Big data and deep data in scanning and electron microscopies: deriving functionality from multidimensional data sets
 Alex Belianinov^{1, 2}Email author,
 Rama Vasudevan^{1, 2},
 Evgheni Strelcov^{1, 2},
 Chad Steed^{1, 7},
 Sang Mo Yang^{1, 2, 4, 5},
 Alexander Tselev^{1, 2},
 Stephen Jesse^{1, 2},
 Michael Biegalski^{2},
 Galen Shipman^{1, 8},
 Christopher Symons^{1, 7},
 Albina Borisevich^{1, 3},
 Rick Archibald^{1, 6} and
 Sergei Kalinin^{1, 2}
DOI: 10.1186/s4067901500066
© Belianinov et al.; licensee Springer. 2015
Received: 27 January 2015
Accepted: 21 April 2015
Published: 13 May 2015
Abstract
The development of electron and scanning probe microscopies in the second half of the twentieth century has produced spectacular images of the internal structure and composition of matter with nanometer, molecular, and atomic resolution. Largely, this progress was enabled by computerassisted methods of microscope operation, data acquisition, and analysis. Advances in imaging technology in the beginning of the twentyfirst century have opened the proverbial floodgates on the availability of highveracity information on structure and functionality. From the hardware perspective, highresolution imaging methods now routinely resolve atomic positions with approximately picometer precision, allowing for quantitative measurements of individual bond lengths and angles. Similarly, functional imaging often leads to multidimensional data sets containing partial or full information on properties of interest, acquired as a function of multiple parameters (time, temperature, or other external stimuli). Here, we review several recent applications of the big and deep data analysis methods to visualize, compress, and translate this multidimensional structural and functional data into physically and chemically relevant information.
Keywords
Scanning probe microscopy Multivariate statistical analysis Highperformance computingReview
Introduction
The ultimate goal for local imaging and spectroscopy techniques is to measure and correlate structureproperty relationships with functionality  by evaluating chemical, electronic, optical, and phonon properties of individual atomic and nanometersized structural elements [1]. If available directly, the information of the structureproperty correlations at the single molecule, bond, or defect levels enables theoretical models to accurately guide materials scientists and engineers to optimally use materials at any length scale, as well as allow for the direct verification of fundamental and phenomenological physical models and direct extraction of the associated parameters.
Particularly significant challenges are offered by spatially inhomogeneous, partially ordered, and disordered systems, ranging from spin glasses [2,3] and ferroelectric relaxors [4,5], to solidelectrolyte interface (SEI) layers in batteries [6] and amorphized layers in fuel cells[7,8], to organic and biological materials. These systems offer a triple challenge: defining relevant local chemical and physical descriptors, probing their spatial distribution, and exploring their evolution in dynamic temperature, light, and chemical and electrochemical reaction processes. While complex, recent progress in information and application [9] of statistics suggest that such descriptions are possible; the challenge is to visualize and explore the data in ways that allow decoupling of various local dynamics under external physical and chemical stimuli.
Ideally, complete studies have to be performed as a function of global stimuli, such as temperature or uniform electric field applied to the system, as well as local stimuli, using localized electric [1013], thermal [1418], or stress fields [1923] exerted by a scanning probe microscopy (an SPM) probe [2426], either within the classical SPM platforms or combined SPMscanning transmission electron microscopy (STEM) setups [27,28]. Further complications of the detection scheme in forcebased SPMs require probing of the response in a frequency band around resonance (since resonant frequency can be position dependent and singlefrequency methods fail to capture these changes) [2932].
Additionally, the instrument hardware challenge is exacerbated by a wealth of extracted information at both global and local scales necessitating a drastic improvement in capability to collect and analyze multidimensional data sets. For example, probing a local transformation requires sweeping a local stimulus (tip bias or temperature) while measuring the response. Note that all firstorder phase transitions are hysteretic and often slow, constraining the measurement of the kinetic hysteresis (and differentiating it from thermodynamics) by measuring the system response as a function of time. This caveat requires firstorder reversal curvetype studies, which effectively increase dimensionality of the data (e.g., probing Preisach densities [33,34]).
Development of multidimensional SPM methods at Oak Ridge National Laboratory
Technique  Dimensionality  Current data set  File volume  References 

1. Band excitation (BE)  3D, space, and ω  (256 × 256) × 64  32 MB  
2. Switching spectroscopy PFM  3D, space, and voltage  (64 × 64) × 128  4 MB  
3. Time relaxation PFM  3D, space, and time  (64 × 64) × 128  4 MB  
4. AC sweeps  4D, space, ω, voltage  (64 × 64) × 64 × 256  512 MB  
5. BE SSPFM  4D, space, ω, voltage  (64 × 64) × 64 × 128  256 MB  
6. BE thermal  4D, space, ω, temp  (64 × 64) × 64 × 256  512 MB  
7. Time relaxation BE  4D, space, ω, time  (64 × 64) × 64 × 64  4 MB  
8. Firstorder reversal curves  5D, space, ω, voltage, voltage  (64 × 64) × 64 × 64 × 16  2 GB  
9. Time relaxation on sweep, BE  5D, space, ω, voltage, time  (64 × 64) × 64 × 64 × 64  8 GB  
10. FORC time BE  6D, space, ω, voltage, voltage, time  (64 × 64) × 64 × 64 × 16 × 64  128 GB  Not yet realized 
Authors note that additional registrationbased problems emerge in combined structural and functional imaging, when the information obtained via a highresolution structural channel (imaging) is complemented by lower resolution spectroscopic probing collected on a coarse grid. These types of experiments bring about problems associated with drift correction and spatial registration of disparate data sets. Therefore, to identify relevant physical behaviors in the intrinsically highdimensional nature of resulting data, without a deterministic physical model, clustering and unsupervised learning techniques can be utilized to establish statistically significant correlations in data sets.
As instrumental platforms and data acquisition electronics are becoming ubiquitous, efficiently storing and handling the large data sets they generate become critical. Hence, the key missing element is mastering ‘the big data’ implicitly present in the (S)TEM/SPM data sets. Here, we review some of the recent advances in the application of big data analysis techniques in structural and functional imaging data. These techniques include unsupervised learning and clustering techniques, supervised neural networkbased classification, and deep data analysis of physically relevant multivariate statistics data.
Multivariate statistical methods
The purpose of this section is to familiarize the reader with the basic unsupervised and supervised learning methods used to reduce dimensionality and visualize data behavior in a highdimensional data set. The material presented in this section gives but a brief overview, and the reader is encouraged to explore the methods further if they have any interest in utilizing them. Minimal mathematical formalism is presented, as the focus is to explain the functional aspect of each of the methods as they are applied to spectral and imaging data, method’s strength and weakness, and give a brief overview of the input and output parameters, if any, to ease the transition to actual utility. All of the methods presented below share the same 2D data structure at the input, with rows as observations and columns as variables. This arrangement implies that in a highdimensional data set, certain dimensions have to be combined. In our work, presented below, we combine dimensions by type, that spatial dimensions in the X, Y, or Z can be mixed, or similarly energy dimensions, such as AC or DC voltage. Other mixing schemes are also possible and in some areas perhaps necessary. More details are given in each of the technical sections as to how each of the methods described below was implemented.
Principal component analysis
Perhaps the easiest way to visualize a multidimensional data set is through principal component analysis (PCA), an approach previously reported for various applications in electron and forcebased scanning probe microscopy data [3541]. PCA has been widely used by a number of scientific fields and owes its popularity to the ease of use and wide availability of the source code in practically any programming language. The algorithm does not take any parameters besides the data itself and outputs three important results: eigenvectors (arranged from most to least information dense), the respective loading (or score) maps associated with each eigenvector, and a Scree plot that represents the information content as a function of eigenvector number. These three results allow the user to visualize the principal behaviors in the data, through eigenvectors and their loading maps, as well as judge the information content of each eigenvector via the Scree plot. PCA, however, suffers from difficulty of interpretation of higher eigenvectors, where the information content typically decreases, the qualitative nature of information content assignment, and processing speed setbacks for truly large data sets (hundreds of thousands of observations with hundreds of thousands long arrays of variables).
where a _{ ik } ≡ a _{ k }(x, y) are positiondependent expansion coefficients or component weights, A _{ i }(U _{ j }) ≡ A(x, y, U _{ j }) is the spectral information at a selected pixel, and U _{ j } are the discrete bias values at which current is measured. The eigenvectors w _{ k }(U) and the corresponding eigenvalues λ _{ k } are found from the singular value decomposition of covariance matrix, C = AA ^{ T }, where A is the matrix of all experimental data points A _{ ij }, i.e., the rows of A correspond to individual grid points (i = 1,.., N⋅M), and columns correspond to voltage points, j = 1,.., P. The eigenvectors w _{ k }(U _{ j }) are orthogonal and are arranged such that corresponding eigenvalues are placed in descending order, λ _{1} > λ _{2} > .... by variance. In other words, the first eigenvector w _{1}(U _{ j }) contains the most information within the spectral image data, the second contains the most common response after the subtraction of variance from the first one, and so on. In this manner, the first 0P maps, a _{ k }(x, y), contain the majority of information within the data set, while the remaining Pp sets are dominated by noise. The number of significant components, p, can be chosen based on the overall shape of λ _{ k(i)} dependence or from correlation analysis of loading maps, which correspond to each of the eigenvectors, a _{ ik } ≡ a _{ k }(x, y). Additionally, Scree plot is used to correlate variance in each component as a function of the component’s number.
Independent component analysis
Independent component analysis (ICA) is a method designed to extract presumably independent signals mixed within the data. Much like PCA, the output is a collection of independent spectra and their loading maps. Unlike PCA, however, the order of ICA components is insignificant, and ICA takes in some input parameters and generally takes longer to run than PCA. One of the key ICA parameters is the number of independent components, a decision that can be highly nontrivial to make. Another often overlooked parameter is the number of principal components to retain; ICA uses PCA as a filter, and for lowdimensional data sets, or data sets with relatively few observations, the last retained principal component plays a huge role in the quality of the signal separation, as it may allow or bar certain details in your data to be presented to the algorithm.
After the whitening, independent signals can be approximated by the orthogonal transformation of the whitened signal by rotating the joint density of the mixed signals in a way to maximize the nonnormality of the marginal densities.
Bayesian demixing
Bayesian demixing is a very powerful technique that shines where PCA and ICA fall short. First and foremost, Bayesian demixing returns a quantitative result with the units of demixed spectra being the units of the input data.
The demixed vectors are also always positive and sum to one, which makes the transition from statistics to science quite natural. There are many optional parameters that can be tweaked within the Bayesian code, but typically at least the number of independent components is required. The disadvantage of the Bayesian method is speed, and additional insight is necessary to optimize the algorithm. Typically, in our analysis flow, we start with PCA and ICA to identify the parameter space; once the region of interesting solutions or phenomena is identified, we perform Bayesian demixing.
While a plethora of Bayesianbased statistics methods exist, we have found the algorithm provided by Dobigeon et al. to be the fastest and easiest to use [43]. The Bayesian approach assumes data in a Y = MA + N form, where observations Y are a linear combination of positionindependent endmembers, M, each weighted with respective relative abundances, A, and corrupted by an additive Gaussian noise N. This approach features the following: the endmembers and the abundance coefficients are nonnegative, fully additive, and sumtoone [4447].
The algorithm operates by estimating the initial projection of endmembers in a reduced subspace via the NFINDR [48] algorithm that finds a simplex of the maximum volume that can be inscribed within the hyperspectral data set using a nonlinear inversion. The endmember abundance priors along with noise variance priors are picked from a multivariate Gaussian distribution found within the data, whereas the posterior distribution is based on endmember independence calculated by Markov Chain Monte Carlo, with asymptotically distributed samples probed by the Gibbs sampling strategy. An additional, unique aspect of Bayesian analysis is that the endmember spectra and abundances are estimated jointly, in a single step, unlike multiple least square regression methods where initial spectra should be known [43].
Clustering
A very natural way to analyze data is to cluster it. There are many algorithms available that have a variety of builtin assumptions about the data and as such could predict the optimal clustering value, order clusters based on variance, or other distance metrics, etc. We present a method, kmeans clustering, which is rather flexible and easy to find on a variety of platforms and in many programming languages. The only required input value for kmeans is the number of clusters; however, additional variables such as the distance metric, number of iterations, how the initial sample is calculated, and how to handle unorthodox data events can all have drastic effects on the results. This clustering algorithm is moderately fast and returns a simple index of integers which enumerates each observation to its respective cluster. The biggest downside of kmeans clustering algorithm is the random cluster ordering on the output; however, this information can be indirectly accessed by looking at the average distance between clusters (based on the supplied metric) as well as the number of points in the cluster.
where μ _{ i } is the mean of points in S _{ i }, is minimized [49,50]. Here, we have used an implementation of the kmeans algorithm that minimizes the sum, over all clusters, of the withincluster sums of pointtoclustercentroid distances. As a measure of distance (minimization parameter), in our data, we have typically used sum of absolute differences with each centroid being the componentwise median of the points in a given cluster.
Neural networks
Here K is commonly referred to as an activation function that defines the node output based on the set of inputs.
As the reader may note, ultimately, the cost function is dictated by the problem we are trying to solve. In the case of an unsupervised learning problem, we are dealing with a general estimation problem, so the cost function is chosen to reflect our imposed model on the system. In the case of supervised learning, we are given a set of examples and to aim to infer the mapping based on the data in the training and other data sets. In the simplest case scenario, the cost function would be a meansquared error type, which would try to minimize the average error between the network’s output and the target values of the example sets.
Spectral domains
We illustrate the applications of multivariate data analytics techniques to multidimensional functional spectroscopies, which include bias, current, frequency, and time channels in SPM and electron energy loss spectroscopy (EELS) in STEM. The analysis involves signal decomposition along the energy or stimulus direction, whereas the spatial portion of the signal is left pristine. In this section, we illustrate analysis via unsupervised and supervised learning algorithms for scanning tunneling spectroscopy (STM) and atomic force microscopy (AFM)based electromechanical force spectroscopies.
3D data  CITS in STEM
In STM, an electrically conductive tip is brought into a current tunneling distance to a conductive sample [51,52]. In Zimaging mode, the tip is scanned over the sample and a Z feedback is used to maintain a constant current while simultaneously adjusting and collecting the position of the feedback. Conversely, in the current imaging mode, Z height is kept constant and the current variation is measured [53]. In current imaging tunneling spectroscopy (CITS), the measurement is performed at an individual spatial point located at an (x, y) position on a grid with the current I recorded for a given applied voltage waveform U. The final data object is a 3D stack of spectral current images I(x, y, U), where I is the detected current, U is the tip bias, and (x, y) are spatial surface coordinates of the measurement [54].
The spatial variability of the electronic behavior across the surface was analyzed using PCA [3537,39,57]. PCA eigenvector and loading map pairs for components 2 and 3 are shown in Figure 2b,c for the FeTe_{0.55}Se_{0.45} CITS data set. It is useful to analyze the eigenvectors and loadings simultaneously to examine the changes in the signal first (the eigenvector here) and its spatial distribution next (the loading). From a statistical perspective, we are mapping sources of electronic inhomogeneity arising from the negative portion of the IV curve in both components, as illustrated in Figure 2b,c. The second eigenvector shown in Figure 2b (upper right corner inset) shows an increase in the −0.05 to 0 V half of the range, where the average signal has a negative slope in the same region. In the third eigenvector, loading pair shown in Figure 2c, the variation is also more prominent in the negative half of the bias range where the current forms a well, compared to the smooth decay behavior in the average IV. Therefore, changes in the current at negative bias are strong sources of data variance in the system and can be attributed to chemical segregation at the surface.
While the components are statistically significant and reflect major changes in the variability of the data, the connection to the physical properties PCA highlights is always nontrivial. This is mostly due to the fact that information variance, the property with respect to which PCA organizes the data, is sensitive to the variability in the signal, rather than the physical origin of the change. This suggests that PCA allows one to denoise, decorrelate, and visualize spatial variability of the response but does not directly yield additional knowledge with respect to the effects that are being studied. In the case of CITS data on FeTe_{0.55}Se_{0.45}, results of the third component, Figure 2c, can be legitimately questioned as the loading map seems to suggest behavior that is erratic and typical of an unstable tip surface tunneling regime. It is then necessary to use other methods in order to supplement PCA results and determine the underlying source of variance in the signal and its relevance to the problem at hand, as will be illustrated by Bayesian demixing analysis of the local conductance behavior in the section ‘Deep learning’ [40,58,59].
Another commonly used unsupervised learning method that reflects major organization in the data structure is kmeans clustering. Insight into the spatial variability of the electronic structure on the surface inaccessible by PCA can be gained from the clustering analysis of the CITS data [60], by kmeans clustering. As a measure of distance (minimization parameter), we have used the sum of absolute differences with each centroid being the componentwise median of the points in a given cluster.
The kmeans result for five clusters using the square Euclidean distance metric is shown in Figure 2d, with the inset in the top right showing the mean IV curves for the individual clusters (colorcoded respectively). As seen in the kmeans clustering result, the mapping is indeed sensitive to the changes in the negative bias portion of the IV curve. Here we see clustering that is based on variance of conductivity or alternatively the width of the band gap. Perhaps a more interesting observation is the spatial distribution of the clusters, where the regions of the highest maximum current (cherry red) and lowest maximum current (green) are segregated and in most cases surrounded by patches of varying conductivity. Note that in this result, single pixel and short line like agglomerates of pixel outliers seen in Figure 2c are absent. Overall, the behavior is more in line with the results of the second PCA component.
4D and 5D data  band excitation spectroscopy analysis
The multivariate analysis of a higherdimensional data set (beyond the 3D) is effectively illustrated by a band excitation piezoresponse force spectroscopy (BEPS) data set. This technique probes the electromechanical response of materials, which is directly related to the material’s ferroelectric state. The spectroscopic version of piezoresponse force microscopy (PFM) probes the local ferroelectric switching induced by the DC bias applied to the tip via a dynamic electromechanical response, effectively yielding the local piezoresponse loop.
Although PCA is useful in visualizing the structure of the data, there are no physically meaningful constraints on the eigenvectors. For example, if it is known (or postulated) that the measured signal is a linear combination of n independent signals, one may want to determine the pure components that correspond to each of these cases. For this particular problem, the ICA [42] technique provides a solution and allows demixing of signals into a userdefined number of vectors (components), with the constraint that the components must be statistically independent.
The spatial maps of the mixing coefficients show variability in the response and are markedly different from the PCA eigenvalue maps; for example, the bottom right area of the sample displays high response of the second component (Figure 6b), which increases the area enclosed within the left side of the butterfly loop. In this example, note that there is no reason for there to be four components to the hysteresis loop, i.e., we illustrate an example of the method, but based on component shape, there should be at least four as all components appear significantly different. Importantly, the fourth component displays a nearideal ferroelectric loop (Figure 5d), and the strength of this component with respect to the other components can be seen as an indication of the degree of purely ferroelectric switching in those regions, as opposed to other components that appear to result from dominating influences by surface charges, polar nanoregions, or fieldinduced phase transformations. For instance, the first component appears largest in the topleft corner of the region studied (Figure 6a), and the coercive fields for this component are much lower, possibly due to the increased propensity of fieldinduced phase transformations (likely rhombohedral to tetragonal [67]) in this area. Thus, ICA is a highly useful method for blind source separation and provides a powerful method accompanying PCA to demix signals where the number of constituent components is either known from physics or can be postulated.
Supervised learning
The training of the neural network was performed on a region of the sample outlined with a black box in Figure 7c. The inputs to the network were the first six components of the principal component analysis decomposition (here, acting as a filter) of the data set within the training region. A network of three neurons was trained repeatedly on multiple examples until a minimal error was achieved. Following training, the network was presented with the data set collected on the whole area shown in Figure 7c and it correctly identified both of the bacterial species. Interestingly, other topographical features, distinct from the substrate, were classified as background, which identifies them as nonbacterial debris. However, a small relatively flat region (right upper corner in Figure 7c) was classified as M. lysodeikticus, implying that this region could be covered in a membrane of lysed bacteria of that species. Thus, supervised learning presents a powerful image recognition tool that can identify objects based on a small subset of information provided in the training set. Even though successful neural network operation requires extensive training for accuracy, the computational cost during operation is infinitesimal. The illustrated example was computed on a typical user desktop without additional highend components or computational clusters. Similarly, neural network approaches can be extended to training on theoretical model outputs, with the experimental results presented for analysis. Examples include functional fits to relaxation parameters [71] or Ising model simulations [72,73].
Deep learning
In this section, we discuss the pathways to establish correspondence between statistical analysis and a physical model, i.e., to transition from a search for correlation to a search for causation. The previously introduced firstorder reversal curve currentvoltage (FORCIV) SPM technique [74] has been deployed in imaging and analysis of spatially uniform Casubstituted BiFeO_{3} and NiO systems [74,75]. Those studies have shown that the locally measured hysteresis in the FORCIV curves is related to changes in electronic conduction sensed by the tip in response to a biasinduced electrochemical process, and the area of the IV loop is overall indicative of local ionic activity. FORCIV spectroscopic imaging modes lack adequate data analysis and interpretation pathways due to the flexible, multidimensional nature of the data set and the volume of the data collected. In this example, we combine FORCIV measurements with the multivariate statistical methods based on signal demixing, in order to discriminate between different conductivity behaviors based on the shapes of the IV curves in the full spectroscopic data set.
The multidimensional nature of these data, combined with the lack of analytical or numerical physical models, naturally calls for multivariate statistical analysis in order to extract the most comprehensive view of the physical behavior of the CFOBFO system. While PCA and ICA are powerful methods that allow one to take a closer look into the structure of the data, a preferable method would preserve physical information in the data and allow fully quantitative analysis. Such a method will separate the data into a combination of welldefined components with clear spectroscopic behavior that has an intensity weight component, providing insight into the spatial distribution of the behavior. Ideally, these components should be physically viable, wellbehaved, positive, have additive weights, etc. This level of analysis can be achieved by Bayesian linear demixing methods, specifically an algorithm conglomerate introduced by Dobigeon et al. [43].
The main advantage of these methods is a quantitative, interpretable result where the final endmembers are nonnegative, in the units of input data, and with all of the respective abundances adding up to 1. Therefore, at each location, the data is decomposed into a linear combination of spectra where each pixel in the probed grid consists of a number of components (i.e., conducting behaviors) present in a corresponding proportion. Note that these constraints allow a direct transition from statistical analysis to physical behavior. By making the abundances additive and the endmembers positive, we can begin assigning physical behavior to the shape and nature of the endmember curves. By extension, analysis of the endmember loading maps adds the spatial component to the behavior that nonstatistical methods of analysis lack entirely.
Image domain
The clustering and dimensionality reduction algorithms used in the previous sections are equally applicable to analysis in image coordinate space, exploring the correlation between individual structural elements found in the image itself. These can originate from both contrast and shapebased features contained in the image, as well as an analysis of features that are mathematically condensed into a representative set.
Sliding Fourier transform
Clustering and classification of atomic features
Imaging in kspace
We studied the first 220 s of growth by using kmeans clustering, with ten clusters, with the mean of the clusters plotted in the upper half of Figure 12, while the temporal dependence of each cluster is graphed in the lower panel. After the deposition begins (at t = 50 s), there are five distinct clusters that characterize the growth process before the transition to stepflow mode. These highlight the pathway for the transition  it appears that it occurs with the streaks gradually losing intensity over time until they are more spotlike (seen in cluster 6). Beyond t = 120 s, four clusters characterize the remaining t = 100 s of growth, and these are outlined with olive dash lines in the upper panel. Interestingly, there is little difference between these clusters (compared with the clusters in the layerbylayer growth segment), and moreover, the similarity suggests little roughening effects in the grown film. We can therefore assign, unambiguously, that the layerbylayer growth transitions to the stepflow mode when cluster 1 is active, i.e., at t = 120 s. Thus, the kmeans clustering allows identification of growth mode transitions as well as the pathway through which this occurs in kspace, and furthermore allows identification of existence or absence of surface roughening. The method is equally applicable to detect 2D → 3D growth mode transitions [82], disordered → ordered transitions [83], strain relaxation [84], etc.
Supervised learning: domain shape recognition
Principal component analysis combined with neural networks can be used for the analysis of ferroelectric domain shapes, which provides insight into the highly nontrivial mechanism of ferroelectric domain switching, and potentially establishes a new paradigm for the information encoding based on the capture domain shape in the image.
The experimental data set consisted of PFM images of the domains produced by an application of a number of electrical pulses of varying length, and a total of 288 domains were acquired for testing.
We used PCA to obtain a set of the descriptors that characterized the individual domains. Each domain image consisted of N × N pixels and was unfolded into a 1D vector of N2 length. PCA eigenvectors (Figure 13b) and corresponding weight coefficients (Figure 13c) characterized the domain morphology. Color map of the weights demonstrates clear differences between the domain groups corresponding to different switching pulses (Figure 13c). This approach illustrates use of eigenvectors for characterization of all of the experimentally observed features of the domain morphology, and the weights can be used as an input parameter for the recognition by a feedforward neural network.
For testing of this approach, the experimental data set was divided into training and test data sets. The PCA over the training data set (about 15% of the domains) was used for calculation of etalon eigenvectors, which was used for deconvolution of the testing weight coefficients over the test data set. The set of the training weights and corresponding switching sequences are then applied for neural network training. The set of testing weights are further used as inputs for recognition.
Experimental simulations of the suggested approach showed its practical applicability and demonstrated probability of the recognition above 65%; however, this relatively low value is mainly defined by irreproducibility of the switching process, caused by the nonideal nature of the ferroelectric crystal.
Highperformance computing
Scalable methods
The key concepts in generating effective highperformance computing methods are managing latency of data transfer and balancing workload. Algorithms that are structured to effectively utilize the everincreasing capacity of highperformance computing are called scalable methods. The movement of data in highperformance computing and across storage devices is well known, with hierarchies of transmission latency; therefore, analyzing scanning and electron microscopy data on these platforms will require physicsbased algorithms that are customized to exploit parallel work while minimizing communication cost [89]. Experimental scientists will need to join with computational scientists and applied mathematicians to continue to scale this analysis to the next generation of highperformance computing (HPC) systems [90].
Future HPC systems are expected to have processor cores, memory units, communication nodes, and other components totaling in the hundreds of millions [91], and it is expected that faults in these systems will occur in the time frame of seconds [92]. This underscores the requirement of the algorithms specifically designed for analysis of scanning and electron microscopy data which must use robust workload balancing tools that are resilient to errors in algorithmic execution, as well as data transfer.
Big data workflows
To effectively leverage scalable methods for analysis on largescale HPC systems, a sophisticated data workflow is required. Whereas computational scientists are accustomed to dealing with the idiosyncrasies of HPC environments (compiler technologies, scientific libraries, communication libraries, complex data models), microscopists generally are not. This presents a challenge in delivering the promise of near realtime analysis to the users at scanning probe, focused Xray, tomography, and electron microscopy imaging facilities [89]. To overcome this challenge, we are employing an automated workflowbased approach.
In a typical example, the user will collect data from the instrument via the instrument control interface. As measurements progress, data is generated in a standard microscopy data format such as Digital Microscopy version 3 (DM3) or a textbased file (see Figure 14). Upon the completion of a measurement, the workflow begins, with the data transfer via a light communication node from the instrument to a highperformance storage [93]. This approach allows pipelining of the data to an HPC environment in parallel while subsequent measurements are taken at the instrument and other instruments are sending data.
Once the data file is stored within an HPC environment, the next stage of the workflow includes conversion to a data model suitable for HPCbased analysis, generally using the Hierarchical Data Format version 5 (HDF5). With the data set now converted and resident on a parallel file system, the next stage of the workflow, analysis via scalable methods, can be executed. At this juncture, an analysis algorithm is selected based on the instrument, the measurement, the material composition, and other userspecified criteria. Once selected, the analysis is executed on an HPC system. The resultant data and statistics are then made available to the user for inspection and further analysis. Initial experimentation of this concept has shown that analysis can be completed in seconds, allowing near realtime feedback from the measurement. Upon completion of the analysis, the data is then organized for possible archival. Once data movement and analysis is completed, interactive visual analysis is made available for further inspection of the data.
Scalable analytics
It is important to note that the difficulties surrounding scalable analytics in the context of the imaging methods insofar discussed extend far beyond the need for taskbased and databased parallelism. In particular, one of the primary challenges expected to impede further progress is the application of statistical methods in extremely high dimensions. Due to the structure of the analysis problems in computational settings, the complexity of the problem space manifests itself as a highdimensional analysis problem, where dimensionality is most often associated with the number of measurements being considered simultaneously. The curse of dimensionality is a persistent phenomenon in modern statistics due to our ability to measure at rates and scales unheard of until the modern era [94]. However, there are many strategies to mitigate the statistical consequences of high dimensionality.
While some of the methods noted earlier in this paper are computationally scalable, in many cases, they are not appropriate for other reasons. For example, although PCA, ICA, kmeans, and back propagation for neural networks all fit the Statistical Query Model, and thus belong to a known set of problems that can essentially scale linearly, this does not necessarily solve the issues raised by highdimensional analysis [95]. For example, it is important to observe that in highdimensional spaces, nearest neighbors become nearly equidistant [96]. This is particularly problematic for clustering algorithms but also has significant consequences for other dimensionality reduction techniques.
Clustering in highdimensional spaces has been addressed using a variety of methods that consider scalability. A good example is the use of hashing in similarity measurements. Hashing techniques that facilitate neighborhood searches in highdimensional space rely on various assumptions for tractability. Often, these assumptions include independence among the dimensions; in the case of Weiss et al., the authors suggest the use of PCA in order to prep the data in such a way that these assumptions are more accurate [97]. Moreover, various hashing techniques attempt to preserve distances between points in different ways, such that the user must be savvy enough to understand these assumptions in order to choose the best approach [97,98]. For example, Weiss et al. gains much of its power by only attempting to preserve the relative order of small distances. After a certain distance in space is reached, all distances beyond that are allowed to become equidistant in the space represented by the hash codes. However, this brings us back to the unfortunate situation that in extremely high dimensions, points tend to become equidistant, such that these hashing approaches cannot be expected to work for problems that do not have structure allowing some sort of dimensionality reduction.
We also suspect that many important patterns cannot be captured by linear dimensionality reduction techniques alone. However, nonlinear techniques, such as those shown by Roweis et al., Tenenbaum et al., Belkin et al., and Gerber et al., are less scalable [99102]. Many such methods fall under the umbrella of manifold learning, which is a technique meant to take advantage of cases where the data lie on a nonlinear subspace that can be represented by a significantly smaller number of dimensions [103]. Many manifold learning approaches involve the solution of a symmetric diagonally dominant (SDD) linear system, but recent progress has been made in finding more efficient, scalable solutions to such problems [104].
When dimensionality reduction techniques still leave large numbers of potentially relevant measurements, other scalable approaches for dealing with highdimensional analysis are still required. In the case of clustering, one such scalable approach that deals with highdimensional clustering can be found in the methodology of Vatsavai et al. [105]. Note that this method also automatically attempts to select the number of clusters, a known problem for kmeans clustering.
Many of the most effective solutions to the challenges presented by highdimensional data have relied on the injection of additional knowledge. In the case where human expertise can play a part in pattern discovery and dimensionality reduction, data analysis becomes much easier. Unfortunately, more often than not, we are dealing with problems where the physics are unknown and the discovery of manual patterns is extremely difficult even in the case of deep domain knowledge. Thus, more automated methods for incorporating additional information, such as the integration of alternate imaging modalities, become important.
Moreover, methods of automated pattern discovery in large data sets have made great progress in recent years. In particular, in the case of imagery methods, much progress in automated feature extraction has occurred in the area known as deep learning [106]. However, such methods rely on large aggregated image repositories. This means that big data workflows have to be in place to retain large numbers of experimental results and allow their joint analysis. In addition, while these methods have proven to be scalable, they are also subject to finding many irrelevant patterns when utilizing networks consisting of extreme numbers of parameters [107].
Visualization
Dynamic hypothesis generation and confirmation techniques are a necessity for enabling scientific progress in extremescale science domains. Indeed, when insight is detected in the data, new questions arise, leading to more detailed examination of specific constituents. Accordingly, scientific analysis techniques should enhance the scientist’s cognitive workflow by intelligently blending human interaction and computational analytics at scale via interactive data visualization. The orchestration of human cognition and computational power is critical for two primary reasons: (i) the data are too large for purely visual methods and require assistance from data processing and mining algorithms, and (ii) the tasks are too exploratory for purely analytical methods and call for human involvement. Having established our strategy for harnessing computational power through automated analytical algorithms, we will devote the remainder of this section to several key strategies for integrating humanguided scientific analysis at scale in the materials sciences.
Given the scale and complexity of the materials data, a visual analytics approach is the most viable solution to accelerate knowledge discovery. Thomas et al. define visual analytics as ‘the science of analytical reasoning facilitated by interactive visual interfaces’ [108]. The fundamental goal of visual analytics is to turn the challenge of the information overload into an opportunity by visually representing information and allowing direct interaction to generate and test hypotheses. The advantage of visual analytics is that users can focus their full cognitive and perceptual capabilities on the analytical process, while simultaneously leveraging advanced computational capabilities to guide the discovery process [109]. Visual analytics is a modern take on the concept of exploratory data analysis (EDA) [110]. Introduced by Tukey, EDA is a data analysis philosophy that emphasizes the involvement of both visual and statistical understanding in the analysis process.
To allow efficient EDA in materials science, the combination of multiple views (CMV) and focus + context information visualizations are needed. CMV is an interaction methodology that involves linked view manipulations distributed across multiple visualizations, and recent evaluations demonstrate that this approach fosters more creative and efficient analysis than noncoordinated views [111]. In a CMV system, as the scientist manipulates a particular visualization (e.g., item selections, filtering, variable integrations), the manipulations are immediately propagated to the other visualizations using a linked data model. In conjunction with CMV, focus + context representations support efficient EDA by preserving the context of the more complete overview of the data during zooming and panning operations. As the scientist zooms into the data views to see more details, the focus + context display simultaneously maintains the context or gestalt [112] of the entire data set. In this way, the operator is less likely to lose their orientation within the overall data space while investigating finegrain details.
EDEN extends the classical parallel coordinates axis by providing cues that guide and refine the analyst’s exploration of the information space. This approach is akin to the concept of the scented widget described by Willett et al. [118]. Scented widgets are graphical user interface components that are augmented with an embedded visualization to enable efficient navigation in the information space of the data items. The concept arises from the information foraging theory described by Pirolli and Card [119], which relates human information gathering to the food foraging activities of animals. In this model, the concept of information scent is identified as the ‘user perception of the value, cost, or access path of information sources obtained by proximal cues’ [119]. In EDEN, the scented axis widgets are augmented with information from automated data mining processes (e.g., statistical filters, automatic axis arrangements, regression mining, correlation mining, and subset selection capabilities) that highlight potentially relevant associations and reduce knowledge discovery timelines.
The parallel coordinates plot is ideal for exploratory analysis of materials science data because it accommodates the simultaneous display of a large number of variables in a twodimensional representation. In EDEN, the parallel coordinates plot is extended with a number of capabilities that facilitate exploratory data analysis and guide the scientist to the most significant relationships in the data. A full description of these extensions is beyond the scope of this article, but the reader is encouraged to explore prior publication for more detailed explanations of our multidimensional analysis techniques [113,120]. EDEN is an exemplary case of the indispensable visual analytics techniques that provide intelligent user interfaces by leveraging both visual representations and human interaction, thereby enhancing scientific discovery with vital assistance from automated analytics. As we develop new visual analytics approaches like EDEN for materials science workflows, we expect to dramatically reduce knowledge discovery timelines through more intuitive and exploratory analysis guided by machine learning algorithms in an intelligent visual interface.
Conclusions
The development of electron and scanning probe microscopies in the second half of the twentieth century was enabled by computerassisted methods for automatic data acquisition, storage, analysis, and tuning and refinement of feedback loops as well as imaging parameters. In the last decade, highresolution STEM and STM imaging techniques have enabled acquisition of highveracity information [121] at the atomic scale, readily providing insight on positions and functionality of materials that have been inaccessible due to a lagging analysis framework in the microscopy communities. Naturally, progress in complexity of dynamic and functional imaging leads to multidimensional data sets containing spectral information on local physical and chemical functionalities, which can be easily expanded further to acquire data as a function of a plethora of parameters such as time, temperature, or many other external stimuli.
Maximizing the scientific output from existing and future microscopes brings forth the challenge of analysis, visualization, and storage of data, as well as decorrelation and classification of the known and unknown hidden data parameters, the traditional big data analysis. The existing infrastructure for such analysis has been developed in the context of medical and satellite imaging, and its extension to functional and structural imaging data is a natural next step. Of course, further development toward a flexible infrastructure where the scientists can select or define their own analysis algorithms to analyze the data ‘on the fly’ as it is being collected can be envisioned. This will require scalable algorithms, highperformance computing, and storage infrastructure. Reducing the data sets to a more manageable size, while initially attractive, comes with the risk of losing significant information within the data, particularly for exploratory studies in which the phenomena of interest may not be captured by statistical methods.
Beyond the big data challenges [122,123] is the transition to a deep data approach, in which we fully utilize all the information present within the data to derive an understanding [124]  namely, how do we ascribe relevant physical and chemical information contained within the data sets, differentiate relevant and coincidental behaviors, move beyond simple correlation, and link to scientific theory? Highresolution imaging allows us to explore the microscopic degrees of freedom in the system  how can we use theory to understand these behaviors, refine theoretical models, and ultimately enable knowledgedriven design and optimization of new materials? [125]. To achieve this goal, new methods and theories will be necessary for defining the local chemical and physical descriptions, their spatial distribution and evolution during reactions. While complicated, recent progress in information and statistical theory suggest that such descriptions are possible [126].
One of the approaches to achieve this goal is through the user center model that combines development and maintenance of cutting edge tools, as well as experience and detailed knowledge of data interpretation in terms of relevant behaviors, all while maintaining an open access policy  making the findings available to the broader scientific community. Equally important will be the crossdisciplinary synergy between theory, imaging, and data analytics, harnessing the power of multivariate statistical methods to understand and explore multidimensional imaging and spectroscopy data sets.
Integration of the knowledge in the field will allow development of universal database libraries allowing identification and data mining of novel and wellunderstood materials, refinement and improvement of dynamic data, and ultimately creation of supervised expert systems that will allow rapid identification and analysis of unknown systems. Successes in fields such as medical diagnostics and imaging suggest that this is fully possible. These developments will further open the pathway for exploration and tailoring of desired material functionalities based on better information. We anticipate the emergence of Googlelike environments that will allow storage and interpretation of collective knowledge and image interpretation in the context of data and historical knowledge. Rather than creating multiple samples, the structureproperty relationships extracted from a single disordered sample could offer a statistical picture of materials functionality, providing the experimental counterpart to Materials Genometype programs.
Notes
Abbreviations
 AC:

alternating current
 AFM:

atomic force microscopy
 ANN:

artificial neural network
 BE:

band excitation
 BEPS:

band excitation polarization spectroscopy
 cAFM:

conductive atomic force microscopy
 CFOBFO:

CoFe_{2}O_{4}BiFeO_{3}
 CITS:

current imaging tunneling spectroscopy
 CMV:

combination of multiple views
 DC:

direct current
 DM3:

Digital Microscopy version 3
 EDA:

exploratory data analysis
 EDEN:

Exploratory Data Analysis Environment
 FFT:

fast Fourier transfrorm
 FORC:

firstorder reversal curve
 FORCIV:

firstorder reversal curve currentvoltage
 HPC:

highperformance computing
 HDF5:

Hierarchical Data Format version 5
 ICA:

independent component analysis
 IV:

currentvoltage
 PCA:

principal component analysis
 PFM:

piezoresponse force microscopy
 RHEED:

reflection highenergy electron diffraction
 SPM:

scanning probe microscopy
 SS PFM:

switching spectroscopy piezoresponse force microscopy
 STEM:

scanning transmission electron microscopy
 STEMHAADF:

scanning transmission electron microscopy highangle annular darkfield imaging
 STM:

scanning tunneling microscopy
 STS:

scanning tunneling spectroscopy
 SSD:

symmetric diagonally dominant
 ω :

frequency
Declarations
Acknowledgements
This research was sponsored by the Division of Materials Sciences and Engineering, BES, DOE (RKV, AT, SVK). The data analysis portion of this research (ES, MB) was conducted at the Center for Nanophase Materials Sciences, which is a DOE Office of Science User Facility. Research related to atomic resolution imaging (AB, AB, SJ) was sponsored by Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UTBattelle, LLC, for the U.S. Department of Energy. The authors gratefully acknowledge Dr. S. Zhang (Penn. State) for providing the PMNPT ferroelectric relaxor sample as well as Dr. YingHao Chu and YingHui Hsieh for providing BFOCFO nanocomposite samples. SMY acknowledges the support by IBSR009D1, Korea.
Authors’ Affiliations
References
 Mody, C: Instrumental Community: Probe Microscopy and the Path to Nanotechnology. MIT Press, Boston, MA (2011)View ArticleGoogle Scholar
 Binder, K, Young, AP: Spinglasses: experimental facts, theoretical concepts, and open questions. Rev Mod Phys 58(4), 801–976 (1986). doi:10.1103/RevModPhys.58.801View ArticleGoogle Scholar
 Binder, K, Reger, JD: Theory of orientational glasses models, concepts, simulations. Adv Phys 41(6), 547–627 (1992). doi:10.1561/2200000006View ArticleGoogle Scholar
 Westphal, V, Kleemann, W, Glinchuk, MD: Diffuse phase transitions and randomfieldinduced domain states of the “relaxor” ferroelectric PbMg_{1/3}Nb_{2/3}O_{3}. Phys Rev Lett 68(6), 847–850 (1992). doi:dx.doi.org/10.1103/PhysRevLett.68.847View ArticleGoogle Scholar
 Tagantsev, AK, Glazounov, AE: Does freezing in PbMg1/3Nb2/3O3 relaxor manifest itself in nonlinear dielectric susceptibility? Appl Phys Lett 74(13), 1910–1912 (1999). doi:10.1063/1.123710View ArticleGoogle Scholar
 Winter, M, Besenhard, JO, Spahr, ME, Novak, P: Insertion electrode materials for rechargeable lithium batteries. Adv Mater 10(10), 725–763 (1998). doi:10.1002/(sici)15214095(199807)10:10<725::aidadma725>3.0.co;2zView ArticleGoogle Scholar
 Bagotsky, VS: Fuel Cells: Problems and Solutions. Wiley, Hoboken, NJ (2009)Google Scholar
 Adler, SB: Factors governing oxygen reduction in solid oxide fuel cell cathodes. Chem Rev 104(10), 4791–4843 (2004). doi:10.1021/cr020724oView ArticleGoogle Scholar
 Machta, BB, Chachra, R, Transtrum, MK, Sethna, JP: Parameter space compression underlies emergent theories and predictive models. Science 342, 604–607 (2013). doi:10.1126/science.1238723View ArticleGoogle Scholar
 Kalinin, SV, Balke, N: Local electrochemical functionality in energy storage materials and devices by scanning probe microscopies: status and perspectives. Adv Mater 22(35), E193–E209 (2010). doi:10.1002/adma.201001190View ArticleGoogle Scholar
 Balke, N, Jesse, S, Morozovska, AN, Eliseev, E, Chung, DW, Kim, Y, Adamczyk, L, Garcia, RE, Dudney, N, Kalinin, SV: Nanometerscale electrochemical intercalation and diffusion mapping of Liion battery materials. Nat Nanotechnol 5, 7349–7357 (2010)View ArticleGoogle Scholar
 Balke, N, Bdikin, I, Kalinin, SV, Kholkin, AL: Electromechanical imaging and spectroscopy of ferroelectric and piezoelectric materials: state of the art and prospects for the future. J Am Ceram Soc 92(8), 1629–1647 (2009). doi:10.1111/j.15512916.2009.03240.xView ArticleGoogle Scholar
 Kalinin, SV, Rodriguez, BJ, Jesse, S, Maksymovych, P, Seal, K, Nikiforov, M, Baddorf, AP, Kholkin, AL, Proksch, R: Local biasinduced phase transitions. Materials Today 11(11), 16–27 (2008). doi:10.1016/s13697021(08)702359View ArticleGoogle Scholar
 Felts, JR, Somnath, S, Ewoldt, RH, King, WP: Nanometerscale flow of molten polyethylene from a heated atomic force microscope tip. Nanotechnology 23(21), 215301 (2012). doi:10.1088/09574484/23/21/215301View ArticleGoogle Scholar
 King, WP, Kenny, TW, Goodson, KE, Cross, G, Despont, M, Dürig, U, Rothuizen, H, Binnig, GK, Vettiger, P: Atomic force microscope cantilevers for combined thermomechanical data writing and reading. Appl Phys Lett 78(9), 1300–1302 (2001). doi:dx.doi.org/10.1063/1.1351846View ArticleGoogle Scholar
 Jesse, S, Nikiforov, MP, Germinario, LT, Kalinin, SV: Local thermomechanical characterization of phase transitions using band excitation atomic force acoustic microscopy with heated probe. Appl Phys Lett 93(7), 073104 (2008). doi:10.1063/1.2965470View ArticleGoogle Scholar
 Nikiforov, MP, Jesse, S, Morozovska, AN, Eliseev, EA, Germinario, LT, Kalinin, SV: Probing the temperature dependence of the mechanical properties of polymers at the nanoscale with band excitation thermal scanning probe microscopy. Nanotechnology 20(39), 395709 (2009). doi:10.1088/09574484/20/39/395709View ArticleGoogle Scholar
 Somnath, S, Corbin, EA, King, WP: Improved nanotopography sensing via temperature control of a heated atomic force microscope cantilever. Sensors J, IEEE 11(11), 2664–2670 (2011). doi:10.1109/JSEN.2011.2157121View ArticleGoogle Scholar
 Kelly, SJ, Kim, Y, Eliseev, E, Morozovska, A, Jesse, S, Biegalski, MD, Mitchell, JF, Zheng, H, Aarts, J, Hwang, I: Controlled mechnical modification of manganite surface with nanoscale resolution. Nanotechnology 25(47), 475302 (2014). doi:10.1088/09574484/25/47/475302View ArticleGoogle Scholar
 Kim, Y, Kelly, SJ, Morozovska, A, Rahani, EK, Strelcov, E, Eliseev, E, Jesse, S, Biegalski, MD, Balke, N, Benedek, N: Mechanical control of electroresistive switching. Nano Lett 13(9), 4068–4074 (2013). doi:10.1021/nl401411rView ArticleGoogle Scholar
 Lu, H, Kim, D, Bark, CW, Ryu, S, Eom, C, Tsymbal, E, Gruverman, A: Mechanicallyinduced resistive switching in ferroelectric tunnel junctions. Nano Lett 12(12), 6289–6292 (2012). doi:10.1021/nl303396nView ArticleGoogle Scholar
 Zhang, JX, Xiang, B, He, Q, Seidel, J, Zeches, RJ, Yu, P, Yang, SY, Wang, CH, Chu, YH, Martin, LW, Minor, AM, Ramesh, R: Large fieldinduced strains in a leadfree piezoelectric material. Nat Nanotechnol 6(2), 98–102 (2011). doi:10.1038/nnano.2010.265View ArticleGoogle Scholar
 Dao, M, Chollacoop, N, Van Vliet, K, Venkatesh, T, Suresh, S: Computational modeling of the forward and reverse problems in instrumented sharp indentation. Acta Mater 49(19), 3899–3918 (2001). doi:10.1016/S13596454(01)002956View ArticleGoogle Scholar
 Garcia, R, Martinez, RV, Martinez, J: Nanochemistry and scanning probe nanolithographies. Chem Soc Rev 35(1), 29–38 (2006). doi:10.1039/B501599PView ArticleGoogle Scholar
 Martinez, J, Martínez, RV, Garcia, R: Silicon nanowire transistors with a channel width of 4 nm fabricated by atomic force microscope nanolithography. Nano Lett 8(11), 3636–3639 (2008). doi:10.1021/nl801599kView ArticleGoogle Scholar
 Van Vliet, KJ, Li, J, Zhu, T, Yip, S, Suresh, S: Quantifying the early stages of plasticity through nanoscale experiments and simulations. Phys Rev B 67(10), 104105 (2003). doi:dx.doi.org/10.1103/PhysRevB.67.104105View ArticleGoogle Scholar
 Chang, HJ, Kalinin, SV, Yang, S, Yu, P, Bhattacharya, S, Wu, PP, Balke, N, Jesse, S, Chen, LQ, Ramesh, R, Pennycook, SJ, Borisevich, AY: Watching domains grow: insitu studies of polarization switching by combined scanning probe and scanning transmission electron microscopy. J Appl Phys 110(5), 052014 (2011). doi:10.1063/1.3623779View ArticleGoogle Scholar
 Nelson, CT, Gao, P, Jokisaari, JR, Heikes, C, Adamo, C, Melville, A, Baek, SH, Folkman, CM, Winchester, B, Gu, YJ, Liu, YM, Zhang, K, Wang, EG, Li, JY, Chen, LQ, Eom, CB, Schlom, DG, Pan, XQ: Domain dynamics during ferroelectric switching. Science 334(6058), 968–971 (2011). doi:10.1126/science.1206980View ArticleGoogle Scholar
 Jesse, S, Guo, S, Kumar, A, Rodriguez, BJ, Proksch, R, Kalinin, SV: Resolution theory, and static and frequencydependent crosstalk in piezoresponse force microscopy. Nanotechnology 21(40), 405703 (2010). doi:10.1088/09574484/21/40/405703View ArticleGoogle Scholar
 Jesse, S, Kalinin, SV: Band excitation in scanning probe microscopy: sines of change. J Phys D Appl Phys 44(46), 464006–464021 (2011). doi:10.1088/00223727/44/46/464006View ArticleGoogle Scholar
 Kalinin, SV, Jesse, S, Proksch, R: Information acquisition & processing in scanning probe microscopy. J Name: R & D Magazine 50(4), 20 (2008)Google Scholar
 Rodriguez, BJ, Callahan, C, Kalinin, SV, Proksch, R: Dualfrequency resonancetracking atomic force microscopy. Nanotechnology 18(47), 475504–475509 (2007)View ArticleGoogle Scholar
 Mayergoyz, ID, Friedman, G: Generalized Preisach model of hysteresis. IEEE Trans Magn 24(1), 212–217 (1988). doi:10.1109/20.43892View ArticleGoogle Scholar
 Mitchler, PD, Roshko, RM, Dahlberg, ED: A Preisach model with a temperature and timedependent remanence maximum. J Appl Phys 81(8), 5221–5223 (1997). doi:10.1063/1.364473View ArticleGoogle Scholar
 Jesse, S, Kalinin, SV: Principal component and spatial correlation analysis of spectroscopicimaging data in scanning probe microscopy. Nanotechnology 20(8), 085714 (2009). doi:10.1088/09574484/20/8/085714View ArticleGoogle Scholar
 Nan Y, Belianinov A, Strelcov E, Tebano A, Foglietti V, Di Castro D, Schlueter C, Lee TL, Baddorf A P, Balke N, Jesse S, Kalinin S V, Balestrino G, Aruta C: Effect of doping on surface reactivity and conduction mechanism in samariumdoped ceria thin films. ACS Nano, 8(12), 1249412501. doi:10.1021/nn505345c
 Bosman, M, Watanabe, M, Alexander, DTL, Keast, VJ: Mapping chemical and bonding information using multivariate analysis of electron energyloss spectrum images. Ultramicroscopy 106(11–12), 1024–1032 (2006). doi:10.1016/j.ultramic.2006.04.016View ArticleGoogle Scholar
 Bonnet, N: Artificial intelligence and pattern recognition techniques in microscope image processing and analysis. In: Hawkes, PW (ed.) vol. 114. Advances in Imaging and Electron Physics, pp. 1–77. Elsevier Academic Press Inc, San Diego (2000)View ArticleGoogle Scholar
 Bonnet, N: Multivariate statistical methods for the analysis of microscope image series: applications in materials science. J MicroscOxf 190, 2–18 (1998). doi:10.1046/j.13652818.1998.3250876.xView ArticleGoogle Scholar
 Belianinov, A, Ganesh, P, Lin, W, Sales, BC, Sefat, AS, Jesse, S, Pan, M, Kalinin, SV: Research update: spatially resolved mapping of electronic structure on atomic level by multivariate statistical analysis. APL Materials 2(12), 120701 (2014). doi:dx.doi.org/10.1063/1.4902996View ArticleGoogle Scholar
 Belianinov, A, Kalinin, SV, Jesse, S: Complete information acquisition in dynamic force microscopy. Nat Commun. 6, (2015). doi:10.1038/ncomms7550
 Hyvärinen, A, Karhunen, J, Oja, E: Independent component analysis, vol. 46. John Wiley & Sons, Danvers, MA (2004)Google Scholar
 Dobigeon, N, Moussaoui, S, Coulon, M, Tourneret, JY, Hero, AO: Joint Bayesian endmember extraction and linear unmixing for hyperspectral imagery. IEEE Trans Signal Process 57(11), 4355–4368 (2009). doi:10.1109/tsp.2009.2025797View ArticleGoogle Scholar
 Parra, L, Mueller, KR, Spence, C, Ziehe, A, Sajda, P: Unmixing hyperspectral data. Advances in Neural Information Processing Systems (NIPS) 12, 942–948 (2000)Google Scholar
 Dobigeon, N, Tourneret, JY, Chein, IC: Semisupervised linear spectral unmixing using a hierarchical Bayesian model for hyperspectral imagery. IEEE Trans Signal Process 56(7), 2684–2695 (2008). doi:10.1109/tsp.2008.917851View ArticleGoogle Scholar
 Moussaoui, S, Brie, D, MohammadDjafari, A, Carteret, C: Separation of nonnegative mixture of nonnegative sources using a Bayesian approach and MCMC sampling. IEEE Trans Signal Process 54, 4133–4145 (2006). doi:10.1109/TSP.2006.880310View ArticleGoogle Scholar
 Dobigeon, N, Moussaoui, S, Tourneret, JY: Blind unmixing of linear mixtures using a hierarchical Bayesian model. Application to spectroscopic signal analysis, pp. 79–83. Proc. IEEESP Workshop Stat. and Signal Processing, Madison, WI (2007)Google Scholar
 Winter, ME: NFINDR: an algorithm for fast autonomous spectral endmember determination in hyperspectral data. In: Shen, MRDSS (ed.) SPIE, pp. 266–275. (1999)
 Hartigan, JA, Wong, MA: Algorithm AS 136: a Kmeans clustering algorithm. J R Stat Soc: Ser C: Appl Stat 28(1), 100–108 (1979). doi:10.2307/2346830Google Scholar
 MacQueen, JB: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)
 Binnig, G, Rohrer, H: Scanning tunneling microscopy. Helv Phys Acta 55(6), 726–735 (1982)Google Scholar
 Binnig, G, Rohrer, H, Gerber, C, Weibel, E: 7X7 Reconstruction on Si(111) resolved in real space. Phys Rev Lett 50(2), 120–123 (1983). doi:10.1103/PhysRevLett.50.120View ArticleGoogle Scholar
 Stroscio, J, Stroscio, A, Kaiser, W, Kaiser, J: Scanning Tunneling Microscopy, vol. volume 27. Methods in Experimental Physics. Academic Press, San Diego, CA (1993)Google Scholar
 Asenjo, A, Gomezrodriguez, JM, Baro, AM: Current imaging tunneling spectroscopy of metallic deposits of silicon. Ultramicroscopy 42, 933–939 (1992). doi:10.1016/03043991(92)90381sView ArticleGoogle Scholar
 Sales, BC, Sefat, AS, McGuire, MA, Jin, RY, Mandrus, D, Mozharivskyj, Y: Bulk superconductivity at 14 K in single crystals of Fe_{{1+y}}Te_{{x}}Se_{{1−x}}. Phys Rev B 79(9), 094521 (2009)View ArticleGoogle Scholar
 Sefat, AS, Singh, DJ, Mater, DJ: Chemistry and electronic structure of ironbased superconductors. Mater Research Bull 36, 614 (2011)View ArticleGoogle Scholar
 Tselev A, Ivanov I N, Lavrik N V, Belianinov A, Jesse S, Mathews J P, Mitchell G D, Kalinin SV: Mapping internal structure of coal by confocal microRaman spectroscopy and scanning microwave microscopy. Fuel. 126, 3237. doi:10.1016/j.fuel.2014.02.029
 Strelcov, E, Belianinov, A, Hsieh, YH, Jesse, S, Baddorf, AP, Chu, YH, Kalinin, SV: Deep data analysis of conductive phenomena on complex oxide interfaces: physics from data mining. ACS Nano 8(6), 6449–6457 (2014). doi:10.1021/nn502029bView ArticleGoogle Scholar
 Strelcov, E, Belianinov, A, Sumpter, BG, Kalinin, SV: Extracting physics through deep data analysis. Materials Today 17(9), 416–417 (2014). doi:10.1016/j.mattod.2014.10.002View ArticleGoogle Scholar
 Haykin, SS: Neural Networks: A Comprehensive Foundation. Prentice Hall, New York, NY (1999)Google Scholar
 Bintachitt, P, TrolierMcKinstry, S, Seal, K, Jesse, S, Kalinin, SV: Switching spectroscopy piezoresponse force microscopy of polycrystalline capacitor structures. Appl Phys Lett 94(4), 042906 (2009). doi:10.1063/1.3070543View ArticleGoogle Scholar
 Marincel D M, Zhang H R, Britson J, Belianinov A, Jesse S, Kalinin SV, Chen LQ, Rainforth WM, Reaney IM, Randall CA, TrolierMcKinstry S: Domain pinning near a singlegrain boundary in tetragonal and rhombohedral lead zirconate titanate films. Physical Review B. 91, 134113. doi:10.1103/PhysRevB.91.134113
 Gruverman, A, Kholkin, A: Nanoscale ferroelectrics: processing, characterization and future trends. Rep Prog Phys 69(8), 2443–2474 (2006). doi:10.1088/00344885/69/8/r04View ArticleGoogle Scholar
 Gruverman, A, Auciello, O, Ramesh, R, Tokumoto, H: Scanning force microscopy of domain structure in ferroelectric thin films: imaging and control. Nanotechnology 8, A38–A43 (1997). doi:10.1088/09574484/8/3a/008View ArticleGoogle Scholar
 Gruverman, AL, Hatano, J, Tokumoto, H: Scanning force microscopy studies of domain structure in BaTiO3 single crystals. Jpn J Appl Phys Part 1  Regul Pap Short Notes Rev Pap 36(4A), 2207–2211 (1997). doi:10.1143/jjap.36.2207View ArticleGoogle Scholar
 Roelofs, A, Bottger, U, Waser, R, Schlaphof, F, Trogisch, S, Eng, LM: Differentiating 180 degrees and 90 degrees switching of ferroelectric domains with threedimensional piezoresponse force microscopy. Appl Phys Lett 77(21), 3444–3446 (2000). doi:10.1063/1.1328049View ArticleGoogle Scholar
 Li, F, Zhang, S, Xu, Z, Wei, X, Luo, J, Shrout, TR: Composition and phase dependence of the intrinsic and extrinsic piezoelectric activity of domain engineered (1− x) Pb (Mg1/3Nb2/3) O3− xPbTiO3 crystals. J Appl Phys 108(3), 034106 (2010). doi:dx.doi.org/10.1063/1.3466978View ArticleGoogle Scholar
 Nikiforov, MP, Reukov, VV, Thompson, GL, Vertegel, AA, Guo, S, Kalinin, SV, Jesse, S: Functional recognition imaging using artificial neural networks: applications to rapid cellular identification via broadband electromechanical response. Nanotechnology 20(40), 405708 (2009). doi:10.1088/09574484/20/40/405708View ArticleGoogle Scholar
 Jesse, S, Kalinin, SV, Proksch, R, Baddorf, AP, Rodriguez, BJ: The band excitation method in scanning probe microscopy for rapid mapping of energy dissipation on the nanoscale. Nanotechnology 18(43), 435503 (2007). doi:10.1088/09574484/18/43/435503View ArticleGoogle Scholar
 Jesse, S, Vasudevan, RK, Collins, L, Strelcov, E, Okatan, MB, Belianinov, A, Baddorf, AP, Proksch, R, Kalinin, SV: Band excitation in scanning probe microscopy: recognition and functional imaging. Annu Rev Phys Chem 65, 519–536 (2014). doi:10.1146/annurevphyschem040513103609View ArticleGoogle Scholar
 Kalinin, SV, Rodriguez, BJ, Budai, JD, Jesse, S, Morozovska, AN, Bokov, AA, Ye, ZG: Direct evidence of mesoscopic dynamic heterogeneities at the surfaces of ergodic ferroelectric relaxors. Phys Rev B 81(6), 064107 (2010). doi:dx.doi.org/10.1103/PhysRevB.81.064107View ArticleGoogle Scholar
 Kumar, A, Ovchinnikov, O, Guo, S, Griggio, F, Jesse, S, TrolierMcKinstry, S, Kalinin, SV: Spatially resolved mapping of disorder type and distribution in random systems using artificial neural network recognition. Phys Rev B 84(2), 024203 (2011). doi:dx.doi.org/10.1103/PhysRevB.84.024203View ArticleGoogle Scholar
 Ovchinnikov, OS, Jesse, S, Bintacchit, P, TrolierMcKinstry, S, Kalinin, SV: Disorder identification in hysteresis data: recognition analysis of the randombondrandomfield Ising model. Phys. Rev. Lett. 103(15) (2009). doi:10.1103/PhysRevLett.103.157203
 Strelcov, E, Kim, Y, Jesse, S, Cao, Y, Ivanov, IN, Kravchenko, II, Wang, CH, Teng, YC, Chen, LQ, Chu, YH, Kalinin, SV: Probing local ionic dynamics in functional oxides at the nanoscale. Nano Lett 13(8), 3455–3462 (2013). doi:10.1021/nl400780dView ArticleGoogle Scholar
 Kim, Y, Strelcov, E, Hwang, IR, Choi, T, Park, BH, Jesse, S, Kalinin S.: Correlative multimodal probing of ionicallymediated electromechanical phenomena in simple oxides. Sci. Rep. 3, 292429212927 (2013). doi:10.1038/srep02924
 Hsieh, YH, Liou, JM, Huang, BC, Liang, CW, He, Q, Zhan, Q, Chiu, YP, Chen, YC, Chu, YH: Local conduction at the BiFeO_{3}CoFe_{2}O_{4} tubular oxide interface. Adv Mater 24(33), 4564–4568 (2012). doi:10.1002/adma.201201929View ArticleGoogle Scholar
 Vasudevan, RK, Belianinov, A, Gianfrancesco, AG, Baddorf, AP, Tselev, A, Kalinin, SV, Jesse, S: Big data in reciprocal space: sliding fast Fourier transforms for determining periodicity. Appl Phys Lett 106(9), 091601 (2015). doi:dx.doi.org/10.1063/1.4914016View ArticleGoogle Scholar
 DeSanto, P, Buttrey, DJ, Grasselli, RK, Lugmair, CG, Volpe, AF, Toby, BH, Vogt, T: Structural characterization of the orthorhombic phase M1 in MoVNbTeO propane ammoxidation catalyst. Top Catal 23(1–4), 23–38 (2003). doi:10.1023/A:1024812101856View ArticleGoogle Scholar
 Grasselli, RK, Buttrey, DJ, Burrington, JD, Andersson, A, Holmberg, J, Ueda, W, Kubo, J, Lugmair, CG, Volpe, AF: Active centers, catalytic behavior, symbiosis and redox properties of MoV(Nb, Ta)TeO ammoxidation catalysts. Top Catal 38(1–3), 7–16 (2006). doi:10.1007/s112440060066xView ArticleGoogle Scholar
 Shiju, NR, Guliants, VV: Recent developments in catalysis using nanostructured materials. Appl Catal A Gen 356(1), 1–17 (2009). doi:10.1016/j.apcata.2008.11.034View ArticleGoogle Scholar
 Dobson, P, Joyce, B, Neave, J, Zhang, J: Current understanding and applications of the RHEED intensity oscillation technique. J Cryst Growth 81(1), 1–8 (1987). doi:10.1016/00220248(87)903551View ArticleGoogle Scholar
 Boschker, JE, Folven, E, Monsen, ÅF, Wahlström, E, Grepstad, JK, Tybell, T: Consequences of high adatom energy during pulsed laser deposition of La_{0. 7}Sr_{0. 3}MnO_{3}. Cryst Growth Des 12(2), 562–566 (2012). doi:10.1021/cg201461aView ArticleGoogle Scholar
 Vasudevan, RK, Tselev, A, Baddorf, AP, Kalinin, SV: Bigdata reflection high energy electron diffraction analysis for understanding epitaxial film growth processes. ACS Nano 8(10), 10899–10908 (2014). doi:10.1021/nn504730nView ArticleGoogle Scholar
 Massies, J, Grandjean, N: Oscillation of the lattice relaxation in layerbylayer epitaxial growth of highly strained materials. Phys Rev Lett 71(9), 1411 (1993). doi:dx.doi.org/10.1103/PhysRevLett.71.1411View ArticleGoogle Scholar
 Ievlev, AV, Morozovska, AN, Eliseev, EA, Shur, VY, Kalinin, SV: Ionic field effect and memristive phenomena in singlepoint ferroelectric domain switching. Nat Comm 5, 4545 (2014). doi:10.1038/ncomms5545View ArticleGoogle Scholar
 Ievlev, A.V., Kalinin, S.V.: Data encoding based on the shape of the ferroelectric domains produced by the a scanning probe microscopy tip. Nano Letters (2015)
 Department of Energy Scientific Grand Challenges Workshop Series: Architectures and Technology for Extreme Scale Computing. http://science.energy.gov/~/media/ascr/pdf/programdocuments/docs/Arch_tech_grand_challenges_report.pdf (2009). Accessed 3 March 2015
 Department of Energy Scientific Grand Challenges Workshop Series: Discovery in Basic Energy Sciences: The Role of Computing at the Extreme Scale. http://science.energy.gov/~/media/ascr/pdf/programdocuments/docs/Bes_exascale_report.pdf (2009). Accessed 3 March 2015
 Chen, J, Choudhary, A, Feldman, S, Hendrickson, B, Johnson, CR, Mount, R, Sarkar, V, White, V, Williams, D: Synergistic Challenges in DataIntensive Science and Exascale Computing. Department of Energy Office of Science, http://sdavscidac.org/images/publications/Che2013a/ASCAC_Data_Intensive_Computing_report_final.pdf (2013). Accessed 2 March, 2015
 Department of Energy Scientific Grand Challenges Workshop Series: CrossCutting Technologies for Computing at the Exascale. http://science.energy.gov/~/media/ascr/pdf/programdocuments/docs/Crosscutting_grand_challenges.pdf (2009). Accessed 3 March 2015
 Dongarra, J, Beckman, P, Moore, T, Aerts, P, Aloisio, G, Andre, JC, Barkai, D, Berthou, JY, Boku, T, Braunschweig, B, Cappello, F, Chapman, B, Chi, X, Choudhary, A, Dosanjh, S, Dunning, T, Fiore, S, Geist, A, Gropp, B, Harrison, R, Hereld, M, Heroux, M, Hoisie, A, Hotta, K, Jin, Z, Ishikawa, Y, Johnson, F, Kale, S, Kenway, R, Keyes, D, et al.: The international exascale software project roadmap. Int J High Perform Comput Appl 25(1), 3–60 (2011). doi:10.1177/1094342010391989View ArticleGoogle Scholar
 Department of Energy Scientific Grand Challenges Workshop Series: Exascale Workshop Panel Meeting Report http://extremecomputing.labworks.org/crosscut/index.stm (2010). Accessed 3 March 2015
 Oak Ridge National Laboratory: Accelerating Data Acquisition, Reduction and Analysis. http://www.csm.ornl.gov/newsite/adara.html (2015). Accessed 3 March 2015
 Donoho, DL: HighDimensional Data Analysis: The Curses and Blessings of Dimensionality. AideMemoire of a Lecture at AMS Conference on Math Challenges of the 21st Century. (2000)
 Chu, CT, Kim, SK, Lin, YA, Yu, YY, Bradski, G, Ng, AY, Olukotun, K.: MapReduce for machine learning on multicore. In: Advances in Neural Information Processing Systems (NIPS). (2006)
 Parsons, L, Haque, E, Liu, H: Supspace clustering for high dimensional data: a review. SIGKDD Explor Newsl 6, 90–105 (2004)View ArticleGoogle Scholar
 Weiss, Y, Fergus, R, Torralba, A: Multidimensional spectral hashing. In: European Conference on Computer Vision. Florence, Italy (2012)
 Weiss, Y, Torralba, A, Fergus, R: Spectral hashing. In: Advances in Neural Information Processing Systems (NIPS). Vancouver, Canada (2008)
 Belkin, M, Niyogi, P: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6), 1373–1396 (2003). doi:10.1162/089976603321780317View ArticleGoogle Scholar
 Gerber, S., Tasdizen, T., Whitaker, R.: Robust nonlinear dimensionality reduction using successive 1dimensional Laplacian eigenmaps. In: Proceedings of the 24th International Conference on Machine Learning (ICML), Corvallis, OR 2007, pp. 281–288
 Roweis, ST, Saul, LK: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)View ArticleGoogle Scholar
 Tenenbaum, JB, de Silva, V, Langford, JC: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000). doi:10.1126/science.290.5500.2319View ArticleGoogle Scholar
 Lin, T, Zha, H: Riemannian manifold learning. IEEE Trans Pattern Anal Mach Intell 30(5), 796–809 (2008). doi:10.1109/TPAMI.2007.70735View ArticleGoogle Scholar
 Kelner, J.A., Orecchia, L., Sidford, A., Zhu, Z.A.: A simple, combinatorial algorithm for solving SDD systems in nearlylinear time. Paper presented at the Proceedings of the FortyFifth Annual ACM Symposium on Theory of Computing, Palo Alto, California, USA,
 Vatsavai, RR, Symons, CT, Chandola, V, Jun, G: GXMeans: a modelbased divide and merge algorithm for geospatial image clustering. International Conference on Computational Science Singapore, In (2011)Google Scholar
 Bengio, Y: Learning deep architectures for AI Found. Trends Mach Learn 2(1), 1–127 (2009)View ArticleGoogle Scholar
 Coates, A., Huval, B., Wang, T., Wu, D.J., Ng, A.Y., Catanzaro, B.: Deep learning with COTS HPC systems. In: 30th International Conference on Machine Learning, Atlanta, Georgia, USA 2013
 Thomas, JJ, Cook, KA: A visual analytics agenda. Computer Graphics and Applications, IEEE 26(1), 10–13 (2006)View ArticleGoogle Scholar
 Keim, DA, Mansmann, F, Schneidewind, J, Thomas, J, Ziegler, H: Visual analytics: scope and challenges. Springer Berlin Heidelberg, Berlin (2008)Google Scholar
 Tukey, JW: Exploratory Data Analysis. 1977. AddisonWesley, Massachusetts (1976)Google Scholar
 Roberts, JC: Exploratory visualization with multiple linked views. In: Dykes, J, MacEachren, AM, Kraak, MJ (eds.) Exploring Geovisualization. Elsevier, San Diego, CA (2005)Google Scholar
 Arnheim, R: Art and visual perception: a psychology of the creative eye. Univ of California Press, Los Angeles, CA (1954)Google Scholar
 Steed, CA, Ricciuto, DM, Shipman, G, Smith, B, Thornton, PE, Wang, D, Shi, X, Williams, DN: Big data visual analytics for exploratory earth system simulation analysis. Comput Geosci 61, 71–82 (2013). doi:10.1016/j.cageo.2013.07.025View ArticleGoogle Scholar
 Inselberg, A: The plane with parallel coordinates. Vis Comput 1(2), 69–91 (1985). doi:10.1007/BF01898350View ArticleGoogle Scholar
 Inselberg, A: Parallel coordinates. Springer, New York, NY (2009)View ArticleGoogle Scholar
 Hauser, H, Ledermann, F, Doleisch, H: Angular brushing of extended parallel coordinates. In: IEEE Symposium on Information Visualization. INFOVIS 2002 2002, pp. 127–130. (2002). IEEE
 Heinrich, J, Weiskopf, D: Eurographics 2013State of the Art Reports, pp. 95–116. The Eurographics Association, Goslar (2012)Google Scholar
 Willett, W, Heer, J, Agrawala, M: Scented widgets: improving navigation cues with embedded visualizations. IEEE Trans Vis Comput Graph 13(6), 1129–1136 (2007)View ArticleGoogle Scholar
 Pirolli, P, Card, S: Information foraging. Psychol Rev 106(4), 643 (1999)View ArticleGoogle Scholar
 Steed, CA, Swan, J, JankunKelly, T, Fitzpatrick, PJ: Guided analysis of hurricane trends using statistical processes integrated with interactive parallel coordinates. In: IEEE Symposium on Visual Analytics Science and Technology. VAST 2009 2009, pp. 19–26. (2009). IEEE
 Yankovich, AB, Berkels, B, Dahmen, W, Binev, P, Sanchez, SI, Bradley, SA, Li, A, Szlufarska, I, Voyles, PM: Picometreprecision analysis of scanning transmission electron microscopy images of platinum nanocatalysts. Nat. Comm. 5 (2014). doi:10.1038/ncomms5155
 Spiegelhalter, D: The future lies in uncertainty. Science 345(6194), 264–265 (2014). doi:10.1126/science.1251122View ArticleGoogle Scholar
 Efron, B: Bayes’ theorem in the 21st century. Science 340(6137), 1177–1178 (2013). doi:10.1126/science.1236536View ArticleGoogle Scholar
 Baldi, P, Sadowski, P, Whiteson, D: Searching for exotic particles in highenergy physics with deep learning. Nat. Comm. 5 (2014). doi:10.1038/ncomms5308
 Brouwer, WJ, Kubicki, JD, Sofo, JO, Giles, CL: An investigation of machine learning methods applied to structure prediction in condensed matter. arXiv preprint arXiv, pp. 1405–3564. (2014)
 Schmidt, M, Lipson, H: Distilling freeform natural laws from experimental data. Science 324(5923), 81–85 (2009). doi:10.1126/science.1165893View ArticleGoogle Scholar
 Jesse, S, Mirman, B, Kalinin, SV: Resonance enhancement in piezoresponse force microscopy: Mapping electromechanical activity, contact stiffness, and Q factor. Appl. Phys. Lett. 89(2) (2006). doi:10.1063/1.2221496
 Jesse, S, Kalinin, SV, Proksch, R, Baddorf, AP, Rodriguez, BJ: Energy dissipation measurements on the nanoscale: band excitation method in scanning probe microscopy. Nanotechnology 18, 435503 (2007). doi:10.1088/09574484/18/47/475504View ArticleGoogle Scholar
 Nikiforov, MP, Thompson, GL, Reukov, VV, Jesse, S, Guo, S, Rodriguez, BJ, Seal, K, Vertegel, AA, Kalinin, SV: Doublelayer mediated electromechanical response of amyloid fibrils in liquid environment. ACS Nano 4(2), 689–698 (2010). doi:10.1021/nn901127kView ArticleGoogle Scholar
 Jesse, S, Baddorf, AP, Kalinin, SV: Switching spectroscopy piezoresponse force microscopy of ferroelectric materials. Appl Phys Lett 88(6), 062908 (2006). doi:10.1063/1.2172216View ArticleGoogle Scholar
 Jesse, S, Lee, HN, Kalinin, SV: Quantitative mapping of switching behavior in piezoresponse force microscopy. Rev Sci Instrum 77(7), 073702 (2006). doi:10.1063/1.2214699View ArticleGoogle Scholar
 Rodriguez, BJ, Jesse, S, Alexe, M, Kalinin, SV: Spatially resolved mapping of polarization switching behavior in nanoscale ferroelectrics. Adv Mater 20, 109 (2008). doi:10.1002/adma.200700473View ArticleGoogle Scholar
 Jesse, S, Rodriguez, BJ, Choudhury, S, Baddorf, AP, Vrejoiu, I, Hesse, D, Alexe, M, Eliseev, EA, Morozovska, AN, Zhang, J, Chen, LQ, Kalinin, SV: Direct imaging of the spatial and energy distribution of nucleation centres in ferroelectric materials. Nat Mater 7(3), 209–215 (2008). doi:10.1038/nmat2114View ArticleGoogle Scholar
 Tan, Z, Roytburd, AL, Levin, I, Seal, K, Rodriguez, BJ, Jesse, S, Kalinin, SV, Baddorf, AP: Piezoelectric response of nanoscale PbTiO_{3} in composite PbTiO_{3}−CoFe_{2}O_{4} epitaxial films. Appl Phys Lett 93, 074101 (2008). doi:dx.doi.org/10.1063/1.2969038View ArticleGoogle Scholar
 Rodriguez, BJ, Choudhury, S, Chu, YH, Bhattacharyya, A, Jesse, S, Seal, K, Baddorf, AP, Ramesh, R, Chen, LQ, Kalinin, SV: Unraveling deterministic mesoscopic polarization switching mechanisms: spatially resolved studies of a tilt grain boundary in bismuth ferrite. Adv Funct Mater 19(13), 2053–2063 (2009). doi:10.1002/adfm.200900100View ArticleGoogle Scholar
 Seal, K, Jesse, S, Nikiforov, MP, Kalinin, SV, Fujii, I, Bintachitt, P, TrolierMcKinstry, S: Spatially resolved spectroscopic mapping of polarization reversal in polycrystalline ferroelectric films: crossing the resolution barrier. Phys Rev Lett 103(5), 057601 (2009). doi:10.1103/PhysRevLett.103.057601View ArticleGoogle Scholar
 Wicks, S, Seal, K, Jesse, S, Anbusathaiah, V, Leach, S, Garcia, RE, Kalinin, SV, Nagarajan, V: Collective dynamics in nanostructured polycrystalline ferroelectric thin films using local timeresolved measurements and switching spectroscopy. Acta Mater 58(1), 67–75 (2010). doi:10.1016/j.actamat.2009.08.057View ArticleGoogle Scholar
 Rodriguez, BJ, Jesse, S, Bokov, AA, Ye, ZG, Kalinin, SV: Mapping biasinduced phase stability and random fields in relaxor ferroelectrics. Appl Phys Lett 95, 9 (2009). doi:10.1063/1.3222868View ArticleGoogle Scholar
 Rodriguez, BJ, Jesse, S, Morozovska, AN, Svechnikov, SV, Kiselev, DA, Kholkin, AL, Bokov, AA, Ye, ZG, Kalinin, SV: Real space mapping of polarization dynamics and hysteresis loop formation in relaxorferroelectric PbMg_{1/3}Nb_{2/3}O_{3}PbTiO_{3} solid solutions. J Appl Phys 108(4), 042006 (2010). doi:10.1063/1.3474961View ArticleGoogle Scholar
 Rodriguez, BJ, Jesse, S, Kim, J, Ducharme, S, Kalinin, SV: Local probing of relaxation time distributions in ferroelectric polymer nanomesas: timeresolved piezoresponse force spectroscopy and spectroscopic imaging. Appl Phys Lett 92(23), 232903 (2008). doi:10.1063/1.2942390View ArticleGoogle Scholar
 Kalinin, SV, Rodriguez, BJ, Jesse, S, Morozovska, AN, Bokov, AA, Ye, ZG: Spatial distribution of relaxation behavior on the surface of a ferroelectric relaxor in the ergodic phase. Appl Phys Lett 95(14), 142902 (2009). doi:dx.doi.org/10.1063/1.3242011View ArticleGoogle Scholar
 Bintachitt, P, Jesse, S, Damjanovic, D, Han, Y, Reaney, IM, TrolierMcKinstry, S, Kalinin, SV: Collective dynamics underpins Rayleigh behavior in disordered polycrystalline ferroelectrics. Proc Natl Acad Sci U S A 107(16), 7219–7224 (2010). doi:10.1073/pnas.0913172107View ArticleGoogle Scholar
 Griggio, F, Jesse, S, Kumar, A, Marincel, DM, Tinberg, DS, Kalinin, SV, TrolierMcKinstry, S: Mapping piezoelectric nonlinearity in the Rayleigh regime using band excitation piezoresponse force microscopy. Appl Phys Lett 98(21), 212901 (2011). doi:10.1063/1.3593138View ArticleGoogle Scholar
 Jesse, S, Maksymovych, P, Kalinin, SV: Rapid multidimensional data acquisition in scanning probe microscopy applied to local polarization dynamics and voltage dependent contact mechanics. Appl Phys Lett 93(11), 112903 (2008). doi:10.1063/1.2980031View ArticleGoogle Scholar
 Maksymovych, P, Balke, N, Jesse, S, Huijben, M, Ramesh, R, Baddorf, AP, Kalinin, SV: Defectinduced asymmetry of local hysteresis loops on BiFeO_{3} surfaces. J Mater Sci 44(19), 5095–5101 (2009). doi:10.1007/s108530093697zView ArticleGoogle Scholar
 Anbusathaiah, V, Jesse, S, Arredondo, MA, Kartawidjaja, FC, Ovchinnikov, OS, Wang, J, Kalinin, SV, Nagarajan, V: Ferroelastic domain wall dynamics in ferroelectric bilayers. Acta Mater 58(16), 5316–5325 (2010). doi:10.1016/j.actamat.2010.06.004View ArticleGoogle Scholar
 McLachlan, MA, McComb, DW, Ryan, MP, Morozovska, AN, Eliseev, EA, Payzant, EA, Jesse, S, Seal, K, Baddorf, AP, Kalinin, SV: Probing local and global ferroelectric phase stability and polarization switching in ordered macroporous PZT. Adv Funct Mater 21(5), 941–947 (2011). doi:10.1002/adfm.201002038View ArticleGoogle Scholar
 Kim, Y, Kumar, A, Tselev, A, Kravchenko, II, Han, H, Vrejoiu, I, Lee, W, Hesse, D, Alexe, M., Kalinin, SV: Nonlinear phenomena in multiferroic nanocapacitors: Joule heating and electromechanical effects. ACS Nano. 5(11), 9104–9112. doi:10.1021/nn203342v
 Nikiforov, MP, Gam, S, Jesse, S, Composto, RJ, Kalinin, SV: Morphology mapping of phaseseparated polymer films using nanothermal analysis. Macromolecules 43(16), 6724–6730 (2010). doi:10.1021/ma1011254View ArticleGoogle Scholar
 Nikiforov, MP, Hohlbauch, S, King, WP, Voitchovsky, K, Contera, SA, Jesse, S, Kalinin, SV, Proksch, R: Temperaturedependent phase transitions in zeptoliter volumes of a complex biological membrane. Nanotechnology. 22(5) (2011). doi:10.1088/09574484/22/5/055709
 Balke, N, Jesse, S, Kim, Y, Adamczyk, L, Tselev, A, Ivanov, IN, Dudney, NJ, Kalinin, SV: Real space mapping of Liion transport in amorphous Si anodes with nanometer resolution. Nano Lett 10(9), 3420–3425 (2010). doi:10.1021/nl101439xView ArticleGoogle Scholar
 Guo, S, Jesse, S, Kalnaus, S, Balke, N, Daniel, C, Kalinin, SV: Direct mapping of ion diffusion times on LiCoO_{2} surfaces with nanometer resolution. J Electrochem Soc 158(8), A982–A990 (2011). doi:10.1149/1.3604759View ArticleGoogle Scholar
 Ovchinnikov, O, Jesse, S, Guo, S, Seal, K, Bintachitt, P, Fujii, I, TrolierMcKinstry, S, Kalinin, SV: Local measurements of Preisach density in polycrystalline ferroelectric capacitors using piezoresponse force spectroscopy. Appl Phys Lett 96(11), 112906 (2010). doi:dx.doi.org/10.1063/1.3360220View ArticleGoogle Scholar
 Guo, S, Ovchinnikov, OS, Curtis, ME, Johnson, MB, Jesse, S, Kalinin, SV: Spatially resolved probing of Preisach density in polycrystalline ferroelectric thin films. J Appl Phys 108(8), 084103–084110 (2010). doi: dx.doi.org/10.1063/1.3493738View ArticleGoogle Scholar
 Balke, N, Jesse, S, Kim, Y, Adamczyk, L, Ivanov, IN, Dudney, NJ, Kalinin, SV: Decoupling electrochemical reaction and diffusion processes in ionicallyconductive solids on the nanometer scale. ACS Nano 4(12), 7349–7357 (2010). doi:10.1021/nn101502xView ArticleGoogle Scholar
 Vasudevan, R, Liu, Y, Li, J, Liang, WI, Kumar, A, Jesse, S, Chen, YC, Chu, YH, Valanoor, N, Kalinin, SV: Nanoscalecontrol of phasevariants in strainengineered BiFeO_{3}. Nano Lett 11(8), 3346–3354 (2011). doi:10.1021/nl201719wView ArticleGoogle Scholar
 Arruda, TM, Kumar, A, Kalinin, SV, Jesse, S: Mapping irreversible electrochemical processes on the nanoscale: ionic phenomena in Li ion conductive glass ceramics. Nano Lett 11(10), 4161–4167 (2011). doi:10.1021/nl202039vView ArticleGoogle Scholar
 Kumar, A, Ovchinnikov, OS, Funakubo, H, Jesse, S, Kalinin, SV: Realspace mapping of dynamic phenomena during hysteresis loop measurements: dynamic switching spectroscopy piezoresponse force microscopy. Appl Phys Lett 98(20), 202903 (2011). doi: dx.doi.org/10.1063/1.3590919View ArticleGoogle Scholar
 Kumar, A, Ciucci, F, Morozovska, AN, Kalinin, SV, Jesse, S: Measuring oxygen reduction/evolution reactions on the nanoscale. Nat Chem 3(9), 707–713 (2011). doi:10.1038/nchem.1112View ArticleGoogle Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.