Skip to main content

Feature extraction via similarity search: application to atom finding and denoising in electron and scanning probe microscopy imaging


We develop an algorithm for feature extraction based on structural similarity and demonstrate its application for atom and pattern finding in high-resolution electron and scanning probe microscopy images. The use of the combined local identifiers formed from an image subset and appended Fourier, or other transform, allows tuning selectivity to specific patterns based on the nature of the recognition task. The proposed algorithm is implemented in Pycroscopy, a community-driven scientific data analysis package, and is accessible through an interactive Jupyter notebook available on GitHub.


Recent advances in (scanning) transmission electron microscopy (STEM) and scanning probe microscopy (SPM) made atomically resolved imaging of solids and surfaces routine [1,2,3,4,5]. STEM enables the visualization of the atomic structure in a broad range of materials from oxides to semiconductors and metals, and in many cases allows observation of the evolution of structure under reactive conditions or thermal stimulations [6,7,8,9,10,11]. Similarly, ultra-high vacuum (UHV) and liquid SPM modes were used to resolve atomic structures of metal and semiconductors, ad atom structures, etc. [12,13,14,15,16].

Both for (S)TEM and SPM, of interest is the fundamental analysis of materials physics and chemistry from imaging data. Indeed, until recently atomically resolved images were used solely to establish the local structure of materials and make qualitative observations on its quality, the presence of specific defects, etc. The progress in spatial resolution and related information limit enabled quantitative description of images, where (for STEM) atomic coordinates for some (or all) constitutive atoms can be extracted with picometer precision. Once available, this information can be used to reconstruct physical order parameter fields such as polarization [17,18,19,20], octahedral tilts [21,22,23], or chemical expansion [24]. Alternatively, local atomic configurations can be analyzed in an unbiased manner via statistical methods, providing information on local crystallography [25, 26]. Moreover, crystallographic information can be extracted from the shape of the atomic column [22, 27]. Parenthetically, the extraction of physical information from atomically resolved imaging data, along with 3D imaging, provides the primary stimulus for development of progressive high-resolution STEM platforms [1]. Similar approaches can be applied to scanning probe microscopy data [25, 28], albeit in this case the origin of the contrast is more complex.

These applications necessitate the development of robust and reliable techniques to extract atomic coordinates from atomically resolved images, requiring little or no human supervision. These generally require a combination of feature extraction methods with physics-based deconvolution. For STEM data, especially annular dark or bright field images, the analysis can be significantly simplified under the assumptions that the regions with maximum contrast correspond to atomic coordinates. For more complex cases, the target is the development of feature classification and feature extraction schemes. Here, we develop an approach based on sliding window decomposition and similarity search that enables fast and robust analysis of images with multiple periodic textures and can be used for denoising, feature extraction, and segmentation of data. We further note that in certain cases, the proposed algorithm leads to physically significant decompositions—however, we defer these studies to follow-on work focused on specific studies of imaging phenomena.


Figure 1a–c shows the fundamentals of the proposed approach, including denoising, feature-based clustering, and similarity search. The proposed algorithm leverages the fact that noise within different sections of the image is typically uncorrelated. Thus, images can be denoised by only retaining strongly correlated information and removing poorly correlated or uncorrelated information. To compare each section of an N × N pixel image with every other section of the image, we use an m × m pixel frame or window where N is much larger than m. For atomically resolved images, an optimal value for m can be calculated using information about the periodicity of the lattice [29]. See Figures S1 and S2 in Additional file 1 for examples of the original N × N pixel image and a corresponding m × m pixel window. The generated m × m pixel frame is slid across the original image column-by-image column and then image row-by-image row. At each position, the m × m pixels within the frame are copied to construct a stack of (N − m)2 windows each with m × m pixels.

Fig. 1
figure 1

Schematic illustrating the fundamentals of the singular value decomposition (SVD)-based image denoising technique and the pattern matching-based techniques for identifying atoms in images. a The denoising process starts with sliding a small window across the given image column-by-column and then row-by-row. b A stack of (N − m)2 windows, each with m × m pixels, is built by copying the contents of the window at each location. c This 3D stack of windows is flattened to a 2D matrix by flattening the m × m pixel windows to 1D arrays with m2 elements. SVD is performed on this 2D matrix to decompose the data into the most correlated and least uncorrelated (noise) components. The image is denoised by reconstructing the 2D matrix in c with only the most correlated SVD components and reversing the steps from c to a. d K-means clustering on the SVD results groups pixels exhibiting similar trends together in a cluster label map. Representative examples of repeating patterns or motifs in the label map are selected for pattern matching. e Each motif is compared to every section in the label map to generate a pattern matching scores’ map. Each continuous-valued scores map is thresholded to generate binary maps with segments and the centroids of these segments provide the coordinates of the repeating patterns such as atoms

As a second step, each window in the stack is flattened from an m × m two-dimensional matrix to a one-dimensional array containing m2 elements. When it is desired to emphasize subtle changes in the atomic periodicity in certain atomically resolved images, additional information, such as the magnitude of the fast Fourier transform of the window, can be appended to the window. This combination of real-space window and corresponding FFT with certain weighting factor allows one to balance relative weight of real- and reciprocal space features and greatly improves flexibility of the analysis. We further note that for different preponderant textures, other image transforms can be used, e.g., Radon transform for analysis of the linear domain structures. Overall, this operation transforms the three-dimensional stack of windows into a large two-dimensional matrix [(N − m)2 windows, each with m2 pixels].

Subsequently, we apply singular value decomposition (SVD) to identify the most and least correlated trends in the windows. Briefly, SVD breaks up a dataset into orthogonal components arranged in decreasing order of statistical significance as

$$A_{pq} = U_{pr} S_{r} V_{qr}^{\text{T}} ,$$

where A is the dataset of interest containing p windows with q pixels, V describes the r most significant components, each containing q pixels. U describes the abundance of these r components in the p windows and S the variance or statistical importance of each of the r components.

Consequently, the first few SVD components contain the most important correlations while the last few components contain the least correlated information and are typically considered to be noise. Figures S3–S5 in Additional file 1 show results from SVD applied to a windowed dataset. The original dataset, A, can be reconstructed exactly from the above equation or partially by using a subset of the r components. Thus, the 2D (N − m)2 × m2 dataset can be reconstructed using only the most informationally significant components from SVD. Subsequently, the windowing process shown in Fig. 1a–c can be used in reverse to generate the filtered image. Figure S6 in Additional file 1 shows the reconstruction of an image using the first few SVD components. Though the windowing process is shown for a square N × N pixel image, the same procedure can be used on rectangular images as well.

The results from SVD are further used to identify atoms or atomic columns via a pattern matching approach outlined in Fig. 1d–e. We begin the atom finding process by performing k-means clustering on U or a subset of U. k-Means clustering classifies data points into k clusters by Euclidean distance such that the variance within each cluster is minimized. In other words, k-means groups data points or pixels such that pixels within the same cluster are more similar to each other than those in other clusters. Applying k-means to U results in a (spatial) map of labels where the value at each pixel is the index of the cluster that the pixel belongs to k-means can also be applied to a subset of U to discount SVD components whose eigenvectors do not exhibit regular patterns. These components often contain information regarding long-range features (e.g., − drift), instrument noise (e.g., − 60 Hz noise), etc. Such manual selection of components in U can better enable k-means to capture the desired features. k-means requires k to be specified a priori and it is a challenge to determine an appropriate value of k that best represents the data. Hence, we ‘over-cluster’ the dataset, or choose a large value for k (e.g., – 24 to 60) to allow k-means to capture the finer nuances in the image, such as phase boundaries, in the original image.

Next, we manually select t square or rectangular windows in the denoised image that are centered on repeating patterns, such as atoms or atomic columns. Separate motifs are selected to represent each of the families of atoms or atomic columns. The coordinates of these windows are used to extract a corresponding set of t motifs from the spatial map of cluster labels obtained from k-means. The spatial abundance of each motif is calculated by scanning the motif across the spatial map of cluster labels, image column by image column and then image row by image row. For a given motif at a given location on the cluster label map, the ‘matching score’ is calculated as the number of pixels in the motif that match with the current window in the cluster label map. This matching score is divided by the number of pixels in the motif such that the score always ranges from 0 to 1. In the event that two motifs identify the same set of atoms, one of the motifs is removed. Additional motifs may be required to identify those atoms that are not captured by the original set of motifs. One example is the case where drift in the microscope results in distortions in the shapes of atoms in certain sections of the image.

This matching process results in t spatial maps of matching scores corresponding to each of the t motifs. These continuous-valued spatial maps of matching scores are manually thresholded to generate t binary maps where the score is set to 1 if the matching score is greater than the threshold and 0 otherwise. The threshold values for each pattern are manually chosen such that the number of matched areas is maximized while minimizing any overlap with segments from other patterns. For atomically resolved images, the coordinates of atoms can be estimated by calculating the centroid of each segment from the thresholded maps. When the same atom is identified for multiple motifs, supervised machine learning techniques such as K-nearest neighbors are used to remove duplicates and assign the atom to the correct motif. Subsequently, a variety of approaches can be used to refine the positions of the atoms for further analysis as necessary [23, 25, 28].


The selection of the appropriate number of components to reconstruct the image plays a very important role in the denoising process. Figure 2 shows the denoising process applied to an atomically resolved image obtained from a scanning transmission electron microscope (STEM). The image shows a clear lattice structure with some bright and some dim spots signifying atomic columns. As is the case in most annular dark-field STEM images, certain columns of atoms result in a signal that is often comparable to or below the noise floor, which makes it very challenging to identify these atomic columns. After applying the windowing algorithm, we applied SVD to the windowed dataset and reconstructed images with varying number of SVD components. Figure 2h shows that variance, or statistical significance, of components decreases exponentially with the number of components. In fact, the first 32 of the 180,625 components contain nearly all the statistically and physically relevant information. Reconstructing the dataset with only the first eight components resulted in an image with minimal noise. Though the dominant lattice structure is certainly visible in the cleaned image in Fig. 2i, l shows that finer shifts in the atomic positions are lost. The fast Fourier transform (FFT) of these images shows that the cleaned image only captures the signal from a few lower order peaks while information from several higher order peaks are lost. When reconstructing the dataset with the first 700 components, we see that the cleaned image is comparable to the original image since only a few components were removed during the reconstruction. Consequently, Fig. 2k shows that minimal information was discarded and the vast majority of information in the frequency space is retained in Fig. 2g. Reconstructing the dataset with only the first 32 components resulted in an image that retains the general lattice structure and the finer shifts in the atomic columns while discarding only the noise as seen in Fig. 2c, j. We reiterate that the image can be reconstructed using any subset of components (for example—components 2,5,8,9,15….) and not only using the first few components. Thus, it is critical that the appropriate components be chosen to reconstruct the images.

Fig. 2
figure 2

Choosing the appropriate number of SVD components to effectively denoise images. a Original atomically resolved image. Denoised images obtained by reconstruction using b 12, c 30, and d 256 SVD components. Magnitudes of the 2D FFTs of images reconstructed using e 12, f 30, and g 256 SVD components. h Variance of the SVD components showing the cut-offs for (red) 12, (green) 30, and (cyan) 256 SVD components as vertical lines. Removed noise, calculated as the difference between the original and denoised images, when reconstructing images using i 12, j 30, and k 256 SVD components. Magnitudes of the 2D FFTs of the removed noise when reconstructing images using l 12, m 30, and n 256 SVD components

Figure 3 shows this windowing and SVD-based filtering methodology applied to four atomically resolved images obtained from STEM and scanning tunneling microscopes (STMs). The sample description is provided in “Methods” section. In all cases, the original images show significant amount of noise that makes it challenging to identify certain atoms or columns. SVD on the windowed datasets resulted in components whose variance decreased exponentially with increasing number of components as seen in the variance plots. The denoised images were generated by reconstructing the windowed datasets using only the first 24–32 components from the complete 500–2000 components. In all cases, the denoised image clearly reveals the atomic columns regardless of the intensity of the signal. The removed signal mainly contains the uncorrelated noise in the original image. The denoised images are neither missing atoms nor do they have additional atoms that were not present in the original image. As this figure shows, our denoising approach works for images with different numbers of atoms, images with defects, images with shifts in the atomic periodicity caused by phase boundaries or grain boundaries (Fig. 3a, e, i), and images of surfaces with contaminants (Fig. 3m).

Fig. 3
figure 3

Image windowing and SVD-based denoising algorithm applied to four atomically resolved images. a, e, i, m Original images showing varying levels of noise, number of atoms, and presence of contaminants. b, f, j, n Variance of the SVD components and the number of components used to reconstruct images shown by the red line. c, g, k, o Denoised images obtained by reconstructing using a subset of the SVD components. d, h, l, p Noise removed from images, calculated as the difference between the original and denoised images

Following the denoising, the results from SVD on the windowed dataset can also be used to identify atoms via the pattern matching approach described above. The first and most challenging step in this process is configuring k-means clustering such that it effectively identifies subtle trends such as phase boundaries, domain walls, grain boundaries, dislocations, or cracks and segregates the dataset accordingly. Figure 4 shows results of k-means clustering applied to the U dataset obtained from SVD on the windowed dataset. Components containing long-range features and changes in background intensities were discarded in most cases for effective identification of subtle features. For the images in Fig. 4a, b, the absolute value of the FFT of the windows was appended to the windows to emphasize subtle changes in the periodicity in the image. These careful considerations enabled k-means to differentiate phase regions in the left and right side of Fig. 4a that are invisible to the human eye. Correspondingly, the cluster label map in Fig. 4i shows different patterns on the left and right separated by a red section in the center. Similarly, k-means was able to clearly identify the antiphase boundary in Fig. 4b as seen by the broad blue “crack” dividing the top and bottom of Fig. 4j. k-Means also clearly identified the grain boundary in Fig. 4c and the defects/impurities in Fig. 4d.

Fig. 4
figure 4

k-Means clustering applied to SVD results to identify trends in the dataset that can be invisible to the human eye. K-means clustering is applied to a subset of the U matrix obtained from SVD on the windowed dataset. ad Images denoised using the windowing and SVD-based denoising algorithm. eh Dendrograms showing the grouping of different clusters. The distance between two clusters is given by the height of the horizontal line connecting two clusters. il Maps of cluster labels obtained from K-means clustering. These maps show phase boundaries, cracks in the crystal structure, grain boundaries, and locations of contaminants in the image

Following successful clustering, this pattern matching technique can be applied to identify and differentiate atoms as shown in Fig. 5. The denoised image in Fig. 5a was reconstructed by carefully choosing a subset of the first 28 SVD components. k-Means clustering was performed on the same set of components to provide the map of cluster labels shown in Fig. 5b. Four motifs, centered on the two different classes of atomic columns, were manually chosen from the cluster label map, demarcated by the black squares in Fig. 5b and also shown in Fig. 5c. Figure 5d shows the overlay of the normalized matching scores for each of the four motifs and the atoms identified by different motifs are differentiated by color. We observed minimal overlap or mis-identification of the atoms at this stage and only 4 of the 300+ atoms were not identified due to artifacts in the center of the k-means cluster map. Independently thresholding the pattern matching scores maps resulted in binary maps shown in Fig. 5e. The coordinates of the atoms were calculated via the centroid of each segment in Fig. 5e.

Fig. 5
figure 5

Pattern matching approach for identifying and classifying atoms. a Image denoised using the denoising algorithm presented in this paper. b Map of cluster labels obtained from k-means clustering on a subset of the components obtained via SVD on the windowed dataset. Patterns to be used for pattern matching are shown by the black squares in b and are expanded in c. d Overlays of pattern matching scores ranging from 0 to 1 for each pattern. Results for the different motifs are distinguished by color. The color is set to transparent for a matching score of 0 and to the solid color corresponding to the motif for a matching score of 1. e Binary maps of pattern matching scores after thresholding to either 0 or 1. f Positions of the atoms identified for each motif obtained from the centroids of the segments in e. The d pattern matching scores, e thresholded scores, and f atom positions have been superimposed over a black-and-white map of the denoised image shown in a


The microscopy community has developed and used a variety of techniques for denoising images and finding the positions of atoms and atomic columns in images since the inception of the SPMs and STEMs [30]. Among the many techniques for denoising atomically resolved images, Gaussian blurring, filtering in the Fourier space, averaging over multiple unit cells, and averaging over a stack of images are some of the most commonly used methods. Most of these techniques are fast and simple but have major shortcomings. For example, filtering in the Fourier space is prone to adding additional atoms at vacancies, removing displacements in atomic positions, etc. Furthermore, it is challenging to further fine-tune the filter since the frequencies with significant information vary from image to image and are not known a priori. Cross-correlation and phase-correlation methods typically require a stack of multiple images and cannot work on a single image like our technique. The current state-of-art technique for image denoising is a non-local means (NLM) [31] technique called block-matching and 3D filtering (BM3D) [32], which identifies windows or patches that are similar, performs 3D wavelet denoising on similar patches and finally applies a Wiener filter. BM3D also shares the same shortcomings as frequency space filtering. Additional file 1: Figure S7–S10 shows the results from popular image filtering techniques. Additional file 1: Figure S11 compares the best results from five image denoising techniques. We observe that our technique is substantially better than all other techniques, including BM3D, at effectively removing the majority of the noise while retaining all the important information regarding the lattice structure. All other techniques only remove the short-range features or high-frequency components while the noise at low-frequency components still remains in the image. Furthermore, all the other techniques tend to erase atoms with relatively low intensities easily as the strength of the filter is increased.

We also compared our atom finding technique with other conventional alternatives. Similar to the image denoising alternatives, the atom finding techniques also have their own advantages and limitations. For example, though Gaussian convolution is fast and simple, it is unable to differentiate regions with different lattice structures and often deletes atoms with lower intensities along with the noise especially in images with low signal-to-noise ratio. Window-based convolutions on the other hand perform slightly better than Gaussian convolutions at identifying atoms with relatively low intensities but at the cost of a significant number of false-positives since the method is dominated by atoms or atomic columns with relatively high intensities. Moreover, similar to Gaussian convolution, window-based convolutions are also poor at distinguishing regions with different lattice structures. We find that our technique is superior to the aforementioned techniques since it has consistently been able to correctly identify all atoms and distinguish regions. See Additional file 1: Figure S12 for more information.

Clearly, each denoising and atom finding technique has its own merits and disadvantages and a single technique may not be ideal for every image. For instance, while our technique is consistent in denoising and atom identification, our (currently) computationally intensive algorithm requires modifications to make it suitable for real-time denoising of images and identification of atoms. We find that the best solution is for such algorithms to be made available freely via open-source packages to allow researchers to adopt the algorithms that suit them the best.


We implemented methods for finding atoms and patterns in the high-resolution images based on similarity search on sliding transforms of images. This approach is universally applicable to STEM, STM, and AFM data, and can be also applied to the other feature- finding problems. The use of the identification object comprised image subset and appended Fourier transform allows tuning for increased detectability of periodic structures, and can be adapted to other characteristic morphologies, e.g., via use of Hough transforms.

All the image denoising and atom finding algorithms presented in this paper are freely available in our open-source, community-driven, python package—Pycroscopy ( The scientific workflow presented in this paper is available via a Jupyter notebook ( that allows straightforward application of the presented methodology to arbitrary images.


The image in Fig. 2 is Li0.33 La0.57 TiO3 described in [30]. Image courtesy of Miaofang Chi, Oak Ridge National Laboratory.

The image in Fig. 3b is Mo-V-M oxide described in [31, 32]. Image courtesy of Albina Borisevich, Oak Ridge National Laboratory.

The image in Fig. 3i is a simulated image.

The image in Fig. 3m is WSe2 irradiated with He ions acquired via Nion Company UltraSTEM 100 at 100 keV. Image Courtesy of Nicholas Cross and Gerd Duscher, The University of Tennessee, Knoxville.


  1. Pennycook, S.J., Kalinin, S.V.: Microscopy: hasten high resolution. Nature 515, 487–488 (2014)

    Article  Google Scholar 

  2. Pennycook, S.J., Nellist, P.D. (eds.): Scanning transmission electron microscopy: imaging and analysis. Springer, New York (2011)

    Google Scholar 

  3. Shibata, N., Findlay, S.D., Kohno, Y., Sawada, H., Kondo, Y., Ikuhara, Y.: Differential phase-contrast microscopy at atomic resolution. Nat. Phys. 8, 611–615 (2012)

    Article  Google Scholar 

  4. Huang, P.Y., Kurasch, S., Alden, J.S., Shekhawat, A., Alemi, A.A., McEuen, P.L., Sethna, J.P., Kaiser, U., Muller, D.A.: Imaging atomic rearrangements in two-dimensional silica glass: watching silica’s dance. Science 342, 224–227 (2013)

    Article  Google Scholar 

  5. Morgenstern, K., Lorente, N., Rieder, K.H.: Controlled manipulation of single atoms and small molecules using the scanning tunnelling microscope. Phys. Status Solidi B-Basic Solid State Phys. 250, 1671–1751 (2013)

    Article  Google Scholar 

  6. Pennycook, S.J., Chisholm, M.F., Lupini, A.R., Varela, M., van Benthem, K., Borisevich, A.Y., Oxley, M.P., Luo, W., Pantelides, S.T.: Advances in imaging and electron physics, vol. 153, p. 327. Elsevier Academic Press Inc., San Diego (2008)

    Google Scholar 

  7. Ishikawa, R., Mishra, R., Lupini, A.R., Findlay, S.D., Taniguchi, T., Pantelides, S.T., Pennycook, S.J.: Direct observation of dopant atom diffusion in a bulk semiconductor crystal enhanced by a large size mismatch. Phys. Rev. Lett. 113, 155501 (2014)

    Article  Google Scholar 

  8. Yang, Z.Q., Yin, L.C., Lee, J., Ren, W.C., Cheng, H.M., Ye, H.Q., Pantelides, S.T., Pennycook, S.J., Chisholm, M.F.: Direct observation of atomic dynamics and silicon doping at a topological defect in graphene. Angew Chem. Int. Edit. 53, 8908–8912 (2014)

    Article  Google Scholar 

  9. Komsa, H.P., Kotakoski, J., Kurasch, S., Lehtinen, O., Kaiser, U., Krasheninnikov, A.V.: Two-dimensional transition metal dichalcogenides under electron irradiation: defect production and doping. Phys. Rev. Lett. 109, 035503 (2012)

    Article  Google Scholar 

  10. Susi, T., Meyer, J.C., Kotakoski, J.: Manipulating low-dimensional materials down to the level of single atoms with electron irradiation. Ultramicroscopy 180, 163–172 (2017)

    Article  Google Scholar 

  11. Zheng, H.M., Rivest, J.B., Miller, T.A., Sadtler, B., Lindenberg, A., Toney, M.F., Wang, L.W., Kisielowski, C., Alivisatos, A.P.: Observation of transient structural-transformation dynamics in a Cu2S nanorod. Science 333, 206–209 (2011)

    Article  Google Scholar 

  12. Gerber, C., Lang, H.P.: How the doors to the nanoworld were opened. Nat. Nanotechnol. 1, 3–5 (2006)

    Article  Google Scholar 

  13. Binnig, G., Quate, C.F., Gerber, C.: Atomic force microscope. Phys. Rev. Lett. 56, 930–933 (1986)

    Article  Google Scholar 

  14. Binnig, G., Rohrer, H., Gerber, C., Weibel, E.: 7X7 reconstruction on Si(111) resolved in real space. Phys. Rev. Lett. 50, 120–123 (1983)

    Article  Google Scholar 

  15. Fukuma, T., Kobayashi, K., Matsushige, K., Yamada, H.: True molecular resolution in liquid by frequency-modulation atomic force microscopy. Appl. Phys. Lett. 86, 193108 (2005)

    Article  Google Scholar 

  16. Fukuma, T., Onishi, K., Kobayashi, N., Matsuki, A., Asakawa, H.: Atomic-resolution imaging in liquid by frequency modulation atomic force microscopy using small cantilevers with megahertz-order resonance frequencies. Nanotechnology 23, 135706 (2012)

    Article  Google Scholar 

  17. Nelson, C.T., Winchester, B., Zhang, Y., Kim, S.J., Melville, A., Adamo, C., Folkman, C.M., Baek, S.H., Eom, C.B., Schlom, D.G., Chen, L.Q., Pan, X.Q.: Spontaneous vortex nanodomain arrays at ferroelectric heterointerfaces. Nano Lett. 11, 828–834 (2011)

    Article  Google Scholar 

  18. Jia, C.L., Mi, S.B., Urban, K., Vrejoiu, I., Alexe, M., Hesse, D.: Atomic-scale study of electric dipoles near charged and uncharged domain walls in ferroelectric films. Nat. Mater. 7, 57–61 (2008)

    Article  Google Scholar 

  19. Jia, C.L., Nagarajan, V., He, J.Q., Houben, L., Zhao, T., Ramesh, R., Urban, K., Waser, R.: Unit-cell scale mapping of ferroelectricity and tetragonality in epitaxial ultrathin ferroelectric films. Nat. Mater. 6, 64–69 (2007)

    Article  Google Scholar 

  20. Borisevich, A.Y., Chang, H.J., Huijben, M., Oxley, M.P., Okamoto, S., Niranjan, M.K., Burton, J.D., Tsymbal, E.Y., Chu, Y.H., Yu, P., Ramesh, R., Kalinin, S.V., Pennycook, S.J.: Suppression of octahedral tilts and associated changes in electronic properties at epitaxial oxide heterostructure interfaces. Phys. Rev. Lett. 105, 087204 (2010)

    Article  Google Scholar 

  21. Jia, C.L., Mi, S.B., Faley, M., Poppe, U., Schubert, J., Urban, K.: Oxygen octahedron reconstruction in the SrTiO(3)/LaAlO(3) heterointerfaces investigated using aberration-corrected ultrahigh-resolution transmission electron microscopy. Phys. Rev. B 79, 081405 (2009)

    Article  Google Scholar 

  22. Borisevich, A., Ovchinnikov, O.S., Chang, H.J., Oxley, M.P., Yu, P., Seidel, J., Eliseev, E.A., Morozovska, A.N., Ramesh, R., Pennycook, S.J., Kalinin, S.V.: Mapping octahedral tilts and polarization across a domain wall in BiFeO(3) from Z-contrast scanning transmission electron microscopy image atomic column shape analysis. ACS Nano. 4, 6071–6079 (2010)

    Article  Google Scholar 

  23. Kim, Y.M., Kumar, A., Hatt, A., Morozovska, A.N., Tselev, A., Biegalski, M.D., Ivanov, I., Eliseev, E.A., Pennycook, S.J., Rondinelli, J.M., Kalinin, S.V., Borisevich, A.Y.: Interplay of octahedral tilts and polar order in BiFeO3 films. Adv. Mater. 25, 2497–2504 (2013)

    Article  Google Scholar 

  24. Kumar, A., Leonard, D., Jesse, S., Ciucci, F., Eliseev, E.A., Morozovska, A.N., Biegalski, M.D., Christen, H.M., Tselev, A., Mutoro, E., Crumlin, E.J., Morgan, D., Shao-Horn, Y., Borisevich, A., Kalinin, S.V.: Spatially resolved mapping of oxygen reduction/evolution reaction on solid-oxide fuel cell cathodes with sub-10 nm resolution. ACS Nano. 7, 3808–3814 (2013)

    Article  Google Scholar 

  25. Lin, W.Z., Li, Q., Belianinov, A., Sales, B.C., Sefat, A., Gai, Z., Baddorf, A.P., Pan, M.H., Jesse, S., Kalinin, S.V.: Local crystallography analysis for atomically resolved scanning tunneling microscopy images. Nanotechnology 24, 415707 (2013)

    Article  Google Scholar 

  26. He, Q., Woo, J., Belianinov, A., Guliants, V.V., Borisevich, A.Y.: Better catalysts through microscopy: mesoscale M1/M2 intergrowth in molybdenum-vanadium based complex oxide catalysts for propane ammoxidation. ACS Nano. 9, 3470–3478 (2015)

    Article  Google Scholar 

  27. He, Q., Ishikawa, R., Lupini, A.R., Qiao, L., Moon, E.J., Ovchinnikov, O., May, S.J., Biegalski, M.D., Borisevich, A.Y.: Towards 3D mapping of BO6 octahedron rotations at perovskite heterointerfaces, unit cell by unit cell. ACS Nano. 9, 8412–8419 (2015)

    Article  Google Scholar 

  28. Gai, Z., Lin, W.Z., Burton, J.D., Fuchigami, K., Snijders, P.C., Ward, T.Z., Tsymbal, E.Y., Shen, J., Jesse, S., Kalinin, S.V., Baddorf, A.P.: Chemically induced Jahn–Teller ordering on manganite surfaces. Nat. Commun 5, 4528 (2014)

    Article  Google Scholar 

  29. Vasudevan, R.K., Ziatdinov, M., Jesse, S., Kalinin, S.V.: Phases and interfaces from real space atomically resolved data: physics-based deep data image analysis. Nano Lett. 16, 5574–5581 (2016)

    Article  Google Scholar 

  30. Voyles, P.M.: Informatics and data science in materials microscopy. Curr. Opin. Solid State Mater. Sci. 21, 141–158 (2017)

    Article  Google Scholar 

  31. Buades, A., Coll, B., Morel, J.-M.: A non-local algorithm for image denoising. IEEE 2, 60–65 (2005)

    Google Scholar 

  32. Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising with block-matching and 3 D filtering. IEEE 6064, 606414 (2006)

    Google Scholar 

Download references

Authors’ contributions

SJ, SK, and SS conceived the original ideas for the algorithms. SS and CRS implemented the algorithms in python, analyzed the data, and generated the figures. SS and SK prepared the manuscript. MC, AB, SJ, NC, and GD provided the atomically resolved images. All authors read and approved the final manuscript.


This research was sponsored by the Division of Materials Sciences and Engineering, Basic Energy Sciences, US Department of Energy (SVK and SS). This research was conducted and partially supported (CRS) at the Center for Nanophase Materials Sciences, which is an US DOE Office of Science User Facility. This research was supported by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the US Department of Energy (S.J.). We gratefully acknowledge Ondrej Dyck’s assistance in identifying the filtering parameters for an atomically resolved image.

Notice: This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC0500OR22725 with the US Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for the United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

All the data and algorithms used in this paper are available on the public software repository-pycroscopy at

Ethics approval and consent to participate

Not applicable.


This research was sponsored by the Division of Materials Sciences and Engineering, Basic Energy Sciences, US Department of Energy. Part of this research was funded by Center for Nanophase Materials Sciences, which is an US DOE Office of Science User Facility. This research was supported by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the US Department of Energy.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Stephen Jesse.

Additional file

Additional file 1.

Additional figures.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Somnath, S., Smith, C.R., Kalinin, S.V. et al. Feature extraction via similarity search: application to atom finding and denoising in electron and scanning probe microscopy imaging. Adv Struct Chem Imag 4, 3 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Electron microscopy
  • Image denoising
  • Atom finding
  • Scanning transmission electron microscopy
  • Scanning tunneling microscopy
  • Singular value decomposition
  • Clustering
  • Statistics