Applying shot boundary detection for automated crystal growth analysis during in situ transmission electron microscope experiments
© The Author(s) 2016
Received: 15 September 2016
Accepted: 4 December 2016
Published: 3 January 2017
In situ scanning transmission electron microscopy is being developed for numerous applications in the study of nucleation and growth under electrochemical driving forces. For this type of experiment, one of the key parameters is to identify when nucleation initiates. Typically, the process of identifying the moment that crystals begin to form is a manual process requiring the user to perform an observation and respond accordingly (adjust focus, magnification, translate the stage, etc.). However, as the speed of the cameras being used to perform these observations increases, the ability of a user to “catch” the important initial stage of nucleation decreases (there is more information that is available in the first few milliseconds of the process). Here, we show that video shot boundary detection can automatically detect frames where a change in the image occurs. We show that this method can be applied to quickly and accurately identify points of change during crystal growth. This technique allows for automated segmentation of a digital stream for further analysis and the assignment of arbitrary time stamps for the initiation of processes that are independent of the user’s ability to observe and react.
Atomic-scale images of interfaces/defects obtained from scanning transmission electron microscopes (STEM) have long been used to provide insights into the structure–property relationships of materials—for example, observations of atomic-scale intermixing at interfaces in semiconducting/oxide heterostructures have helped understand the unique electronic and magnetic properties of these systems [1, 2]. The development and application of the STEM techniques used in these and other studies (for example, [3–9]) start from the premise that the atoms in the structure do not move. However, the systems that are being developed for many novel energy technologies are far removed from this paradigm—their intrinsic functionality is wholly dependent on the motion of atoms. For example, in Li-ion batteries, the charge/discharge cycle involves the mobility of ions across the electrolyte–electrode interface . To identify the key aspects of the complex processes and transients occurring in energy technologies, we must therefore develop in situ or operando methods that allow us to observe directly the functions of the system taking place during operation of the device.
Current image capture and analysis is performed manually—the user starts the camera and looks for any change to occur in the images as they are recorded. This is a time-consuming process that requires frames to be individually analyzed to identify regions of interest. However, this type of problem—the identification of where and when in a series of frames there is a change—lends itself to automation. Recent trends in digital and streaming media have rapidly introduced a number of techniques that can be used to automate the analysis of videos . These techniques have become increasingly important to streaming content providers looking to improve video search, indexing, and retrieval. In order to perform automated analysis of video, it is typically segmented into a hierarchy of shots. Shots refer to a group of frames that make up a single camera action. This process, referred to as shot boundary detection (SBD), allows for further analysis of digital media by regions of similar content. Computational efficiency is crucial to video segmentation in order to provide timely feedback. Previous work has been performed to evaluate the performance of segmentation techniques based on the video domain, type of transition, and type of detection feature [17–19]. This provides a baseline for choosing and evaluating suitable techniques for the type of data typically produced by STEM.
Video is typically stored and transmitted in a compressed format, such as one of the moving picture experts group (MPEG) standards. While these compressed formats are convenient for storage and streaming, they are computationally expensive to decompress for the purposes of analysis . In the case of STEM where image data are captured at a rate of hundreds or thousands of frames per second, the expense of decoding the video grows very quickly. In this case, performing analysis of the compressed stream directly becomes an attractive option to increase efficiency. In this paper, we demonstrate the use of performing analysis on the compressed data stream. The example we use is the identification of the electrodeposition of Li during charge/discharge of a Li battery. The example identifies the onset of the deposition/first nucleation stages of Li metal that can be correlated with a specific voltage value controlling these changes. The potential to extend this form of compressed analysis to also identify where in the frame the process take place first (adding a spatial coordinate to the temporal one) will also be discussed.
The in situ electrochemical STEM experiments were performed on a FEI 80–300 kV Cs-corrected Titan microscope equipped with Schottky field-emission electron source, a monochromator, and a CEOS hexapole spherical probe aberration corrector. For these experiments, the microscope was operated at 300 keV in both bight-field (BF) and high-angle annular dark-field (HAADF) modes (Fig. 1b; Additional file 1). All images were obtained after calibration of the dose, and the dose was kept below ≤0.3 electrons/Å2/s to avoid beam damage effects. All the electrochemical measurements were performed with a commercially available Poseiden 500 (Protochips Inc., Raleigh, NC, USA) microfluidic in situ electrochemical stage, which allows for simultaneous observation of dynamic electrochemical measurements in the liquid environment. Figure 1a illustrates an in situ liquid electrochemical scanning transmission (STEM) cell (ec-STEM) used for Li dendrite deposition/stripping in 1M LiPF6 in PC electrolyte with trace amount of water as shown in Fig. 1b. The in situ liquid ec-STEM cell is made from two silicon microchips containing 50-nm-thick silicon nitride membranes transparent to the electron beam and three Pt microelectrodes, aligned parallel to each other. The top electrochemical microchip has 500 nm SU-8 spacer and the bottom microchip has 150 nm gold spacer giving a nominal spacing of 650 nm. The electron beam passes through the electrolyte and two SixNy membranes allowing for recording the process of the Li dendrite growth and dissolution in real-time at high spatial and temporal resolution during cyclic voltammetry or galvanostatic charge/discharge process in both TEM and STEM modes at 2–3 µL/min flow rate. All the cycling voltammetry experiments were conducted with a Gamry Reference 600 potentiostat, and synchronized with simultaneous recording of the video sequence of Li dendrite deposition/dissolution process at the Pt electrode from LiPF6 in PC electrolyte in the in situ ec-STEM cell.
Many techniques exist that aim to directly handle compressed video streams for quick and efficient processing. These techniques rely on the reduced signal and coefficients produced as part of the compression process . The coefficients generated directly relate to the original uncompressed signal and can be used to detect transitions in a video. While there are numerous ways for video frames (scenes) to transition, they can typically be categorized as either a cut or gradual transition . A cut occurs when a scene is ended in one frame and a new scene begins in the next frame. Gradual transitions are a change between two scenes where the content of one shot is slowly replaced with that of the next over several frames. Both of these types of shot boundaries can come in many different forms. In the case of crystal growth detection, we expect that after the initial nucleation event (a cut scene), a gradual transition will then take place as the material grows (this makes the gradual transition the most common technique and hence the primary focus of this work). An added complication for this type of experiment is that the object of focus (here size and shape of Li grains) tends to change over the course of several frames as the experiment is performed. With these types of gradual transitions, it is important to consider differences over a window of time. The window size varies depending on the speed and type of the transition. A general window size can be chosen to fit the transition type as well as the type of data observed.
The MPEG standard provides a set of guidelines for video compression and transmission of video at a variable bitrate. The standard makes use of two techniques to achieve compression: a block-based motion compensation and the discrete cosine transform (DCT) . These techniques take advantage of the spatial and temporal redundancy within a sequence of frames to reduce the amount of data necessary to reconstruct the video. The foundational component of a video is a frame. A frame is an image of a width and height that represents one step in a video. These frames often contain regions of similar visual content within themselves. Storing the values for each individual pixel in an image is costly and unnecessary. To eliminate these redundant data, the image is divided into small blocks called macroblocks (MB), to which the DCT is applied. The transformation produces a matrix of coefficients that represent each block of data. In order to further minimize the amount of data stored, an additional technique called quantization is applied . Quantization reduces the transformation coefficient data to the smallest possible amount necessary to reconstruct each block. This additional step is designed to limit the frequencies stored for the image while reducing many of the frequency components to zero for optimal compression.
A video is composed of a series of frames, which when played back at a certain frame rate provide a visually fluid motion. Frames in a video typically have common data between one or more frames. To eliminate the need to store this content for individual frames, special frames called prediction frames are used . These prediction frames (P-frames) reference other frames or MB within a frame which can be found before or after the predicted frame. Frames that do not reference other frames are referred to as intra-coded frames (I-frames). Frame references are calculated during a phase of the encoding process called motion estimation. The result of the motion estimation step is a model called the motion vector that describes the offset of coordinates shared between prediction and reference frames .
Types of video transitions
The videos used here have been encoded using the MPEG-2 standard. The MPEG-encoding process generates a number of statistics for each frame of a video. The encoding information can be accessed by partially decoding the compressed video. Partially decoding the video eliminates the need to calculate the original frame pixel values. The inverse transform performed for full decoding has been found to consume as much 40% of total decoding time . Therefore, partial decoding results in a significant time savings over other methods. For the purposes of this paper, the FFmpeg library  is used to process and decode video streams. Shot boundary detection in the compressed domain makes use of features derived from the reduced signal to find change. Two types of features that can be used in change detection are frame and motion information . The frame information refers to the type of frame encoding, such as I-frame or P-frame. This is important for decision making due to the different characteristics of each type of frame. Motion information includes the motion vector as well as MB motion features, such as the sum of variance (SoV). The SoV of each MB is used by the encoder to measure the amount of motion within the MB. This MB motion information is used by the encoder not only determines how the MB will be encoded, but also serves as an indicator of the amount of change occurring within each block.
With the encoded video, the frame and motion information can be extracted. Separate analysis of frames based on the frame type is carried out to take advantage of characteristics specific to each type. As previously discussed, predicted frames contain motion information which varies in size depending on the degree of change. Compared to P-frames, intra-coded frames (I-frames) have minimal motion information due to their limited relation to other frames. Motion information can be used to characterize the amount of change occurring within a frame. Scenes will have different motion levels, but motion information will remain similar within a scene . The measure of the level and rate of change is used to detect change points within a sequence of frames. There are multiple types of motion information available for each frame. One type of motion information is the MB SoV, which measures the total motion within a MB . Another type of motion information is the motion vector, which has been shown to be an effective indicator of change between a series of frames. By using the SoV and motion information, these measures can be used as an indicator of how similar a predicted frame is to its reference frame.
Results and discussion
The results in this section demonstrate the application of automated change detection techniques to STEM videos. The sample videos are discussed, including the challenges presented in the videos and encoding parameters. Next the algorithm applied to the videos is explained. This covers any assumptions made about the data as well as any defined parameters. Finally, the results of the algorithm applied to the sample videos are shown.
Before applying automated analysis, it is important to discuss the video-encoding parameters. These parameters must be carefully chosen so that the encoding algorithm produces output appropriate for analysis. The two sample videos in this case were encoded with the FFmpeg multimedia library. This library allows for full control over the video-encoding process through a series of parameters. The parameters chosen for this case encode the video as MPEG-2 using a constant frame rate (CFR). As opposed to CRF, variable frame rates (VFR) aim to eliminate similar content between frames in order to decrease the amount of data stored. Using CFR in this case reduces additional processing and allows for a fixed video quality level.
Only the P-frames are considered in this case due to the inherent lack of motion information found in I-frames. Two visually distinguishable level changes occur in this sequence. Regions of static content remain roughly level, while rapid level changes indicate the presence of a change.
In order to detect regions of change, it is necessary for background noise to be low so that transitions are easily distinguishable. To further reduce noise, we square the difference signal. Squaring the difference signal emphasizes the change while suppressing low-frequency noise. The result provides an absolute difference between frames. An example of the noise reduction compared to the original difference signal is shown in Fig. 6. The peaks in the difference signal make it possible to distinguish where transitions occur.
For each pair of adjacent points, the relevance measure R is calculated. This measures the total change contributed by each of the components. The net change, denoted as ∆y, is the change in distance between points. Since the points measured by the sum of squared differences are the distance from the origin, the net change is the difference between the point values. Large distances between points indicate a large amount of change over this time. The angle is measured between the vector formed by the two points and the horizontal axis. In areas with little change, the sum of squared difference will be nearly flat which will result in angle near zero. For regions of large change, the signal increases rapidly resulting in angles near 90°.
Change (background replacement)
Identified region boundaries
Points of change can be grouped together to form transition regions. These regions are formed by grouping together points of change occurring near one another. For this instance, changes found within ten frames of another change are used to form the region.
The technique described in this paper builds upon research in the area of shot boundary detection in the compressed domain. This analysis technique was chosen due to its execution speed and overall performance in detecting transitions. Other techniques exist which rely on methods such as machine learning, frame-based color histograms, and luminance values. While these techniques may have similar effectiveness in detecting changes, their runtime efficiency is significantly lower. Recent comparisons of techniques show that detection in the compressed domain can be done in less than real-time, while others require much more computational time .
We have demonstrated that video analysis techniques used for shot boundary detection can be used to identify changes in the movies showing Li deposition/dissolution process in the in situ ec-STEM cell. Shot boundary detection offers a wide variety of techniques that can be applied to find points of change for different types of transitions and under different conditions. These methods allow for direct operation on compressed video without the need for full-frame decoding, which reduces the computational complexity. Metrics based on differences in motion between frames in MPEG video in the compressed domain are used. A metric is developed based on the total amount of change occurring at each point, which is used to identify transition regions. Experimental results show positive results for identifying the points where changes occur. These techniques could be applied to find transition points, which can aid in manual interpretation of the results, or potentially be applied to direct automatic frame capture.
The video-encoding step produces a lossy signal which is typically avoided in the microscopy community. As such, this technique is strictly used as an automated means of detection. Future work may consider applying different compression algorithms, such as the latest H.264 standards. It may also be of interest to investigate other shot boundary detection algorithms that are more computationally expensive given that the analysis is a step performed independent of the experiment.
WAM wrote the majority of the manuscript and provided algorithmic background and approaches to automating detection of events in (S)TEM images. BLM performed the in situ battery experiments designed and implemented in collaboration with NDB. BLM assisted with in validating the results of the compression analysis of the EM data. JT conceived the analytical approach in the paper using compression meta-data for event detection and characterization, designed the experimental approach used in the paper, also reviewed, edited and approved the manuscript. JT and RG developed the technique described in this paper, RG applied the technique to the in situ electrochemical (S)TEM videos, acquired the video compression algorithm coefficients, formatted and annotated the figures, and helped edit the manuscript. All authors read and approved the final manuscript.
This work was supported in part by the Analysis in Motion (AIM) and Chemical Imaging (CII) Laboratory Directed Research and Development (LDRD) Initiatives at Pacific Northwest National Laboratory (PNNL). PNNL is a multi-program national laboratory operated by Battelle for the U.S. Department of Energy (DOE) under Contract DE-AC05-76RL01830. A portion of the research was performed using the Environmental Molecular Sciences Laboratory (EMSL), a national scientific user facility sponsored by the Department of Energy’s Office of Biological and Environmental Research and located at PNNL. In addition, the electrochemical aspects of this work were primarily supported by Joint Center for Energy Storage Research (JCESR), an Energy Innovation Hub funded by the Department of Energy, Office of Science, Basic Energy Sciences.
The authors declare that they have no competing interests.
Availability of data and materials
Image/video data used in this methods paper can be made available by contacting the authors, other support including possible software sharing will require signed agreements, contact authors.
Consent for publication
Copyright will be handled by PNNL legal for questions contact Debra Hamilton +1 (509) 372-6867.
Ethics approval and consent to participate
NA/not human subject work.
U.S. Department of Energy (DOE) Contract DE-AC05-76RL01830.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Jesson, D.E., Pennycook, S.J., Baribeau, J.M.: Phys. Rev. Lett. 66, 750–753 (1991)View ArticleGoogle Scholar
- Ortalan, V., Uzun, A., Gates, B.C., Browning, N.D.: Nat. Nanotechnol. 5, 843–847 (2010)View ArticleGoogle Scholar
- Evans, J.E., Jungjohann, K.L., Browning, N.D., Arslan, I.: Controlled growth of nanoparticles from solution with in situ liquid transmission electron microscopy. Nanoletters 11, 2809–2813 (2011)View ArticleGoogle Scholar
- Williamson, M.J., Tromp, R.M., Vereecken, P.M., Hull, R., Ross, F.M.: Dynamic microscopy of nanoscale cluster growth at the solid–liquid interface. Nat. Mater. 2, 532–536 (2003)View ArticleGoogle Scholar
- de Jonge, N., Ross, F.M.: Electron microscopy of specimens in liquid. Nat. Nanotechnol. 6, 695–704 (2011)View ArticleGoogle Scholar
- de Jonge, N., Peckys, D.B., Kremers, G.J., Piston, D.W.: Electron microscopy of whole cells in liquid with nanometer resolution. Proc. Natl. Acad. Sci. 106, 2159–2164 (2009)View ArticleGoogle Scholar
- Mehdi, B.L., Stevens, A., Qian, J., Park, C., Henderson, W.A., Xu, W., Zhang, J.-G., Mueller, K.T., Browning, N.D.: The impact of Li grain size on Coulombic efficiency in Li batteries. Sci. Rep. 6, 34267 (2016). doi:10.1038/srep34267 View ArticleGoogle Scholar
- Mehdi, B.L., Qian, J., Nasybulin, E., Park, C., Welch, D.A., Faller, R., Mehta, H., Henderson, W.A., Xu, W., Wang, C.M., Evans, J.E., Liu, J., Zhang, J.-G., Mueller, K.T., Browning, N.D.: Observation and quantification of nanoscale processes in lithium batteries operando electrochemical (S)TEM. Nano Lett. 15, 2168–2173 (2015)View ArticleGoogle Scholar
- Zheng, H.M., Smith, R.K., Jun, Y.W., Kisielowski, C., Dahmen, U., Alivasatos, A.P.: Observation of single colloidal platinum nanocrystal growth trajectories. Science 324, 1309–1312 (2009)View ArticleGoogle Scholar
- Li, D.S., Nielsen, M.H., Lee, J.R.I., Frandsen, C., Banfield, J.F., De Yoreo, J.J.: Direction-specific interactions control crystal growth by oriented attachment. Science 336, 1014–1018 (2012)View ArticleGoogle Scholar
- Gu, M., Parent, L.R., Mehdi, L., Unocic, R., McDowell, M., Sacci, R., Xu, W., Connell, J., Xu, P., Abellan, P., Chen, X., Yaohui, Z., Perea, D., Lauhon, L., Zhang, J., Liu, J., Browning, N.D., Cui, Y., Arslan, I., Wang, C.: Demonstration of an electrochemical liquid cell for operando transmission electron microscopy observation of the lithiation/delithiation behavior of Si nanowire battery anodes. Nano Lett. 13, 6106–6112 (2013)View ArticleGoogle Scholar
- Woehl, T.J., Park, C., Evans, J.E., Arslan, I., Ristenpart, W.D., Browning, N.D.: Direct observation of abnormal Ostwald ripening in nanoparticle ensembles caused by aggregative growth. Nano Lett. 14, 373–378 (2014)View ArticleGoogle Scholar
- Sutter, E., Jungjohann, K.L., Bliznakov, S., Courty, A., Maisonhaite, E., Tenney, S., Sutter, P.: In situ liquid-cell electron microscopy of silver–palladium galvanic replacement reactions on silver nanoparticles. Nat. Commun. 5, 4946 (2014)View ArticleGoogle Scholar
- White, E.R., Singer, S.B., Augustyn, V., Hubbard, W.A., Mecklenburg, M., Dunn, B., Regan, B.C.: In situ transmission electron microscopy of lead dendrites and lead ions in aqueous solution. ACS Nano 6, 6308–6317 (2012)View ArticleGoogle Scholar
- Abellán, P., Park, C., Mehdi, B.L., Xu, W., Zhang, Y., Parent, L.R., Gu, M., Arslan, I., Zhang, J., Wang, C.M., Evans, J.E., Browning, N.D.: Probing the degradation mechanisms in electrolyte solutions for Li-ion batteries by in situ TEM. Nano Lett. 14, 1293–1299 (2014)View ArticleGoogle Scholar
- Cotsaces, C., Nikolaidis, N., Pitas, I.: Video shot detection and condensed representation: a review. Sig. Process. Mag. IEEE 23(2), 28–37 (2006)View ArticleGoogle Scholar
- Yuan, J., Wang, H., Xiao, L., Zheng, W., Li, J., Lin, F., Zhang, B.: A formal study of shot boundary detection. IEEE Trans. Circuits Syst. Video Technol. 17(2), 168–186 (2007)View ArticleGoogle Scholar
- Smeaton, A.F., Over, P., Doherty, A.R.: Video shot boundary detection: seven years of TRECVid activity. Comput. Vis. Image Underst. 114(4), 411–418 (2010)View ArticleGoogle Scholar
- Gargi, U., Kasturi, R., Strayer, S.H.: Performance characterization of video-shot-change detection methods. IEEE Trans. Circuits Syst. Video Technol. 10(1), 1–13 (2000)View ArticleGoogle Scholar
- Lefèvre, S., Vincent, N.: Efficient and robust shot change detection. J. Real-Time Image Proc. 2(1), 23–34 (2007)View ArticleGoogle Scholar
- Mandal, M.K., Idris, F., Panchanathan, S.: A critical evaluation of image and video indexing techniques in the compressed domain. Image Vis. Comput. 17(7), 513–529 (1999)View ArticleGoogle Scholar
- Lienhart, R.: Comparison of automatic shot boundary detection algorithms. SPIE Conf. Storage Retr. Image Video Databases 3656, 290–301 (1999)View ArticleGoogle Scholar
- Gamaz, N., Huang, X., Panchanathan, S.: Scene change detection in MPEG domain. In: 1998 IEEE Southwest Symposium on Image Analysis and Interpretation (1998)Google Scholar
- Sikora, T.: MPEG digital video-coding standards. Sig. Process. Mag. IEEE 14(5), 82–100 (1997)View ArticleGoogle Scholar
- Puri, A.: Video coding using the MPEG-2 compression standard. Vis. Commun. Image Proc. 2094, 1701–1713 (1993). doi:10.1117/12.157930 Google Scholar
- Le Gall, D.: MPEG: a video compression standard for multimedia applications. Commun. ACM 34(4), 46–58 (1991)View ArticleGoogle Scholar
- Fernando, W.A.C., Loo, K.-K.: Abrupt and gradual scene transition detection in MPEG-4 compressed video sequences using texture and macroblock information. In 2004 International Conference on Image Processing, 2004. ICIP ‘04 (2004)Google Scholar
- Lian, S.: Automatic video temporal segmentation based on multiple features. Soft. Comput. 15(3), 469–482 (2011)View ArticleGoogle Scholar
- Lee, S.-W., Kim, Y.-M., Choi, S.W.: Fast scene change detection using direct feature extraction from MPEG compressed videos. IEEE Trans. Multimed. 2(4), 240–254 (2000)View ArticleGoogle Scholar
- FFmpeg. https://www.ffmpeg.org/ (2015)
- Lelescu, D., Schonfeld, D.: Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream. IEEE Trans. Multimed. 5(1), 106–117 (2003)View ArticleGoogle Scholar
- Meng, J., Juan, Y., Chang, S.-F.: Scene change detection in an MPEG-compressed video sequence. In: IS&T/SPIE’s Symposium on Electronic Imaging: Science & Technology. International Society for Optics and Photonics (1995)Google Scholar
- Sun, Z., Liu, J., Sun, J., Sun, X., Ling, J.: A motion location based video watermarking scheme using ICA to extract dynamic frames. Neural Comput. Appl. 18(5), 507–514 (2009)View ArticleGoogle Scholar
- Calic, J., Izuierdo, E.: Efficient key-frame extraction and video analysis. In: Proceedings of International Conference on Information Technology: Coding and Computing, 2002 (2002)Google Scholar