Abstract
For centuries, the scientific discovery process has been based on systematic human observation and analysis of natural phenomena1. Today, however, automated instrumentation and large-scale data acquisition are generating datasets of such large volume and complexity as to defy conventional scientific methodology. Radically different scientific approaches are needed, and machine learning (ML) shows great promise for research fields such as materials science2,3,4,5. Given the success of ML in the analysis of synthetic data representing electronic quantum matter (EQM)6,7,8,9,10,11,12,13,14,15,16, the next challenge is to apply this approach to experimental data—for example, to the arrays of complex electronic-structure images17 obtained from atomic-scale visualization of EQM. Here we report the development and training of a suite of artificial neural networks (ANNs) designed to recognize different types of order hidden in such EQM image arrays. These ANNs are used to analyse an archive of experimentally derived EQM image arrays from carrier-doped copper oxide Mott insulators. In these noisy and complex data, the ANNs discover the existence of a lattice-commensurate, four-unit-cell periodic, translational-symmetry-breaking EQM state. Further, the ANNs determine that this state is unidirectional, revealing a coincident nematic EQM state. Strong-coupling theories of electronic liquid crystals18,19 are consistent with these observations.
Similar content being viewed by others
Main
EQM research concentrates on exotic electronic phases that emerge when electrons interact so strongly that they lack a definite momentum. These electrons often self-organize into complex new states of EQM, such as electronic liquid crystals18,19, high-temperature superconductors20,21, fractionalized electronic fluids and quantum spin liquids. In this field, vast experimental datasets have emerged—for example, from real-space visualization of EQM using spectroscopic imaging scanning tunnelling microscopy (SISTM)17, from momentum-space visualization of EQM using angle-resolved photoemission spectroscopy, or from modern X-ray22 and neutron scattering. The challenge is to develop ML strategies that enable scientific discovery using large and complex experimental data structures from EQM experiments.
An excellent example is the electronic structure of the CuO2 plane in copper oxide compounds supporting high-temperature superconductivity20 (Fig. 1a). With one electron per Cu site, strong Coulomb interactions produce charge localization in an antiferromagnetic Mott insulator state. Removing p electrons (adding p holes) per CuO2 plaquette generates the ‘pseudogap’ phase20, which exhibits a strongly depleted density of electronic states N(E) for energies \(\left|E\right| < {\Delta }_{1}\), where Δ1 is the characteristic pseudogap energy scale that emerges for T < T∗(p) (Fig. 1a). Although the pseudogap phase has defied microscopic identification for decades20, recently it has been reported that rotational and translational symmetry are spontaneously broken in this phase. Rotational symmetry breaking is referred to as a nematic state18,19,23,24 and occurs at a wavevector of Q = 0 as the breaking of 90°-rotational (C4) symmetry at T < T∗(p) (Fig. 1a). This presents a conundrum because, in theory, ordering at Q = 0 cannot open an energy gap in the electronic spectrum. The translational symmetry breaking or density wave (DW) state, which should open such an energy gap, is detected using SISTM visualization17 and X-ray scattering22. It consists of periodic spatial modulations of electronic structure with finite wavevector Q, and thus with periodicity λ = 2π/|Q|, that occur within the pseudogap phase (Fig. 1a). A key challenge is to identify the correct microscopic theory for the DW state (see Methods section ‘Strong-coupling DW states’) and to find its relationship (if any) with both the nematic state and the pseudogap.
A DW state with wavevector Q is described by the spatially modulating function A(r) = D(r)cos[Q∙r + φ0(r)], where A(r) is the density amplitude, φ0(r) represents the effects of disorder and topological defects, and D(r) is the DW form factor symmetry. For a tetragonal crystal, an s-symmetry form factor remains unchanged under 90° rotations, whereas a d-symmetry form factor changes sign, as observed in copper oxides25. One theoretical approach to understanding a DW state is based on conventional electrons with well-defined wave momentum p(E) = ħk(E) (ħ is the reduced Planck constant). DW states can then appear at a wavevector of Q = ki(E = 0) − kf(E = 0) if many (ki(0), kf(0)) pairs are connected by the same wavevector Q—that is, nested (red arrow in Fig. 1b). Under these circumstances, Q should usually be incommensurate (Fig. 1b). Alternatively, strongly interacting particle-like electrons may have well-defined positions in real space, being fully localized in the Mott insulator phase or self-organized into electronic liquid-crystal states18,19,24. For copper oxides, such states are often predicted18,19,24 to exhibit periodic charge density modulations that are unidirectional and crystal-lattice-commensurate, with wavelength λ = 4a0, where a0 is the Cu–Cu distance, or wavevector \({\boldsymbol{Q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0.25,0\right)\)oriented along the Cu–O–Cu axis (Fig. 1c, Methods section ‘Strong-coupling DW states’). Such lattice-commensurate charge modulations in position-based theories (Fig. 1c) are expected to be robust against changes with electron density p and electron energy, whereas those associated with the geometry of the Fermi surface in momentum-based theories (Fig. 1b) are expected to evolve continuously with p.
A central challenge has therefore been to determine whether the electronic-structure modulations in hole-doped CuO2 (for example, Fig. 1d, e) are lattice-commensurate, unidirectional and with specific periodicity, or if they evolve continuously with electron density and energy. However, because of their inherent limitations, it has not been possible to discriminate these position- and momentum-based theoretic perspectives by using traditional analysis techniques. First, owing to the extreme disorder observed in copper oxide EQM images17 (Fig. 1d) or the broad line-widths detected simultaneously in reciprocal space22, theory demonstrates that conventional Fourier analysis is fundamentally limited26,27 in determining the exact symmetries of the EQM state. Second, when such complicated electronic-structure motifs exist at the atomic scale in real space17, Fourier analysis spreads all of that information throughout reciprocal space. Consequently, the customary Fourier analysis of SISTM and X-ray data that focuses on a single intensity peak, which has long reported incommensurate modulations that evolve continuously with p in the range 0.22 ≲ Q(2π/a0) ≲ 0.3 (refs 17,22), disregards much information. Specifically, the key insights contained in atomic-scale electronic-structure motifs (Fig. 1d), discommensurations28 and topological defects (Methods section ‘Fourier transform analysis of EQM images: disorder and information loss’) are all discarded. By contrast, ML analysis of EQM images holds great promise because it avoids this information loss and analyses the complete image array objectively.
High-data-volume imaging studies of EQM (for example, Fig. 1e) use SISTM, a technique for visualizing N(E) with subatomic resolution and a crystal-lattice register17. The resulting image array for a given sample is built from measurements of the differential electron tunnelling conductance dI/dV(r, V) = g(r, V) (I is the current) between the microscope tip and the sample, obtained at a square array of locations r and in a range of voltage differences V between the tip and the sample. For technical reasons, images of Z(r, V) = g(r, +V)/g(r, −V), which accurately represent the spatial symmetry of electronic structure but avoid systematic errors17, are most frequently used. Although Fourier analysis of Z(r, V) to yield Z(q, V) is an obvious approach to studying the EQM modulation wavevectors17,22, it has severe limitations, as discussed above. Therefore, identifying the fundamental broken-symmetry EQM state from an array of such Z(r, E = eV) images (for example, Fig. 1e), where e is the electron charge, is a paradigmatic challenge for ML techniques.
Here we introduce a specific ML approach that uses ANNs to achieve hypothesis testing with EQM image arrays. The technique is based on a supervised ML with ANN–human collaboration. Its goals are to automatically search experimental EQM image arrays (for example, Fig. 1e), recognize spatial modulations in a variety of distinct categories, identify their fundamental periodicity and lattice register throughout an image, and distinguish whether the modulations are unidirectional or bidirectional. The first stage is the generation of sets of ANN training images, each labelled by a hypothesis (the different DW modulations to be discerned). Here, we test four hypotheses associated with four distinct types of ideal periodic modulations, all with a d-symmetry form factor, and with fundamental wavelengths λ = 4.348a0, 4.000a0, 3.704a0 and 3.448a0. We note that only category 2 represents a commensurate pattern with λ = 4a0. Four training sets are then generated for categories C = 1–4 using identical procedures, in which we introduce specific forms of heterogeneity designed to mimic the noise, intrinsic disorder and topological defects of the experimental data (Fig. 2a, Methods section ‘Generation of training image sets’). In all of these simulated training image sets, heterogeneity disrupts the long-range-ordered patterns in real space, as shown for a typical training image in Fig. 2b. It also scrambles the peaks in the d-symmetry Fourier transforms17 of the training images, rendering them broad and chaotic (Fig. 2c). In the second stage, we establish an ANN architecture that trains well with these training image sets. During training, the parameters of the ANN are adjusted iteratively to minimize a cross-entropy cost function29. Stochastic gradient descent, along with backpropagation30, is used for lowering the cost function. The training is complete and all parameters of each ANN are set when the cross-entropy31 saturates. Each finalized ANN generally has an accuracy of >99% when tested on validation images (see Fig. 2d and Methods section ‘Configuration of ANN’). The ANN design is a fully connected feed-forward network with a single hidden layer (Fig. 3). Statistical reliability of this ML system for different network architectures and different initial conditions is achieved by training 81 distinct ANNs in parallel with the same training image set.
Our ANN ensemble is first used to perform hypothesis tests for the experimental EQM image arrays as a function of electron density. The measured Z(r, E) electronic-structure images are from samples of the hole-doped copper oxide Bi2Sr2CaCu2O8 in the range 0.06 ≤ p ≤ 0.20. Obviously, disorder and complexity of EQM abound in Z(r, Δ1) throughout this electron-density range (black double-headed arrow in Fig. 1a) and are equally apparent in the broad fluctuating peaks around \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({Q}_{x}\pm {\rm{\delta }}{Q}_{x},{\rm{\delta }}{Q}_{y}\right)\) and \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({\rm{\delta }}{Q}_{x},{Q}_{y}\pm {\rm{\delta }}{Q}_{y}\right)\) in Z(q, Δ1) (see Fig. 3a, b). Definite fundamental periodicities seem undetectable in these Z(r, Δ1) data. The set of experimental Z(r, E) image arrays have a field of view of 16 nm × 16 nm and are measured in a sequence of independent experiments on distinct crystals with p ≈ 0.06, 0.08, 0.085, 0.14, 0.20 (critical temperature of Tc = 20, 45, 50, 74, 82 K, respectively). The ANNs analyse these Z(r, Δ1) images as a function of p, focusing on the pseudogap energy E = Δ1(p) because copper oxide EQM symmetry breaking emerges at this energy17,25. Figure 4a–e shows the actual Z(r, Δ1) images provided to the trained ANN system and Fig. 4f–j shows their d-symmetry Fourier transforms. The ANNs succeed with high reliability in discriminating and identifying the periodic motifs for all of these images (Methods section ‘Validation and benchmarking’). In Fig. 4k–o we show the response of the ANNs as the probability P(C) of the input EQM image being identified in category C. Here, the ANNs reveal that, on average, the phenomenology of the training images with C = 2 and λ = 4a0 has the highest probability of being recognized within the Z(r, Δ1) image array, but only for electron densities 0.06 ≤ p ≤ 0.14. Thus, the ANNs identify a predominant translational symmetry breaking, which occurs commensurately with the specific wavelength λ = 4a0 (Fig. 4a–d). Overall, the ANNs determine that the identical, commensurate, 4a0-period electronic-structure modulations are hidden in all of the E ≈ Δ1 EQM images from the 0.06 ≤ p ≤ 0.14 area of the CuO2 phase diagram.
A second key physics issue is the energy dependence within a Z(r, E) image array. Quasiparticle scattering interference17 (QPI) occurs when an impurity atom scatters wave-like states ki(E) into kf(E), resulting in quantum inference at wavevectors Qif = ki(E) − kf(E) and generating modulations of N(r, E) or its Fourier transform N(Qif, E). QPI is a physical phenomenon distinct from a DW state because, whereas the modulation wavevectors of QPI evolve rapidly with E, they do not for a DW state. Therefore, the ANNs explore a Bi2Sr2CaCu2O8 Z(r, E) array of 16 nm × 16 nm EQM images that are measured in a sequence of independent experiments at distinct electron energy of E = 66, 96, 126, 150 meV on the same crystal with P = 0.08. Figure 5a–d shows this Z(r, E) image set that is input to the same ANN system. EQM complexity in the identical field of view now evolves rapidly with electron energy because the images are dominated by QPI. Similarly, Fig. 5e–h shows the d-symmetry Fourier transforms Z(q, E) from Fig. 5a–d, which exhibit broad fluctuating peaks that evolve rapidly with electron energy, as expected in QPI. Well-defined fundamental periodicities are indiscernible in these Z(r, E) (Fig. 5a–d) and Z(q, E) (Fig. 5e–h) data. However, Fig. 5j–l demonstrates that the ANN suite finds the hypothesis category with the highest recognition probability to be again C = 2, which means that the predominant modulations have a period of 4a0 for all energies exceeding 66 meV (Fig. 5b–d). Again, despite intense masking by QPI phenomena, the ANNs recognize commensurate, 4a0-period DW modulations and reveal that these occur predominantly near the pseudogap energy E = Δ1.
A third ANN discovery in Fig. 5i–l is that the commensurate, 4a0-period modulations exhibit a strong preference for breaking symmetry under 90° rotations (C4). This is revealed because the ANN array yields up to three times higher probability in that specific category (C = 2) when the data are input in the x orientation (red) compared to when identical data are provided in the y orientation (yellow) (Fig. 5j–l). Although the extreme nanoscale disorder masks any directional preference in the images of Fig. 5a–d, the DW modulations occur primarily along the x axis of the CuO2 plane. ANN analysis of the energy dependence of this complete Z(r, E) image array, shown in Extended Data Fig. 1, further confirms that the appearance of this nematicity (Fig. 5i–l) occurs when approaching the pseudogap energy scale, which is |Δ1| ≈ 80 meV. Thus, the ANNs find that a nematic state emerges at the pseudogap energy because of highly disordered, but unidirectional, 4a0-period modulations. This discovery strongly implies that the nematic electronic structure of CuO2 is a vestigial nematic state32 whose characteristic energy gap is the pseudogap. Advanced theory32 predicts that a unidirectional DW that is reduced by disorder to extremely short coherent lengths should generate a nematic state dubbed the vestigial nematic state. Although experimental validation of this hypothesis is considered impossible using conventional Fourier transform techniques26,27, here it is demonstrably achievable by an ANN array (Fig. 5, Extended Data Fig. 1). Existence of a vestigial nematic state in carrier-doped CuO2 would provide a direct, internally consistent link between a nematic state and the unidirectional, 4a0-period DW modulations, whose energy gap is the pseudogap (Fig. 4). The evidence of the presence of a vestigial nematic emerged unexpectedly from ANN analysis of experimental image arrays not optimized for such studies; determining a complete p dependence with the ANN suite will require new measurements of appropriately optimized image arrays.
To summarize, we have developed and demonstrated a new general protocol for ML-based identification of symmetry-breaking ordered states in electronic-structure image arrays obtained from EQM visualization experiments. Our ANNs are trained to learn the defining motifs of each category, including its topological defects, and to recognize those motifs in real EQM image arrays (Fig. 1e). Despite the complexity of the hole-doped Mott insulator state, instrument distortion and noise, and the intense electronic disorder of the EQM image arrays studied (Figs. 1d, 3a, b, 4, 5), the ANNs repeatedly and reliably discover the predominant features of a specific ordered state. The signature of this state for 0.06 ≤ p ≤ 0.14 is an electronic-structure modulation that is lattice-commensurate, unidirectional, and has a d-symmetry form factor and a period of λ = 4a0 (Fig. 4). The predominance of this phenomenology (Fig. 4) implies that a strong-coupling position-based theory is central to these broken-symmetry states of carrier-doped CuO2. The ANN array also reveals that it is the λ = 4a0 DW modulations at the pseudogap energy that break the global rotational symmetry to generate a nematic state (Fig. 5, Extended Data Fig. 1). This implies that the pseudogap region of the CuO2 phase diagram (Fig. 1a) contains a vestigial nematic state. In addition, the demonstration that ANNs can process and identify specific broken symmetries of highly complex image arrays from non-synthetic experimental EQM data is a milestone for general scientific technique. Overall, these advances open the prospect of additional ML-driven scientific discovery in EQM studies.
Methods
Strong-coupling DW states
Real-space (position-based) strong-coupling theories for carrier-doped CuO2 predict lattice-commensurate, unidirectional, DWs in various electronic degrees of freedom. Among them are two candidate states that can both lead to 4a0-period modulations of the charge density and of the local density of states N(r) with wavevector Q = (2π/(4a0), 0). The first is a 4a0-period modulation in the charge density of the two oxygen sites Ox and Oy within each unit cell, with a relative phase of π between them. This is a charge DW with a d-symmetry form factor, which exists as a fundamental ordered state. The second is an 8a0-period modulation of the d-wave Cooper pair density, which can exist as a fundamental ordered state and induces a 4a0-period modulation in the charge density. These two distinct fundamental states are shown schematically in Extended Data Fig. 2a, b.
Fourier transform analysis of EQM images: disorder and information loss
A Fourier transform (FT) of two-dimensional image data is a linear transformation of the data. All the information in the original image appears in the full, complex FT throughout reciprocal space. Importantly, when there are complicated local patterns or motifs of short-range order at the atomic scale in real space, that information gets spread over all of reciprocal space. This is because what is extremely local in real space becomes completely delocalized in reciprocal space. However, in the traditional mode of FT analysis, a compact region in reciprocal space is selected as a region of importance because the intensity is peaked at that point. Crucially, this approach discards abundant information throughout reciprocal space away from the peak-intensity wavevector. For hole-doped CuO2, the real-space electronic structure at the atomic scale is uniquely complex (Fig. 1). For instance, one always finds that the scanning tunnelling microscopy (STM) image whose FT peak intensity occurs away from Q = 0.25 (see Extended Data Fig. 3a) hosts distinct local motifs that are commensurate with the lattice (see Extended Data Fig. 3b). Because any information that is local in position space gets spread over all reciprocal space, when one discards much of the data throughout reciprocal space, crucial insights contained in atomic-scale electronic-structure motifs, discommensurations and topological defects are lost. On the other hand, because of the versatility of ANN in capturing any function whatsoever33, the new ML approach allows one to impartially inspect the entirety of the data in each STM image with no loss of information. This is a key distinction between the traditional FT approach and the ML approach, which exhaustively analyses all of the data throughout real space.
Generation of training image sets
The diversification of synthetic images of a unidirectional DW to create a training image set (see Extended Data Fig. 4) starts from components with d-wave and s-wave form factors (DFF and SFF, respectively) and includes (1) heterogeneity through independent amplitude and phase fluctuations and (2) topological defects or dislocations in DFF. For categories C = 1, 2, 3, 4 (with representative wavelength λC) the DFF \(\left({I}_{C,{\rm{f}},{\rm{d}}}^{{\rm{DFF}}}\right)\) and SFF \(\left({I}_{{\rm{f}}}^{{\rm{SFF}}}\right)\) modulations with noise models are
with overall constants ADFF = 1, ASFF = 0.5 and phase offsets φDFF = π/4, φSFF = 0. Here the amplitude field Af(x, y) and the phase field φf(x, y) represent smooth fluctuations (different random realizations in \({I}_{C,{\rm{f,d}}}^{{\rm{DFF}}}\left(x,y\right)\) and \({I}_{{\rm{f}}}^{{\rm{SFF}}}\left(x,y\right)\)), and Ad(x, y) and the phase field φd(x, y) denote dislocation defects. For each category, we generate different realizations labelled as f and d. For each f realization the Af(x, y) field is a two-dimensional Gaussian fluctuation field with spatial length scale ξA = 8a0, normalized between −1 and 1, while φf(x, y) is a two-dimensional Gaussian fluctuation field with the same spatial length scale, ξφ = 8a0, normalized between −π and π. The values of the correlation length scales ξA and ξφ are determined by a simple analysis of an SISTM Z(q, E) FT (Fig. 3). The strengths of the amplitude and phase fluctuations εA = 0.8 and εφ = 0.5, respectively, are also chosen to produce images in rough consistency with a typical Z(r, E). In each image, there are nd = 2 dislocations at random positions ri = (xi, yi), i = 1,…, nd, with windings wi = ±2π and total winding 0. The total dislocation-contributed fields are
where the amplitude recovery length is ξd = a0, as determined from Z(r, E).
Then, the training set for each category C combines the different form factor components into the image intensity at pixel position (x, y) in units of a through IC(x, y) = IC,DFF(x, y)D(x, y) + ISFF(x, y)S(x, y) using atomic masks. The SFF mask S(x, y) is a sum of two-dimensional Gaussians with maxima equal to 1 and spatial widths equal to 0.35a0, each located at a Cu atom position (x and y integers), whereas the DFF mask D(x, y) is a sum of positive Gaussians at the locations Ox and negative ones at Oy. The total intensity IC(x, y) of all simulated images is normalized to take values between 0 and 1. All simulated images have six pixels per a0 and contain 2 × 86 × 86 unit cells, to give a total size of 516 × 516 pixels.
Configuration of ANN
In a feed-forward fully connected ANN, the neurons form a layered structure and the output of each neuron is sent to all of the neurons in the subsequent layer. Each neuron j in the first layer assesses the input, organized as a vector x = {xi} with weight matrix w = {wji} and bias vector b = {bj}, and determines the output through a nonlinear transformation \(f({\boldsymbol{w}}\cdot {\boldsymbol{x}}+{\boldsymbol{b}})\), called the activation function. The bias bj and the weights wji are the parameters of the ANN and are adjusted during the training. The activation function usually takes the form of the sigmoid function or the rectified linear unit (see the inset of Extended Data Fig. 5a). For input image data x = {xi} and an ANN with a single hidden layer of neurons (labelled by j), the neurons in the output layer corresponding to the different categories receive the inputs
We also use a softmax function for the output layer for normalized ANN outputs \({\sigma }_{c}={\text{e}}^{{\sigma }_{c}^{^{\prime} }}/\displaystyle {\sum }_{c}{\text{e}}^{{\sigma }_{c}^{^{\prime} }}\) that allows a probabilistic interpretation for the different categories denoted by the subscript c.
For supervised ML, we divide the dataset into a training set containing 90% of the images and the remaining 10% are used for unbiased validation, speed control and overfitting detection during the training. The weights and biases of the ANN are optimized using stochastic gradient descent to minimize the cross-entropy cost function
Here x is a training data input vector labelled with a specific category yc(x) (for example, yc(x) = δc,1 if the training data belong to category 1) and σc(x) is the actual output of the ANN. The training process essentially minimizes the difference between the ANN output and the label.
We use a batch size of 50 and L2 regularization to avoid overfitting. We include 50 neurons in the hidden layer and choose the sigmoid function as the neuron activation function unless stated otherwise. In Extended Data Fig. 5a we show examples of the cost function, as well as the accuracy on the validation dataset, for the sigmoid and the rectified linear unit activation functions during the training. Extended Data Fig. 5b shows the achieved accuracy and cross-entropy cost after 25 epochs as a function of the number of neurons in a single hidden layer. We trained 81 ANNs with random initial conditions by using a stochastic training process. The outputs of the finalized ANNs are robust and quantitatively consistent with each other. The results provided in the main text show the average and standard deviations from all 81 ANNs. To verify that our results are robust against changes to the architecture of the ANN, we trained six ANNs with 100 neurons in a single hidden layer and six ANNs with two hidden layers, and we found that the results agree within error bars.
Because they are drawn from a historic image-array archive not designed for ML-based studies, the SISTM image arrays Z(r, E) vary in spatial resolution from sample to sample from 1.7 to 11.5 pixels per average Cu–Cu distance. The number of CuO2 unit cells in the experimental images also varies from 2 × 55 × 55 to 2 × 175 × 175. The Cu and Ox,y atom positions, registered by the topograph, show random distortions of the lattice due to the STM tip-drift effect (Extended Data Fig. 6a).
To correct for the drift and standardize all of the Z(r, E) images, we prepare each Z(r, E) as follows: (1) using interpolation, we map the Z(r, E) image to the resulting input image so that each topographic atom position maps onto a position in a perfect atomic lattice with a Cu–Cu distance of a0 = 6 pixels (see Extended Data Fig. 6b, c), which corrects both the drift effect and standardizes the spatial resolution; (2) we crop or tile the image to a size of 516 × 516 pixels; (3) to study the degree of unidirectionality, for each input image we create a copy rotated by 90°, since the training images have modulations only along the x direction for simplicity and clarity. Extended Data Fig. 7 shows the Z(r, E) and prepared input data at different dopings of Bi2Sr2CaCu2O8. It should be noted that the results are reliable only if the test data lie reasonably consistently within the input space given by the synthetic training sets.
Validation and benchmarking
To assess the discriminatory power of the ANN categorization, we study obvious modulations in two experimental images (Extended Data Fig. 8): (1) the topograph of Bi2Sr2CaCu2O8, which has no discernible modulation except for the Cu atomic lattice (an SFF at Q = 0); (2) the Z(r, E) image of Ca2−xNaxCuO2Cl2 (NaCCOC), with obvious commensurate 4a0-period modulations, which are apparent in a DFF FT. The ANN indeed finds that no category is particularly compelling for the topograph of Bi2Sr2CaCu2O8 in Extended Data Fig. 8a (see Extended Data Fig 8b). By contrast, the ANN finds category 2 to be the most compelling for the the Z(r, E) image of NaCCOC in Extended Data Fig. 8c (see Extended Data Fig 8d).
We also checked the robustness of our approach against the existence of Bi2Sr2CaCu2O8 superlattice modulations. The assessment of the ANNs was independent of the existence or absence (data with superlattice modulation removed from the FT) of the superlattice modulations.
We further tested the robustness of the ANN decisions against changes in the disorder model. For this we trained a new ANN with a training set generated with different disorder parameters. Specifically, we decreased the amplitude fluctuation intensity εA by 13% and the phase fluctuation intensity εφ by 20% while making the disorder profiles vary more rapidly in space by decreasing the correlation lengths ξA and ξφ by 6%. By repeating the assessment of the experimental data shown in Fig. 4k–m, o and Extended Data Fig. 1a with the new ANN, we find that the results remain unchanged. This is shown by the comparison between Fig. 4k–m, o and Extended Data Fig. 1a (shown in Extended Data Fig. 9a–e) and the output from the ANN trained with the new disorder model (shown in Extended Data Fig. 9f–j). The results show preference for the commensurate period 4a0 for systems with 0.06 < p < 0.14 (Extended Data Fig. 9a–d, f–i) and complete confusion over different candidate categories for p = 0.2 (Extended Data Fig. 9e, j). The energy dependence comparison between the ANN assessments presented in the main text (Extended Data Figs. 1a, 9e) and the assessments of the ANN trained with the altered disorder model (Extended Data Fig. 9j) shows that the tie between the onset of preference for the commensurate period 4a0 and the nematicity at the pseudogap energy scale is robust against variations in the disorder model used to train ANNs.
Discommensurations and maximum intensity wavevector
In ref. 28, we carried out FT-based linear analysis of equivalent data using the fact that the power spectral density was not smoothly distributed (Extended Data Fig. 10a, b). We introduced the concept of the demodulation residue (DR), using
where α = x, y indicates the component of the DR vector, r is the position vector in the image and L is the linear dimension of the image. The DR measures the phase fitness of the q-modulation in the spatial pattern ψ(r) through the filtered FT
where \(\widetilde{\psi }\left({\boldsymbol{k}}\right)\) is the FT of the data and Λ is the Fourier cutoff. By minimizing the DR, \({R}_{{\boldsymbol{q}}}\left(\psi \right)\equiv \sqrt{{\left[{{\boldsymbol{R}}}_{{\boldsymbol{q}}}^{x}\left(\psi \right)\right]}^{2}+{\left[{{\boldsymbol{R}}}_{{\boldsymbol{q}}}^{y}\left(\psi \right)\right]}^{2}}\), for a given modulation while considering different q-modulations, we showed that one can obtain the phase-averaged wavevector \(\bar{{\boldsymbol{Q}}}\) of DW modulations. Within the limits of the FT, which is a linear-basis transform, this approach facilitated dealing with situations in which the amplitude does not show well-defined peaks owing to severe disorder.
However, there are limitations in this approach because the FT is a linear transformation of the basis and is useful when the desired phenomenon has sharp features in the new basis, that is, the wavevector basis. Nevertheless, when there are randomly placed, highly disordered patches of a real-space DW pattern with sprinkles of topological defects, FT-based methods perform very poorly. Obviously, one would not attempt a FT in trying to recognize human faces in an image for precisely this reason. The limitation of FT-based methods is evident in that, even when a modulation pattern consists of a commensurate 4a0-period modulation (Q0 = 2π/(4a0)) everywhere except for a sequence of discommensurations (phase slips in the commensurate modulation pattern), the Rq(ψ) minimization (as well as the FT amplitude maximization) incorrectly identifies an apparent period of \(\bar{Q}=0.3\times 2{\rm{\pi /}}{a}_{0}\) (Extended Data Fig. 10e). Although in ref. 28 the DR minimization yielded \(\bar{Q}=2{\rm{\pi /}}\left(4{a}_{0}\right)\) for pseudogap energy data (a single dataset for each doping) for various dopings, this depended critically on visual inspection to identify commensurate patches in supplementary figure 6B of ref. 28 (see also Extended Data Fig. 3). Furthermore, the DR-based approach therein was averaged over topological defects (dislocations), ignoring their role. Finally, the DR-based approach required manual choice of the Fourier cutoff, again using visual inspection of the data. Hence the entire process is time-consuming, highly labour-intensive and fraught with human perceptual bias. It is therefore not possible to study the largest SISTM image arrays with this FT approach in any consistent way, rendering it impossible to inspect the complete electron density and electron energy dependence of the largest EQM image-array archives.
The ANN-based approach that we introduce in the main text is far more powerful, efficient and general. It does not rely on arbitrary choices, such as the cutoff Λ, or on visual selection of Fourier regions of interest, and is not tied to any basis. The ANN is inherently nonlinear and an ANN with a sufficient number of neurons can express/detect any function33. Owing to the versatility of ANNs, our ANN-based approach allows us to rapidly analyse a complete image-array dataset in its entirety, without any ad hoc Fourier filtering or selection. Hence the ANN approach is quite unbiased. Moreover, once the ANNs are trained, the automatic assessment of a new dataset takes minutes, allowing for a high-throughput analysis. It is this efficiency that allowed the discovery of the connection between the nematic state and commensurate DW state, both setting in at the pseudogap energy scale (Extended Data Fig. 1).
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request. The data include experimental datasets, their standardized input forms for ANN analysis, the ANN output statistics, as well as Mathematica notebook files for generating training sets, standardizing input images and defining colour scales. The data used for Extended Data Figs. 1, 4 are provided as Source Data.
Code availability
The custom computer codes used to build and train the ANNs and to use the trained ANNs for data analysis are available from the corresponding author upon reasonable request.
References
Bacon, F., The Advancement of Learning (1605; Paul Dry Books, 2001).
Ouyang, R. et al. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mat. 2, 083802 (2018).
Stanev, V. et al. Machine learning modeling of superconducting critical temperature. Npj Comput. Mater. 4, 29 (2018).
Rosenbrock, C. W., Homer, E. R., Csányi, G. & Hart, G. L. W. Discovering the building blocks of atomic systems using machine learning: application to grain boundaries. Npj Comput. Mater. 3, 29 (2017).
Kusne, A. G. et al. On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets. Sci. Rep. 4, 6367 (2014).
Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017).
Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).
Torlai, G. & Melko, R. G. Neural decoder for topological codes. Phys. Rev. Lett. 119, 030501 (2017).
van Nieuwenburg, E. P. L., Liu, Y.-H. & Huber, S. D. Learning phase transitions by confusion. Nat. Phys. 13, 435–439 (2017).
Broecker, P., Carrasquilla, J., Melko, R. G. & Trebst, S. Machine learning quantum phases of matter beyond the fermion sign problem. Sci. Rep. 7, 8823 (2017).
Ch’ng, K., Carrasquilla, J., Melko, R. G. & Khatami, E. Machine learning phases of strongly correlated fermions. Phys. Rev. X 7, 031038 (2017).
Zhang, Y. & Kim, E.-A. Quantum loop topography for machine learning. Phys. Rev. Lett. 118, 216401 (2017).
Deng, D.-L., Li, X. & Das Sarma, S. Quantum entanglement in neural network states. Phys. Rev. X 7, 021021 (2017).
Stoudenmire, E. M. & Schwab, D. J. Supervised learning with tensor networks. Adv. Neural Inf. Process. Syst. 29, 4799–4807 (2016).
Schindler, F., Regnault, N. & Neupert, T. Probing many-body localization with neural networks. Phys. Rev. B 95, 245134 (2017).
Torlai, G. et al. Neural-network quantum state tomography. Nat. Phys. 14, 447–450 (2018).
Fujita, K. et al. in Strongly Correlated Systems: Experimental Techniques (eds Avella, A. & Mancini, F.) 73–109 (Springer, 2015).
Kivelson, S. A., Fradkin, E. & Emery, V. J. Electronic liquid-crystal phases of a doped Mott insulator. Nature 393, 550–553 (1998).
Zaanen, J. Self-organized one dimensionality. Science 286, 251–252 (1999).
Keimer, B., Kivelson, S. A., Norman, M. R., Uchida, S. & Zaanen, J. From quantum matter to high-temperature superconductivity in copper oxides. Nature 518, 179–186 (2015).
Wang, F. & Lee, D.-H. The electron-pairing mechanism of iron-based superconductors. Science 332, 200–204 (2011).
Comin, R. & Damaschelli, A. Resonant X-ray scattering studies of charge order in cuprates. Annu. Rev. Condens. Matter Phys. 7, 369–405 (2016).
Fradkin, E. et al. Nematic Fermi fluids in condensed matter physics. Annu. Rev. Condens. Matter Phys. 1, 153–178 (2010).
Fradkin, E., Kivelson, S. A. & Tranquada, J. M. Theory of intertwined orders in high temperature superconductors. Rev. Mod. Phys. 87, 457 (2015).
Hamidian, M. H. et al. Atomic-scale electronic structure of the cuprate d-symmetry form factor density wave state. Nat. Phys. 12, 150–156 (2016).
Robertson, J. A. et al. Distinguishing patterns of charge order: stripes or checkerboards. Phys. Rev. B 74, 134507 (2006).
Del Maestro, A., Rosenow, B. & Sachdev, S. From stripe to checkerboard ordering of charge-density waves on the square lattice in the presence of quenched disorder. Phys. Rev. B 74, 024520 (2006).
Mesaros, A. et al. Commensurate 4a 0-period charge density modulations throughout the Bi2Sr2CaCu2O8+x pseudogap regime. Proc. Natl Acad. Sci. USA 113, 12661–12666 (2016).
Nielsen, M. A. Neural Networks and Deep Learning (Determination Press, 2015).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Cover, T. M. & Thomas, J. A. Elements of Information Theory 2nd edn (Wiley, 1991).
Nie, L. et al. Quenched disorder and vestigial nematicity in the pseudogap regime of the cuprates. Proc. Natl Acad. Sci. USA 111, 7980–7985 (2014).
Cybenko, G. Approximation by superposition of a sigmoidal function. Math. Contr. Signals Syst. 2, 303–314 (1989).
Acknowledgements
We thank P. Ginsparg, J. Hoffman, S. Kivelson, R. Melko, A. Millis, M. Stoudenmire, K. Weinberger and J. Zaanen for discussions and communications. A.M. and Y.Z. acknowledge support from DOE DE-SC0010313; Y.Z. acknowledges support from DOE DE-SC0018946; E.-A.K. and J.C.S.D. acknowledge support by the Cornell Center for Materials Research with funding from the NSF MRSEC programme (DMR-1719875); E.K. and K.C. acknowledge support from the NSF through grant number DMR-1609560. E.-A.K. and E.K. acknowledge support from the Kavli Institute for Theoretical Physics (where initial discussions about the project took place), which is supported in part by the NSF under grant number PHY-1748958. S.U. and H.E. acknowledge support from a Grant-in-aid for Scientific Research from the Ministry of Science and Education of Japan and the Global Centers of Excellence Program of the Japan Society for the Promotion of Science. K.F. and J.C.S.D. acknowledge support from the US Department of Energy, Office of Basic Energy Sciences, under contract number DEAC02-98CH10886; S.D.E, M.H.H. and J.C.S.D. acknowledge support from the Moore Foundation’s EPiQS Initiative through grant GBMF4544; J.C.S.D. acknowledges support from Science Foundation Ireland under award SFI 17/RP/5445 and from the European Research Council (ERC) under award number DLV-788932.
Reviewer information
Nature thanks Giuseppe Carleo, Andrea Perali and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Author information
Authors and Affiliations
Contributions
E.-A.K. and J.C.S.D. conceived the project. E.K., K.C., E.-A.K. and Y.Z. designed the ML strategy. Y.Z. implemented the ANN-based ML strategy. E.-A.K. and A.M. constructed the mathematical model for the training set and A.M. generated the training set. K.F., S.U. and H.E. synthesized and characterized the crystals studied. K.F., S.D.E. and M.H.H. carried out the experiments and image array data processing. E.-A.K. and J.C.S.D. supervised the investigation and wrote the paper with key contributions from K.F., Y. Z. and A.M. The manuscript reflects the contributions of all authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 ANN detection of unidirectionality at different electron energies.
Output categorization by 81 ANNs of 16 nm × 16 nm Z(r, E) images of Bi2Sr2CaCu2O8 in the electron energy range E = 30–150 meV in steps of 6 meV for P = 0.08 (Tc = 45 K). Markers are larger than the statistical spread (one standard deviation) of the ANN outputs, as estimated from our ensemble of 81 ANN realizations (see Methods). a, The output for modulation orientation X is obtained by inputting the Z(r, E) image array to the ANNs. b, The output for modulation orientation Y is obtained by inputting the 90°-rotated versions of the Z(r, E) used in a to the ANNs.
Extended Data Fig. 2 Schematic of DWs arising in the CuO2 plane according to strong-coupling position-based theories.
a, The d-symmetry 4a0 charge DW. The charge density at the Ox site is modulated with four-unit-cell periodicity along the horizontal direction, and similarly for that at Oy, but out of phase by π (d symmetry). Cu locations are marked by small dots. b, The 8a0 pair DW state. The d-wave Cooper pair density is modulated with eight-unit-cell periodicity along the horizontal direction. Such modulation in the Cooper pair density can cause a 4a0-period modulation in the local density of states N(r).
Extended Data Fig. 3 Local commensurate motifs in scanning tunnelling microscopy images.
a, Large-field-of-view, high-precision scanning tunnelling microscopy image of the electronic structure of Bi2Sr2CaCu2O8 with p ≈ 0.08, integrated to E = 100 meV. The larger inset shows the FT of the power spectral density, whereas the smaller inset shows the same data plotted along a line from 0.1 to 0.5 in units of 2π/a0. Clearly, the maximum intensity peak occurs at \(\left\langle Q\right\rangle =0.28\). b, Within each of the eight 6.5-nm2 regions marked in a there are many commensurate, unidirectional 4a0 electronic-structure motifs (inside the white rectangles). The Cu sites, independently determined from topographic imaging, are shown as fine dots.
Extended Data Fig. 4 Categories defined by electronic orders.
a–d, Example images from the simulated training set, from category C = 1 (a), C = 2 (b), C = 3 (c) and C = 4 (d), defined by DFF unidirectional modulation with wavelengths λC = 4.348a0, 4a0, 3.704a0 and 3.448a0, respectively. The CuO2 unit-cell size a0 is 6 pixels diagonally.
Extended Data Fig. 5 ANN training and testing.
a, Examples of the accuracy of the ANN outputs for the independent validation dataset and the cross-entropy cost function are compared for different neuron activation functions during the initial training processes. The inset illustrates the nonlinear activation functions, that is, the sigmoid function and the rectified linear unit (ReLU). b, Examples of the accuracy and the cross-entropy cost versus the number of neurons in a single hidden layer after 25 epochs of training.
Extended Data Fig. 6 Experimental SISTM images.
a, Example Z(r, E) of underdoped Bi2Sr2CaCu2O8 with hole density p = 0.06 (Tc = 20 K). The inset is a zoom-in with marked atom positions determined from the topograph (Cu, red/light; O, purple/dark). b, A small region of a. c, Standardized version of b (see Methods).
Extended Data Fig. 7 Experimental SISTM images used as input for categorization.
a, Z(r, E) of underdoped Bi2Sr2CaCu2O8 with p = 0.06 (Tc = 20 K) at energy E = Δ1 (see main text). f, The 516 × 516 pixel (2 × 86 × 86 CuO2 unit cells) input data from a (see Methods). b–e, g–j, As in a, f, but for underdoped Bi2Sr2CaCu2O8 with p = 0.08 (Tc = 45 K) (b, g), underdoped Bi2Sr2CaCu2O8 with p = 0.085 (Tc = 50 K) (c, h), underdoped Bi2Sr2CaCu2O8 with p = 0.14 (Tc = 74 K) (d, i) and overdoped Bi2Sr2CaCu2O8 with p = 0.20 (Tc = 82 K) (e, j). Too-small images are tiled, with unit cells intact at the tiling boundary, whereas too-large images are cropped.
Extended Data Fig. 8 Benchmarking categorization using experimental images.
a, Input data for the topograph of overdoped Bi2Sr2CaCu2O8 with p = 0.22 (Tc = 70 K). b, Output categorization of a by 81 ANNs, showing absence of a translation-breaking signal. Results for the modulation orientations x and y are obtained by inputting the image in a and its 90°-rotated version, respectively, to the ANNs (see Methods). c, The input Z(r, E) data for NCCOC with a doping of p = 0.12 at E = 150 meV. d, Output categorization of c by 81 ANNs, showing commensurate modulations (category 2).
Extended Data Fig. 9 Categorization is robust to changes in training-set parameters.
a–d, The evolution of output categorizations upon increasing hole doping (as in Fig. 4k–m, o). e, Energy dependence of the output categorizations (as in Extended Data Fig. 1a). f–j, Categorizations of the same inputs as for a–e, respectively, obtained from the output of a single ANN trained using a different training set (see Methods).
Extended Data Fig. 10 Weakness of FT analysis of EQM.
a, b, The DFF Fourier amplitude, \(\left|\widetilde{\Psi }\left({\boldsymbol{q}}\right)\right|\), with the wavevector q restricted to a square area with its corner at the origin of Fourier space (black square) and its centre at \({{\boldsymbol{Q}}}_{X}=\frac{1}{4}{{\boldsymbol{G}}}_{X}\) (in a) and \({{\boldsymbol{Q}}}_{Y}=\frac{1}{4}{{\boldsymbol{G}}}_{Y}\) (in b), where GX amd GY are the Bragg peaks. Data from a Bi2Sr2CaCu2O8 sample with a doping level of p = 0.10 (Tc = 65 K). Figure reproduced from ref. 28 with permission from PNAS. c, The modulation is the real part of the complex wave \(\psi \left(x\right)=A\left(x\right){{\rm{e}}}^{i\left[{Q}_{0}x+\phi \left(x\right)\right]}\), which has commensurate domains with local wavevector \({Q}_{0}=\frac{1}{4}\times \frac{2{\rm{\pi }}}{a}\) (period 4a). The amplitude A(x) ≥ 0 varies smoothly around 1. Phase slips are incorporated in φ(x) (see d). The average wavevector is \(\bar{Q}=0.3\times \frac{2{\rm{\pi }}}{a}\). d, The local phase φ(x) of ψ(x) in c, constructed as a discommensuration array in the phase argument \(\Phi \left(x\right)={Q}_{0}x+\phi \left(x\right)\). Phase slips of all discommensurations are set to +π. The distances between neighbouring discommensurations vary randomly around the average distance set by the value of incommensurability, \(\delta =\bar{Q}-{Q}_{0}=0.05\times \frac{2{\rm{\pi }}}{a}\). e, Fourier amplitudes \(\left|\widetilde{\psi }\left(q\right)\right|\) of the modulation ψ(x) in c (blue line) show a narrow peak at \(\bar{Q}=0.3\times \frac{2{\rm{\pi }}}{a}\). The demodulation residue |Rq| (red dashed line) has its minimum exactly at the average \(\bar{Q}\).
Rights and permissions
About this article
Cite this article
Zhang, Y., Mesaros, A., Fujita, K. et al. Machine learning in electronic-quantum-matter imaging experiments. Nature 570, 484–490 (2019). https://doi.org/10.1038/s41586-019-1319-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-019-1319-8
- Springer Nature Limited
This article is cited by
-
Classification of magnetic order from electronic structure by using machine learning
Scientific Reports (2023)
-
Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems
Communications Physics (2023)
-
Machine learning the microscopic form of nematic order in twisted double-bilayer graphene
Nature Communications (2023)
-
Adversarial machine learning phases of matter
Quantum Frontiers (2023)
-
Preparing quantum states by measurement-feedback control with Bayesian optimization
Frontiers of Physics (2023)