Main

EQM research concentrates on exotic electronic phases that emerge when electrons interact so strongly that they lack a definite momentum. These electrons often self-organize into complex new states of EQM, such as electronic liquid crystals18,19, high-temperature superconductors20,21, fractionalized electronic fluids and quantum spin liquids. In this field, vast experimental datasets have emerged—for example, from real-space visualization of EQM using spectroscopic imaging scanning tunnelling microscopy (SISTM)17, from momentum-space visualization of EQM using angle-resolved photoemission spectroscopy, or from modern X-ray22 and neutron scattering. The challenge is to develop ML strategies that enable scientific discovery using large and complex experimental data structures from EQM experiments.

An excellent example is the electronic structure of the CuO2 plane in copper oxide compounds supporting high-temperature superconductivity20 (Fig. 1a). With one electron per Cu site, strong Coulomb interactions produce charge localization in an antiferromagnetic Mott insulator state. Removing p electrons (adding p holes) per CuO2 plaquette generates the ‘pseudogap’ phase20, which exhibits a strongly depleted density of electronic states N(E) for energies \(\left|E\right| < {\Delta }_{1}\), where Δ1 is the characteristic pseudogap energy scale that emerges for T < T(p) (Fig. 1a). Although the pseudogap phase has defied microscopic identification for decades20, recently it has been reported that rotational and translational symmetry are spontaneously broken in this phase. Rotational symmetry breaking is referred to as a nematic state18,19,23,24 and occurs at a wavevector of Q = 0 as the breaking of 90°-rotational (C4) symmetry at T < T(p) (Fig. 1a). This presents a conundrum because, in theory, ordering at Q = 0 cannot open an energy gap in the electronic spectrum. The translational symmetry breaking or density wave (DW) state, which should open such an energy gap, is detected using SISTM visualization17 and X-ray scattering22. It consists of periodic spatial modulations of electronic structure with finite wavevector Q, and thus with periodicity λ = 2π/|Q|, that occur within the pseudogap phase (Fig. 1a). A key challenge is to identify the correct microscopic theory for the DW state (see Methods section ‘Strong-coupling DW states’) and to find its relationship (if any) with both the nematic state and the pseudogap.

Fig. 1: EQM imaging in hole-doped CuO2.
figure 1

a, Schematic phase diagram of hole-doped CuO2. At charge density p = 0 a single electron is localized at each Cu site in the Mott insulator state. As holes are introduced (electrons removed), the Mott insulator disappears quickly. High-temperature superconductivity emerges at slightly higher p, reaching its maximum critical temperature Tc near p ≈ 0.16. However, in the range p < 0.19 and up to a temperature of T, an enigmatic phase of EQM—the pseudogap phase—is known to contain periodic charge-density modulations of the imprecisely known wavevector Q. b, In the CuO2 Brillouin zone, the Fermi surface is defined as the momentum-space contour k(E = 0) that separates the occupied from the unoccupied electronic states, and its locus changes rapidly with changing carrier density p. DW states may then appear at a wavevector of Q = ki(E = 0) − kf(E = 0) if the electron states ki(E) and kf(E) are ‘nested’ (red and yellow arrows). c, Strongly correlated electrons may be fully localized in the Mott insulator phase or self-organized into electronic liquid-crystal states in real space. The schematic shows a simple example of a state with unidirectional charge-density modulations in the CuO2 plane, having wavelength λ = 4a0 or wavevector \({\boldsymbol{Q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0.25,0\right)\) (Methods section ‘Strong-coupling DW states’). d, Typical 24.4 nm × 24.4 nm (500 pixels × 500 pixels) SISTM electronic-structure image R(rE = 150 mV) of the CuO2 plane of Bi2Sr2CaCu2O8 with p = 0.08 (Tc = 45 K). Complex spatial patterns (which look like highly disordered tweed) dominate. The contrast with the simple periodic arrangement of the simultaneously visualized atoms of the same crystal in the topograph (inset) is striking. e, Typical array of simultaneously acquired images Z(rE) for p = 0.08, each with size 16 nm × 16 nm 245 pixels × 245 pixels) but at a different electron energy E in the range 6 meV < E < 150 meV in steps of 12 meV. Such arrays are the main type of dataset for which efficient ML analysis and discovery techniques are required. Experimental data are replicated at least three times for each doping.

A DW state with wavevector Q is described by the spatially modulating function A(r) = D(r)cos[Qr + φ0(r)], where A(r) is the density amplitude, φ0(r) represents the effects of disorder and topological defects, and D(r) is the DW form factor symmetry. For a tetragonal crystal, an s-symmetry form factor remains unchanged under 90° rotations, whereas a d-symmetry form factor changes sign, as observed in copper oxides25. One theoretical approach to understanding a DW state is based on conventional electrons with well-defined wave momentum p(E) = ħk(E) (ħ is the reduced Planck constant). DW states can then appear at a wavevector of Q = ki(E = 0) − kf(E = 0) if many (ki(0), kf(0)) pairs are connected by the same wavevector Q—that is, nested (red arrow in Fig. 1b). Under these circumstances, Q should usually be incommensurate (Fig. 1b). Alternatively, strongly interacting particle-like electrons may have well-defined positions in real space, being fully localized in the Mott insulator phase or self-organized into electronic liquid-crystal states18,19,24. For copper oxides, such states are often predicted18,19,24 to exhibit periodic charge density modulations that are unidirectional and crystal-lattice-commensurate, with wavelength λ = 4a0, where a0 is the Cu–Cu distance, or wavevector \({\boldsymbol{Q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0.25,0\right)\)oriented along the Cu–O–Cu axis (Fig. 1c, Methods section ‘Strong-coupling DW states’). Such lattice-commensurate charge modulations in position-based theories (Fig. 1c) are expected to be robust against changes with electron density p and electron energy, whereas those associated with the geometry of the Fermi surface in momentum-based theories (Fig. 1b) are expected to evolve continuously with p.

A central challenge has therefore been to determine whether the electronic-structure modulations in hole-doped CuO2 (for example, Fig. 1d, e) are lattice-commensurate, unidirectional and with specific periodicity, or if they evolve continuously with electron density and energy. However, because of their inherent limitations, it has not been possible to discriminate these position- and momentum-based theoretic perspectives by using traditional analysis techniques. First, owing to the extreme disorder observed in copper oxide EQM images17 (Fig. 1d) or the broad line-widths detected simultaneously in reciprocal space22, theory demonstrates that conventional Fourier analysis is fundamentally limited26,27 in determining the exact symmetries of the EQM state. Second, when such complicated electronic-structure motifs exist at the atomic scale in real space17, Fourier analysis spreads all of that information throughout reciprocal space. Consequently, the customary Fourier analysis of SISTM and X-ray data that focuses on a single intensity peak, which has long reported incommensurate modulations that evolve continuously with p in the range 0.22 ≲ Q(2π/a0) ≲ 0.3 (refs 17,22), disregards much information. Specifically, the key insights contained in atomic-scale electronic-structure motifs (Fig. 1d), discommensurations28 and topological defects (Methods section ‘Fourier transform analysis of EQM images: disorder and information loss’) are all discarded. By contrast, ML analysis of EQM images holds great promise because it avoids this information loss and analyses the complete image array objectively.

High-data-volume imaging studies of EQM (for example, Fig. 1e) use SISTM, a technique for visualizing N(E) with subatomic resolution and a crystal-lattice register17. The resulting image array for a given sample is built from measurements of the differential electron tunnelling conductance dI/dV(rV) = g(rV) (I is the current) between the microscope tip and the sample, obtained at a square array of locations r and in a range of voltage differences V between the tip and the sample. For technical reasons, images of Z(rV) = g(r, +V)/g(r, −V), which accurately represent the spatial symmetry of electronic structure but avoid systematic errors17, are most frequently used. Although Fourier analysis of Z(rV) to yield Z(qV) is an obvious approach to studying the EQM modulation wavevectors17,22, it has severe limitations, as discussed above. Therefore, identifying the fundamental broken-symmetry EQM state from an array of such Z(rE = eV) images (for example, Fig. 1e), where e is the electron charge, is a paradigmatic challenge for ML techniques.

Here we introduce a specific ML approach that uses ANNs to achieve hypothesis testing with EQM image arrays. The technique is based on a supervised ML with ANN–human collaboration. Its goals are to automatically search experimental EQM image arrays (for example, Fig. 1e), recognize spatial modulations in a variety of distinct categories, identify their fundamental periodicity and lattice register throughout an image, and distinguish whether the modulations are unidirectional or bidirectional. The first stage is the generation of sets of ANN training images, each labelled by a hypothesis (the different DW modulations to be discerned). Here, we test four hypotheses associated with four distinct types of ideal periodic modulations, all with a d-symmetry form factor, and with fundamental wavelengths λ = 4.348a0, 4.000a0, 3.704a0 and 3.448a0. We note that only category 2 represents a commensurate pattern with λ = 4a0. Four training sets are then generated for categories C = 1–4 using identical procedures, in which we introduce specific forms of heterogeneity designed to mimic the noise, intrinsic disorder and topological defects of the experimental data (Fig. 2a, Methods section ‘Generation of training image sets’). In all of these simulated training image sets, heterogeneity disrupts the long-range-ordered patterns in real space, as shown for a typical training image in Fig. 2b. It also scrambles the peaks in the d-symmetry Fourier transforms17 of the training images, rendering them broad and chaotic (Fig. 2c). In the second stage, we establish an ANN architecture that trains well with these training image sets. During training, the parameters of the ANN are adjusted iteratively to minimize a cross-entropy cost function29. Stochastic gradient descent, along with backpropagation30, is used for lowering the cost function. The training is complete and all parameters of each ANN are set when the cross-entropy31 saturates. Each finalized ANN generally has an accuracy of >99% when tested on validation images (see Fig. 2d and Methods section ‘Configuration of ANN’). The ANN design is a fully connected feed-forward network with a single hidden layer (Fig. 3). Statistical reliability of this ML system for different network architectures and different initial conditions is achieved by training 81 distinct ANNs in parallel with the same training image set.

Fig. 2: Training an ANN to identify broken-symmetry states in SISTM data.
figure 2

a, The ANN array is trained to recognize a DW in electronic-structure images (for example, Z(rE)) representing different EQM states. A training image set is synthesized by appropriately diversifying pristine images of four distinct electronic ordered states. Each translational-symmetry-breaking ordered state is labelled by a category C = 1, 2, 3, 4 associated with a wavelengths of λ = 4.348a0, 4.000a0, 3.704a0, 3.448a0, respectively. The training image in each category are diversified by appropriate addition of noise, short-correlation length fluctuations in amplitude and phase, and topological defects. b, Example of a training image in category C = 2 (DW along the x axis with a d-symmetry form factor and λ = 4a0), where smooth amplitude and phase fluctuations, as well as topological defects (dislocations) at random positions, have been added to simulate typical phenomena encountered in experimental EQM visualization (for example, Fig. 1d). The full 516 × 516 pixel image contains 2 × 86 × 86 entire CuO2 unit cells with a Cu–Cu distance of 6 pixels diagonally. c, The d-symmetry Fourier transform of b. The absence of a well-defined modulation wavevector Q within the modulations in b has been successfully simulated in the training image, as seen by the region of momentum space (grey dashed circle) within which the amplitudes at different wavevectors vary greatly. The grey dots are at \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(\pm 0.5,0\right)\) and \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0,\pm 0.5\right)\). d, Each ANN is trained by minimizing the cross-entropy cost function progressively through stochastic gradient descent and backpropagation. The process of going through the entire set of shuffled training data, also known as an epoch, is repeated until the cross-entropy and accuracy saturate. The overall accuracy of the finalized ANNs on the synthesized data is generally over 99%.

Fig. 3: ANN analysis of experimental EQM visualization data.
figure 3

a, Typical measured 16 nm × 16 nm (440 pixels × 440 pixels) Z(rE = 84 meV) image of Bi2Sr2CaCu2O8 with p = 0.08 (Tc = 45 K). The disorder and complexity of copper oxide EQM are manifest. b, Typical measured Z(qE = 84 meV) image of Bi2Sr2CaCu2O8 with p = 0.08 (Tc = 45 K), which is the d-symmetry Fourier transform of a. The disorder and complexity of EQM are also evident here in the broad and fluctuating peaks around \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({Q}_{x}\pm {\rm{\delta }}{Q}_{x},{\rm{\delta }}{Q}_{y}\right)\) and \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({\rm{\delta }}{Q}_{x},{Q}_{y}\pm {\rm{\delta }}{Q}_{y}\right)\) with |δQx| = |δQy| ≈ 0.2. Grey dots are at \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(\pm 0.4,0\right)\) and \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0,\pm 0.4\right)\). c, Schematic of the ANN analysis procedure for experimental Z(rE) images. The successfully trained neural network with fixed parameters (weights W(1) and W(2) of the hidden layer and the output layer, respectively; biases) is a classifier; that is, it classifies each experimental image as belonging into one of the four categories. Neuron activation functions in our ANNs are taken to be the sigmoid function.

Our ANN ensemble is first used to perform hypothesis tests for the experimental EQM image arrays as a function of electron density. The measured Z(rE) electronic-structure images are from samples of the hole-doped copper oxide Bi2Sr2CaCu2O8 in the range 0.06 ≤ p ≤ 0.20. Obviously, disorder and complexity of EQM abound in Z(rΔ1) throughout this electron-density range (black double-headed arrow in Fig. 1a) and are equally apparent in the broad fluctuating peaks around \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({Q}_{x}\pm {\rm{\delta }}{Q}_{x},{\rm{\delta }}{Q}_{y}\right)\) and \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({\rm{\delta }}{Q}_{x},{Q}_{y}\pm {\rm{\delta }}{Q}_{y}\right)\) in Z(qΔ1) (see Fig. 3a, b). Definite fundamental periodicities seem undetectable in these Z(rΔ1) data. The set of experimental Z(rE) image arrays have a field of view of 16 nm × 16 nm and are measured in a sequence of independent experiments on distinct crystals with p ≈ 0.06, 0.08, 0.085, 0.14, 0.20 (critical temperature of Tc = 20, 45, 50, 74, 82 K, respectively). The ANNs analyse these Z(rΔ1) images as a function of p, focusing on the pseudogap energy E = Δ1(p) because copper oxide EQM symmetry breaking emerges at this energy17,25. Figure 4a–e shows the actual Z(rΔ1) images provided to the trained ANN system and Fig. 4f–j shows their d-symmetry Fourier transforms. The ANNs succeed with high reliability in discriminating and identifying the periodic motifs for all of these images (Methods section ‘Validation and benchmarking’). In Fig. 4k–o we show the response of the ANNs as the probability P(C) of the input EQM image being identified in category C. Here, the ANNs reveal that, on average, the phenomenology of the training images with C = 2 and λ = 4a0 has the highest probability of being recognized within the Z(rΔ1) image array, but only for electron densities 0.06 ≤ p ≤ 0.14. Thus, the ANNs identify a predominant translational symmetry breaking, which occurs commensurately with the specific wavelength λ = 4a0 (Fig. 4a–d). Overall, the ANNs determine that the identical, commensurate, 4a0-period electronic-structure modulations are hidden in all of the E ≈ Δ1 EQM images from the 0.06 ≤ p ≤ 0.14 area of the CuO2 phase diagram.

Fig. 4: ANN detection of the evolution of broken symmetry with electron density.
figure 4

ae, Measured 16 nm × 16 nm (440 pixels × 440 pixels) Z(rE) images of Bi2Sr2CaCu2O8 for p = 0.06, 0.08, 0.085, 0.14, 0.20 (Tc = 20, 45, 50, 74, 82 K). Each image is measured at E = Δ1(p), the pseudogap energy at that electron density. Obviously, disorder and complexity of the copper oxide EQM abound throughout this electron density range (black double-headed arrow in Fig. 1a). fj, The d-symmetry Fourier transforms Z(qE) of the images in a–e. The disorder and complexity of EQM are again evident as broad fluctuating peaks around \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({Q}_{x}\pm {\rm{\delta }}{Q}_{x},{\rm{\delta }}{Q}_{y}\right)\) and \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({\rm{\delta }}{Q}_{x},{Q}_{y}\pm {\rm{\delta }}{Q}_{y}\right)\). Grey dots indicate the points \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(\pm 0.4,0\right)\) and \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0,\pm 0.4\right)\). ko, Output categorization of the input data shown in a–e by 81 ANNs. The numbers at the top show the fundamental wavelength of each category in units of a0. We make the statistical analysis using independent assessments on a given experimental image by 81 ANNs that are independently trained to arrive at the probability P(C) of the image belonging to category C; the error bars mark the statistical spread (one standard deviation) of P(C) (see Methods). Because the training images used for the ANNs are unidirectional—that is, their pristine orders are along the x axis—the categorization results for the modulation orientations X and Y (red and yellow bars, respectively) are obtained by inputting the Z(rE) images and their 90°-rotated versions, respectively, to the ANNs.

A second key physics issue is the energy dependence within a Z(rE) image array. Quasiparticle scattering interference17 (QPI) occurs when an impurity atom scatters wave-like states ki(E) into kf(E), resulting in quantum inference at wavevectors Qif = ki(E) − kf(E) and generating modulations of N(rE) or its Fourier transform N(QifE). QPI is a physical phenomenon distinct from a DW state because, whereas the modulation wavevectors of QPI evolve rapidly with E, they do not for a DW state. Therefore, the ANNs explore a Bi2Sr2CaCu2O8 Z(rE) array of 16 nm × 16 nm EQM images that are measured in a sequence of independent experiments at distinct electron energy of E = 66, 96, 126, 150 meV on the same crystal with P = 0.08. Figure 5a–d shows this Z(rE) image set that is input to the same ANN system. EQM complexity in the identical field of view now evolves rapidly with electron energy because the images are dominated by QPI. Similarly, Fig. 5e–h shows the d-symmetry Fourier transforms Z(qE) from Fig. 5a–d, which exhibit broad fluctuating peaks that evolve rapidly with electron energy, as expected in QPI. Well-defined fundamental periodicities are indiscernible in these Z(rE) (Fig. 5a–d) and Z(qE) (Fig. 5e–h) data. However, Fig. 5j–l demonstrates that the ANN suite finds the hypothesis category with the highest recognition probability to be again C = 2, which means that the predominant modulations have a period of 4a0 for all energies exceeding 66 meV (Fig. 5b–d). Again, despite intense masking by QPI phenomena, the ANNs recognize commensurate, 4a0-period DW modulations and reveal that these occur predominantly near the pseudogap energy E = Δ1.

Fig. 5: ANN detection of broken symmetry at different electron energies.
figure 5

ad, Measured 16 nm × 16 nm (440 pixels × 440 pixels) Z(rE) images of Bi2Sr2CaCu2O8 at electron energies E = 66, 96, 126, 150 meV for p = 0.08 (Tc = 45 K). In the same field of view, EQM complexity evolves rapidly with electron energy—a purely quantum mechanical effect. eh, The d-symmetry Fourier transforms Z(qE) of the images shown in ad. The disorder and complexity of EQM are strong, as seen in the broad fluctuating peaks around \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({Q}_{x}\pm {\rm{\delta }}{Q}_{x},{\rm{\delta }}{Q}_{y}\right)\) and \(\frac{2{\rm{\pi }}}{{a}_{0}}\left({\rm{\delta }}{Q}_{x},{Q}_{y}\pm {\rm{\delta }}{Q}_{y}\right)\), but now δQx and δQy evolve rapidly with electron energy (another quantum mechanical effect). Grey dots are at \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(\pm 0.4,0\right)\)and \({\boldsymbol{q}}=\frac{2{\rm{\pi }}}{{a}_{0}}\left(0,\pm 0.4\right)\). il, Output categorization of the input data shown in ad by 81 ANNs. The statistical analysis is as in Fig. 4k–o.

A third ANN discovery in Fig. 5i–l is that the commensurate, 4a0-period modulations exhibit a strong preference for breaking symmetry under 90° rotations (C4). This is revealed because the ANN array yields up to three times higher probability in that specific category (C = 2) when the data are input in the x orientation (red) compared to when identical data are provided in the y orientation (yellow) (Fig. 5j–l). Although the extreme nanoscale disorder masks any directional preference in the images of Fig. 5a–d, the DW modulations occur primarily along the x axis of the CuO2 plane. ANN analysis of the energy dependence of this complete Z(rE) image array, shown in Extended Data Fig. 1, further confirms that the appearance of this nematicity (Fig. 5i–l) occurs when approaching the pseudogap energy scale, which is |Δ1| ≈ 80 meV. Thus, the ANNs find that a nematic state emerges at the pseudogap energy because of highly disordered, but unidirectional, 4a0-period modulations. This discovery strongly implies that the nematic electronic structure of CuO2 is a vestigial nematic state32 whose characteristic energy gap is the pseudogap. Advanced theory32 predicts that a unidirectional DW that is reduced by disorder to extremely short coherent lengths should generate a nematic state dubbed the vestigial nematic state. Although experimental validation of this hypothesis is considered impossible using conventional Fourier transform techniques26,27, here it is demonstrably achievable by an ANN array (Fig. 5, Extended Data Fig. 1). Existence of a vestigial nematic state in carrier-doped CuO2 would provide a direct, internally consistent link between a nematic state and the unidirectional, 4a0-period DW modulations, whose energy gap is the pseudogap (Fig. 4). The evidence of the presence of a vestigial nematic emerged unexpectedly from ANN analysis of experimental image arrays not optimized for such studies; determining a complete p dependence with the ANN suite will require new measurements of appropriately optimized image arrays.

To summarize, we have developed and demonstrated a new general protocol for ML-based identification of symmetry-breaking ordered states in electronic-structure image arrays obtained from EQM visualization experiments. Our ANNs are trained to learn the defining motifs of each category, including its topological defects, and to recognize those motifs in real EQM image arrays (Fig. 1e). Despite the complexity of the hole-doped Mott insulator state, instrument distortion and noise, and the intense electronic disorder of the EQM image arrays studied (Figs. 1d, 3a, b, 4, 5), the ANNs repeatedly and reliably discover the predominant features of a specific ordered state. The signature of this state for 0.06 ≤ p ≤ 0.14 is an electronic-structure modulation that is lattice-commensurate, unidirectional, and has a d-symmetry form factor and a period of λ = 4a0 (Fig. 4). The predominance of this phenomenology (Fig. 4) implies that a strong-coupling position-based theory is central to these broken-symmetry states of carrier-doped CuO2. The ANN array also reveals that it is the λ = 4a0 DW modulations at the pseudogap energy that break the global rotational symmetry to generate a nematic state (Fig. 5, Extended Data Fig. 1). This implies that the pseudogap region of the CuO2 phase diagram (Fig. 1a) contains a vestigial nematic state. In addition, the demonstration that ANNs can process and identify specific broken symmetries of highly complex image arrays from non-synthetic experimental EQM data is a milestone for general scientific technique. Overall, these advances open the prospect of additional ML-driven scientific discovery in EQM studies.

Methods

Strong-coupling DW states

Real-space (position-based) strong-coupling theories for carrier-doped CuO2 predict lattice-commensurate, unidirectional, DWs in various electronic degrees of freedom. Among them are two candidate states that can both lead to 4a0-period modulations of the charge density and of the local density of states N(r) with wavevector Q = (2π/(4a0), 0). The first is a 4a0-period modulation in the charge density of the two oxygen sites Ox and Oy within each unit cell, with a relative phase of π between them. This is a charge DW with a d-symmetry form factor, which exists as a fundamental ordered state. The second is an 8a0-period modulation of the d-wave Cooper pair density, which can exist as a fundamental ordered state and induces a 4a0-period modulation in the charge density. These two distinct fundamental states are shown schematically in Extended Data Fig. 2a, b.

Fourier transform analysis of EQM images: disorder and information loss

A Fourier transform (FT) of two-dimensional image data is a linear transformation of the data. All the information in the original image appears in the full, complex FT throughout reciprocal space. Importantly, when there are complicated local patterns or motifs of short-range order at the atomic scale in real space, that information gets spread over all of reciprocal space. This is because what is extremely local in real space becomes completely delocalized in reciprocal space. However, in the traditional mode of FT analysis, a compact region in reciprocal space is selected as a region of importance because the intensity is peaked at that point. Crucially, this approach discards abundant information throughout reciprocal space away from the peak-intensity wavevector. For hole-doped CuO2, the real-space electronic structure at the atomic scale is uniquely complex (Fig. 1). For instance, one always finds that the scanning tunnelling microscopy (STM) image whose FT peak intensity occurs away from Q = 0.25 (see Extended Data Fig. 3a) hosts distinct local motifs that are commensurate with the lattice (see Extended Data Fig. 3b). Because any information that is local in position space gets spread over all reciprocal space, when one discards much of the data throughout reciprocal space, crucial insights contained in atomic-scale electronic-structure motifs, discommensurations and topological defects are lost. On the other hand, because of the versatility of ANN in capturing any function whatsoever33, the new ML approach allows one to impartially inspect the entirety of the data in each STM image with no loss of information. This is a key distinction between the traditional FT approach and the ML approach, which exhaustively analyses all of the data throughout real space.

Generation of training image sets

The diversification of synthetic images of a unidirectional DW to create a training image set (see Extended Data Fig. 4) starts from components with d-wave and s-wave form factors (DFF and SFF, respectively) and includes (1) heterogeneity through independent amplitude and phase fluctuations and (2) topological defects or dislocations in DFF. For categories C = 1, 2, 3, 4 (with representative wavelength λC) the DFF \(\left({I}_{C,{\rm{f}},{\rm{d}}}^{{\rm{DFF}}}\right)\) and SFF \(\left({I}_{{\rm{f}}}^{{\rm{SFF}}}\right)\) modulations with noise models are

$$\begin{array}{l}{I}_{C,{\rm{f,d}}}^{{\rm{DFF}}}\left(x,y\right)={A}_{{\rm{DFF}}}\left[1+{\varepsilon }_{{\rm{A}}}{A}_{{\rm{f}}}\left(x,y\right)\right]{A}_{{\rm{d}}}\left(x,y\right){\rm{\cos }}\left[2{\rm{\pi }}x/{\lambda }_{C}+{\varepsilon }_{\phi }{\phi }_{{\rm{f}}}\left(x,y\right)+{\phi }_{{\rm{d}}}\left(x,y\right)+{\phi }_{{\rm{DFF}}}\right]\\ {I}_{{\rm{f}}}^{{\rm{SFF}}}\left(x,y\right)={A}_{{\rm{SFF}}}\left[1+{\varepsilon }_{A}{A}_{{\rm{f}}}\left(x,y\right)\right]cos\left[{\varepsilon }_{\phi }{\phi }_{{\rm{f}}}\left(x,y\right)+{\phi }_{{\rm{SFF}}}\right]\end{array}$$
(1)

with overall constants ADFF = 1, ASFF = 0.5 and phase offsets φDFF = π/4, φSFF = 0. Here the amplitude field Af(xy) and the phase field φf(xy) represent smooth fluctuations (different random realizations in \({I}_{C,{\rm{f,d}}}^{{\rm{DFF}}}\left(x,y\right)\) and \({I}_{{\rm{f}}}^{{\rm{SFF}}}\left(x,y\right)\)), and Ad(xy) and the phase field φd(xy) denote dislocation defects. For each category, we generate different realizations labelled as f and d. For each f realization the Af(xy) field is a two-dimensional Gaussian fluctuation field with spatial length scale ξA = 8a0, normalized between −1 and 1, while φf(xy) is a two-dimensional Gaussian fluctuation field with the same spatial length scale, ξφ = 8a0, normalized between −π and π. The values of the correlation length scales ξA and ξφ are determined by a simple analysis of an SISTM Z(qE) FT (Fig. 3). The strengths of the amplitude and phase fluctuations εA = 0.8 and εφ = 0.5, respectively, are also chosen to produce images in rough consistency with a typical Z(rE). In each image, there are nd = 2 dislocations at random positions ri = (xiyi), i = 1,…, nd, with windings wi = ±2π and total winding 0. The total dislocation-contributed fields are

$$\begin{array}{l}{A}_{{\rm{d}}}\left(r\right)=\prod _{i=1}^{{n}_{{\rm{d}}}}\left[1-exp\left(-| {\boldsymbol{r}}-{{\boldsymbol{r}}}_{i}| /{\xi }_{{\rm{d}}}\right)\right]\\ {\phi }_{{\rm{d}}}\left(x,y\right)=\sum _{j=1}^{{n}_{{\rm{d}}}}arg\left[{\rm{sgn}}\left({w}_{j}\right)\left(x-{x}_{j}\right)+i\left(y-{y}_{j}\right)\right]\end{array}$$

where the amplitude recovery length is ξd = a0, as determined from Z(rE).

Then, the training set for each category C combines the different form factor components into the image intensity at pixel position (xy) in units of a through IC(xy) = IC,DFF(xy)D(xy) + ISFF(xy)S(xy) using atomic masks. The SFF mask S(xy) is a sum of two-dimensional Gaussians with maxima equal to 1 and spatial widths equal to 0.35a0, each located at a Cu atom position (x and y integers), whereas the DFF mask D(xy) is a sum of positive Gaussians at the locations Ox and negative ones at Oy. The total intensity IC(xy) of all simulated images is normalized to take values between 0 and 1. All simulated images have six pixels per a0 and contain 2 × 86 × 86 unit cells, to give a total size of 516 × 516 pixels.

Configuration of ANN

In a feed-forward fully connected ANN, the neurons form a layered structure and the output of each neuron is sent to all of the neurons in the subsequent layer. Each neuron j in the first layer assesses the input, organized as a vector x = {xi} with weight matrix w = {wji} and bias vector b = {bj}, and determines the output through a nonlinear transformation \(f({\boldsymbol{w}}\cdot {\boldsymbol{x}}+{\boldsymbol{b}})\), called the activation function. The bias bj and the weights wji are the parameters of the ANN and are adjusted during the training. The activation function usually takes the form of the sigmoid function or the rectified linear unit (see the inset of Extended Data Fig. 5a). For input image data x = {xi} and an ANN with a single hidden layer of neurons (labelled by j), the neurons in the output layer corresponding to the different categories receive the inputs

$${\sigma }_{c}^{^{\prime} }=\sum _{j}{w}_{cj}^{(2)}f\left(\sum _{i}{w}_{ji}^{(1)}{x}_{i}+{b}_{j}^{(1)}\right)+{b}_{c}^{(2)}$$

We also use a softmax function for the output layer for normalized ANN outputs \({\sigma }_{c}={\text{e}}^{{\sigma }_{c}^{^{\prime} }}/\displaystyle {\sum }_{c}{\text{e}}^{{\sigma }_{c}^{^{\prime} }}\) that allows a probabilistic interpretation for the different categories denoted by the subscript c.

For supervised ML, we divide the dataset into a training set containing 90% of the images and the remaining 10% are used for unbiased validation, speed control and overfitting detection during the training. The weights and biases of the ANN are optimized using stochastic gradient descent to minimize the cross-entropy cost function

$${\rm{C}}{\rm{.E}}{\rm{.}}=\frac{1}{N}\sum _{{\boldsymbol{x}}}\sum _{c=1}^{4}\left\{{y}_{c}({\boldsymbol{x}}){\rm{ln}}\left[{\sigma }_{c}({\boldsymbol{x}})\right]+\left[1-{y}_{c}({\boldsymbol{x}})\right]{\rm{ln}}\left[1-{\sigma }_{c}({\boldsymbol{x}})\right]\right\}$$

Here x is a training data input vector labelled with a specific category yc(x) (for example, yc(x) = δc,1 if the training data belong to category 1) and σc(x) is the actual output of the ANN. The training process essentially minimizes the difference between the ANN output and the label.

We use a batch size of 50 and L2 regularization to avoid overfitting. We include 50 neurons in the hidden layer and choose the sigmoid function as the neuron activation function unless stated otherwise. In Extended Data Fig. 5a we show examples of the cost function, as well as the accuracy on the validation dataset, for the sigmoid and the rectified linear unit activation functions during the training. Extended Data Fig. 5b shows the achieved accuracy and cross-entropy cost after 25 epochs as a function of the number of neurons in a single hidden layer. We trained 81 ANNs with random initial conditions by using a stochastic training process. The outputs of the finalized ANNs are robust and quantitatively consistent with each other. The results provided in the main text show the average and standard deviations from all 81 ANNs. To verify that our results are robust against changes to the architecture of the ANN, we trained six ANNs with 100 neurons in a single hidden layer and six ANNs with two hidden layers, and we found that the results agree within error bars.

Because they are drawn from a historic image-array archive not designed for ML-based studies, the SISTM image arrays Z(rE) vary in spatial resolution from sample to sample from 1.7 to 11.5 pixels per average Cu–Cu distance. The number of CuO2 unit cells in the experimental images also varies from 2 × 55 × 55 to 2 × 175 × 175. The Cu and Ox,y atom positions, registered by the topograph, show random distortions of the lattice due to the STM tip-drift effect (Extended Data Fig. 6a).

To correct for the drift and standardize all of the Z(rE) images, we prepare each Z(rE) as follows: (1) using interpolation, we map the Z(rE) image to the resulting input image so that each topographic atom position maps onto a position in a perfect atomic lattice with a Cu–Cu distance of a0 = 6 pixels (see Extended Data Fig. 6b, c), which corrects both the drift effect and standardizes the spatial resolution; (2) we crop or tile the image to a size of 516 × 516 pixels; (3) to study the degree of unidirectionality, for each input image we create a copy rotated by 90°, since the training images have modulations only along the x direction for simplicity and clarity. Extended Data Fig. 7 shows the Z(rE) and prepared input data at different dopings of Bi2Sr2CaCu2O8. It should be noted that the results are reliable only if the test data lie reasonably consistently within the input space given by the synthetic training sets.

Validation and benchmarking

To assess the discriminatory power of the ANN categorization, we study obvious modulations in two experimental images (Extended Data Fig. 8): (1) the topograph of Bi2Sr2CaCu2O8, which has no discernible modulation except for the Cu atomic lattice (an SFF at Q = 0); (2) the Z(rE) image of Ca2−xNaxCuO2Cl2 (NaCCOC), with obvious commensurate 4a0-period modulations, which are apparent in a DFF FT. The ANN indeed finds that no category is particularly compelling for the topograph of Bi2Sr2CaCu2O8 in Extended Data Fig. 8a (see Extended Data Fig 8b). By contrast, the ANN finds category 2 to be the most compelling for the the Z(rE) image of NaCCOC in Extended Data Fig. 8c (see Extended Data Fig 8d).

We also checked the robustness of our approach against the existence of Bi2Sr2CaCu2O8 superlattice modulations. The assessment of the ANNs was independent of the existence or absence (data with superlattice modulation removed from the FT) of the superlattice modulations.

We further tested the robustness of the ANN decisions against changes in the disorder model. For this we trained a new ANN with a training set generated with different disorder parameters. Specifically, we decreased the amplitude fluctuation intensity εA by 13% and the phase fluctuation intensity εφ by 20% while making the disorder profiles vary more rapidly in space by decreasing the correlation lengths ξA and ξφ by 6%. By repeating the assessment of the experimental data shown in Fig. 4k–m, o and Extended Data Fig. 1a with the new ANN, we find that the results remain unchanged. This is shown by the comparison between Fig. 4k–m, o and Extended Data Fig. 1a (shown in Extended Data Fig. 9a–e) and the output from the ANN trained with the new disorder model (shown in Extended Data Fig. 9f–j). The results show preference for the commensurate period 4a0 for systems with 0.06 < p < 0.14 (Extended Data Fig. 9a–d, f–i) and complete confusion over different candidate categories for p = 0.2 (Extended Data Fig. 9e, j). The energy dependence comparison between the ANN assessments presented in the main text (Extended Data Figs. 1a, 9e) and the assessments of the ANN trained with the altered disorder model (Extended Data Fig. 9j) shows that the tie between the onset of preference for the commensurate period 4a0 and the nematicity at the pseudogap energy scale is robust against variations in the disorder model used to train ANNs.

Discommensurations and maximum intensity wavevector

In ref. 28, we carried out FT-based linear analysis of equivalent data using the fact that the power spectral density was not smoothly distributed (Extended Data Fig. 10a, b). We introduced the concept of the demodulation residue (DR), using

$${{\boldsymbol{R}}}_{{\boldsymbol{q}}}^{\alpha }(\psi )\equiv \int {{\rm{d}}}^{2}{\boldsymbol{r}}\frac{1}{{L}^{2}}{\rm{R}}{\rm{e}}[{{\rm{\Psi }}}_{{\boldsymbol{q}}}^{\ast }({\boldsymbol{r}})(-i{{\rm{\partial }}}_{\alpha }){{\rm{\Psi }}}_{{\boldsymbol{q}}}({\boldsymbol{r}})]$$

where α = x, y indicates the component of the DR vector, r is the position vector in the image and L is the linear dimension of the image. The DR measures the phase fitness of the q-modulation in the spatial pattern ψ(r) through the filtered FT

$${\Psi }_{{\boldsymbol{q}}}\left({\boldsymbol{k}}\right)=exp\left[-\frac{{\left({\boldsymbol{k}}-{\boldsymbol{q}}\right)}^{2}}{2{\Lambda }^{2}}\right]exp\left(-i{\boldsymbol{q}}\cdot {\boldsymbol{x}}\right)\widetilde{\psi }\left({\boldsymbol{q}}+{\boldsymbol{k}}\right)$$
(2)

where \(\widetilde{\psi }\left({\boldsymbol{k}}\right)\) is the FT of the data and Λ is the Fourier cutoff. By minimizing the DR, \({R}_{{\boldsymbol{q}}}\left(\psi \right)\equiv \sqrt{{\left[{{\boldsymbol{R}}}_{{\boldsymbol{q}}}^{x}\left(\psi \right)\right]}^{2}+{\left[{{\boldsymbol{R}}}_{{\boldsymbol{q}}}^{y}\left(\psi \right)\right]}^{2}}\), for a given modulation while considering different q-modulations, we showed that one can obtain the phase-averaged wavevector \(\bar{{\boldsymbol{Q}}}\) of DW modulations. Within the limits of the FT, which is a linear-basis transform, this approach facilitated dealing with situations in which the amplitude does not show well-defined peaks owing to severe disorder.

However, there are limitations in this approach because the FT is a linear transformation of the basis and is useful when the desired phenomenon has sharp features in the new basis, that is, the wavevector basis. Nevertheless, when there are randomly placed, highly disordered patches of a real-space DW pattern with sprinkles of topological defects, FT-based methods perform very poorly. Obviously, one would not attempt a FT in trying to recognize human faces in an image for precisely this reason. The limitation of FT-based methods is evident in that, even when a modulation pattern consists of a commensurate 4a0-period modulation (Q0 = 2π/(4a0)) everywhere except for a sequence of discommensurations (phase slips in the commensurate modulation pattern), the Rq(ψ) minimization (as well as the FT amplitude maximization) incorrectly identifies an apparent period of \(\bar{Q}=0.3\times 2{\rm{\pi /}}{a}_{0}\) (Extended Data Fig. 10e). Although in ref. 28 the DR minimization yielded \(\bar{Q}=2{\rm{\pi /}}\left(4{a}_{0}\right)\) for pseudogap energy data (a single dataset for each doping) for various dopings, this depended critically on visual inspection to identify commensurate patches in supplementary figure 6B of ref. 28 (see also Extended Data Fig. 3). Furthermore, the DR-based approach therein was averaged over topological defects (dislocations), ignoring their role. Finally, the DR-based approach required manual choice of the Fourier cutoff, again using visual inspection of the data. Hence the entire process is time-consuming, highly labour-intensive and fraught with human perceptual bias. It is therefore not possible to study the largest SISTM image arrays with this FT approach in any consistent way, rendering it impossible to inspect the complete electron density and electron energy dependence of the largest EQM image-array archives.

The ANN-based approach that we introduce in the main text is far more powerful, efficient and general. It does not rely on arbitrary choices, such as the cutoff Λ, or on visual selection of Fourier regions of interest, and is not tied to any basis. The ANN is inherently nonlinear and an ANN with a sufficient number of neurons can express/detect any function33. Owing to the versatility of ANNs, our ANN-based approach allows us to rapidly analyse a complete image-array dataset in its entirety, without any ad hoc Fourier filtering or selection. Hence the ANN approach is quite unbiased. Moreover, once the ANNs are trained, the automatic assessment of a new dataset takes minutes, allowing for a high-throughput analysis. It is this efficiency that allowed the discovery of the connection between the nematic state and commensurate DW state, both setting in at the pseudogap energy scale (Extended Data Fig. 1).