Main

In the last decade, mechanical devices have been brought squarely into the domain of quantum science through a series of remarkable experiments exploring the physics of measurement, transduction and sensing6,7,8,9,10,11,12,13. Two paradigms for obtaining quantum control over such systems are those of cavity optomechanics, where the position \(\hat{x}\) of a mechanical oscillator parametrically couples to a higher-frequency electromagnetic cavity6, and quantum acoustics, where an artificial atom or qubit exchanges quanta with a mechanical oscillator. The latter is the acoustic analogue of cavity or circuit quantum electrodynamics (cQED), the archetypal playground of quantum optics, which has enabled a vast range of experiments probing quantum physics14 and led to the emergence of the superconducting approach to quantum information processing15. In quantum acoustics systems, the exchange of quanta between a qubit and a mechanical oscillator is described by the Hamiltonian \({ {\hat{H}} }_{{\rm{int}}}=g({\hat{\sigma }}_{-}+{\hat{\sigma }}_{+})(\hat{b}+{\hat{b}}^{\dagger })\)  , where g is the coupling rate and \({\hat{\sigma }}_{+}\left({\hat{\sigma }}_{-}\right)\) and \({\hat{b}}^{\dagger }\left(\hat{b}\right)\) are the raising and lowering operators of the qubit and mechanical modes, respectively. Strong coupling of the system is realized for a g value greater than the decoherence rates of the qubit (γ) and the mechanical mode (κ) in this limit, and a single excitation can be resonantly swapped multiple times before being lost to the environment. Experiments in this regime have demonstrated quantum control of mechanical systems at the single-phonon level7,12,16, as well as preparation of higher Fock states using more elaborate protocols11.

In both cavity optomechanics and quantum acoustics, approaches to probing the phonon-number states of a mechanical resonator invariably involve swapping phonons into a resonator or qubit acting as a meter. Alternatively, we can build a measurement apparatus that directly senses the mechanical energy stored in a resonator without the need to exchange excitations13. This quantum nondemolition (QND) approach to measuring motion has numerous advantages and remains a challenge in the study of mechanical systems in the quantum regime. Cavity optomechanical approaches to QND attempt to couple the detector to the \({\hat{x}}^{2}\), instead of the \(\hat{x}\), observable of the mechanical system17, but require optomechanical coupling rates beyond the current experimental capabilities18,19 to achieve phonon-number resolution. In quantum acoustics, detuning the transition frequency of the qubit ωge = ωm + Δ (where \(\left|g\right\rangle \) is the ground state and \(\left|e\right\rangle \) is the excited state) from the mechanical frequency ωm by |Δ| ≫ g prevents the direct swap of real excitations between the two systems. Instead, it leads to an off-resonant interaction between the qubit and mechanical system that results in an energy-dependent shift of the qubit frequency induced by virtual transitions. The effective Hamiltonian

$${\hat{H}}_{{\rm{e}}{\rm{f}}{\rm{f}}}={\omega }_{{\rm{m}}}{\hat{b}}^{\dagger }\hat{b}+\frac{1}{2}({\omega }_{{\rm{g}}{\rm{e}}}+2\chi {\hat{b}}^{\dagger }\hat{b}){\hat{\sigma }}_{z}$$
(1)

where \({\hat{\sigma }}_{z}\) denotes the Pauli z operator, accurately describes the system in this off-resonant regime3. Under \({ {\hat{H}} }_{{\rm{eff}}}\), the only interaction between the two systems manifests as a qubit frequency shift \(2\chi \hat{n}\), which is dependent on the phonon number \(\hat{n}={\hat{b}}^{\dagger }\hat{b}\). For superconducting charge qubits operating in the transmon regime, the effects of the higher excited level \(\left(\left|f\right\rangle \right)\) must also be taken into account when calculating χ. The resulting expression for the dispersive coupling rate χ is distinct from that for the two-level atom case, and is given by20

$$\chi =-\frac{{g}^{2}}{\Delta }\frac{\alpha }{\Delta -\alpha }$$
(2)

where α = ωge − ωef is the transmon anharmonicity. Because \({ {\hat{H}} }_{{\rm{eff}}}\) commutes with both the phonon-number operator \(\hat{n}\) and with \({\hat{\sigma }}_{z}\), the two systems cannot exchange energy, so measurements of the qubit excited-state population \(\left(1+{\hat{\sigma }}_{z}\right)/2\) do not perturb the phonon number. Furthermore, in the limit χ ≫ max{γ, κ}, the frequency shift 2χ induced by the presence of a single phonon in the oscillator becomes resolvable in the qubit excitation spectrum (see Fig. 1 for a schematic representation). We call this the phonon-number-splitting regime, in analogy to the dispersive regime of cQED, where photons in an electromagnetic cavity3,21 or other bosonic excitations22 lead to an energy-dependent atomic-transition frequency. In cQED, the dispersive regime has been instrumental in implementing new approaches to quantum measurement and error correction5.

Fig. 1: Phonon-number-splitting scheme.
figure 1

The state of a mechanical oscillator is described in quantum mechanics by a linear superposition of equally spaced energy eigenstates \(\left|n\right\rangle \), each representing a state of n phonons in the system. This quantized structure is normally not resolvable because all of the transitions between the energy levels occur at the same frequency ωm. By coupling the resonator to a qubit of transition frequency ωge with a rate of g, we cause splitting in the qubit spectrum that is parameterized by the dispersive coupling rate χ. This allows us to distinguish between the different phonon-number states that are present in the oscillator.

Several technical hurdles have prevented phonon-number splitting from being observed in quantum acoustics. The sub-micrometre wavelength of gigahertz-frequency acoustic phonons—much smaller than the scale of the electrodes comprising the qubit circuits—leads to an enormous phonon mode density being accessible at the qubit transition frequency. Uncontrolled coupling to phonons is a known source of dissipation in cQED23 and represents a major challenge in combining qubits with the strong piezoelectrics needed for phonon sensing and control. Systems designed to have large coupling rates tend to couple strongly to parasitic modes, reducing the overall coherence of the qubits as well as the availability of viable operating frequencies. Approaches to mitigating these losses have included tunable couplers that isolate the qubit from the piezoelectric material by rapidly turning off the coupling after interaction12, and bulk wave resonators in which the participation ratio of the qubit electric field with the piezoelectric is reduced11,16. Our approach avoids sacrificing coupling, while maintaining qubit coherence. We reduce the density of accessible mechanical modes that the qubit can radiate into by confining the phonons to a very small piezoelectric region where only a few mechanical modes are present at the frequencies of interest. Ordinarily, the leakage of phonons out of this region through its supporting anchors would lead to rapid decoherence of both the mechanical system and the qubit. To realize leakage-free anchors, we create a periodic patterning of the elastic material that opens a phononic bandgap. Equivalently, we can view the mechanical resonator as a defect in a phononic crystal bandgap material that is etched into a piezoelectric film.

To fabricate the chip-scale system, we integrate microwave Josephson junction qubits (aluminium on high-resistivity silicon) with piezoelectric nanomechanical devices patterned from thin-film lithium niobate (LiNbO3; LN) (see Methods for fabrication details). As seen in Fig. 2, our system utilizes a transmon qubit of the type presented in ref. 24, which is controlled via on-chip microwave lines and read out dispersively using a coplanar waveguide microwave readout resonator. The transmon is coupled to an array of one-dimensional phononic crystal defect resonators through the piezoelectricity of LN. Each mechanical structure consists of a narrow, suspended beam of patterned LN (Fig. 2b) with a periodicity of a = 1 μm that opens a complete bandgap in the region ~2−2.4 GHz. The defect site at the centre of the phononic crystal supports highly confined mechanical modes with frequencies that lie within the bandgap (Fig. 2c, d). To address these modes, we place aluminium electrodes directly on top of the phononic crystal anchors. With one terminal grounded and another terminal contacting the transmon, the voltage fluctuations of the qubit create an electric field in the defect site, which is linearly coupled to its mechanical deformation by the piezoelectric effect. The structure is designed such that at least one of the localized modes generates a polarization that is aligned with the electric field produced by the electrodes (see Methods for design details).

Fig. 2: Fabricated device.
figure 2

a, False-colour optical micrograph of the device, showing the readout resonator (purple), transmon qubit (green) and nanomechanical resonators (white box). The qubit flux control (Z) and excitation (XY) lines are shown in white. b, False-colour scanning electron micrograph of the suspended resonators. Each resonator consists of a defect site embedded in a phononic crystal that supports a complete phononic bandgap in the frequency range ~2−2.4 GHz. The structures are fabricated from a 250-nm-thick film of lithium niobate (dark blue) that is suspended above a silicon substrate, and are coupled to the qubit via thin aluminium electrodes (light blue) that address the defect modes. We form a connection between the electrodes and the qubit using superconducting bandages, which are visible as small squares at the edges of the LN-supporting slabs. c, Scanning electron micrograph of a phononic crystal defect. d, Finite-element method simulation of a mechanical defect mode, showing the localized deformation of the structure and the electrostatic potential ϕ(r) (colour scale) generated through the piezoelectricity of LN.

We first probe the mechanical resonances by measuring the qubit excitation spectrum as we tune its transition frequency ωge across the phononic bandgap region. Here, frequency control is provided by a magnetic flux applied via an on-chip flux line, and the qubit is excited using a dedicated charge line. The state of the qubit is measured by using its dispersive interaction with the microwave readout resonator. We scatter a pulse off the resonator and monitor the transmitted complex voltage (amplitude and phase) as we sweep the frequency of the spectroscopy pulse. After subtracting the voltage transmitted with the qubit in its ground state, the amplitude of the resulting signal is directly proportional to the qubit excited-state population. The results of these measurements are shown in Fig. 3a, where we observe a series of anticrossings corresponding to various defect modes. From these data we obtain the frequencies \(\left\{{\omega }_{{\rm{m}}}^{(i)}\right\}\) and coupling rates {gi} of the five most strongly coupled modes, each corresponding to an individual resonator in the array. We measure coupling rates in the range g/2π = 13−16 MHz, in fairly good agreement with finite-element simulations (see Methods). We also observe a set of anticrossings corresponding to a small number of additional, weakly coupled defect modes. For the phonon-number-splitting measurements presented later, we use the highest-lying mechanical mode at \({\omega }_{{\rm{m}}}^{(1)}/2{\rm{\pi }}=2.405\;{\rm{GHz}}\), for which we perform a ringdown measurement to find its decay rate, κ/2π = 370 kHz. Next, to characterize the coherence of the qubit we tune it to ωge/2π = 2.301 GHz, sufficiently far from all of the mechanical modes, and measure a qubit energy relaxation time of T1 = 1.14 μs and a total qubit linewidth of γ/2π ≈ 600 kHz. Finally, we extract the qubit anharmonicity α/2π = 138 MHz using a two-tone spectroscopy measurement of the \(\left|g\right\rangle \to \left|e\right\rangle \) and \(\left|e\right\rangle \to \left|f\right\rangle \) transitions. All together, these parameters place the system deep in the strong-coupling regime (g ≫ κ, γ) and open up the possibility of observing phonon-number splitting, with an expected dispersive shift of 2χ/2π ≈ 3 MHz.

Fig. 3: Qubit spectroscopy and mechanical-mode structure.
figure 3

a, Qubit spectrum as a function of the externally applied magnetic flux, Φe (Φ0 is the magnetic flux quantum). The arrows indicate the strongly coupled mechanical modes associated with each of the five phononic crystal defect resonators. The qubit frequency at the flux sweet spot (Φe = 0) is \({\omega }_{{\rm{ge}}}^{{\rm{(max)}}}/2{\rm{\pi }}=2.417\;{\rm{GHz}}\), in close proximity to the highest-lying mechanical mode at \({\omega }_{{\rm{m}}}^{(1)}/2{\rm{\pi }}=2.405\;{\rm{GHz}}\) that is used for the phonon-number-splitting experiment. A small number of weakly coupled features are present in the spectrum, corresponding to additional localized defect modes. a.u., arbitrary units. b, Close-up of the anticrossing with the mechanical mode at 2.257 GHz (dashed box in a). The vertical slice at zero detuning is shown in white to the right, and is used to calculate a coupling rate g/2π = 15.2 MHz.

To observe phonon-number splitting, we perform a pump–probe measurement consisting of a short phonon excitation pulse followed by a longer qubit spectroscopy pulse (Fig. 4 inset), along with a readout pulse at the end to infer the qubit excited-state population as described earlier. The phonon pulse is sent to the XY line of the qubit. Because the qubit and the mechanical system are weakly hybridized when far-detuned, the pulse drives the mechanical system into an approximately coherent state. The duration τ of the spectroscopy pulse is chosen to balance two competing effects: the pulse bandwidth ~ 1/τ needs to be sufficiently small to resolve the narrowest spectroscopic features, which have a width of about γ in our system, while τ cannot be much longer than the phonon lifetime, 1/κ, because the mechanical mode must remain excited during the measurement. Of the two requirements, τ ≫ γ−1 = 270 ns and τ ≲ κ−1 = 430 ns, the first is necessary to observe phonon-number splitting and the second determines the effective size of the observed mechanical state. We choose τ = 1.5 μs, which satisfies the first—but not the second—condition to obtain better resolution for the phonon-number peaks. As we perform the measurement, the mechanical mode experiences considerable dissipation, which limits the mean number of phonons that we can observe in this experiment to \(\langle \hat{n}\rangle \approx 1\). Additionally, the qubit frequency undergoes a slow drift during the course of the measurement. To account for this, we periodically measure the qubit frequency with the phonon pulse turned off and use this to offset the data before averaging them (see Methods for a detailed discussion).

Fig. 4: Phonon-number splitting.
figure 4

The qubit excitation spectrum is measured following a phonon excitation pulse of duration τmech = 175 ns and of varying amplitude (see inset for the pulse sequence). The detuning on the horizontal axis is relative to the qubit frequency ωge/2π = 2.317 GHz in the absence of a phonon excitation pulse. The initial phonon populations prepared by the pulse decay over the course of the measurement but are nevertheless visible as individual peaks separated by twice the dispersive coupling rate, 2χ. At the highest drive amplitudes we are able to resolve states with phonon numbers up to n = 3. We fit the data (blue points) using numerical master-equation simulations of the full pulse sequence (solid grey lines), with the mechanical drive strength as the only free fit parameter in the Hamiltonian. From these simulations we extract the mean phonon number \(\bar{n}=\left\langle \hat{n}\left({\tau }_{{\rm{mech}}}+\tau /2\right)\right\rangle \) midway through the qubit spectroscopy pulse, which we indicate next to each spectrum.

We use the highest-lying mechanical mode at \({\omega }_{{\rm{m}}}^{(1)}/2{\rm{\pi }}=2.405\;{\rm{GHz}}\) and detune the qubit by Δ ≈ −6g to ωge/2π = 2.317 GHz. By varying the amplitude of the preparation pulse, we prepare states of varying phonon occupations, resulting in the qubit spectra found in Fig. 4. In addition to the original \(\left|g\right\rangle \to \left|e\right\rangle \) qubit transition, we observe a series of peaks corresponding to different phonon-number states \(\left|n\right\rangle \) populated by the preparation pulse. The peaks are uniformly separated by 2χ/2π ≈ 3 MHz, in close agreement with the dispersive shifts that are expected for our device parameters. The amplitude of the nth peak is an indirect measure of the population of state \(\left|n\right\rangle \), as evidenced by the fact that the relative heights of the peaks associated with n > 0 increase at higher excitation voltages. We also observe phonon-number-dependent linewidths for each peak, which can be understood as dephasing of the qubit due to the more rapid decay of higher-lying Fock states25. This broadens higher-phonon-number peaks and obscures the quantization of the mechanical oscillator’s energy. Therefore, at sufficiently large phonon occupations we enter a regime in which the effect of the mechanical motion on the qubit spectrum is that of an a.c. Stark shift induced by a coherent field13,26.

We numerically model our measurement using time-domain master-equation simulations that evolve the joint state \(\hat{\rho }(t)\) of the mechanical mode and qubit over the course of the pulse sequence (see Supplementary Information). We use the full Hamiltonian to evolve the state—as opposed to the dispersive Hamiltonian of equation (1)—in order to correctly model the excitation of phonons via the transmon. The final state of the system following the excitation and spectroscopy pulses is given by \({\hat{\rho }}_{{\rm{f}}}=\hat{\rho }({\tau }_{{\rm{mech}}}+\tau )\) and is used to calculate the qubit excited-state populations, \({p}_{{\rm{e}}}={\rm{tr}}\left\{{\hat{\rho }}_{{\rm{f}}}| e\rangle \langle e| \right\}\). These are overlaid with the data in Fig. 4. The parameters used in the simulation are obtained from an independent set of calibrations, as described in Methods. The only free parameter is a correction factor (of the order of 1) for the mechanical drive strength. An offset and a scaling factor are used to overlay the simulated excitation spectrum on the measurements. To provide an approximate measure of the size of the mechanical states in the resonator, we indicate the mean phonon number \(\bar{n}={\rm{t}}{\rm{r}}\{\hat{\rho }({\tau }_{{\rm{m}}{\rm{e}}{\rm{c}}{\rm{h}}}+\tau /\hspace{-0.5pt}2){\hat{b}}^{\dagger }\hat{b}\}\) midway through the spectroscopy pulse next to each spectrum in Fig. 4.

We have demonstrated a platform for quantum acoustics that combines phononic crystal defect modes with superconducting qubits. By using a phononic crystal bandgap, we reduce the mechanical and qubit dissipation rates while maintaining a large phonon–qubit coupling, g. This enables us to dispersively resolve the phonon-number states of a mechanical resonator—a key step towards realizing QND measurements of a solid mechanical object and detecting quantum jumps of the phonon number4. Looking forward, we expect phononic-crystal-based quantum acoustics to enable a new class of hybrid quantum technologies and provide a natural platform for integrating strongly piezoelectric materials with superconducting qubits. These types of mechanical resonator are also suited for efficient optical readout owing to their large mechanical-mode confinement and can provide a route to the networking of microwave quantum machines27,28. Moreover, very long coherence times of the order of 300 μs have now been demonstrated on phononic-crystal devices implemented in silicon29, suggesting that the mechanical dissipation of our devices can be improved with further investigation. Ultracoherent mechanical resonators integrated with qubits provide a route to realizing quantum acoustic processors in which phononic registers act as quantum memories that may simplify the scaling of superconducting quantum machines30. Finally, by moving into the strong dispersive regime, our work enables further demonstrations, such as QND detection of single phonons4 and generation of ‘Schrödinger cat’ states of motion31. In this context, we highlight a recent, independent observation of phonon-number splitting in a surface acoustic-wave device32.

Methods

Device fabrication

Our fabrication process begins with a 500-nm-thick film of lithium niobate on a 500-μm-thick high-resistivity (ρ > 3 kΩ cm) silicon substrate. The LN film is first thinned to approximately 250 nm by blanket argon milling. We then pattern a mask on negative resist (hydrogen silsesquioxane) with electron-beam lithography and transfer it to the LN with an angled argon milling step33. After stripping the resist, we perform a thorough acid clean to remove re-deposited amorphous LN. This is critical, as any remaining residue substantially lowers the quality of the electrodes deposited in a later step. Next, we define the aluminium ground plane, feedlines and transmon capacitor on the exposed silicon substrate using photolithography, electron-beam evaporation and liftoff. The Al/AlOx/Al Josephson junctions are then formed using a standard Dolan bridge technique and double-angle evaporation34,35. Following junction growth, we deposit 50-nm-thick aluminium electrodes directly on top of the phononic crystals to couple the defect modes to the qubit. This liftoff mask is patterned using electron-beam lithography with alignment precision of about 10 nm to the existing LN structures. In the final metallization step, we evaporate aluminium bandages that form the superconducting connections between the qubit capacitor, the electrodes, the junctions and the ground plane36. The bandages are 500 nm thick to smoothly connect the phononic crystal electrodes—resting on the 250-nm-thick LN film—with the qubit capacitor and the ground plane below. After dicing the sample into individual chips, the LN structures are released with a masked XeF2 dry etch that attacks the underlying silicon with high selectivity37. Finally, the release mask is stripped in solvents, and individual chips are packaged for low-temperature measurement.

Device parameters

Extended Data Table 1 gives the device parameters for the qubit, the five strongly coupled mechanical modes and the coplanar waveguide readout resonator. The maximum qubit frequency \({\omega }_{{\rm{ge}}}^{({\rm{\max }})}\) is extracted from a fit to the flux tuning curve, \({\omega }_{{\rm{g}}{\rm{e}}}({\Phi }_{{\rm{e}}})={\omega }_{{\rm{g}}{\rm{e}}}^{(max)}\sqrt{|\cos ({\rm{\pi }}{\Phi }_{{\rm{e}}}/{\Phi }_{0})|}\). The transmon anharmonicity α = ωge − ωef can also be extracted from the flux tuning curve, and we confirm this value with a separate two-tone measurement of the \(\left|g\right\rangle \to \left|e\right\rangle \) and \(\left|e\right\rangle \to \left|f\right\rangle \) transitions. Additionally, the qubit is characterized by its energy relaxation time, T1, and its total linewidth, γ. These parameters are measured using the techniques described in section ‘Characterization of qubit and mechanical oscillator’, from which we obtain an estimate for the dephasing time Tϕ through γ = 1/2T1 + 1/Tϕ. The five strongly coupled mechanical modes are characterized by their resonance frequencies \(\{{\omega }_{{\rm{m}}}^{(i)}\}\) and their coupling rates {gi} to the qubit, which are obtained by measuring the normal-mode splittings in the flux tuning dataset (Fig. 3a). We note that the extraction of g3 and g5 is complicated by the presence of additional weakly coupled modes. The decay rate κ of the mechanical mode \({\omega }_{{\rm{m}}}^{(1)}\) (used for the phonon-number-splitting experiment) was obtained via a ringdown measurement similar to the measurement of a qubit T1 (see section ‘Characterization of qubit and mechanical oscillator’). Finally, χ is the dispersive coupling rate between the transmon and the mechanical mode at the detuning \(\Delta ={\omega }_{{\rm{g}}{\rm{e}}}-{\omega }_{{\rm{m}}}^{(1)}\) used in the experiment, where Δ/2π = −88 MHz. We also list the frequency ωr and linewidth κr of the readout resonator.

Control of qubit and mechanical oscillator

The experimental setup is shown in Extended Data Fig. 1. In these experiments, we generate all qubit- and phonon excitation pulses using an arbitrary waveform generator (AWG) (Tektronix series 5200) with a rate of 5 × 109 samples per second. Because the qubit has a relatively low transition frequency (ωge/2π ≈ 2.4 GHz), the pulses are produced directly using the instrument’s built-in digital IQ mixer without further need for upconversion. The AWG output is then low-pass filtered at room temperature to remove Nyquist images, spurious intermodulation signals and clock bleedthrough. We use a separate AWG channel to generate the phonon excitation pulses, which are then combined with the qubit pulses at room temperature. Once in the cryostat, the signals are attenuated and filtered at various temperature stages before being routed to the qubit through a dedicated charge line on the device (labelled XY in Extended Data Fig. 1). Flux biasing is performed using a programmable voltage source (SRS SIM928), which is low-pass filtered at the 3-K stage (Aivon Therma-24G) and at the 7-mK stage; the d.c. signal is then sent to an on-chip flux line (labelled Z in Extended Data Fig. 1).

Qubit readout

The qubit state is read out dispersively via a superconducting coplanar waveguide resonator38. Square-envelope readout pulses are generated directly by the AWG with a carrier frequency of ωr/2π = 3.026 GHz, which roughly matches the resonance frequency of the readout resonator. One end of the resonator is capacitively coupled to the qubit, while the other end is inductively coupled to a through-feedline with a coupling rate of κr/2π = 1.3 MHz. After passing through two isolators (Quinstar QCY-030150S000), the signal is amplified at 3 K by a high-electron-mobility-transistor amplifier (Caltech CITCRYO1-12A), and at room temperature by two low-noise amplifiers (Miteq AFS4-02001800-24-10P-4 and AFS4-00100800-14-10P-4). Next, the signal is down-converted to an intermediate frequency (IF) of 125 MHz using a separate local oscillator (Keysight E8257D) and a double-balanced mixer (Marki ML1-0220I). Finally, the IF signal is amplified, low-pass filtered and digitized by an acquisition card (AlazarTech ATS9350) with 12-bit resolution and a sampling rate of 500 × 106 samples per second. The data are first stored on-board and then transferred to a graphics processing unit for real-time processing. Additionally, a vector network analyser (Rhode & Schwarz ZNB20) is used in the readout chain to calibrate the frequency of the readout pulses.

Characterization of qubit and mechanical oscillator

As described in section ‘Control of qubit and mechanical oscillator’, the system is driven through the transmon XY line at a time-dependent Rabi rate of Ω(t) = Ω0f(t), where f(t) is a normalized pulse envelope and Ω0 = AkVd is directly proportional to the drive voltage Vd. The conversion factor Ak, with k = {1, 3}, depends on which AWG channel is used to drive the qubit (see Extended Data Fig. 1) and varies with frequency. We first calibrate the qubit excitation pulses through a Rabi oscillation measurement with the qubit at ωge/2π = 2.318 GHz, the frequency at which we performed the phonon-number-splitting measurements (see Extended Data Fig. 2a). Here we use Gaussian pulses \(f(t)={\rm{\exp }}\left(-{t}^{2}/2{\sigma }_{t}^{2}\right)\) of width σt = 50 ns and varying amplitude Ω0(Vd) = A1Vd. From these data, we extract A1 = 2π × 93.9 MHz V−1. We infer \({A}_{3}\approx \sqrt{2}{A}_{1}\) from the presence of the extra 3-dB attenuator.

To measure the qubit energy relaxation time, T1, we use the calibration to choose an appropriate π-pulse amplitude and prepare the qubit approximately in the excited state, \(\left|e\right\rangle \). We then measure the excited-state population (Extended Data Fig. 2b) as we vary the delay between preparation and readout. The resulting data are fitted with an exponential to extract T1. We perform this measurement at a variety of qubit frequencies, all of them sufficiently separated from the strongly coupled mechanical modes, and measure relaxation times in the range T1 = 1.0−1.4 μs. In addition, we perform steady-state spectroscopy with the qubit at the frequency of the phonon-number-splitting experiment to extract the total qubit linewidth, γ/2π ≈ 600 kHz.

We perform a ringdown measurement to extract the decay rate κ of the mechanical mode used for the phonon-number-splitting experiment. Here, the qubit is first detuned by an amount Δ ≫ g to avoid hybridizing the modes (see Supplementary Information), and we then send a nearly resonant pulse at frequency \({\omega }_{{\rm{d}}}\approx {\omega }_{{\rm{m}}}^{(1)}\) to excite the mechanical mode. The mean mechanical occupation \(\left\langle {\hat{b}}^{\dagger }\hat{b}\right\rangle \) shifts the readout resonator through a small cross-Kerr interaction induced by the qubit. We therefore use this shift as an approximate measure of the occupation, in the same way that we measure the excited-state population of the qubit. Sweeping the delay between excitation and readout produces the ringdown curve shown in Extended Data Fig. 2b, with a decay rate of κ/2π = 370 kHz, corresponding to an energy relaxation time of κ−1 = 430 ns.

All the parameter estimates obtained from these characterization measurements are later used in the numerical simulations of the pulse sequence, which are discussed in Supplementary Information.

Pulse sequence

The phonon-number-splitting data were obtained using a pump–probe scheme in which a short phonon excitation pulse is sent at the mechanical frequency ωm and is immediately followed by a weak spectroscopy pulse. These pulses are generated with separate AWG channels and are later combined before entering the cryostat (see section ‘Control of qubit and mechanical oscillator’). Both pulses have cosine-shaped envelopes of the form V(t) = V0[1 − cos(2πt/τ)]/2, which are synthesized at a baseband frequency of νIF = 125 MHz and then digitally upconverted to their final carrier frequencies. For all of the phonon-number-splitting measurements, the length of the phonon excitation pulse is held fixed at τmech = 175 ns while its voltage is varied to prepare states with different mean phonon numbers \(\langle {\hat{b}}^{\dagger }\hat{b}\rangle \). The length and voltage of the qubit spectroscopy pulse are set to τ = 1.5 μs and V0 = 7.5 mV, respectively, corresponding to a Rabi rate of Ω0/2π ≈ 700 kHz (see section ‘Characterization of qubit and mechanical oscillator’) for details on pulse calibration). We separately verified that these pulse settings do not result in power broadening of the qubit line. Immediately following the spectroscopy pulse, a 2-μs square-envelope pulse is sent to the readout port of the device. We measure the I and Q quadratures of the scattered pulse and subtract the reference values I0 and Q0 recorded with the system in the ground state. The resulting signal is then an indirect measure of the excited state population.

Flux drift correction and qubit tracking

Our experiment uses a frequency-tunable transmon qubit that is tuned away from its flux sweet spot, making it susceptible to flux noise and drift. We take several precautions to reduce drift in the qubit frequency induced by variations in the environmental magnetic field. These include low-temperature magnetic shielding, vibrational isolation and low-pass filtering of the d.c. flux bias. Nonetheless, some of the data presented here required more than one hour of averaging. The slow drift in the qubit frequency—of the order of the linewidth of the qubit over one hour—smears out the peaks, which we correct in post-processing.

The drift correction scheme is realized by alternating between measurements of the qubit frequency and the phonon-number-splitting spectrum (Extended Data Fig. 3), and using the qubit frequency as a reference to align the phonon-number-splitting data for averaging. This is done by creating a two-part AWG sequence: in the first part of the sequence, we measure the qubit spectroscopic line in a narrow window around the expected (noiseless) qubit frequency while the mechanical system is left unexcited. The next part of the sequence performs the phonon-number-splitting measurement, that is, phonon excitation followed by qubit spectroscopy. The resulting IQ averages are returned for data collection every 20 s. In post-processing, the frequency drift is extracted from the qubit-tracking spectrum, then used to offset the phonon-number-splitting data. This scheme is able to reduce the apparent qubit linewidth from 2.8 MHz to 1.1 MHz during an hour-long measurement, allowing us to compensate for slow (sub-hertz) flux drift and to improve the resolution of the phonon-number states (Extended Data Fig. 3).

We note that fluctuations in the qubit frequency cause small (about 1.5%) changes in the qubit–phonon detuning Δ = ωge − ωm, which in turn cause dispersion in χ. Namely, for small changes δΔ in the detuning, we expect variations δχ in the dispersive shift per phonon of order

$$\frac{{\rm{\delta }}\chi }{\chi }\approx -\left(\frac{2\Delta -\alpha }{\Delta -\alpha }\right)\frac{{\rm{\delta }}\Delta }{\Delta }$$
(3)

Over the duration of the phonon-number-splitting measurements, we estimate that the peak-to-peak splitting varies by up to 75 kHz (δχ/χ ≈ 2.5%) given our operating parameters. Although this effect is small, it is preferable to reduce flux noise by more direct measures—post-processing can only improve the spectral clarity of phonon-number peaks to the extent that δχ ≪ γ, κ. One solution is to move to fixed-frequency qubits and use the a.c. Stark shift for frequency control, as done in other quantum acoustics experiments11.

Phononic crystal resonator design

As described in the main text, the mechanical resonators used in this work are one-dimensional phononic crystal resonators; each resonator is formed by introducing a single defect site to an artificial lattice that is patterned onto the LN. This localizes a set of vibrational modes at the defect site, provided that the modes lie widthin the bandgap of the surrounding lattice. This configuration can be thought of as a wavelength-scale resonator surrounded by acoustic Bragg mirrors. Each unit cell of the mirror region is comprised of a square-shaped block of LN uniformly covered by a 50-nm-thick aluminium layer. As shown in Extended Data Fig. 4a, the mirror cell is parameterized by its lattice constant a, strut length sl and strut width sw. Additionally, there are other geometric parameters that we cannot tightly control during fabrication, such as the LN thickness tLN, the sidewall angle θsw and the corner fillet radius R. We numerically simulate39 the eigenmodes of the mirror cell using Floquet boundary conditions, sweeping the wavevector k over the first Brillouin zone k ∈ [0, π/a]. This produces a band diagram such as the one shown in Extended Data Fig. 4b, where we use the same set of mirror cell parameters as those of the fabricated device. The diagram shows all possible bands of the structure within the frequency range of interest—including all polarizations and symmetries—and exhibits a clear phononic bandgap over the approximate range [1.6 GHz, 2.0 GHz]. This gap is similar in size to that observed in the experiment (about 400 MHz) but is centred at a lower frequency, which could be due to differences in the material properties of our films and those used for the simulations. As a final step, we verify the robustness of the phononic bandgap to variations in the mirror cell parameters to ensure that fabrication-induced fluctuations will not drastically alter the size or position of the gap.

The defect cell is created by stretching the local lattice constant to a larger value adef > a and introducing a break in the aluminium metallization, effectively forming a pair of electrodes separated by a gap of lgap (Extended Data Fig. 4a). This configuration supports modes that lie within the phononic bandgap and are therefore localized to the defect site (Extended Data Fig. 4c, d). Through the piezoelectric effect, the strain Sjk associated with each mode induces a polarization of Pi = eijkSjk in the crystal, where eijk is the piezoelectric coupling tensor. The modes of the structure can couple strongly to the qubit if the polarization field P overlaps with the electric field of the electrodes and is predominantly aligned along the same direction. Our devices are fabricated on X-cut LN, with the direction of propagation of the phononic crystals pointing along the Y crystal axis. This orientation allows for defect modes that have the ‘correct’ polarization, as shown in Extended Data Fig. 4c. Using the techniques outlined in ref. 40, we calculate coupling rates of g/(2π) ≈ 20−22 MHz for the fabricated defect geometries, in modest (difference of about 30%) agreement with our measurements (Extended Data Fig. 4e). In addition, the defects generally support other localized modes that do not couple as strongly (g/(2π) ≲ 5 MHz).

The device used in this experiment contains an array of five resonators that have the same mirror design, but different values of the defect width wdef. As discussed earlier, each resonator supports a small number of localized modes (Extended Data Fig. 4e), but only one of them has the correct polarization. In Extended Data Fig. 4f we show the simulated frequencies of such modes for the values of wdef used in our device. These simulation results clearly show that each of the five strongly coupled modes that we observe corresponds to a separate resonator in the array, and explain the origin of additional weakly coupled modes that are present in the spectrum.