1 Introduction

Lithium niobate (LN) and lithium tantalate (LT) are ferroelectric crystals with many applications in bulk as well as integrated optics devices [1, 23].

LN has negative birefringence, i.e. \(n_{e}-n_{o}<0\) for typical wavelengths and temperatures. The refractive index depends not only on wavelength, but also on temperature [10], the material composition [26] and doping [13]. The most commonly available crystals are grown from a congruent melt with composition of \(x_{c}\) = [Li]/([Li]+[Nb]) = 48.38% [5]. The deviation from stoichiometry (50%) leads to a high density of intrinsic point defects and can cause undesirable effects such as photorefraction [12]. The consistent properties and large size of congruent crystals has allowed high volume production of devices for commercial applications, in particular SAW filters [21] and Mach–Zehnder modulators [25]. Magnesium doping in LN reduces the influence of photorefractive effects, enabling applications in nonlinear optics [18], but results in compositional variation along the growth axis [28].

Some applications require stable birefringence over the intended lifetime of a device. One example is an integrated optics multiplexer where the birefringence between TE and TM modes is compensated by modal dispersion to enable efficient cross coupling between modes [16]. Another is an electro-optic modulator where the birefringent walk-off in the first crystal is canceled by the second crystal of equal length, with the optic axis of the second crystal rotated by 90\(^{\circ }\) relative to that of the first [3, 9].

In this paper, we report that the birefringence value of an LN crystal sample is affected by the temperature history of that sample. The effects are small, but can lead to refractive index shifts that interfere with device operation. We studied samples of congruent and off-congruent as well as MgO doped crystals and found that for crystals maintained at some constant temperature, the birefringence will asymptotically approach a value characteristic for that temperature. Any temperature cycling as typically performed in device processing will lead to birefringence changes that may take months to dissipate at room temperature. The observed behavior is consistent with lithium vacancies migrating over short distances, altering defect distribution patterns with an equilibrium distribution that is dependent on temperature.

2 Experimental

2.1 Measurement setup

Fig. 1
figure 1

Experimental setup (not to scale)

The experimental setup is shown in Fig. 1. The aluminum block at the center of the oven is resistively heated. A temperature controller (Eurotherm 2216e) reads the temperature via a Pt-100 thermistor. An additional temperature sensor (Analog Devices ADT7420) is mounted closer to the sample. We denote this reading the ”sample temperature” even though the actual sample temperature may be somewhat different. A hole in the aluminum plate allows the beam to pass. A thin peace of glass covers the hole to prevent air convection. The LN sample is placed onto the cover glass after adding a small amount of mineral oil for improved thermal contact. The aluminum part is enclosed in silicone foam insulation, resulting in an exponential decay time for cooling (no heating power) of 23 min.

The illumination consists of a halogen bulb (10 W nominal) placed in front of a pinhole. The pinhole is imaged with a 1” diameter lens through the polarizer and a turning mirror, creating an illuminated circle with 800 \(\upmu\)m diameter at the sample. The first polarizer is oriented to evenly split the light intensity along the principal refractive index directions of the sample. A second polarizer is placed and oriented such that all light is blocked in the absence of the birefringent sample. The light is collected with a condenser lens onto the slit of a spectrometer (Ocean optics USB4000) and recorded. Figure 2 shows the source spectrum (black) and a typical set of fringes through a 0.5 mm thick sample.

Fig. 2
figure 2

Illumination spectrum and typical fringe spectrum measured

2.2 Temperature control

In most experimental runs, the controller is used in manual mode to periodically cycle the sample through a small temperature window above a threshold temperature. As soon as the sample temperature falls below the threshold, a heating pulse of fixed duration and power is applied to increase the oven temperature about two degrees, then the power is set to a lower level so that the oven can start to cool, and that power is maintained until the sample temperature falls below the threshold at which point the next heating pulse is applied. This method avoids controller-induced heating variations affecting the sample, so cooling is smooth and reproducible. A subset of data from each temperature cycle is evaluated where the sample is cooling and its temperature falls into the 1 K bracket above the threshold temperature.

2.3 Crystal samples

Samples were cut from commercially available double-side polished X-cut wafers of LN, manufactured by G&H Ohio. The diameters of the congruent and MgO doped wafers were 100 and 76.2 mm. The thicknesses were 0.484 or 1.024 mm for congruent wafers, and 0.500 mm for the MgO doped wafer. All samples were cut close to the wafer centers, resulting in squares of \(4.35 \times 4.35\) mm along Y- and Z-direction. No anti-reflection coatings were applied. The perimeter of the samples were painted with silver paint (Ted Pella Leitsilber) to allow dissipation of pyro-electric charges that appear on the Z-faces caused by temperature changes.

Table 1 Crystal samples

Some of the samples were further processed to study the influence of point defects. To lower the proton content below that of the as-grown material [7], samples from CLN and MgO wafers were heated and held for 24 h at 1100 \(^{\circ }\)C in a sealed tube furnace with oxygen flowing at 1 l/min. This process has been shown to reduce the proton concentration to below 10% of the as-grown concentration. Other samples were annealed at 1100 \(^{\circ }\)C for 100 h in a lithium-rich atmosphere created by the presence of two-phase LN powders (65 mol% Li\(_2\)O) [6, 14]. The [Li]/[Nb] ratio in the crystal increases until either the treatment is terminated or the sample composition reaches a phase-boundary. For CLN, this boundary is close to stoichiometry ([Li]/[Nb] \(\approx\) 1), eliminating virtually all point defects. The changes in MgO are expected to increase the lithium content but the dopant will still be present as are other defects to satisfy charge neutrality. Table 1 shows the different sample compositions and thicknesses. Experiments were run on multiple different samples from each group, but all were cut from the center region of the wafer ensuring equivalent composition and thickness, and the consistency of measurements between samples within the same group was well within measurement uncertainty.

In a first experiment, a CLN-1 mm sample was heated to around 140 \(^{\circ }\)C, fringe tracking was initiated, and the oven power slowly ramped down to zero and back up for a total of two cycles. For the remainder of the experimental runs, the sample to be tested was first placed into a different oven and held at a controlled temperature for a time considered sufficiently long to achieve equilibrium. In the following, we call this the annealing temperature \(T_{anneal}\). The oven was then tilted and the hot sample allowed to fall into room temperature water in order to quickly drop the sample temperature and in effect ”freeze in” the conditions achieved during annealing. The thermal shock sometimes led to cracks and individual samples were retired once the optically evaluated volume was affected. The oven from the setup shown in Fig. 2 was programmed to cycle above a given threshold temperature \(T_{thrsh}\), the sample was placed onto the cover glass and finally the top insulation replaced. The fringes were then tracked for several hours while the oven continued to cycle above the threshold temperature.

2.4 Location of fringe extrema

The birefringence primarily depends on wavelength and temperature, and we define a positive number B as

$$\begin{aligned} B=n_{o}-n_{e} \end{aligned}$$
(1)

When polarized light falls onto the sample, the ordinary and extraordinary beam will propagate at speeds given by their respective refractive index. This leads to a relative walk-off in phase. The polarization of the transmitted beam generally is elliptically polarized. The relative retardation of the waves can be expressed in fractions of waves as

$$\begin{aligned} \Gamma =\frac{d\left( T\right) B\left( \lambda ,T\right) }{\lambda } \end{aligned}$$
(2)

where d is the sample thickness of Table 1, taking into account thermal expansion along the X-direction [17]. The detected intensity after the polarizer will vary as \(\sin ^{2}\left( \Gamma \pi \right)\) and be minimal if \(\Gamma\) is an integer, maximal if \(\Gamma mod 1 =0.5\). Because of material dispersion, B will increase towards shorter wavelengths, and \(\Gamma\) will do so as well. While the setup will not measure the absolute value of \(\Gamma\), tracking the fringe positions will allow for a precise measurement of changes in retardation B.

2.5 Fringe analysis

To eliminate spectrometer measurements with a poor ratio of signal to noise, we restricted data to a range of 550–900 nm having good intensity as seen in Fig. 2. For each measurement, the raw pixel data was normalized with the measured illumination intensity, resulting in a fringe spectrum as shown in Fig. 3. The contrast is suboptimal, partially because the polarizer is not optimally oriented, partially because the transmitted light is not a plane wave and different rays will experience slightly different apparent crystal thickness. The beam passes through a series of interfaces: air-sample, sample-oil, oil-glass, and finally glass-air and each will reflect some light, leading to wavelength-dependent intensity modulation of the observed pattern due to Fabry–Pérot fringing. This distorts the fringes as seen for the peaks at longer wavelengths in Fig. 3.

Fig. 3
figure 3

Normalized intensity fringes and peak fit results

To get the best possible estimate of the peak position, the points making up each half-fringe are fitted to a \(\sin ^{2}\) function with adjustable amplitude, frequency and offsets. This method takes into account tens of points for each fit and gives a better estimate of the peak wavelength than if the location of the most extreme intensity were taken. Because the integer offset for \(\Gamma\) is not known, we initially assign a value of 0 to the longest wavelength for a minimum, and then move to the next extremum, incrementing the value by 0.5 for each consecutive extremum. According to (2), we multiply this point-wise defined function with \(\lambda /d\) to get values that represent the dispersion of B, but with unknown offset \(n\times \lambda /d\) where the fringe offset is an integer \(n\in \mathbb {N}\). Figure 4 shows data for a room temperature scan and compares it to a calculated curve using the equation of [10]. An offset of \(n=42\) waves together with fine-tuning the temperature for the calculated curve gives very good agreement with an rms deviation of \(1.7\times 10^{-5}\). To be able to track changes in birefringence, the experimental points (crosses in Fig. 4) are approximated by a polynomial which allows evaluating a value for any wavelength. The approximation is thought to be most reliable in the center section, both because of high source intensity, and because polynomial approximations often have lower accuracy towards the end of the range. We chose a wavelength of 670 nm which has roughly an equal number of extrema fitted at either side and will report all birefringence changes for that wavelength for the remainder of the paper. Most experiments used a fit polynomial of degree 4, resulting in rms deviation of about \(3.3\times 10^{-5}\). The residuals can be reduced further by increasing the polynomial degree but flatten out at around \(1.3\times 10^{-6}\) for degree 9 and higher.

Fig. 4
figure 4

Birefringence calculated from literature and from fit locations in Fig. 3

3 Experimental results and analysis

Figure 5 shows the temperature profile that sample CLN-1 mm was subjected to while recording the fringes. Only changes in value of B can be measured, so we shift the measurement such that the value at 670 nm for the maximal temperature observed equals that calculated from the Sellmeier equation: \(B_{LE}(146.9\,^{\circ }\textrm{C})=0.077\).

Fig. 5
figure 5

Temperature cycling recorded for CLN-1 mm sample

Fig. 6
figure 6

Deviation of birefringence value B for temperature cycling

The birefringence value B for 670 nm decreases by \(38\times 10^{-6}\) for every degree of temperature increase [10]. The changes in birefringence of interest are comparatively small, so Fig. 6 show the difference between measured B and Sellmeier value at the same temperature. A polynomial of degree 10 was applied to each of the over 18,000 fringe spectra recorded to calculate the value for B at 670 nm. The curves for the cooling segments overlap reasonably well as do those for the heating segments. The values at high temperature agree for all segments indicating that the value has equilibrated to a steady-state value independent of the thermal history. The difference between cooling and heating segments is biggest in the 100–130 \(^{\circ }\)C range. This is where the value of birefringence can change on a time-scale of tens of minutes, characteristic of the temperature changes. At lower temperatures, the different segments tend to agree with each other, not because a steady-state has been achieved, but because the birefringence change is too slow to be observed. The inset shows a histogram of the residuals when the data \(B_{exp}-B_{EL} (T)\) is approximated by polynomial of degree 8 for the first cooling segment from 144 to 78 \(^{\circ }\)C. The rms value for these residuals is \(0.42\times {10^{-6}}\) and represents the estimated uncertainty due to noise and fringe peak-fitting uncertainties.

3.1 Magnitude and dynamics of birefringence change

To quantify these observations and get an idea of the rate at which the birefringence equilibrates, we subjected samples to the anneal and equilibration experiments described in Sect. 2.3. The individual runs with their results are listed in Table 2 for congruent composition and in Table 3 for MgO doped crystal samples. The samples were held at \(T_{anneal}\) for a time sufficient to reach equilibrium, at least 40 h for the lowest temperature, and 1 h for the others. After being dropped into room temperature water, the sample center temperature is expected to cool to below 100 \(^{\circ }\)C in less than 3 s for all annealing temperatures, in effect freezing in the sample state at \(T_{anneal}\). The fringe evaluation starts as soon as the top insulation is placed, starting the clock for the evaluation phase of the sample run. The sample temperature will adjust to that of the oven within a few tens of seconds, so the start of anneal time is somewhat uncertain. Most runs had an annealing temperature in the range 150–210 \(^{\circ }\)C, and a threshold temperature for characterization lower by 45 K or more. One run in each sample group had an annealing temperature of about 96 \(^{\circ }\)C with a higher temperature characterization. The range of threshold temperatures spans only about 25 K because data acquisition was difficult for very fast or very slow birefringence changes. To illustrate the steps taken for data analysis, we discuss the measurement where the CLN sample was annealed at 208.8 \(^{\circ }\)C and the oven was set to cycle above the threshold temperature \(T_{thrsh}=\)105 \(^{\circ }\)C. Figure 7 shows the first ten minutes of the recorded sample temperature, but the temperature cycling and recording continued for over 6 h total. The actual sample temperature dips below the threshold temperature because there is some delay between when the heating starts and the oven reacts. Two more temperature levels are indicated in the graph. The upper line represents \(T_{top} = T_{thrsh}\) + 1 K, and the dashed line is the evaluation temperature \(T_{eval}\) at the mid-point.

Fig. 7
figure 7

Sample temperature at beginning of evaluation run

Fig. 8
figure 8

Estimation of \(B(T_{eval})\) for first three cycles

The data within the temperature bracket is evaluated for each cooling segment to estimate the value of B(670 nm) at the time the sample crosses \(T_{eval}\). This is done by a linear fit to \(B(670 \, \textrm{nm}, T)\) and evaluating this fit at \(T_{eval}\) as shown in Fig. 8 for the first three cooling segment. Due to the unknown value n for the fringe offset, there is an unknown offset for B(T). This offset, however, remains unchanged during the run. The B values decrease with increasing temperature for each of the segments with a slope \(\partial B/\partial T\approx -58\times 10^{-6}\). The three cooling segments (at times about 1, 5, and 9 min after placing the sample) are offset from each other, indicating changes in the crystal defect distribution. A linear fit is applied to the data from each segment and the fitted B value at the evaluation temperature will be used for further analysis as shown in Fig. 9. Because the oven temperature is cycling, the sample temperature is not constant, and \(T_{eval}\) may not be the best temperature estimate for characterizing defect migration. We expect a temperature dependent relaxation process with the rate being proportional to \(exp\left( - E_{act}/kT\right)\) where \(E_{act}\) is the activation energy, k the Boltzmann constant and T the absolute temperature. As we see below, a typical value is \(E_{act}=\) 1.355 eV, so we use this to calculate an effective sample temperature as a weighted average of the sample temperature

$$\begin{aligned} T_{eff}=\left[ \int \exp \left( - \frac{E_{act}}{kT(t)}\right) T(t)dt \right] \div \left[ \int \exp \left( - \frac{E_{act}}{kT(t)}\right) dt \right] . \end{aligned}$$
(3)

We can see in Tables 2 and 3 that this temperature is about 1.2 K higher than the threshold temperature (0.7 K above \(T_{eval}\)), both because the sample spends more time above \(T_{eval}\) and because hotter temperatures are more heavily weighted.

The fitted B values for all the cooling segments of this experimental run are shown in Fig. 9 as red points. They show a rapid change early in the run, and tend towards an asymptotic value at longer times. The figure shows two fitted curves to the experimental points. The first is a simple exponential decay with \(\beta =1\), the second a stretched exponential with exponent \(\beta =0.65\) according to:

$$\begin{aligned} y_{stretched}=y_\infty - \Delta B \times \exp \left[ -{\left( \frac{t}{\tau }\right) }^\beta \right] . \end{aligned}$$
(4)

There are three adjustable optimization parameters: \(y_\infty\) is the asymptotic value, \(\Delta B\) the birefringence change from when the sample is placed to the asymptotic value, and \(\tau\) a time constant indicative of how fast the function relaxes. It is clear that the simple exponential decay does not provide a good approximation. The stretched exponential with \(\beta =0.65\) on the other hand gives excellent agreement. This value fits all experiments in Tables 2 and 3 reasonably well, so it was kept constant, allowing a meaningful comparison of time constants. Stretched exponential decay behavior has been observed previously in LN. Berben et al. [2] investigated the decay of photo-ionized electrons from Fe\(^{2+}\) sites and saw best agreement with \(\beta =0.26\). The authors postulated that the electrons get trapped on Nb anti-site defects and the varying range of distances between the source and trap gives rise to a range of recombination rates. It is not surprising that the value for \(\beta\) in our experiments is different since the migrating lithium ions move more slowly by several orders of magnitude.

Fig. 9
figure 9

Birefringence \(B(T_{eval}\) = 105.5 \(^{\circ }\)C) for CLN sample after annealing at 208 \(^{\circ }\)C

For illustration purposes, the curves in Fig. 9 have been shifted by \(y_\infty\) of the stretched exponential fit. This parameter needs to be fitted for each experiment, but does not contain any useful information because the fringe offset n is unknown.

Table 2 CLN annealing runs
Table 3 MgO annealing runs

The results for all experimental runs are listed in Tables 2 and 3 and the change in birefringence \(\Delta B\) has been plotted in Fig. 10. The abscissa shows the temperature change from the annealing temperature to the effective temperature during evaluation. The data for each sample class is fitted well by a line crossing the origin, indicated by a cross. The slope for CLN is steeper than that for MgO, and the best fitting slopes are indicated in the figure. To get an estimate of the activation energy for the relaxation, we plotted the natural logarithm of the time constants \(\tau\) (at \(\beta =0.65\)) as a function of the inverse absolute temperature in Fig. 11. The expected behavior for an Arrhenius equation in such a plot is a straight line, and a linear fit was applied taking into account the uncertainties in the time constant estimates. Both sets of data fit well to such a model, and the slopes are indistinguishable within the uncertainties resulting in activation energy for both CLN and MgO of 1.356 ± 0.01 eV. The two sets of data are offset by about twice the uncertainty estimated for the offsets, suggesting a possibly more rapid equilibration in MgO than in CLN.

Fig. 10
figure 10

Change in birefringence B as a function of temperature change

Fig. 11
figure 11

Activation energy derived from decay times

3.2 Sample thickness and composition

One may suspect that surface-effects influence the measured retardation of light passing through the samples. Semiletov [27] for example, saw phase segregation at a surface layer for temperatures as low as 300 \(^{\circ }\)C. If such surface effects play a significant role, the relationship postulated in (2) with a linear relationship between thickness and birefringence will no longer hold. To test the linearity, a 1 mm thick CLN sample was evaluated to compare to the \(\approx\) 0.5 mm samples. Table 4 shows the birefringence change slope \(\Delta B/ \Delta T\) as estimated by single temperature measurements for this thicker sample and other samples having altered composition. For comparison, equivalent results for CLN and MgO samples are shown from Tables 2, 3, and slopes of Fig. 10.

Table 4 Change for different sample composition
Fig. 12
figure 12

Birefringence change evolution for some samples summarized in table 4

The CLN-1 mm sample is more than twice the thickness of the CLN samples, so a surface mechanism would result in a calculated slope less than half the CLN value. The measured slope is actually 4% higher, so we can conclude that the major mechanism is uniformly distributed across the bulk of the sample. We expect that a change in stoichimetry will effect a change in defect conentrations and change the slope value. The dry oxygen annealed samples show no discernible difference from the as-grown samples. The Li-enrichment has the most dramatic effect as can also be seen in Fig. 12. The magnitude of \(\Delta B\) is drastically reduced in LN-Li-rich, and that in MgO-Li-rich drops to almost half that of the as-grown samples. While the values for \(\tau\) for effective temperature at around 114 \(^\circ\)C are fairly consistent for all other samples, the birefringence relaxation in the Li-rich samples is much slower with \(\tau\) more than 10\(\times\) that for LN-Li-rich and 5\(\times\) that for MgO-Li-rich. The time constants were still fitted with \(\beta =0.65\) and shifted by \(y_{\infty }\) but the actual \(\beta\) value may be different, resulting in large uncertainty for \(\Delta B\) because the measurement duration is still too short to reach the asymptotic value.

4 Discussion

To affect the refractive index for the long timescales observed, the crystal lattice configuration needs to undergo changes. Stoichiometry variations can be ruled out at this temperature as they would require transport of ionic species through the crystal surface. The lattice configuration changes can happen by re-configuration of intrinsic and/or extrinsic point defects. The activation energy determined above is very close to that observed for proton migration caused by photo-refractive charges (1.33 eV) [8] or annealing of waveguides (1.31 eV) [19]. However, the samples dry-annealed to reduce the proton concentrations show no measurable change from the as-grown samples, suggesting protons at typical concentrations in as-grown crystals are not significant. Another possible defect is the small polaron, often observed in chemically reduced crystals. The samples in this investigation were all annealed either in air or pure oxygen above \(1000\,^\circ\)C and showed no coloration typical of reduced crystals, making the presence of these defects unlikely. Furthermore, the activation energy for migration of small polarons is around 0.3 eV, [22] clearly much lower than what was observed here.

The Li-enriched samples show a significant reduction in the effect for both types of samples. The effect in the near stoichiometric LN sample is less than 20% of that in CLN. The much slower relaxation indicates that a different class of defects may be responsible. This large change confirms that the birefringence change in CLN is primarily caused by intrinsic defects present in CLN. A truly stochiometric crystal has all lattice sites occupied by the proper atom, and this leaves no opportunity for any lattice changes with temperature. The effect of Li-enrichment in MgO crystals is less drastic, but we see the same trend of smaller and slower birefringence change.

4.1 Point defects in lithium niobate

Non-stoichiometric as well as MgO doped crystals contain large amounts of point defects that can re-arrange into different configurations. We base our discussion here on a defect model with the following point defects: lithium vacancies (\(V_{Li}^{'}\)), niobium anti-site defects (\(Nb_{Li}^{4\cdot }\)) for undoped crystals, and additionally magnesium occupation of Li-sites (\(Mg_{Li}^{\cdot }\)) as well as Nb-sites (\(Mg_{Nb}^{3'}\)) for the doped crystals. This model does not include niobium vacancies because their existence is energetically unfavorable [15, 20]. It is generally accepted that LN has a fully occupied oxygen lattice with no substitutions or vacancies. Crystal growers typically describe a crystal composition by referencing melt composition, e.g. ”MgO doped congruent LN” despite the starting composition being quite different from a truly congruent melt that would result in a crystal of the same composition [11, 28]. Our CLN samples are grown from a melt with x = [Li]/([Li] + [Nb]) = 0.4838, and the MgO samples from a melt having the same x ratio, but with 5 mol% of MgO added. Below the photo-refractive threshold concentration for MgO, the only defects are \(Nb_{Li}^{4\cdot }\), \(V_{Li}^{'}\) and \(Mg_{Li}^{\cdot }\) [15]. As more MgO is added to the crystal and the threshold concentration is reached or exceeded, the \(Nb_{Li}^{4\cdot }\) defects have been eliminated and Mg starts to populate both Li and Nb sites.

Let us assume we have a crystal composition containing L moles of \(Li_2O\), N moles of \(Nb_2O_5\) and M moles of MgO. By applying the defect model described above and accounting for all cation lattice sites as well as requiring charge neutrality, we can calculate the fraction of atoms and vacancies on either Li- or Nb-sites.

$$\begin{aligned}&x =L/(L+N)\nonumber \\&m =M/(L+N+M) \nonumber \\&r =L/N =x/(1-x) \end{aligned}$$
(5)
$$\begin{aligned}&\left. \begin{array}{ll} Mg_{Nb} =0\\ Nb_{Li} =\frac{(1-r)-2m}{5+r-4m}\\ V_{Li} =\frac{1-r+(3-r)Nb_{Li}}{2}\\ Mg_{Li} =V_{Li}-4 Nb_{Li} \end{array}\right\} \text { }2m <1-r \end{aligned}$$
(6)
$$\begin{aligned}&\left. \begin{array}{ll} Mg_{Nb} =\frac{2m-(1-r)}{5+r-4m}\\ Nb_{Li} =0\\ V_{Li} =\frac{1-r-(3-r)Mg_{Nb}}{2}\\ Mg_{Li} =V_{Li}-3 Mg_{Nb} \end{array}\right\} \text { }2m \ge 1-r \end{aligned}$$
(7)

Equation (5) defines the molar fraction m of MgO doping, typically set to 5% in the melt. The ratio r is defined for convenience, and Eqs. (6) and (7) give the concentration of defects within the respective sites for a crystal composition defined by r and m. \(Mg_{Nb}\) for example represents the fraction of Nb sites that are occupied by a Mg atom and \(V_{Li}\) the fraction of vacant Li sites. By accounting for all site positions, we find \(Nb_{Nb}=1-Mg_{Nb}\) and \(Li_{Li}=1-Nb_{Li}-V_{Li}-Mg_{Li}\). Equation (6) applies for a crystal below the threshold, including undoped LN where \(m=0\). For our CLN samples, \(x=0.4838\) and we find \(Nb_{Li}=1.06\%\) and \(V_{Li}=4.23\%\). Equation (7) gives the defect densities for crystals above threshold. We see that when the Li/Nb ratio r is decreased, a higher MgO doping level m is needed to ensure composition that is above threshold.

To estimate the crystal defect composition of the MgO samples, we need to relate the known melt composition to that in the crystal. Tan et al. [28] have established distribution coefficients for congruent melts with added MgO. By interpolating data from their Figures, we estimate the crystal grown from a 5% doped congruent melt to have \(x=0.478\) and \(m=5.35\%\). Using (7), we find \(Mg_{Nb}=0.4\%\), \(Mg_{Li}=5.0\%\), and \(V_{Li}=3.8\%\).

Lithium-enrichment in CLN samples will create nearly stoichiometric crystals with drastically reduced \(V_{Li}^{'}\) and \(Nb_{Li}^{4\cdot }\) concentrations as compared to CLN. The incorporated additional Li\(_2\)O necessitate additional crystal unit cell growth at the surface and Nb atoms counter-diffusing Li atoms [4]. For MgO samples, the lithium enrichment will increase the lithium content, but not alter the number of Nb or Mg atoms in the sample. As is the case with CLN, new crystal lattice units will grow on the surface to accommodate the additional oxygen incorporated. While both r and m are changed, the ratio of Mg atoms to Nb atoms will stay constant. This leads to the additional condition \((Mg_{Li}+Mg_{Nb})/(1-Mg_{Nb})=\text {const}\) during Li-enrichment. Zhang et al. [31] have measured the weight gain of an MgO sample during such annealing runs and measured the weight gain at long times to be 0.46%. Using these two conditions, the model of (7) predicts a crystal with composition \(x=0.490\) and \(m=5.23\%\) with \(Mg_{Nb}=1.15\%\), \(Mg_{Li}=4.02\%\) and \(V_{Li}=0.77\%\). The vacancies are reduced by a factor of about five, but not completely eliminated.

To consider the influence of defects on birefringence, one needs to think about how the defects interact with each other and how these interactions may depend on temperature. Xu et al. conducted ab initio calculations for a congruent LN supercell of \(2\times 2\times 2\) hexagonal unit cells (containing 48 Li- and Nb-sites each) to estimate the formation energy of different arrangements of up to four \(V_{Li}^{'}\) around a \(Nb_{Li}^{4\cdot }\) center [29]. That paper’s Table 2 lists the formation energy \(E_{form}\), degeneracy D and system polarization \(P_s\) for arrangements with all four vacancies being on next neighbor sites of the anti-site defect. For the sake of a simple thought experiment, let us assume that these configurations are the only ones occurring. For thermal equilibrium, we can then calculate the probability \(p_i\) that a defect cluster is arranged in one of the i cases listed.

$$\begin{aligned}&p_i \propto D_i\exp \left( - \frac{(E_{form})_i}{kT}\right) \nonumber \\&\sum _{i=1}^4 p_i = 1 \nonumber \\&P = \sum _{i=1}^4 p_i (P_s)_i \end{aligned}$$
(8)

Inserting the provided values and evaluating P at two typical temperatures for our experiments (209 and \(114\,^\circ\)C) gives a defect-related spontaneous polarization change on cooling of \(\Delta P=9.0\times 10^{-4}\) C/m\(^2\). The actual value is likely different because we omitted a great many defect structures, and longer-range interactions between clusters are expected to distort the lattice and have not been considered. The refractive index is affected via the quadratic electro-optic effect, namely \(\Delta (\frac{1}{n_e^2}) =2 g_{33} P_s \Delta P\) [30]. Inserting values \(P_s=0.71\) C/m\(^2\) and \(g_{33}=0.09\) m\(^4\)/C\(^2\) and \(n_e=2.2\) for congruent LN, we estimate a change in extraordinary index of \(-0.6\times 10^{-3}\), or an expected slope of \(6.5\times 10^{-6}\)/K. As defined in (1), B moves in the opposite way, so this simplistic model predicts a slope of for \(\Delta B/ \Delta T\) of around \(-6.5\times 10^{-6}\)/K, about twice what is observed for CLN. Considering the crudeness of the calculation, this is good agreement, indicating that the observed changes in B are likely due to re-arrangement of Li vacancies. The change in polarization above is not to be confused with that of the pyro-electric effect which has the opposite sign and is instantaneous. The mechanism for adjusting the cluster composition via lithium vacancy migration is further supported by the good agreement of measured activation energies for these processes [24, 29].

For the MgO samples, no positively charged \(Nb_{Li}^{4\cdot }\) are available, so single lithium vacancies are expected to coordinate with the singly charged \(Mg_{Li}^\cdot\) sites. Their density is almost five times higher than the anti-sites in CLN. This, however, does not mean that the observed slope for birefringence change should be that much higher. The defect complexes around the positively charged defect are less tightly bound and likely show a smaller difference in polarization and therefore diminish the overall effect on slope. Both MgO and CLN cluster relaxation mechanisms rely on lithium vacancy migration which agrees with the materials having similar activation energy. Lithium-enriched LN samples have a much lower density of defect clusters. This explains the much smaller B change observed, but it is not obvious how that would lead to the drastic slow-down of Li vacancy migration observed. The lithium-enriched MgO samples are still expected to have similar defect clusters, though at a lower density. As with the undoped samples, the experiments confirm the expected diminished B change, though the reduction does not scale linearly with estimated defect density, and the change in relaxation speed remains unexplained.

4.2 Computer simulation of simple defect model

CLN and MgO compositions have large charged defects concentrations, and this will lead to lattice distortion affecting both cluster energy levels and activation energies for transitions between various clusters configurations. In order to arrive at stretched exponential relaxation dynamics, it is necessary to have a broadened distribution of activation energies. The simplest model assumes just two distinct defect cluster arrangements with energy levels separated by \(\Delta E\) as shown in Fig. 13. Each defect site can transition between the two states with transition probability \(\pi _{up}=\pi _0 \exp \left( -\Delta E_{act}/kT\right)\) for going to the higher energy level, and \(\pi _{down}=\pi _0 \exp \left( -(\Delta E_{act}-\Delta E)/kT\right)\) for returning to the lower energy state. For a sample in thermal equilibrium, the fraction p of the population in the upper state is such that the transition rates balance: \(\pi _{up}(1-p)= \pi _{down} p\). It is independent of the transition dynamics and given by

$$\begin{aligned} p_{equ}(T)= \left( 1+ \exp ^{\Delta E/kT}\right) ^{-1} . \end{aligned}$$
(9)

The varying local environment for the defect sites result in each site having a somewhat different activation energy. For our model, we assume that the site activation energies have a Gaussian distribution with average \(E_{avg}\) and standard deviation \(\sigma _E\). Instead of simulating a continuous distribution, we use 21 groups of sites with \(E_{act}\) ranging from \(E_{avg}-3\sigma _E\) to \(E_{avg}+3\sigma _E\).

Fig. 13
figure 13

Simple defect model assumed for simulation

The effect on the birefringence is assumed to stem from the difference in polarizability between the two configurations with a certain proportionality constant dB/dp. The modeling of an experimental run starts with all 21 groups of defects having the thermal equilibrium population for the annealing temperature \(p_{equ}(T_{anneal})\). The time-dependence is modeled using the classic Runge–Kutta method for the duration of the experiment with population transition rates evaluated at \(T_{eff}\). The birefringence at any time is calculated as a population-weighted average for upper state occupancy p. We are not taking into account thermo-optic effects so the model only predicts deviations from the asymptotic equilibrium value at \(T_{eff}\).

$$\begin{aligned} \Delta B^{model}(T_{eff},t)=\frac{dB}{dp} \left( \frac{1}{21}\sum _{i=1}^{21}p_i(t) - p_{equ}(T_{eff})\right) . \end{aligned}$$
(10)

The data underlying the fits in Table 2 was rarefied by reducing the points in the tails of the measurements in order to de-emphasize data with small changes. This resulted in data for the twelve experimental runs listed in Table 2 having between 26 and 49 points each. The theoretical model in (10) is then used to simulate the time evolution of the birefringence change at the temperatures for each of the experiments listed. This value will approach 0 as the site occupancy relaxes to the equilibrium population distribution. The model parameters were optimized by fitting the simulations (lines in Fig. 14) to the experimental points for each experimental run. As mentioned in Sect. 3.1, the asymptotic value is unknown, so the experimental points for each run were shifted by the \(y_\infty\) estimate obtained in the earlier fits to (4). The model parameters which best fit the experimental points, shown as circles in Fig. 14, are listed in Table 5. The uncertainties listed are estimated from the covariance matrix of the numerical fit. Up to 12 h of data was used for the fit, but Fig. 14 shows just the first 3 h of data together with the best fit model simulations. The model correctly predicts the opposite sign of \(\Delta B\) for the one run where evaluation was performed at a temperature higher than the anneal. The model also reproduces the different decay times for different evaluation temperatures as well as the stretched exponential shape. Agreement is generally good though there are systematic deviations for the larger \(\Delta B\) values. This likely is due to the simplistic assumptions, e.g. having just two states for defect cluster configurations.

The same model was also applied to the MgO experimental runs summarized in Table 3. Each of the 11 experiments contained between 24 and 47 points. Table 6 and Fig. 15 show the corresponding results. The fit uncertainties are higher because the birefringence changes are smaller, increasing the effect of measurement errors. As we saw above in Fig. 11, the activation energy is quite similar in the two materials. The energy difference \(\Delta E\) between the two levels is somewhat lower in MgO, likely because of the different types of defect clusters. The parameter dB/dp is different by about 2\(\times\), both because the cluster density is different and cluster types have different polarizations.

Table 5 CLN experiment fit results
Table 6 MgO experiment fit results
Fig. 14
figure 14

Best fit results for CLN

Fig. 15
figure 15

Best fit results for MgO

Fig. 16
figure 16

Two-temperature anneal

To test the predictive value of the fit results, we conducted an experiment on a CLN sample where an annealed sample (2 h at 209.8 \(^\circ\)C) was re-annealed for 402 min at a temperature of 84.2 \(^\circ\)C before tracking B at \(T_{eff}=106.3\,^\circ\)C. The duration of the second anneal is not sufficient to achieve thermal equilibrium, so simulation needs to model both the second anneal and the evaluation period to assess the changing level population. Figure 16 shows the simulation result as solid line and the experimental points as circles. Because only relative birefringence values can be measured, the experimental points have been shifted vertically such that the last experimental point coincides with the simulation curve. The simulation matches the experimental points reasonably well though there are systematic deviation. The evaluation sections have been fitted to (4). The fit to the experimental points yields \(\tau =127\) min, sown as dashed curve. This is significantly larger than the \(69\pm 7\) min listed in Table 2 for the experiments with \(T_{eff}\approx 106.2\,^\circ\)C. The fit to the evaluation temperature simulation yields \(\tau =98.4\) min. While this is not good agreement with the simulation, it confirms that a low temperature treatment leading to a non-equilibrated crystal will increase the time constant for the subsequent evaluation at typical temperatures.

5 Conclusion

We have tracked the time evolution of LN crystal birefringence after an annealing step for samples held at constant temperature. The birefringence changes over time and is likely caused by defect cluster re-arrangement facilitated by lithium vacancy migration. The effect was quantified for a variety of crystal compositions, annealing temperatures and evaluation temperatures. While the observed birefringence changes are small compared to other well-known effects such as the thermo-optic effect, they vary over timescales ranging from minutes at 120 \(^\circ\)C to months at room temperature. For CLN, the long-term birefringence change is proportional to the temperature difference between anneal and evaluation temperatures having constant \(\Delta B/\Delta T=- 3.5\times 10^{-3}\)K\(^{-1}\). The effect and timescales observed are consistent with lithium vacancies migrating. Lithium-enriched samples having lower densities of defects show a diminished effect. When designing or manufacturing optical devices that rely on precise matching of birefringence in crystals, these effects must be taken into account. One potential solution is to increase the operating temperature of the crystals above room temperature so that the defect clusters will reach thermal equilibrium in time-scales appropriate for the application. At a minimum, two crystals of equal length forming a matched pair need to be kept together during processes such as annealing and AR coating.

The measurement apparatus could be improved by temperature-stabilizing the detector array and lab climate control. The method should be applicable to characterizing lithium tantalate as well. The method provides a tool to accurately measure birefringence changes and is promising for the study of crystal structure defect clusters and re-arrangements facilitated by ion or vacancy migration.