Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Light propagation in sub-wavelength waveguides enables tight confinement over long propagation lengths to enhance nonlinear optical interactions. Not only can sub-wavelength waveguides compress light spatially, they also provide a tunable means to control the spreading of light pulses in time, producing significant effects even for nanojoule pulse energies. By exploring linear and nonlinear light propagation, first for free-space conditions, then for sub-wavelength guided conditions, we demonstrate how sub-wavelength structure can enhance nonlinear optics at the nanoscale. We demonstrate key applications including wavelength generation and all-optical modulation. Lastly, we show how to assemble these devices to form all-optical logic gates.

We begin by developing a fundamental understanding of light propagation in materials that we will build upon as we cover linear and nonlinear light propagation in nano-scale structures. We introduce plane-wave propagation in bulk materials and develop a simple model to explain the frequency dependence of the dielectric function. We observe how this frequency dependence affects optical pulse propagation. With this foundation, we will later explore light-on-light modulation using nonlinear optics.

1.1 Plane Waves

Throughout this chapter, we will develop and use several different wave equations. Each wave equation makes assumptions to localize energy both in time, using pulses, and in space, using waveguides. The first wave equation assumes plane waves at a single frequency (continuous wave). To localize in time, an optical pulse requires the interference of multiple frequencies. Therefore, we model a pulse as a single frequency modulated by an envelope function using the slowly varying envelope approximation. We use waveguides to confine light spatially by taking advantage of sustained propagating solutions of Maxwell’s equations, known as modes. Light guided within a mode propagates in an analogous way to plane waves using the guided-wave equation. Finally, we will augment this equation using the slowly varying envelope approximation to form a fourth wave equation to describe pulses in a waveguides. Strong light-matter interactions create a nonlinear polarization that we must include. We will introduce the physics of nonlinear optics in the simplest way possible, using plane waves. Lastly, we will modify the pulsed waveguide equation to include third-order nonlinear optical effects to form the Nonlinear Schrodinger Equation (NLSE).

1.1.1 Wave Equation for Plane Waves

We start with Maxwell’s equations in a linear, homogeneous material with no free charges or currents:

$$ \begin{array}{llll} {}&{(i)} \ {\vec{\nabla } \cdot \vec{E} = 0} \quad {\left( {\it iii} \right)} \ {\vec{\nabla } \times \vec{E} = - \frac{{\partial \vec{B}}}{{\partial t}}} \\ {}&{\left( {\it ii} \right)} \ {\vec{\nabla } \cdot \vec{B} = 0} \quad {\left( {\it iv} \right)} \ {\vec{\nabla } \times \vec{B} = \frac{{{\mu_r}{\varepsilon_r}}}{{{c^2}}}\frac{{\partial \vec{E}}}{{\partial t}}.}\end{array} $$
(7.1)

In this set of coupled differential equations, the vectors \( \vec{E} \) and \( \vec{B} \) are the electric and magnetic fields, \( {\varepsilon_r} \) and \( {\mu_r} \) are the relative electric permittivity and the relative magnetic permeability and define we \( c \equiv 1/\sqrt {{{\varepsilon_0}{\mu_0}}} \), which is the speed of light in vacuum. If we take the curl of equation (iii) and substitute equations (i) and (iv), the wave equation is commonly derived as [1]:

$$ {\nabla^2}\vec{E} - \frac{{{\mu_r}{\varepsilon_r}}}{{{c^2}}}\frac{{{\partial^2}\vec{E}}}{{\partial {t^2}}} = 0. $$
(7.2)

To gain physical insight into this equation, we define \( {k^2}/{\omega^2} \equiv {\mu_r}{\varepsilon_r}/{c^2} \) and solve this differential equation in one dimension for \( \vec{E}\left( {z,t} \right) \):

$$ \vec{E}\left( {z,t} \right) = \frac{1}{2}\left[ {{{\vec{E}}_0}{e^{i\left( {kz - \omega t} \right)}} + c.c.} \right] = {\rm Re} \left[ {{{\vec{E}}_0}{e^{i\left( {kz - \omega t} \right)}}} \right], $$
(7.3)

where \( {\vec{E}_0} \) is the complex electric field vector, \( z \) is the position, \( t \) is the time and \( c.c. \) denotes the complex conjugate of the previous term, insuring the field is a real quantity. The last expression uses phasor notation, which is mathematically more compact. Consequently, we will use phasor notation occasionally and leave it to the reader to take the real part.

1.1.2 Velocity of Plane Waves in a Material

The first question one might ask is: what is the speed of this wave? For a fixed location, the time between crests is \( T = 2\pi /\omega = 1/f \), where \( f \) is the frequency and the quantity \( \omega = 2\pi f \) is the angular frequency. In a similar way, the spacing between crests is given by the wavelength in the material, \( {\lambda_{mat}} = 2\pi /k \), and we refer to \( k = 2\pi /{\lambda_{mat}} \) as the wavevector. In three dimensions, \( k \) is a vector that points in the direction of the phase velocity; however, we will use it as a scalar for one dimension. This wavelength in the material \( {\lambda_{mat}} \) should not be confused with the wavelength in vacuum \( {\lambda_0} \). For most materials we can approximate \( {\mu_r} \approx 1 \) [1]. We also allow the relative permittivity \( {\varepsilon_r} \) to be frequency dependent using \( {\varepsilon_r}\left( \omega \right) \). We determine the speed of the wave \( v \) by observing how long it takes a single crest to propagate one wavelength:

$$ v = \frac{{{\lambda_{mat}}}}{T} = {\lambda_{mat}}f = \frac{\omega }{k} = \frac{c}{{\sqrt {{{\varepsilon_r}\left( \omega \right)}} }} \equiv \frac{c}{{n\left( \omega \right)}}. $$
(7.4)

Here, we define the index of refraction as \( n\left( \omega \right) \equiv ck\left( \omega \right)/\omega \). If \( {\varepsilon_r} \) and \( {\mu_r} \) are unity (their values in vacuum), the velocity is the speed of light in vacuum \( c \). We refer to the velocity at which the crests and the troughs of the wave propagate as the phase velocity, to distinguish it from the pulse or group velocity.

1.1.3 Propagation Losses

We have defined the refractive index in terms of the square root of the dielectric function. What happens if the dielectric function is complex? A complex dielectric constant causes \( k \) to be complex and we must consider the consequences of a complex index of refraction, \( \tilde{n}\left( \omega \right) \). Therefore, we must further define the index of refraction as:

$$ n\left( \omega \right) \equiv {\rm Re} \left[ {\frac{{ck\left( \omega \right)}}{\omega }} \right] = {\rm Re} \left[ {\sqrt {{{\varepsilon_r}\left( \omega \right)}} } \right]. $$
(7.5)

Considering a complex wavevector for a single frequency given by \( \tilde{k} = k^{\prime} + ik^{\prime \prime} \), we get immediate physical insight by observing the propagation of a plane wave in a medium with a complex wavevector:

$$ \vec{E}\left( {z,t} \right) = \frac{1}{2}\left[ {{{\vec{E}}_0}{e^{i\left[ {\left( {k^{\prime} + ik^{\prime \prime}} \right)z - \omega t} \right]}} + c.c.} \right] = \frac{1}{2}\left[ {{{\vec{E}}_0}{e^{ - k^{\prime \prime}z}}{e^{i\left( {k^{\prime}z - \omega t} \right)}} + c.c.} \right] $$
(7.6)

We see that \( k^{\prime \prime} \) exponentially attenuates the wave as it propagates.

We rarely measure the electric field directly and instead measure the time-averaged power. For a plane wave, the time-averaged power per unit area is the intensity or irradianceIdefined by

$$ I = \frac{{c{\varepsilon_0}n}}{2}{\left| {\vec{E}\left( {z,t} \right)} \right|^2}. $$
(7.7)

Writing Eq. (7.6) in terms of intensity, the expression becomes

$$ I = \frac{1}{2}c{\varepsilon_0}n{\left| {{{\vec{E}}_0}} \right|^2}{e^{ - 2k^{\prime \prime}z}}, $$
(7.8)

and the imaginary part of the wavevector \( k^{\prime \prime} \) is responsible for intensity attenuation. Attenuation is important because high intensities are critical for efficient nonlinear interactions, thus, attenuation is a limiting factor. We use this expression to define the attenuation coefficient given by \( \alpha = 2k^{\prime \prime} \), having units of inverse length. When the attenuation is due to absorption, we refer to \( \alpha \) as the absorption coefficient. In a similar manner, if we allow the index of refraction to become complex, we define \( \tilde{n} \equiv n + i\kappa \) and this new term \( \kappa \) is known as the extinction coefficient \( \kappa \left( \omega \right) \equiv ck^{\prime \prime}\left( \omega \right)/\omega \).

In the optical engineering literature, losses are usually notated in units of decibels per length, and it is convenient to relate this convention to the absorption coefficient. For a distance \( L \), the intensity decreases from \( {I_0} \) to \( I(L) \) and the loss is given by [2]:

$$ {\text{loss \ in \ dB}} {=} {-} 10{\log_{10}}\left(\! {\frac{{I(L)}}{{{I_0}}}} \!\right) {=} {-} 10{\log_{10}} \left( {\frac{{{I_0}{e^{ - \alpha L}}}}{{{I_0}}}} \right) {=} 10\left( {\alpha L} \right){\log_{10}}(e) {\approx} 4.34\alpha L. $$
(7.9)

Using this equation and assuming the losses are due to absorption, we can relate all of these quantities:

$$ \begin{gathered} \kappa \left( \omega \right)\left[ {\text{unitless}} \right] = k^{\prime \prime}\left( \omega \right)\frac{c}{\omega }\left[ {k^{\prime \prime}{\text{ in }}{{\text{m}}^{{ - 1}}}} \right] \hfill \\ = \frac{{\alpha \left( \omega \right)}}{2}\frac{c}{\omega }\left[ {\alpha {\text{ in }}{{\text{m}}^{{ - 1}}}} \right] \approx \frac{\text{loss}}{{8.68}}\frac{c}{\omega }\left[ {{\text{loss in dB/m}}} \right]. \hfill \\ \end{gathered} $$
(7.10)

We have assumed that the loss of light is due to absorption. However, any source of attenuation, such as absorption and scattering from inhomogeneities within the materials, limits nonlinear interactions. Therefore, we should use the inclusive definition of \( \alpha \) (the attenuation coefficient) when analyzing nonlinear devices.

1.2 Dielectric Function

The frequency-dependent dielectric function produces a frequency-dependent wavevector that is important for pulse propagation. To understand the origins of the frequency dependent dielectric function, we will develop a simple model here.

1.2.1 Drude-Lorentz Model

We classically model the interaction between an electromagnetic wave and electrons bound to their respective ion-cores using the Drude-Lorentz model. By modeling the electron-ion interaction as a one-dimensional simple harmonic oscillator, we explore the features of the dielectric function. The binding force between an electron and its ion is given by

$$ {F_{\it binding}} = - m{\omega_0}^2x, $$
(7.11)

where \( m \) is the mass of the electron, \( {\omega_0} \) is the resonant frequency of the electron-nucleus system, and \( x \) is the displacement of the electron. While in motion, the electron is also subject to a damping force with strength proportional to a constant \( \gamma \), resisting its movement. The damping force is given by

$$ {F_{\it damping}} = - m\gamma \frac{\textit{dx}}{\textit{dt}}. $$
(7.12)

Lastly, a steady oscillating electric field, using Eq. (7.3) with \( z = 0 \), provides the driving force given by:

$$ {F_{\it driving}} = - eE = - e{E_0}{e^{ - i\omega t}}, $$
(7.13)

proportional to the charge of the electron \( e \). The equation of motion thus becomes:

$$ m\frac{{{d^2}x}}{{d{t^2}}} + m\gamma \frac{{dx}}{{dt}} + m{\omega_0}^2x = - e{E_0}{e^{ - i\omega t}}, $$
(7.14)

whose solution is:

$$ x(t) = {x_0}{e^{ - i\omega t}},\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{x_0} \equiv - \left( {\frac{e}{m}} \right)\frac{1}{{({\omega_0}^2 - {\omega^2}) - i\gamma \omega }}{E_0}. $$
(7.15)

The oscillating electron creates a dipole moment given by

$$ p(t) = - ex(t) = \left( {\frac{{{e^2}}}{m}} \right)\frac{1}{{({\omega_0}^2 - {\omega^2}) - i\gamma \omega }}{E_0}{e^{ - i\omega t}}. $$
(7.16)

In a bulk material, we have \( N \) of these dipoles per unit volume and we must express the dipole moment as a vector:

$$ \vec{P}(t) = \frac{{N{e^2}}}{m}\frac{1}{{({\omega_0}^2 - {\omega^2}) - i\gamma \omega }}\vec{E}(t) \equiv {\varepsilon_0}{\chi_e}\vec{E}(t). $$
(7.17)

From the polarization, with the definitions for the susceptibility \( \vec{P}(t) \equiv {\varepsilon_0}{\chi_e}\vec{E}(t) \) and the relative dielectric function \( {\varepsilon_r}\left( \omega \right) \equiv 1 + {\chi_e} \), we solve for the relative dielectric function:

$$ {\varepsilon_r}\left( \omega \right) = 1 + \frac{{N{e^2}}}{{m{\varepsilon_0}}}\frac{1}{{({\omega_0}^2 - {\omega^2}) - i\gamma \omega }}. $$
(7.18)

1.2.2 Dielectric Function for a Single Resonance

We realize that the relative dielectric function is a complex quantity \( {\varepsilon_r}\left( \omega \right) = {\varepsilon_r}^{\prime}\left( \omega \right) + i{\varepsilon_r}^{\prime \prime}\left( \omega \right) \). For a single resonance, we separate Eq. (7.18) into real and imaginary parts:

$$ {\varepsilon_r}^{\prime}\left( \omega \right) {+} i{\varepsilon_r}^{\prime \prime}\left( \!\omega \right) {=} \left( {1 {+} \frac{{N{e^2}}}{{m{\varepsilon_0}}}\frac{{({\omega_0}^2 {-} {\omega^2})}}{{{{({\omega_0}^2 {-} {\omega^2})}^2} {+} {\gamma^2}{\omega^2}}}} \!\right) {+} i\left( \! {\frac{{N{e^2}}}{{m{\varepsilon_0}}}\frac{{\gamma \omega }}{{{{({\omega_0}^2 - {\omega^2})}^2} {+} {\gamma^2}{\omega^2}}}} \! \right)\!. $$
(7.19)

Plotting the displacement of the electron x as a function of frequency ω in Fig. 7.1 (left, solid line), we see that the amplitude on-resonance is largest. The phase shift (dashed line) is initially in-phase for frequencies below resonance and lags behind at higher frequencies. This results in a complex dielectric function which has a strong imaginary component \( {\varepsilon_r}^{\prime \prime}\left( \omega \right) \) on resonance (Fig. 7.1, right, dashed line). Observing the real component \( {\varepsilon_r}^{\prime}\left( \omega \right) \) the bound charges keep up with the driving field and the wave propagates more slowly, corresponding to a higher refractive index. At resonance, energy is transferred to the bound charges and the wave is attenuated. Above the resonance, the bound charges cannot keep up and the dielectric acts like a vacuum.

Fig. 7.1
figure 1

The displacement amplitude x 0 and phase \( \varphi \) versus frequency for the Drude-Lorentz model (left). The complex frequency-dependent dielectric function for a single resonance (right)

1.2.3 Multiple Resonances

In a bulk material, the electric field becomes a vector and we sum over all dipoles, creating a material polarization given by

$$ \vec{P}(t) = \left( {\frac{{N{e^2}}}{m}} \right)\sum\limits_j {\frac{{{f_j}}}{{({\omega_j}^2 - {\omega^2}) - i{\gamma_j}\omega }}{{\vec{E}}_0}(t)} \equiv {\varepsilon_0}{\chi_e}\vec{E}(t). $$
(7.20)

Here, we sum over the individual electrons for each molecule of the bulk material. The subscript \( j \) corresponds to a single resonant frequency \( {\omega_j} \) with a damping coefficient \( {\gamma_j} \), of which there are \( {f_j} \) electrons per molecule and there are N molecules per unit volume. This model is most quantitatively accurate in the dilute gas limit [1]; however, it provides a qualitative insight into solid materials. Using this polarization, the complex relative dielectric function is given by

$$ {\varepsilon_r}\left( \omega \right) = 1 + \left( {\frac{{N{e^2}}}{{{\varepsilon_0}m}}} \right)\sum\limits_j {\frac{{{f_j}}}{{({\omega_j}^2 - {\omega^2}) - i{\gamma_j}\omega }}} . $$
(7.21)

We can extend the dielectric function to include other resonances, as shown in Fig. 7.2. At very high frequencies, the dielectric acts like a vacuum. At ultraviolet and visible frequencies, electronic resonances play a dominant role. Lastly, for low frequencies, the driving fields are slow enough to access both ionic and dipolar resonances.

Fig. 7.2
figure 2

Schematic illustration of the various contributions to the dielectric constant across the electromagnetic spectrum

1.2.4 Metals

Now, let us consider the case where the binding energy is very weak, as in the case of a metal. Here, we have \( {F_{\it binding}} \approx 0 \). Then equation of motion reduces to

$$ m\frac{{{d^2}x}}{{d{t^2}}} + m\gamma \frac{{dx}}{{dt}} = - e{E_0}{e^{ - i\omega t}}\!, $$
(7.22)

which has a solution of

$$ x(t) = \left( {\frac{e}{m}} \right)\frac{1}{{{\omega^2} + i\gamma \omega }}{E_0}{e^{ - i\omega t}}. $$
(7.23)

Metals are known for their conductivity. In the low frequency limit, where \( \omega \ll \gamma \) the generated current J is proportional to the velocity, charge, and number of electrons per unit volume, given by:

$$ J = - Ne\frac{{dx}}{{dt}} = \frac{{N{e^2}}}{m}\frac{1}{{\gamma - i\omega }}E(t) \approx \frac{{N{e^2}}}{{m\gamma }}E(t) \equiv \sigma E(t). $$
(7.24)

From this expression, we readily defined the conductivity \( \sigma \). In the high frequency limit \( \omega \gg \gamma \) we find that

$$ J = \frac{{N{e^2}}}{m}\frac{1}{{\gamma - i\omega }}E(t) \approx - i\frac{{N{e^2}}}{{m\omega }}E(t) \equiv \sigma E(t), $$
(7.25)

and that the conductivity \( \sigma \) is complex and that \( J \) is out of phase with \( E(t) \).

At optical frequencies, if the frequency is large relative to the damping (\( \omega \gg \gamma \)) we can approximate \( \gamma \approx 0 \), and we obtain the free electron model:

$$ \varepsilon \left( \omega \right) = 1 - \frac{{N{e^2}}}{{m{\varepsilon_0}}}\frac{1}{{{\omega^2}}} \equiv 1 - \frac{{{\omega_p}^2}}{{{\omega^2}}}. $$
(7.26)

Here, we define the plasma frequency as \( {\omega_p} \equiv N{e^2}/m{\varepsilon_0} \). For a typical metal, having 1022 electrons/cm3, this density corresponds to a plasma frequency of 6 × 1015 rad/s or a vacuum wavelength of 330 nm. When the frequency of light is above the plasma frequency, \( {\varepsilon_r}^{\prime \prime} = 0 \) and the metal is transparent. At the plasma frequency, the real part of the dielectric function becomes zero. Below the plasma frequency, the dielectric function is completely imaginary; the wave does not propagate and is reflected instead. Therefore, the metal acts as a high-pass filter and can be used as a mirror.

1.3 Pulse Propagation

Short optical pulses are a key tool for nonlinear optical research as they can achieve high intensities even with small pulse energies and low average powers. However, pulsed transmission requires multiple frequencies propagating coherently together, as shown in Fig. 7.3 (left). The frequency-dependent phase velocity causes the pulse envelope to propagate at its own velocity and can lead to temporal pulse spreading during propagation show in Fig. 7.3 (right). In this section, we will address these issues and develop the mathematical framework to handle pulses.

Fig. 7.3
figure 3

A pulse is made up of many frequencies (left) which can propagate with their own phase velocities, leading to temporal pulse spreading (right)

1.3.1 Phase Versus Group Velocity

To understand how a pulse propagates, let us begin with a simple model (two continuous waves of differing angular frequencies and wavevectors) and explore what happens when they propagate together. The two propagating waves are given by:

$$ {y_1} = A\sin \left( {{k_1}z - {\omega_1}t} \right)\,\,{\text{and}}\,\,{y_2} = A\sin \left( {{k_2}z - {\omega_2}t} \right). $$
(7.27)

Where A is the amplitude of the waves (identical values here), \( {k_1} \) and \( {k_2} \) are the wave vectors, and \( {\omega_1} \) and \( {\omega_2} \) are the frequencies for the first and second waves (respectively). Each wave has a speed, or phase velocity, given by:

$$ {v_1} = \frac{{{\omega_1}}}{{{k_1}}} = {f_1}{\lambda_1}\,\,{\text{and}}\,\,{v_2} = \frac{{{\omega_2}}}{{{k_2}}} = {f_2}{\lambda_2}. $$
(7.28)

If we superimpose these waves by adding then, then apply the trigonometric identity,

$$ \sin \alpha + \sin \beta = 2\cos \left( {\frac{{\alpha - \beta }}{2}} \right)\sin \left( {\frac{{\alpha + \beta }}{2}} \right), $$
(7.29)

we arrive at the following result:

$$ y {\equiv} {y_1} + {y_2} {=} 2A\cos \left[ {\frac{1}{2}\left( {z\left( {{k_1} - {k_2}} \right) {-} t\left( {{\omega_1} - {\omega_2}} \right)} \right)} \right]\sin \left[ {\frac{{{k_1} + {k_2}}}{2}z - \frac{{{\omega_1} {+} {\omega_2}}}{2}t} \right]. $$
(7.30)

We can simplify this expression using the following definitions:

$$ \begin{array}{llll} \Delta k &\equiv {k_1} - {k_2} {\text{ \ and \ }} {\Delta \omega \equiv {\omega_1} - {\omega_2}} \\ k &\equiv \frac{{{k_1} + {k_2}}}{2} {\text{ \ and \ }} {\omega \equiv \frac{{{\omega_1} + {\omega_2}}}{2},} \end{array} $$
(7.31)

to show

$$ y = 2A\cos \left[ {\frac{1}{2}\left( {z\Delta k - t\Delta \omega } \right)} \right]\sin \left( {kz - \omega t} \right). $$
(7.32)

This expression takes the form of a fast oscillating term (the sine function) modulated slowly by the cosine function. To visualize this behavior, we set \( t = 0 \) and plot Eq. (7.32) in Fig. 7.4. We see the beating between these two frequencies forms the basis of a simple pulse or group. We call the cosine function the envelope, and the fast oscillatory part the carrier.

Fig. 7.4
figure 4

Two waves interfere, forming a simple pulse train whereby a carrier wave is modulated by a slowly varying envelope \( \cos \left( {z\Delta k/2} \right) \)

Observing Eq. (7.32), we see two relevant velocities. The velocity of the carrier wave from the sine term is:

$$ {v_p} = \frac{\omega }{k} = f\lambda . $$
(7.33)

We refer to this velocity as the phase velocity. Similarly, the envelope has a velocity given by

$$ {v_g} = \frac{{\Delta \omega }}{{\Delta k}} \to \frac{{d\omega }}{{dk}}, $$
(7.34)

which we refer to as the group velocity. If we consider the case of no dispersion, the phase velocities of the two waves are the same \( {v_p} = \frac{{{\omega_1}}}{{{k_1}}} = \frac{{{\omega_2}}}{{{k_2}}} \) and we find that:

$$ \begin{array}{llll} {v_g} &= \frac{{\Delta \omega }}{{\Delta k}} = \frac{{{\omega_1} - {\omega_2}}}{{{k_1} - {k_2}}} = \frac{{{{{\omega_1}} \mathord{\left/{\vphantom {{{\omega_1}} {\left( {{k_1}{k_2}} \right)}}} \right.} {\left( {{k_1}{k_2}} \right)}} - {{{\omega_2}} \mathord{\left/{\vphantom {{{\omega_2}} {\left( {{k_1}{k_2}} \right)}}} \right.} {\left( {{k_1}{k_2}} \right)}}}}{{{1 \mathord{\left/{\vphantom {1 {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}} \right.} {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}}} \hfill \\ &= \frac{{{{{v_p}} \mathord{\left/{\vphantom {{{v_p}} {{k_2} - {{{v_p}} \mathord{\left/{\vphantom {{{v_p}} {{k_1}}}} \right.} {{k_1}}}}}} \right.} {{k_2} - {{{v_p}} \mathord{\left/{\vphantom {{{v_p}} {{k_1}}}} \right.} {{k_1}}}}}}}{{{1 \mathord{\left/{\vphantom {1 {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}} \right.} {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}}} = {v_p}\frac{{{1 \mathord{\left/{\vphantom {1 {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}} \right.} {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}}}{{{1 \mathord{\left/{\vphantom {1 {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}} \right.} {{k_2} - {1 \mathord{\left/{\vphantom {1 {{k_1}}}} \right.} {{k_1}}}}}}} = {v_p}, \hfill \\ \end{array} $$
(7.35)

and therefore the group velocity and the phase velocity are identical. There are several categories of linear dispersion as shown in Table 7.1.

Table 7.1 Types of linear dispersion

If we consider a single optical element, such as a mirror, lens, or length of fiber, the group velocity will cause a fixed time-delay for the pulse, known as the group delay. The group delay can deviate substantially from the delay caused by the phase velocity (inversely dependent on the index) in a dispersive media. The difference between the phase and group velocity within a single optical element will cause an offset between the absolute phase of the carrier and the envelope of the pulse, known as the carrier-envelope offset.

1.3.2 Gaussian Pulse

In our simple model, we have only considered two propagating waves. As we add more waves in between these initial waves, the envelope is no longer defined by a simple cosine function and the pulses can separate in time, as shown Fig. 7.3. To describe such a pulse, we use the slowly-varying envelope approximation (in phasor notation):

$$ \vec{E}\left( {z,t} \right) = A\left( {z,t} \right){e^{i\left( {kz - {\omega_0}t} \right)}}\hat{x}. $$
(7.36)

Here, \( A\left( {z,t} \right) \) is the envelope function and \( \exp \left( {i\left( {kz - {\omega_0}t} \right)} \right) \) is the carrier. A Gaussian is a common shape for an ultrashort pulse, given by:

$$ \begin{array}{llll} \vec{E}\left( {z,t} \right) &= {{\vec{E}}_0}\exp \left( { - \frac{1}{2}{{\left( {\frac{t}{\tau }} \right)}^2}} \right){e^{i\left( {kz - {\omega_0}t} \right)}} \hfill \\ &= {{\vec{E}}_0}\exp \left( { - 2\ln (2){{\left( {\frac{t}{{{\tau_{\it FWHM}}}}} \right)}^2}} \right){e^{i\left( {kz - {\omega_0}t} \right)}}, \hfill \\ \end{array} $$
(7.37)

Here, \( {\vec{E}_0} \) is the amplitude of the electric field, \( t \) is time and both \( \tau \) and \( {\tau_{\it FWHM}} \) reflect the pulse duration. Additionally, \( {\omega_0} \) and \( k \) are the angular frequency and wavevector of the carrier wave (respectively).

We can measure the pulse duration in several ways by applying different clipping levels (\( 1/e \),\( 1/{e^2} \), full-width at half maximum) and calculating these in terms of either the electric field or the power. Using Eq. (7.37), we define two definitions for the pulse duration with respect to the time-averaged power:

$$ P(t) = {P_p}\exp \left( { - {{\left( {{t \mathord{\left/{\vphantom {t \tau }} \right.} \tau }} \right)}^2}} \right) = {P_p}\exp \left( { - 4\ln (2){{\left( {{t \mathord{\left/{\vphantom {t {{\tau_{\it FWHM}}}}} \right.} {{\tau_{\it FWHM}}}}} \right)}^2}} \right). $$
(7.38)

Here, \( P(t) \) is the power as a function of time \( t \), and \( {P_p} \) is the peak power. In the first pulse-duration definition, \( \tau \) is the duration from the peak to the \( 1/e \)-clipping level and is often used for its mathematical simplicity. Experimentally, we use the full-width at half-maximum duration, related to \( \tau \) using \( {\tau_{\it FWHM}} = 2\sqrt {{\ln 2}} \tau \).

A Gaussian pulse is very convenient because the spectrum also takes the shape of a Gaussian. A pulse is as short as possible if the spectral phase is frequency-independent. We refer to such a pulse as being transform limited and we use the time-bandwidth product: \( \tau \Delta \nu \approx 0.44 \) (for a Gaussian). Alternatively, we can write this expression in terms of wavelength to show: \( \tau \approx 0.44{{{\lambda^2}} \mathord{\left/{\vphantom {{{\lambda^2}} {\left( {c\Delta \lambda } \right)}}} \right.} {\left( {c\Delta \lambda } \right)}} \). Thus, the transform limited pulse duration is inversely dependent on the spectral width. For example, a transform limited 100-fs pulse has a spectral width of \( \Delta \lambda \approx 9.4\ {\text{nm}} \) at 800 nm and a width of \( \Delta \lambda \approx 35.3\ {\text{nm}} \) at 1,550 nm.

1.4 Temporal Pulse Broadening

We observed how a frequency-dependent propagation constant gives rise to phase velocity and group velocity. If we continue to expand the propagation constant in a Taylor series, we see:

$$ \begin{array}{llll} k\left( \omega \right) &= {k_0} + {\left. {\frac{{dk}}{{d\omega }}} \right|_{{\omega_0}}}\left( {\omega - {\omega_0}} \right) + \frac{1}{2}{\left. {\frac{{{d^2}k}}{{d{\omega^2}}}} \right|_{{\omega_0}}}{\left( {\omega - {\omega_0}} \right)^2} + ... \hfill \\ & = {k_0} + k^{\prime}\left( {\omega - {\omega_0}} \right) + \frac{1}{2}k^{\prime \prime}{\left( {\omega - {\omega_0}} \right)^2} + ... \hfill \\ \end{array} $$
(7.39)

Of these coefficients, we recall that \( {k_0} \) and \( k^{\prime} \) are related to the phase and group velocities, respectively. We find that higher order dispersion, beginning with \( k^{\prime \prime} \), begins to change the shape of the pulse, reducing the peak intensity, and is therefore a critical factor for many nonlinear experiments.

1.4.1 Group Velocity Dispersion

There is a simple way to think about the spreading of a pulse as it propagates in a material. Consider a transform limited pulse of light, consisting of a spectrum of colors. If we cut the spectrum in half, we will have a higher-energy “blue” pulse and a lower-energy “red” pulse. If we propagate these partial pulses in a media with only linear dispersion (\( k^{\prime} \) is constant and \( k^{\prime \prime} = 0 \)), both will propagate with the same group velocity, and we can recombine the blue and red pulses to obtain the original pulse duration. If the medium has higher order dispersion, \( k^{\prime \prime} \ne 0 \) and the group velocity changes as a function of frequency, causing the red and blue pulses to separate in time as they propagate. When we recombine them, there will be a delay between the red and blue pulses, and their combination will be of longer duration than the original pulse. This fixed delay is known as the group delay dispersion (GDD). An optical element, such as a fixed length of fiber or a microscope objective, will have a fixed amount of GDD.

If we care about the pulse duration as we propagate, we require the GDD per unit length, which is known as the group velocity dispersion (GVD). We define the group velocity dispersion in a material as:

$$ {\text{GVD}} = \frac{\partial }{{\partial \omega }}\left( {\frac{1}{{{v_g}}}} \right) = \frac{\partial }{{\partial \omega }}\left( {\frac{{\partial k}}{{\partial \omega }}} \right) = \frac{{{\partial^2}k}}{{\partial {\omega^2}}}, $$
(7.40)

GVD has units of time squared per length (often fs2/mm).

We can clarify the difference between GDD and GVD by considering an experiment that consists of several optics (mirrors, lenses, etc.) leading up to a nonlinear pulse propagation experiment (for example, a fiber or a photonic chip). Each optical element before the fiber adds a fixed amount of GDD, broadening the pulse. However, we must consider the interplay between the nonlinearity and dispersion as the pulse propagates within the fiber and therefore, we consider the fiber’s GVD.

Just as we had normal and anomalous dispersion previously, the group velocity dispersion can be normal or anomalous. Here, we consider positive values of the GVD as normal and negative values as anomalous. This labeling convention is because most materials at visible wavelengths show normal GVD. For example, the GVD of silica is normal (positive) in the visible, reaches zero around 1.3 μm, and becomes anomalous (negative) for longer wavelengths, such as 1.5 μm.

To make things slightly more confusing, the optical engineering community takes the derivative of the refractive index with respect to the wavelength, producing another term, known as the dispersion parameter \( D \) given by:

$$ D = - \frac{\lambda }{c}\frac{{{d^2}n}}{{d{\lambda^2}}} = - \frac{{2\pi c}}{{{\lambda^2}}}GVD = - \frac{{2\pi c}}{{{\lambda^2}}}\frac{{{\partial^2}k}}{{\partial {\omega^2}}}. $$
(7.41)

The dispersion parameter has units of time per length squared, often specified in ps/nm/km. These units are convenient when we estimate strong pulse broadening. It is important to note that the sign of the dispersion parameter is opposite to that of GVD, and thus, a material with anomalous dispersion has a positive dispersion parameter. These are often used interchangeably, so one should avoid saying “positive” and “negative” dispersion and instead use “normal” and “anomalous” dispersion.

1.4.2 Dispersive Pulse Broadening

To observe dispersive broadening, we can consider the Gaussian pulse from Eq. (7.37) and apply the effects of group velocity dispersion given by \( k^{\prime \prime} \). To simplify the analysis, we assume \( {k_0} = 0 \) and \( k^{\prime} = 0 \), both of which will not broaden our pulse, as we have shown. We take the Fourier transform of Eq. (7.37):

$$ E\left( {z = 0,\omega } \right) = \frac{1}{{\sqrt {{2\pi }} }}\int_{ - \infty }^\infty {{E_0}{e^{\frac{{ - {t^2}}}{{2{\tau^2}}}}}{e^{ - i{\omega_0}t}}{e^{i\omega t}}dt} = {E_0}\tau {e^{\frac{{ - {\tau^2}{{\left( {{\omega_0} - \omega } \right)}^2}}}{2}}}. $$
(7.42)

From here, we can add the spectral phase and take the inverse Fourier transform:

$$ \begin{gathered} E\left( {z = L,t} \right) = \frac{1}{{\sqrt {{2\pi }} }}\int_{ - \infty }^\infty {{E_0}\tau {e^{\frac{{ - {\tau^2}{{\left( {{\omega_0} - \omega } \right)}^2}}}{2}}}{e^{ - i\frac{{k^{\prime \prime}}}{2}{{\left( {\omega - {\omega_0}} \right)}^2}L}}{e^{ - i\omega t}}d\omega } \hfill \\ = \frac{{\tau {E_0}}}{{\sqrt {{{\tau^2} + ik^{\prime \prime}L}} }}{e^{\frac{{ - {t^2}}}{{2{\tau^2}{{\left( {1 + {{\left( {\frac{{k^{\prime \prime}L}}{{{\tau^2}}}} \right)}^2}} \right)}^2}}}}}{e^{i\frac{{k^{\prime \prime}L{t^2}}}{{2\left( {{{\left( {k^{\prime \prime}L} \right)}^2} + {\tau^4}} \right)}}}}{e^{ - i{\omega_0}t}}. \end{gathered} $$
(7.43)

We notice that the amplitude changes as the pulse broadens and there is additional phase from the propagation. We also observe that the middle term closely resembles the form of Eq. (7.42) if we let:

$$ \tau ^{\prime} = \tau \sqrt {{1 + {{\left( {\frac{{k^{\prime \prime}L}}{{{\tau^2}}}} \right)}^2}}}, $$
(7.44)

thus, the pulse broadens in time by \( \sqrt {{1 + {{\left( {k^{\prime \prime}L/{\tau^2}} \right)}^2}}} \). We can write this expression in terms of the full-width at half-maximum to show:

$$ \tau {^{\prime}_{\it FWHM}} = {\tau_{\it FWHM}}\sqrt {{1 + {{\left( {4\ln 2\frac{{\it GVD}}{{{\tau_{\it FWHM}}^2}}d} \right)}^2}}} \approx 4\ln 2\frac{{\it GVD}}{{{\tau_{\it FWHM}}}}d, $$
(7.45)

as it propagates through a distance \( d \). This approximation is valid for strong dispersive broadening. For silica fiber at 800 nm, GVD = 36 fs2/mm, which corresponds to a dispersion parameter D of −106 ps/nm/km. Consider a 100 fs pulse at 800 nm; after propagating through 1 m of silica fiber, it will have a duration of 1 ps. As the broadening is very strong, we see that using the dispersion parameter is convenient to estimate the pulse duration. For this situation, a 100 fs pulse at 800 nm has a spectral bandwidth of \( \Delta \lambda \approx 9.4{\text{ nm}} \), thus \( \Delta \tau \approx \left| { - 106{\text{ps/nm/km}}} \right|\left( {9.4\,{\text{nm}} \times 1{\text{ m}}} \right) = 996{\text{ fs}} \), which is approximately correct. However, this expression is invalid for small broadening, for example, from a 10-cm length of fiber. Recall that a short pulse has a wide bandwidth and is therefore more susceptible to dispersive broadening than a longer pulse with a reduced bandwidth. This phenomenon creates the counter-intuitive effect where a narrowed spectrum can produce a shorter pulse for a given amount of dispersion.

1.4.3 Group Velocity Dispersion Compensation

To facilitate strong nonlinear interactions with low pulse energies, we require very short pulses. As each optical element adds dispersion, potentially broadening the pulse, we address the practical concern of pulse compression here. As the change in pulse duration is a linear effect, we can counteract an amount of normal GDD with an equal amount of anomalous GDD to recover the original pulse duration. With sufficient GDD, we can pre-compensate additional optics to form the shortest pulse somewhere later in the optical path. For visible and most NIR optical elements, the dispersion is normal. To compensate, we can use a device that has tunable anomalous GDD known as a pulse compressor.

Tunable anomalous group delay dispersion can be achieved using both gratings and prisms, as shown in Fig. 7.5 (top and bottom, respectively). These configurations consist of four elements (gratings or prisms) such that the first two provide half of the intended group delay dispersion starting with a collimated beam and ending with parallel, spatially dispersed colors. The second pair provides the second half of the GDD and reforms the collimated beam. Very often, we use a single grating or prism pair with a mirror that reflects the spatially dispersed colors backward to “fold” the compressor onto itself. This folding-mirror is angled slightly so that the compressed, retroreflected beam can be diverted by another mirror that is initially missed by the incoming beam.

Fig. 7.5
figure 5

Schematic of a two types of pulse compressors: a grating compressor (top) and a prism compressor (bottom)

To determine the GDD through a grating or prism pair, we determine the wavelength-dependent path length and the resulting phase through a single pair. From here, we calculate \( {d^2}\varphi /d{\omega^2} \). For the case of a grating pair, the total GDD is [3]

$$ \frac{{{d^2}\varphi }}{{d{\omega^2}}} = - \frac{{\left( {1/{\omega^2}} \right)\left( {\lambda /\Lambda } \right)\left( {2\pi L/\Lambda } \right)}}{{{{\left[ {1 - {{\left( {\sin {\theta_i} - \lambda /\Lambda } \right)}^2}} \right]}^{3/2}}}}. $$
(7.46)

Here, \( \omega \) is the angular frequency, \( \lambda \) is the wavelength, \( \Lambda \) is the spatial period, \( {\theta_i} \) is the angle of incidence of the incoming light, and \( L \) is the separation of the gratings. We tune the amount of dispersion compensation by changing the separation of the gratings, \( L \).

For the case of a prism pair, the situation is more complicated because of the additional material dispersion within the prism itself. We find the following approximation is useful for small amount of material dispersion when the prism pair is cut and aligned to the Brewster angle (for maximum efficiency) [4]:

$$ \frac{{{d^2}\varphi }}{{d{\omega^2}}} \approx \frac{{{\lambda^3}}}{{2\pi {c^2}}}\left[ { - 8L{{\left( {\frac{{dn}}{{d\lambda }}} \right)}^2} + 8\left( {\frac{{{d^2}n}}{{d{\lambda^2}}}} \right)\left( {{D_{1/{e^2}}}} \right)} \right]. $$
(7.47)

Here, \( L \) is the separation of the prisms, \( {D_{1/{e^2}}} \) is the beam diameter (at the \( 1/{e^2} \) clipping level), \( n \) is the index of refraction, and \( c \) is the speed of light. The first term provides tunable anomalous dispersion and depends on the prism material and the separation of the prisms (similar to gratings). The second term is very often normal and depends on the propagation length in the prism material.

For a fixed separation \( L \), the amount of dispersion compensation obtained using gratings is very large compared to prisms, making for shorter separation distances and potentially smaller footprints. For example, to compensate roughly 1,000 fs2, fused silica prisms require a separation of 77 cm and SF10 glass prisms only require 21 cm [4]. However, gratings compressors typically have high amounts of loss and cannot be used for all applications, such as intra-cavity dispersion compensation for a femtosecond laser.

2 Nonlinear Optics

The strong electric fields achievable in a laser can drive the motion of electrons and atoms to create a nonlinear polarization. This nonlinear polarization gives rise to many effects, from the generation of new frequencies to light-by-light modulation. In this section, we will explore these effects and establish a foundation to later explore all-optical devices.

2.1 Nonlinear Polarization

For weak electric fields, the polarization depends linearly on the material susceptibility (in SI units):

$$ \vec{P} = {\varepsilon_0}\chi \vec{E}, $$
(7.48)

For stronger electric fields, we expand Eq. (7.48),

$$ P = {\varepsilon_0}\left( {{\chi^{(1)}}E + {\chi^{(2)}}{E^2} + {\chi^{(3)}}{E^3} + ...} \right), $$
(7.49)

to produce a series of polarizations:

$$ P = {P^{(1)}} + {P^{(2)}} + {P^{(3)}} + ..., $$
(7.50)

where \( {P^{(1)}} \equiv {\varepsilon_0}{\chi^{(1)}}E \) is the linear polarization, \( {P^{(2)}} = {\varepsilon_0}{\chi^{(2)}}{E^2} \) is the second-order nonlinear polarization, and so on. Although we have written this polarization in terms of scalars, the electric field is a vector quantity and therefore \( {\chi^{(1)}} \) is a second rank tensor (with 9 elements), \( {\chi^{(2)}} \) is a third rank tensor (27 elements), and \( {\chi^{(3)}} \) is a fourth rank tensor (81 elements), to which we often apply symmetry arguments to isolate unique, non-zero terms [5].

These nonlinear polarization terms act as a driving field in the wave equation, forming the nonlinear wave equation:

$$ {\nabla^2}\vec{E} - \frac{{{n^2}}}{{{c^2}}}\frac{{{\partial^2}\vec{E}}}{{\partial {t^2}}} = \frac{1}{{{\varepsilon_0}{c^2}}}\frac{{{\partial^2}{{\vec{P}}^{NL}}}}{{\partial {t^2}}}, $$
(7.51)

Here, \( n \) is the linear index of refraction and \( {P^{NL}} \) is the nonlinear polarization (excluding the linear term from Eq. (7.50) contained within \( n \)). This driving term acts as a source of new propagating waves.

2.1.1 Second-Order Nonlinear Polarization

The second order nonlinear polarization is responsible for several effects involving three photons. For example, two photons can combine to make a third photon. If the two initial photons are of the same (different) frequency, this effect is second harmonic generation (sum-frequency generation). Alternatively, one photon can split to make two photons through difference frequency generation.

Not all bulk crystals exhibit second-order nonlinearities. Let us consider the centrosymmetric case where a crystal has inversion symmetry. For the second order polarization, we find that

$$ {\vec{P}^{(2)}} = {\varepsilon_0}{\chi^{(2)}}\vec{E}\vec{E}. $$
(7.52)

If we invert the sign of the fields, we expect the same polarization, only with an opposite sign, therefore

$$ - {\vec{P}^{(2)}} = {\varepsilon_0}{\chi^{(2)}}\left( { - \vec{E}} \right)\left( { - \vec{E}} \right) = {\varepsilon_0}{\chi^{(2)}}\vec{E}\vec{E}. $$
(7.53)

Both equations cannot be simultaneously true in a centrosymmetric crystal, unless \( {\chi^{(2)}} \) vanishes. Therefore, materials possessing inversion symmetry do not exhibit second-order nonlinearities within the bulk material. Examples include many types of crystals, amorphous materials such as glasses, as well as liquids and gases. However, inversion symmetry is broken at an interface, leading to second-order nonlinearities even for materials with bulk inversion symmetry [6, 7].

Now, let us consider second-harmonic generation in a non-centrosymmetric crystal. Here, we have an electric field given by

$$ \vec{E}\left( {\vec{r},t} \right) = \frac{1}{2}\left( {E\left( {\vec{r},t} \right){e^{ - i{\omega_0}t}} + {\text{c}}{\text{.c}}{.}} \right)\hat{x}, $$
(7.54)

where we consider a single polarization in the x-direction. We use the slowly varying envelope approximation through the use of the scalar field \( E\left( {\vec{r},t} \right) \), which we will refer to simply as E. The nonlinear polarization is given by

$$ {\vec{P}^{NL}}\left( {\vec{r},t} \right) = \frac{1}{2}\left( {{P^{NL}}\left( {\vec{r},t} \right){e^{ - i{\omega_{NL}}t}} + {\text{c}}{\text{.c}}{.}} \right)\hat{x}, $$
(7.55)

which produces a second-order polarization:

$$ {\vec{P}^{(2)}} = {\varepsilon_0}{\chi^{(2)}}\vec{E}\vec{E} = {\varepsilon_0}{\chi^{(2)}}\left[ {\frac{{E{E^*}}}{2} + \frac{1}{2}\left( {\frac{{{E^2}}}{2}{e^{ - i2{\omega_0}t}} + c.c.} \right)} \right]\hat{x}\hat{x}. $$
(7.56)

From this expression, we see two terms. The first is at zero frequency, and is not responsible for generating a propagating wave as it vanishes when operated on by the time-derivative in the wave equation. The next term is at twice the initial frequency, known as the second harmonic. We can solve this expression for the slowly varying amplitude of the nonlinear polarization and the frequency to show:

$$ P_{2{\omega_0}}^{(2)} = \frac{{{\varepsilon_0}{\chi^{(2)}}{E^2}}}{2}\,\,{\text{and}}\,\,{\omega_{NL}} = 2{\omega_0}. $$
(7.57)

The process of generating second-harmonic signal efficiently requires additional considerations, known as phase matching, which is often achieved using birefringent crystals [8]. In a similar way, a field consisting of two different frequencies can generate terms at the second harmonic of each frequency, as well as for their summation and difference.

2.1.2 Third-Order Nonlinear Polarization

The majority of materials are centrosymmetric and therefore, the first non-zero nonlinear polarization is usually the third-order polarization. Writing this polarization out, we find:

$$ \begin{array}{lllll}{\vec{P}^{(3)}}(t) = {\varepsilon_0}{\chi^{(3)}}\vec{E}\vec{E}\vec{E} = {\varepsilon_0}{\chi^{(3)}}&\left[ {\frac{1}{2}\left( {\frac{{{E^3}}}{4}{e^{ - i3{\omega_0}t}} + c.c.} \right)}\right.\\ & + \left.{\frac{1}{2}\left( {\frac{3}{4}{{\left| E \right|}^2}E{e^{ - i{\omega_0}t}} + c.c.} \right)} \right]\hat{x}\hat{x}\hat{x}. \end{array} $$
(7.58)

Two terms emerge, leading to distinct physical processes. As before, the slowly varying envelope amplitude and nonlinear frequency for the first term is:

$$ P_{3{\omega_0}}^{(3)} = \frac{{{\varepsilon_0}{\chi^{(3)}}{E^3}}}{4}\,\,{\text{and}}\,\,{\omega_{NL}} = 3{\omega_0}, $$
(7.59)

respectively. This term is responsible for third harmonic generation. This process is often not efficient unless we use phase-matching techniques.

Meanwhile, the second term is extremely relevant, as the polarization occurs at the original frequency. The slowly varying amplitude and nonlinear frequency are:

$$ P_{{\omega_0}}^{(3)} = \frac{{3{\varepsilon_0}{\chi^{(3)}}{{\left| E \right|}^2}E}}{4}\,\,and\,\,{\omega_{NL}} = {\omega_0}. $$
(7.60)

This effect leads to a nonlinear index of refraction and is extremely important for devices. The nonlinear index is perhaps the most frequently observed nonlinearity. Not only do all materials exhibit third-order nonlinearities, but also, phase-matching is automatically satisfied as the polarization is at the original frequency. Therefore, we can observe this nonlinear effect without the specialized configurations required of other nonlinearities, such as second- and third-harmonic generation.

2.1.3 Anharmonic Oscillator Model

In the same way that we used the Drude-Lorentz model to gain physical insight into the linear dielectric function, we can use the classical anharmonic oscillator model to explore electronic nonlinearities. We base this nonlinear model on the same principles as before, but replace the binding force with an altered, nonlinear force. Just as the binding force for the Drude model is proportional to an electron’s displacement from its equilibrium position (x), the second-order polarization requires a term proportional to \( {x^2} \). For the (more applicable) third-order case, the binding force is:

$$ {F_{\it binding}} = - m{\omega_0}^2x + mb{x^3}, $$
(7.61)

where we have introduced a force term proportional to \( {x^3} \) with a phenomenological proportionality constant of b. Typically, b is on the order of \( \omega_0^2/{d^2} \), where d is on the order of the Bohr radius [5]. The equation of motion becomes

$$ m\frac{{{d^2}x}}{{d{t^2}}} + m\gamma \frac{{dx}}{{dt}} + m{\omega_0}^2x - mb{x^3} = - e{E_0}{e^{ - i\omega t}}. $$
(7.62)

We proceed in much the same way as before, which we will not show here. For the case of self-phase modulation, we arrive at [5]:

$$ {\chi^{(3)}} = \frac{{Nb{e^4}}}{{{\varepsilon_0}{m^3}D{{\left( \omega \right)}^3}D\left( { - \omega } \right)}}, $$
(7.63)

where \( D\left( \omega \right) = \omega_0^2 - {\omega^2} - i\gamma \omega \). This expression takes a very similar form as \( {\chi_e} \) from Eq. (7.18), with additional factors of \( {e^2} \) and \( {m^{ - 2}} \), along with multiple degenerate resonances given by \( D\left( \omega \right) \). We can approximate this expression if we are far from resonance using:

$$ {\chi^{(3)}} \approx \frac{{N{e^4}}}{{{\varepsilon_0}{m^3}\omega_0^6{d^2}}}. $$
(7.64)

2.2 Nonlinear Index of Refraction

The third-order nonlinear polarization gives rise to a nonlinear index of refraction that we use to make all-optical devices. An intensity-dependent index leads to self-phase modulation (SPM) for coherent waves and cross-phase modulation for non-coherent waves. In addition, by allowing \( {\chi^{(3)}} \) to take on complex values, we discover a new source of nonlinear absorption, analogous to the linear absorption caused by a complex \( {\chi^{(1)}} \).

2.2.1 Intensity-Dependent Refractive Index

If we take the slowly varying envelope amplitudes for the linear polarization \( {P^{(1)}} = {\varepsilon_0}{\chi^{(1)}}E \) and add this new term, we can write an effective linear susceptibility:

$$ {P^{(1)}} + P_{{\omega_0}}^{(3)} = {\varepsilon_0}{\chi^{(1)}}E + \frac{{3{\varepsilon_0}{\chi^{(3)}}{{\left| E \right|}^2}}}{4}E = {\varepsilon_0}\left( {{\chi^{(1)}} + \frac{{3{\chi^{(3)}}}}{4}{{\left| E \right|}^2}} \right)E = {\varepsilon_0}\chi_{eff}^{(1)}E. $$
(7.65)

This effective susceptibility creates an effective index of refraction:

$$ n = \sqrt {{{\rm Re} \left( \varepsilon \right)}} = \sqrt {{{\rm Re} \left( {1 + \chi_{eff}^{(1)}} \right)}} = \sqrt {{1 + {\rm Re} \left( {{\chi^{(1)}}} \right) + \frac{{3{\rm Re} \left( {{\chi^{(3)}}} \right)}}{4}{{\left| E \right|}^2}}}, $$
(7.66)

which we can simplify by expanding around small \( {\left| E \right|^2} \):

$$ n {\approx} \sqrt {{1 {+} {\rm Re} \left( {{\chi^{(1)}}} \right)}} {+} \frac{{3{\rm Re} \left( {{\chi^{(3)}}} \right)}}{{8\sqrt {{1 + {\rm Re} \left( {{\chi^{(1)}}} \right)}} }}{\left| E \right|^2} = n {+} \frac{3}{{8n}}{\rm Re} \left( {{\chi^{(3)}}} \right){\left| E \right|^2} {=} n {+} {\bar{n}_2}{\left| E \right|^2}. $$
(7.67)

This derivation produces our first definition of the nonlinear refractive index (in terms of \( {\left| E \right|^2} \)):

$$ {\bar{n}_2} = \frac{3}{{8{n_0}}}{\rm Re} \left( {{\chi^{(3)}}} \right). $$
(7.68)

In terms of the intensity, Eq. (7.67) becomes:

$$ n = {n_0} + {n_2}I = {n_0} + \frac{3}{{4n_0^2{\varepsilon_0}c}}{\rm Re} \left[ {{\chi^{(3)}}} \right]I. $$
(7.69)

The nonlinear index if often referred to as the Kerr effect. For a single wave propagating in a Kerr-medium, the wave’s intensity modulates the index of refraction, which modulates the phase of the wave as it propagates. We refer to this process as self-phase modulation. A similar effect occurs across different, non-coherent waves, known as cross-phase modulation [5]. We note that \( {n_2} \) can be either positive or negative, depending on the origin of the nonlinearity (electronic polarization, molecular orientation, thermal, etc.). For the electronic, non-resonant nonlinear polarization in silica, \( {n_2} \) is positive and therefore, we will assume a positive value of \( {n_2} \). For silica, the nonlinear index of refraction is 2.2–3.4 × 10−20 m2/W [9].

We have used SI units during this derivation, thus \( {\chi^{(3)}} \) is in units of m2/V2. Often \( {\chi^{(3)}} \) is often given in electrostatic units (cm3/erg, or simply esu). Meanwhile, the nonlinear index is typically quoted in units of cm2/W. The conversion is [5]:

$$ {n_2}\left( {\frac{{c{m^2}}}{W}} \right) = \frac{{12{\pi^2}}}{{n_0^2c}}{10^7}\chi_{esu}^{(3)}, $$
(7.70)

2.2.2 Two-Photon Absorption

Using a complex valued \( {\chi^{(3)}} \), we see that the third-order nonlinear polarization implies a nonlinear extinction coefficient, given by:

$$ \kappa = {\kappa_0} + {\kappa_2}I = {\kappa_0} + \frac{3}{{4n_0^2{\varepsilon_0}c}}{\rm Im} \left[ {{\chi^{(3)}}} \right]I. $$
(7.71)

Using \( \alpha = 2{\omega_0}\kappa /c \), we can write this expression as the nonlinear absorption:

$$ \alpha (I) = {\alpha_0} + {\alpha_2}I = {\kappa_0} + \frac{{3{\omega_0}}}{{2n_0^2{\varepsilon_0}{c^2}}}{\rm Im} \left[ {{\chi^{(3)}}} \right]I. $$
(7.72)

The new term is responsible for two-photon absorption and has units of length per power. For two-photon absorption to occur, the total energy of both photons must be large enough to promote an electron from the valence to the conduction band, and therefore \( {\chi^{(3)}} \) must necessarily be frequency dependent. For degenerate (same frequency) two-photon absorption, the single-photon energy must be at least half of the band-gap energy.

We note that two-photon absorption is a third-order process and not a second-order process. We can see this distinction by observing Eq. (7.58), whereby the imaginary third-order nonlinear polarization produces a wave at the original frequency. This out-of-phase wave adds destructively with the original wave, resulting in attenuation.

2.3 Self-Phase Modulation Effects

Self-phase modulation gives rise to several effects that we can use in devices. Applying SPM in the spatial domain leads to self-focusing; and in the time domain, this effect is responsible for spectral broadening, supercontinuum generation, and four-wave mixing. These processes require high intensities to be efficient; therefore, all sources of loss are important for materials and devices. With losses in mind, we explore the nonlinear figures of merit to assess current and future materials for third-order nonlinear optical applications.

2.3.1 Nonlinear Phase

Understanding the accumulation of nonlinear phase is essential, as it forms the basis for many all-optical devices. Starting with Eq. (7.3), we realize that as the wave propagates, it accumulates a phase given by \( \varphi = kL \), where \( L \) is the distance traveled. Adding the nonlinear index we see:

$$ \varphi = k(I)L = \frac{{2\pi }}{\lambda }\left( {n + {n_2}I} \right)L = \frac{{2\pi nL}}{\lambda } + \frac{{2\pi {n_2}L}}{\lambda }I = {\varphi_L} + {\varphi_{NL}}, $$
(7.73)

the intensity alters the phase accumulated by a factor of:

$$ {\phi_{NL}} = \frac{{2\pi {n_2}L}}{\lambda }I, $$
(7.74)

through a process known as self-phase modulation. For a plane wave at a single frequency, the wave gains a fixed phase change per unit distance. As we will show later, we can use this nonlinear phase to modulate nonlinear interferometers.

2.3.2 Self-Focusing

If we make the intensity non-uniform in space or in time, these gradients results in self-focusing or spectral-broadening, respectively. To demonstrate the effects of a spatial non-uniformity, let us consider a Gaussian beam profile passing through a thin, flat, nonlinear-material. At the center of the beam, the intensity is higher than at the edges, therefore the center accumulates a larger amount of nonlinear phase. For a material with a positive nonlinear index, the distribution of nonlinear phase shifts is analogous to a positive lens and causes the wave to focus. The Z-scan technique uses this intensity-dependent lens to measure the nonlinearity of bulk samples [10, 11].

2.3.3 Spectral Broadening

If we make the intensity non-uniform in time using a pulse, self-phase modulation broadens the pulse’s spectrum. Consider a pulse at a fixed location in space that has an intensity that changes in time, \( I(t) \). Using Eq. (7.73), we can take a derivative with respect to time, to find a change in frequency given by:

$$ \Delta \omega = - \frac{{d\varphi }}{{dt}} = - \frac{{2\pi }}{\lambda }L{n_2}\frac{{dI(t)}}{{dt}}. $$
(7.75)

For the case of a Gaussian pulse, we remember that Eq. (7.38) can be converted to intensity by dividing by an area and replacing \( {P_p} \to {I_0} \). With Eq. (7.75), we find the frequency shift for a Gaussian pulse is:

$$ \Delta \omega = \frac{{2\pi }}{\lambda }L{n_2}\frac{{8\ln (2)}}{{{\tau^2}}}tI(t). $$
(7.76)

From this equation, we realize that \( I(t) \) is always positive and \( tI(t) \) goes from negative to positive. At a fixed position, negative time is the front of the pulse and therefore, the front of the pulse experiences a negative frequency shift, creating longer, “red” wavelengths in the front. Oppositely, the back end of the pulse (positive time values) is frequency shifted positively and the wavelengths appear “blue-shifted”. These are new frequencies that are not present in the original pulse. Alternatively, one can picture this process as the peak of a pulse traveling with a slower phase velocity and therefore, the carrier wave “stretches out” in the front and “bunches up” in the back, creating red and blue wavelengths in the front and back, respectively.

Group velocity dispersive effects are extremely important in a real system. For a positive nonlinearity and normal GVD, both effects cause the red to move toward the front of the pulse and the blue to the back, reducing the intensity, leading to a limited amount of spectral broadening. If GVD is low, the broadening can be very strong, forming a continuum of wavelengths, known as supercontinuum or “white light” generation. In the case of anomalous GVD, the effects of self-phase modulation and GVD are opposite and can lead to a situation where the two effects balance one another, creating stable packets of light, known as solitons.

2.3.4 Nonlinear Figures of Merit

Material losses limit the performance of SPM-based all-optical switches. Switches made using nonlinear interferometers, for example, require a specified total nonlinear phase for full modulation (i.e. 2π). Total losses (both intrinsic and from fabrication) limit the maximum amount of nonlinear phase by attenuating the wave as it propagates, making full modulation impossible for certain materials [12]. We quantify these limitations using dimensionless parameters known as the nonlinear figures of merit.

To proceed, we ask: for a given material, what is the maximum amount of nonlinear phase that we can accumulate? Let us consider the simplest case of a monochromatic plane-wave experiencing both linear and nonlinear attenuation. Our first step must be to determine the intensity as a function of distance using:

$$ \frac{{dI(z)}}{{dz}} = - \left[ {{\alpha_0} + {\alpha_2}I(z)} \right]I(z). $$
(7.77)

This differential equation has a solution of the form:

$$ I(z) = {I_0}{e^{ - {\alpha_0}z}}{\left[ {1 + {I_0}{\alpha_2}\left( {\frac{{1 - {e^{ - {\alpha_0}z}}}}{{{\alpha_0}}}} \right)} \right]^{ - 1}}. $$
(7.78)

The total nonlinear phase accumulated over a distance \( L \) is therefore:

$$ {\text{total}}{\varphi_{NL}} = \int_0^L {\frac{{2\pi {n_2}}}{\lambda }I(z)dz} = 2\pi \left( {\frac{{{n_2}}}{{{\alpha_2}\lambda }}} \right)\ln \left[ {1 + {I_0}{\alpha_2}\left( {\frac{{1 - {e^{ - {\alpha_0}L}}}}{{{\alpha_0}}}} \right)} \right]. $$
(7.79)

From this expression, we define the effective length for plane-waves and in waveguides as [11].

$$ {L_{eff}} = \frac{{1 - {e^{ - {\alpha_0}L}}}}{{{\alpha_0}}}. $$
(7.80)

Taking \( L \to \infty \), the effective length becomes \( 1/{\alpha_0} \) and the maximum nonlinear phase is given by:

$$ { \max }{\varphi_{NL}} = 2\pi \left( {\frac{{{n_2}}}{{{\alpha_2}\lambda }}} \right)\ln \left[ {1 + {{\left( {\frac{{{n_2}}}{{{\alpha_2}\lambda }}} \right)}^{ - 1}}\left( {\frac{{{n_2}{I_0}}}{{{\alpha_0}\lambda }}} \right)} \right]. $$
(7.81)

From this result, we can define two figures [13, 14]:

$$ W \equiv \left( {\frac{{{n_2}{I_0}}}{{{\alpha_0}\lambda }}} \right)\,\,{\text{and}}\,\,{T^{ - 1}} \equiv \left( {\frac{{{n_2}}}{{{\alpha_2}\lambda }}} \right), $$
(7.82)

and we can rewrite Eq. (7.81) as:

$$ { \max }{\varphi_{NL}} = 2\pi {T^{ - 1}}\ln \left[ {1 + TW} \right]. $$
(7.83)

The W and T −1 figures of merit are associated with linear losses and two-photon absorption, respectively.

Now, let us look at the simplest case, where we have only linear losses, by taking the limit of Eq. (7.83) as \( {T^{ - 1}} \to \infty \). The maximum nonlinear phase amounts to \( 2\pi W \), and if we require \( { \max }{\varphi_{NL}} > 2\pi \), we require that \( W > 1 \). This result suggests the need for a minimum operating intensity: for a given attenuation coefficient \( {\alpha_0} \) we require an intensity of \( {I_0} > {\alpha_0}\lambda /{n_2} \). The damage threshold of the material or additional nonlinear effects (particularly two- and three-photon absorption) pose limits for all-optical switching. Alternatively, for a given operating intensity, we must achieve total losses (absorption and scattering) less than \( {\alpha_0} < {n_2}{I_0}/\lambda \).

Usually, we are only interested in materials with no intrinsic linear absorption and therefore, two-photon absorption and imperfections from the fabrication are limiting factors. For strong two-photon absorption, T −1 is small, and the wave is strongly absorbed at the start of propagation. After the intensity quickly drops, we will continue to accumulate nonlinear phase at a reduced rate, requiring very large devices for full modulation. For a fixed value of T −1, we can determine the required value of \( { \max }{\varphi_{NL}} > 2\pi \):

$$ {W_{req}} > \frac{{\left( {{e^T} - 1} \right)}}{T} \approx 1 + \frac{1}{2}T + \frac{1}{6}{T^2} + ... $$
(7.84)

For reasonable values of \( {W_{req}} \) (near unity), we use the following guidelines [13, 14]:

$$ W = \left( {\frac{{{n_2}{I_0}}}{{{\alpha_0}\lambda }}} \right) > 1\,\,{\text{and}}\,\,{T^{ - 1}} = \left( {\frac{{2{n_2}}}{{{\alpha_2}\lambda }}} \right) > 1. $$
(7.85)

These two figures of merit are a simple and effective way to evaluate a nonlinear material. Although the intrinsic nonlinearity of silica fiber is relatively low, it is a fantastic nonlinear material, having exceptionally low loss and low two-photon absorption. These factors become much more critical in new materials systems with high nonlinearity and serve as a check for applications. For example, silicon has a very strong intrinsic nonlinearity, 200–300 times silica glass. However, this high nonlinearity is accompanied by strong two-photon absorption, and thus the T −1 figure of merit is below unity for all telecommunications wavelengths and is only above unity for wavelengths longer than 1,800 nm [15]. Although low figures of merit make ultrafast interferometric switches impractical in silicon, other effects are used for switching [16].

3 Nanoscale Optics

In this section, we will explore sub-wavelength waveguides both theoretically and experimentally and use these in the next section to enhance nonlinear interactions. To motivate the use of waveguides for nonlinear optics, we observe that the total accumulated nonlinear phase depends on the intensity and the effective length in Eq. (7.79). We can reach high intensities by focusing a laser using a lens; however, the beam will quickly diverge, leading to few accumulated nonlinearities. Considering a Gaussian beam, the peak intensity is \( {I_0} = {P_0}/\left( {\pi w_0^2} \right) \), where \( {P_0} \) is the power, and \( {w_0} \) is the beam waist. For a focused beam, the divergence limits the length of nonlinear phase accumulation. If the effective length for a Gaussian beam is given by the Rayleigh distance \( {z_R} = \pi w_0^2/\lambda \), the intensity-length product for a focused beam is [9]:

$$ {I_0}{z_R} = \left( {\frac{{{P_0}}}{{\pi w_0^2}}} \right)\left( {\frac{{\pi w_0^2}}{\lambda }} \right) = \frac{{{P_0}}}{\lambda }. $$
(7.86)

The intensity is inversely proportional to the square of the spot size, \( {w_0} \); meanwhile, the focused spot will diverge based on the Rayleigh distance, which is directly proportional to the square of the spot size. These combined effects lead to a situation whereby tight focusing, alone, does not change the accumulated nonlinearity. We can achieve much stronger nonlinear interactions if we can counteract the divergence by keeping the wave confined as it propagates. Such confinement is precisely what we achieve using a waveguide. If the waveguide has a loss of \( {\alpha_0} \), using Eq. (7.80), the intensity-length product is [9]:

$$ {I_0}{L_{eff}} = \left( {\frac{{{P_0}}}{{\pi w_0^2}}} \right)\left( {\frac{{1 - {e^{ - {\alpha_0}L}}}}{{{\alpha_0}}}} \right) \approx \frac{{{P_0}}}{{\pi w_0^2{\alpha_0}}}. $$
(7.87)

The maximum intensity-length product (for a waveguide of infinite length) depends completely on the loss of the waveguide and is far greater than for a focused beam. This ability to accumulate a large amount of nonlinear phase in a waveguide is an essential tool for nonlinear optics. In addition, we see that the nonlinear interaction scales inversely with the area. Therefore, the highest nonlinear interactions will occur for waveguides with the smallest effective area and thus, by employing nano-scale waveguides, we can achieve very large nonlinear interactions efficiently.

3.1 Waveguides

Waveguides use multiple reflections to confine light signals into discrete channels, known as modes. By considering geometrical arguments, we will first develop an intuitive understanding of the guiding condition using reflections from metallic mirrors. Similarly, we will use total internal reflection to form waveguides in dielectric materials. By observing the wavelength-dependence of the guiding condition, we will demonstrate how waveguides can alter the effective propagation constant and use it to engineer the effective dispersion within a waveguide. Lastly, we will observe the field distributions within waveguides, which will enable us to create enhanced nonlinearities using sub-wavelength structure.

3.1.1 Metal Mirror Waveguide

We will start by considering the simplest waveguide formed by two parallel metallic mirrors. Considering the geometry of Fig. 7.6 (left), we see that for the wave to sustain itself, the phase accumulated from multiple bounces must add constructively with the original wave. Each bounce off a metal mirror adds a phase shift of π. Therefore, the total phase from the path length and from each bounce, must be an integer multiple of 2π. From the geometry in Fig. 7.6, this condition is [17]:

$$ \begin{array}{llll} \frac{{2\pi }}{{{\lambda_0}}}\left( {AC - AB} \right) - 2\pi &= \frac{{2\pi }}{{{\lambda_0}}}\left( {AC - AC\cos \left( {2\theta } \right)} \right) - 2\pi \\ &= \frac{{2\pi }}{{{\lambda_0}}}2d\sin \theta - 2\pi = 2\pi m^{\prime}, \\ \end{array} $$
(7.88)

where \( m^{\prime} \) is an integer (starting from zero). Simplifying this expression and using \( m = m^{\prime} + 1 \), we obtain:

$$ \begin{array}{*{20}{c}} {2d\sin \theta = m{\lambda_0}} & {m = 1,\,2,\,...} \end{array} $$
(7.89)
Fig. 7.6
figure 6

A metallic mirror waveguide showing the periodicity condition requirement (left) and a graphical method to determine the solutions to Eq. (7.89) (right)

Therefore, we have a discrete number of propagating solutions, as shown in Fig. 7.6 (right). Each solution is known as a mode and acts like a channel for an electromagnetic wave. The number of guided modes for a single polarization is given by:

$$ M{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\cdot}$}}{ = }}\frac{{2d}}{{{\lambda_0}}}. $$
(7.90)

We find that no guiding can occur when the distance becomes less than half of the wavelength.

Depending on the orientation of the electric field, two unique waves can propagate. If the electric field of the wave is transverse to the plane of the mirror, we find the transverse-electric polarization (TE) and there is no electric field in the direction of propagation. Swapping the electric and magnetic fields results in transverse-magnetic polarization (TM). These will become important when we consider dielectric waveguides, as the boundary conditions are slightly different for TE- versus TM-waves.

In the case of a metal mirror waveguide, this condition is identical for the both TE and TM polarizations, and therefore the total number of supported modes is twice this value. Alternatively, we can write down an equation for the cutoff frequency (the lowest frequency that is still guided), by setting \( \sin \theta = 1 \) and \( m = 1 \), then solving for the frequency to show:

$$ {\omega_c} = \frac{{\pi c}}{d}. $$
(7.91)

3.1.2 Planar Dielectric Guiding

Alternatively, we can form a waveguide using dielectric materials. This case is very similar to the two-mirror case, except the wave is guided using total internal reflection. Therefore, we require a high index material surrounded on the top and bottom by lower index materials. The relative index difference is related to the index contrast and determines parameters including the number of modes, the phase and group velocities, GVD and the confinement.

For total internal reflection, the angle between the wavevector and the normal to the interface must be greater than the critical angle. The critical angle is given by \( \sin {\theta_c} = {n_2}/{n_1} \), where \( {n_1} \) is the index in the guiding layer and \( {n_2} \) is the cladding index. We should note that our current waveguide discussion uses the complimentary angle. In our present notation, waves will be totally internally reflected if \( \theta < \left( {\pi /2 - {\theta_c}} \right) \). Using these conditions, we modify Eq. (7.88) by scaling the length by the index of the guiding layer, \( {n_1} \), and we replace the phase-shift upon reflection, previously \( \pi \), with a new material- and polarization-dependent phase-shift, \( \varphi \):

$$ \frac{{2\pi }}{{{\lambda_0}}}{n_1}\left( {2d\sin \theta } \right) - 2\varphi = 2\pi m^{\prime} $$
(7.92)

If we use the convention that \( \lambda = {\lambda_0}/{n_1} \), the condition becomes [17]:

$$ \frac{{2\pi }}{\lambda }2d\sin \theta - 2\varphi = 2\pi m. $$
(7.93)

The phase change depends on the materials (through \( {\theta_c} \)) and on the polarization [18]:

$$ {\varphi_{TE}} = 2{\tan^{ - 1}}\left( {\sqrt {{\frac{{{{\cos }^2}{\theta_c}}}{{{{\sin }^2}\theta }} - 1}} } \right),\,{\text{and}} $$
(7.94)
$$ {\varphi_{TM}} = 2{\tan^{ - 1}}\left( {{{\csc }^2}{\theta_c}\sqrt {{\frac{{{{\cos }^2}{\theta_c}}}{{{{\sin }^2}\theta }} - 1}} } \right), $$
(7.95)

for TE- and TM-polarizations, respectively. Using these expressions, we determine the guiding condition to be:

$$ \tan \left[ {\frac{{\pi d}}{\lambda }\sin {\theta_{TE}} - \frac{{m\pi }}{2}} \right] = \sqrt {{\frac{{{{\sin }^2}\left( {\pi /2 - {\theta_c}} \right)}}{{{{\sin }^2}{\theta_{TE}}}} - 1}}, \,\,{\text{and}} $$
(7.96)
$$ \tan \left[ {\frac{{\pi d}}{\lambda }\sin {\theta_{TM}} - \frac{{m\pi }}{2}} \right] = {\csc^2}{\theta_c}\sqrt {{\frac{{{{\sin }^2}\left( {\pi /2 - {\theta_c}} \right)}}{{{{\sin }^2}{\theta_{TM}}}} - 1}} . $$
(7.97)

To find the modes graphically, we can plot both the left- and right-hand sides in terms of \( \sin \theta \), the solutions occur whenever the two curves intersect, as shown in Fig. 7.7.

Fig. 7.7
figure 7

Graphical solutions to the waveguiding condition from Eq.(7.97)

In addition, we can calculate the number of modes by determining the number of crossings. We realize that the left-hand side is periodic in \( \sin \theta \) and we can expect a solution for each of period until the right hand side goes to zero. Considering both even and odd values of m, the function has a periodicity of \( \lambda /\left( {2d} \right) \) in terms of \( \sin \theta \). We find that the right-hand side goes to zero when \( \sin \theta = \sin \left( {\pi /2 - {\theta_c}} \right) \). Therefore, the number of modes, M, for TE-polarization is given by:

$$ M\dot{ = }\frac{{\sin \left( {\pi /2 - {\theta_c}} \right)}}{{\lambda /\left( {2d} \right)}}. $$
(7.98)

Note that we must round up in this case. Alternatively, we can write this expression in terms of the refractive indices use by using:

$$ \sin \left( {\pi /2 - {\theta_c}} \right) = \cos {\theta_c} = \sqrt {{1 - {{\sin }^2}{\theta_c}}} = \sqrt {{1 - n_2^2/n_1^2}} = \frac{{\sqrt {{n_1^2 - n_2^2}} }}{{{n_1}}}, $$
(7.99)

to show:

$$ M\dot{ = }\frac{{2d}}{{\lambda {n_1}}}\sqrt {{n_1^2 - n_2^2}} = \frac{{2d}}{{{\lambda_0}}}\sqrt {{n_1^2 - n_2^2}}, $$
(7.100)

for TE-polarization [17]. (Note the substitution of the free-space wavelength.) This expression is not correct if we consider an asymmetric waveguide with a different upper and lower cladding. Similarly, we have a cutoff frequency of [17]:

$$ {\omega_c} = \frac{{\pi c}}{d}\frac{1}{{\sqrt {{n_1^2 - n_2^2}} }}, $$
(7.101)

above which multiple modes of one polarization will propagate.

Multiple modes in a waveguide can make designing and fabricating devices difficult. Although modes are linearly independent, small perturbations in real systems can couple multiple modes to one another. Therefore, we often try to make waveguides single-mode. We consider a waveguide single-mode if it only supports a single guided mode for a particular polarization (TE or TM). We can obtain single-mode operation by adjusting the waveguide materials and geometry, as shown in Eqs. (7.90) and (7.100).

Comparing the single-mode condition for mirror versus dielectric guiding, we find that using 600-nm light, the single-mode thicknesses d for mirror guiding is in the range from 300 to 600 nm, below which, no modes are supported. On the other hand, for fused silica, we find single-mode guiding for thicknesses less than 283 nm, with no minimum thickness, in contrast to the mirror guided case. If we reduce the index contrast between the core and the cladding, we can increase the size of the waveguide, as is done for standard silica fiber.

3.1.3 Propagation Constants

A guided wave will propagate at a different speed due to the multiple bounces it must undergo. Therefore, it no longer makes sense to use the wavevector \( k \) to describe the rate of phase propagation and we define a new constant \( \beta = 2\pi {n_{eff}}/{\lambda_0} \) known as the propagation constant. The wave travels in an analogous manner to a plane wave, however \( k \) is replaced by \( \beta = 2\pi {n_{eff}}/{\lambda_0} \) and the wave propagates with an effective index of \( {n_{eff}} \). We can illustrate the propagation constant by consider the mirror waveguide shown in Fig. 7.8. We see that [17]:

$$ {\beta^2} = {k^2} - k_y^2 = {k^2}\left( {1 - {{\sin }^2}{\theta_m}} \right) = {k^2}\left( {1 - \frac{{{m^2}\lambda_0^2}}{{4{d^2}}}} \right) = {k^2} - \frac{{{m^2}{\pi^2}}}{{{d^2}}}, $$
(7.102)
Fig. 7.8
figure 8

Propagation constant for a metallic mirror waveguide

Which we write this expression in terms of the cutoff frequency to obtain:

$$ {\beta_m} = \frac{\omega }{c}\sqrt {{1 - {m^2}\frac{{\omega_c^2}}{{{\omega^2}}}}} = \frac{\omega }{c}\cos {\theta_m}. $$
(7.103)

The propagation constant for a mirror waveguide is less than the free-space propagation constant, implying that the phase velocity is greater than the speed of light. We determine the group velocity for a mirror waveguide using:

$$ {v_g} = {\left[ {\frac{{d{\beta_m}}}{{d\omega }}} \right]^{ - 1}} = c\sqrt {{1 - {m^2}\frac{{\omega_c^2}}{{{\omega^2}}}}} = c\cos {\theta_m}. $$
(7.104)

showing that the group velocity is still slower than the speed of light. Similarly, we calculate the GVD for a mirror waveguide using:

$$ {\it GVD} = \frac{{{d^2}{\beta_m}}}{{d{\omega^2}}} = - \left( {\frac{{{m^2}\omega }}{c}} \right){\left( {\frac{{{\omega_c}}}{{{\omega^2} - {m^2}\omega_c^2}}} \right)^2}\sqrt {{1 - {{\left( {\frac{{m{\omega_c}}}{\omega }} \right)}^2}}} . $$
(7.105)

This simple model provides two critical insights. First, we find that the dispersion for a metal waveguide is always negative (anomalous), contrary to many dielectric materials. Secondly, we find that the GVD depends heavily on the cutoff frequency of the waveguide \( {\omega_c} = \pi c/d \) and becomes very strong as we approach the cutoff frequency. Therefore, the GVD is heavily dependent on the waveguide geometry. Thus, by changing the mirror separation d, we have the ability to tune the GVD, which we have seen is a critical parameter for pulse propagation.

In a similar way, we determine the propagation constant for a dielectric wave-guide starting with:

$$ 2d\sqrt {{\frac{{{\omega^2}}}{{c_1^2}} - {\beta^2}}} = 2{\varphi_r} + 2\pi m, $$
(7.106)

with integer values of m. Similar to the mirror waveguide, we can use this condition to derive the group velocity for a dielectric waveguide. The derivation is rather mathematical and is of limited use in the present context. Therefore, we refer the reader to the derivation presented in [17], and present the result here:

$$ {v_g} = \frac{{d\cot \theta + \Delta z}}{{{n_1}d\csc \theta /c + \Delta \tau }}. $$
(7.107)

Here, \( {n_1} \) is the refractive index of the core, \( \theta \) is the angle associated with the planar mode. In addition, we have introduced \( \Delta z \), which corresponds to the additional distance that the wave propagates along the boundary in the form of an evanescent wave for each round trip. This additional propagation takes a time \( \Delta \tau \). We can relate this result to our previous result, if we have no evanescent field by setting \( \Delta z = \Delta \tau = 0 \). Equation (7.107) reduces to \( {v_g} = (c/{n_1})\cos \theta \), which is identical to Eq. (7.104), with the addition of a dielectric between the two mirrors.

Just as for a metal mirror waveguide, the group velocity depends on the thickness of the dielectric waveguide. The dielectric’s intrinsic GVD alters the total GVD, which is sometimes approximated as the sum of the material dispersion and the waveguide dispersion [19, 20]. We must also consider the penetration of the evanescent wave into the cladding. Because of these considerations, and the complications of a two-dimensional waveguide cross-section, we often use numerical simulations.

As many materials display normal GVD at optical frequencies, we are often interested in reducing the amount of normal dispersion by introducing anomalous waveguide dispersion. For this case, we see two general design considerations. First, we should design our waveguides with dimensions that are comparable to the wavelength of interest (near cutoff). Second, we should use core and cladding materials with high index-contrasts to reduce the penetration of the evanescent field and better approximate the strong anomalous dispersion found in the mirror waveguide.

3.1.4 Waveguide Equation

Light propagates differently in a waveguide than it does in free-space; however, it is similar to plane wave propagation. To reflect this similarity, we can rewrite the wave equation and look for solutions that take the same form as plane waves. We use a spatially varying dielectric constant \( {\varepsilon_r}\left( {x,y} \right) \) that is infinite in the z-direction. We start with the equation for the time-harmonic vector and scalar potentials in a source-free, non-magnetic, non-uniform, dielectric medium [3]:

$$ {\nabla^2}\vec{A} + {\omega^2}\frac{{{\varepsilon_r}}}{{{c^2}}}\vec{A} = - i\frac{\omega }{{{c^2}}}\nabla {\varepsilon_r}\Phi . $$
(7.108)

The electric field is determined by \( \vec{E} = - i\omega \vec{A} - \nabla \Phi \). We see that the right hand side of Eq. (7.108) couples the vector and scalar potential. If we assume that \( \nabla {\varepsilon_r} \) is small and set it to zero, we can obtain:

$$ {\nabla^2}\vec{A} + \frac{{{\omega^2}}}{{{c^2}}}{\varepsilon_r}\vec{A} = 0. $$
(7.109)

We can look for solutions that resemble plane waves of the form:

$$ \vec{A} = \vec{y}F\left( {x,y} \right){e^{ - i\beta z}}, $$
(7.110)

where we assume that all the of z dependence is in \( {e^{ - i\beta z}} \), and therefore \( F\left( {x,y} \right) \) is the modal distribution, which describes the spatial profile of the mode. After substitution, the differential equation for \( F\left( {x,y} \right) \) is:

$$ \nabla_T^2F + \left( { - {\beta^2} + \frac{{{\omega^2}}}{{{c^2}}}{\varepsilon_r}\left( {x,y} \right)} \right)F = 0, $$
(7.111)

where \( \nabla_T^2 \) is the Laplacian in the transverse spatial direction. This equation has the same form as the time-dependent Schrodinger equation for a two-dimensional potential well with a potential of \( - {\omega^2}{\mu_0}\varepsilon \left( {x,y} \right) \). Therefore, this equation’s solutions are completely analogous to those for a quantum well and, as we’ve seen previously, requires specific values of the propagation constant \( \beta \).

3.1.5 Waveguide Field Distributions

For the case of a metallic mirror waveguide, the electric field must necessarily be zero at the metallic boundary. However, for the case of dielectric waveguides, the field extends beyond the boundary as a decaying evanescent field. Solving Eq. (7.111), a TE mode has an internal field given by [17]:

$$ {F_m}(y) \propto \left\{ {\begin{array}{*{20}{c}} {\cos \left( {2\pi \frac{{\sin {\theta_m}}}{\lambda }y} \right),} & {m = 0,\,2,\,4,\,...} \\ {\sin \left( {2\pi \frac{{\sin {\theta_m}}}{\lambda }y} \right),} & {m = 1,\,3,\,5,\,...} \\ \end{array} } \right. - \frac{d}{2} \leqslant y \leqslant \frac{d}{2}, $$
(7.112)

with \( \lambda = {\lambda_0}/{n_1} \). Outside of the waveguide core, the evanescent field is given by:

$$ {F_m}(y) \propto \left\{ {\begin{array}{*{20}{c}} {\exp \left[ { - \left( {{n_2}{k_0}\sqrt {{\frac{{{{\cos }^2}{\theta_m}}}{{{{\cos }^2}\left( {\pi /2 - {\theta_c}} \right)}} - 1}} } \right)y} \right],} \\ {\exp \left[ {\left( {{n_2}{k_0}\sqrt {{\frac{{{{\cos }^2}{\theta_m}}}{{{{\cos }^2}\left( {\pi /2 - {\theta_c}} \right)}} - 1}} } \right)y} \right],} \\ \end{array} } \right.\,\,\,\,\,\,\begin{array}{*{20}{c}} \begin{gathered} y > d/2 \hfill \\ \hfill \\ \end{gathered} \\ {y < - d/2} \\ \end{array} \begin{array}{*{20}{c}} \begin{gathered} \hfill \\ \hfill \\ \end{gathered} \\ . \\ \end{array} $$
(7.113)

We see the evanescent field decays into the surrounding medium, as shown in Eq. (7.113). The higher order modes, having larger angles, \( {\theta_m} \), will extend further into the cladding [17].

Just as for quantum wells, there are analogous oscillatory solutions for waveguides, as shown in Fig. 7.9. For mirror waveguides, the modes resemble an infinite quantum well, with the wave going to zero at the boundaries (Fig. 7.9, left). Alternatively, we see that waves guided by a dielectric extend beyond the boundaries and propagate in the form of an evanescent wave, analogous to a finite quantum well (Fig. 7.9, right).

Fig. 7.9
figure 9

The spatial-distribution for the wave functions of a quantum well have the same form as a guided electromagnetic wave. Here, we see the several wave function for an infinite quantum well (far left), a mirror waveguide (middle left), a finite potential quantum well (middle right), and a dielectric waveguide (far right)

We have only dealt with planar waveguides so far, for simplicity. We can confine the wave in the other transverse dimension as well, which enables us to concentrate light strongly in two-dimensions. Adding an additional degree of freedom leads to many different type of waveguides including cylindrical waveguides, such as fibers, as well as rectangular waveguides such as channel, strip, and ridge waveguides [2].

3.2 Silica Nanowires

We have seen that in dielectric waveguides, the thickness and refractive indices determine the number of modes, their spatial distribution, propagation constants, and GVD. High index contrasts enable stronger confinement, greater control over the total GVD and can lead to bends with very small radii of curvature. However, high index contrast waveguides require excellent uniformity. Abrupt tapers and interfacial roughness causes scattering and can be large a source of optical loss. Therefore, smooth, uniform waveguides are a necessity for operating in this high-index contrast regime. In this section, we will explore silica nanowires as a simple and effective waveguiding system for nano-scale optics.

3.2.1 Silica Nanowire Fabrication

Silica nanowires are easily fabricated using a fiber-pulling technique [21]. Starting with a standard silica fiber, we remove the protective coating to expose the cladding. We place the bare fiber within a flame until the silica softens and then the fiber is drawn until it breaks, Fig. 7.10 (left). This process creates a tapered region starting from a typical diameter of 125 μm that ends in a fiber that is 1–5 μm in diameter. We make smaller diameter nanowires, down to 20 nm, by attaching the free-end of the nanowire to a sapphire taper, placing the sapphire into the flame, and drawing a second time (Fig. 7.10, right) [21, 22]. Nanowires drawn this way are advantageous because one end is still attached to a standard optical fiber, facilitating simple coupling. Alternatively, we use a mechanical method for pulling nanowires, which results in slightly larger dimensions (down to 90 nm) with the advantage of having standard optical fiber on both ends [23].

Fig. 7.10
figure 10

Two-step drawing process for silica nanowires. First, we draw the standard fiber over a flame to form nanowires down to ~1 μm (left). By attaching the fiber to a sapphire taper and drawing a second time, we fabricate nanowires down to 10’s of nanometers [22]

The drawing process has several critical factors to achieve reliable results. First, the fiber must be extremely clean when being drawn to insure uniform heating. Second, the flame must be well controlled and well characterized to insure that the nanowire is at the optimum hot spot. Third, the flame must be very of high quality, such as hydrogen.

3.2.2 Mechanical Properties

Silica nanowires have extremely high quality mechanical properties [21, 24]. They can be made small (down to 20 nm) [22], with lengths up to tens of mm [21]. These are extremely uniform; for example, a 260-nm diameter fiber displays an 8-nm diameter variation over a total length of 4 mm, resulting in a uniformity of 106 with an aspect ratio of 7 × 10−5 [23]. Meanwhile, the surface roughness is less than 0.5 nm (Fig. 7.11, middle) [21, 22]. They are extremely strong with an estimated tensile strength of 5 GPa [21].

Fig. 7.11
figure 11

Physical properties of silica nanowires. A coiled, 260-nm diameter silica nanowire that is 4-mm long (left). An SEM image of a silica nanowire, demonstrating smooth sidewall roughness (middle). We can bend can form these into devices such as this nano-knot (right)

We can bend and twist silica nanowires to form devices (Fig. 7.11, right). The small diameter of the nanowires enables a strong Van Der Waals force that can hold devices in place. Nanowire bends are elastic and therefore straighten when stress is released; however, annealing nanowires enables such temporary bends to become permanent.

3.2.3 Optical Properties

The smoothness and uniformity of silica nanowires, combined with the high index-contrast between silica (n = 1.45) and air (n = 1), produces high quality sub-micrometer waveguides. We launch light into a nanowire by coupling into the macroscopic optical fiber using standard objective coupling techniques. Additionally, we collect the light exiting the nanowire using a similar method, thus creating a bi-directional link to the nanoscale.

Guiding losses display an interesting trend as a function of diameter. Figure 7.12 (left) shows the measured propagation loss as a function of diameter at two wavelengths, 633 and 1,550 nm. We see that losses are lower for longer wavelengths. In addition, larger diameter nanowires result in lower losses, typically below 0.1 dB/mm [23].

Fig. 7.12
figure 12

Optical losses versus nanowire diameter for 633- and 1,550-nm light (left). Diameter dependent scattering of silica nanowires (right)

When guiding light in a nanowire, we observe scattering from microscopic dust particles (Fig. 7.12, right) and notice an interesting dependence on the diameter. For large fibers, we observe virtually no scatter (not shown). As we decrease the diameter below 1 μm, we find significant and uniform scattering. As we decrease the diameter below 450 nm, the scattering becomes restricted to increasingly fewer, brighter spots.

We can explain the peculiar scattering shown in Fig. 7.12 (right) intuitively if we observe the transverse shape of the mode. We calculate and plot the Poynting-vector profile for an 800-nm, 500-nm, and 200-nm diameter silica nanowire in Fig. 7.13 (left, middle, and right, respectively), with a wavelength of 633 nm. We find that for a waveguide with diameter of 800 nm and larger, the fraction of the wave guided within the core (which we refer to as the confinement) is very high, and there is little evanescent field. For smaller diameters (around 500 nm), a significant evanescent field develops, leading to more scattering. For even smaller diameters, the power is almost entirely guided on the outside of the fiber in the form of an evanescent wave. The large evanescent wave leads to very strong scattering. We might expect more scattering from smaller nanowires, but we actually observe fewer scattered particles. We can easily explain this trend when we realize that the surface area decreases in smaller nanowires, leading to less area for the dust particles to attach, reducing the total number of dust particles.

Fig. 7.13
figure 13

Numerically simulated Poynting vector for 633-nm light in a silica nanowire for several diameters [23]

Quantitatively, Fig. 7.14 shows the confinement as a function of the diameter for silica nanowires guiding 633- and 1,550-nm light. For very large diameters, the light is confined completely within the core; however, the light is not very concentrated as it spreads across the total cross section of the waveguide. For diameters near or slightly smaller than the wavelength, the fraction of light within the core is over 90 %, thus the light is tightly confined and well contained within the material, which we refer to as strong confinement. Lastly, diameters less than one quarter of the wavelength guide light almost completely in the evanescent field.

Fig. 7.14
figure 14

Calculated optical confinement in a silica nanowire for 633- and 1,550-nm light [23, 25]

The substantial evanescent waves in silica nanowires are advantageous for sensing. When a considerable fraction of the light is guided in the evanescent field, the wave will strongly interact with nearby materials, enabling efficient coupling over short distances [26]. In combination with the strong Van Der Waals force, coupling is as simple as touching one nanowire to the other, as shown in Fig. 7.15 (left). Coupling can also happen on extremely short length-scales, down to a several micron, as shown using simulation (Fig. 7.15, right) [26].

Fig. 7.15
figure 15

Optical image of two silica-nanowires evanescently coupling (left) [21]. Numeric simulation showing strong coupling over a distance of 3 μm (right) [26]

Just as we can bend, form, and cut silica nanowires mechanically, we can use these techniques to form devices. So far, we have only considered devices that are freely suspended in air. To make practical devices, we require a substrate with lower index; otherwise, the wave will escape into the cladding below. Potential substrates include MgF2 (n = 1.37), mesoporous silica (n = 1.18) [23, 27], and silica aerogel (n = 1.05) [24]. Using such a substrate, we can form devices such as the directional coupler shown in Fig. 7.16 (middle) [24]. We can achieve very small bending radii, as low as 5 μm with losses around 1 dB per 90-degree turn, as shown Fig. 7.16 (left) [24]. In addition, we can use silica nanowires to couple into and out of other nano-scale materials such as ZnO nanowires, shown in Fig. 7.16 (right) [27].

Fig. 7.16
figure 16

The high index contrast between silica and air enables tight bends for devices (left) [21]. We can form micro-scale devices such as a directional coupler (middle) [24]. We can couple into other nano-scale devices such as this ZnO waveguide (right) [27]

4 Nonlinear Optics at the Nanoscale

We have explored both the linear and nonlinear propagation of plane waves in materials and in waveguides, developed an understanding of linear pulse propagation, and have identified how waveguides can enable high intensities in space (strong confinement) and time (short-pulse dispersion management). Now we will merge these concepts together to introduce nonlinear optics at the nanoscale and demonstrate several applications.

4.1 Nonlinear Schrodinger Equation

The generalized nonlinear Schrodinger equation (NLSE) is the workhorse of nonlinear fiber and integrated optics. It adds a third-order nonlinear response to the waveguide equation and describes the temporal evolution of a pulse in a nonlinear waveguide. Although we will discuss the simplest version here, we can extend this model to include wide-bandwidth pulses, a delayed nonlinearity, and two-photon absorption.

4.1.1 Nonlinear Schrodinger Equation

To develop the nonlinear Schrodinger equation, we will follow the derivation by Agrawal [9]. Let us consider the wave equation in terms of the linear and nonlinear polarization in the time domain:

$$ {\nabla^2}\vec{E} - \frac{1}{{{c^2}}}\frac{{{\partial^2}\vec{E}}}{{\partial {t^2}}} = {\mu_0}\frac{{{\partial^2}{{\vec{P}}_L}}}{{\partial {t^2}}} + {\mu_0}\frac{{{\partial^2}{{\vec{P}}_{NL}}}}{{\partial {t^2}}}. $$
(7.114)

Our strategy for simplifying this differential equation starts by including the nonlinear polarization into an effective dielectric constant \( {\varepsilon_{eff}}\left( \omega \right) \). Next, we use separation of variables to isolate a transverse-wave equation from the propagation equation, as we did when analyzing the modes in a waveguide. We incorporate the nonlinear polarization by solving the transverse equation and using the nonlinear part of \( {\varepsilon_{eff}}\left( \omega \right) \) as a perturbation. The lowest order perturbation alters the propagation constant, while keeping the distribution of the mode intact. We apply the resulting linear and nonlinear propagation constants to the propagation equation to arrive at the NLSE, which describes the temporal evolution of a pulse in a nonlinear waveguide.

As we have seen in previously, the effects of dispersion are easiest to handle in the frequency domain. For clarity, we will denote variables in frequency domain using a tilde above. Therefore, we take the Fourier transform of Eq. (7.114), to show:

$$ {\nabla^2}\tilde{E} - {\varepsilon_{eff}}\left( \omega \right)k_0^2\tilde{E} = 0, $$
(7.115)

and we define an effective relative dielectric constant in the frequency domain:

$$ {\varepsilon_{eff}}\left( \omega \right) = 1 + \tilde{\chi }_{xx}^{(1)}\left( \omega \right) + {\varepsilon_{NL}}. $$
(7.116)

Although not explicitly stated, the effective dielectric constant varies in the transverse direction to form the waveguide. By combining both the linear and nonlinear terms into an effective dielectric constant, we simplify the analysis.

Next, we combine the slowly varying amplitude approximation from Eq. (7.36) with the modal distribution from Eq. (7.110) in the time domain:

$$ \vec{E} = F\left( {x,y} \right)A\left( {z,t} \right)\exp \left( {i{\beta_0}z - {\omega_0}t} \right)\hat{x}. $$
(7.117)

Here, we have a carrier wave given by \( \exp \left( {i{\beta_0}z - {\omega_0}t} \right) \) that is slowly modulated by an envelope function, \( A\left( {z,t} \right) \). Taking the Fourier transform of this expression, we substitute

$$ \tilde{E} = F\left( {x,y} \right)\tilde{A}\left( {z,\omega - {\omega_0}} \right)\exp \left( {i{\beta_0}z} \right), $$
(7.118)

into (7.115). Now, we perform a separation of variables to show:

$$ \frac{{{\partial^2}F}}{{\partial {x^2}}} + \frac{{{\partial^2}F}}{{\partial {y^2}}} + \left[ {{\varepsilon_{eff}}\left( \omega \right)k_0^2 - \tilde{\beta }{{\left( \omega \right)}^2}} \right]F = 0, $$
(7.119)

and

$$ 2i{\beta_0}\frac{{\partial \tilde{A}}}{{\partial z}} + \left( {{{\tilde{\beta }}^2}\left( \omega \right) - \beta_0^2} \right)\tilde{A} = 0. $$
(7.120)

We have used the slowly varying amplitude approximation and assumed the term \( {\partial^2}\tilde{A}/\partial {z^2} \) is negligible compared to \( \beta \left( \omega \right)\left( {\partial \tilde{A}/\partial z} \right) \). Equation (7.119) closely resembles (7.111), determines the modal distribution \( F\left( {x,y} \right) \) and the propagation constant \( \tilde{\beta }\left( \omega \right) \), and includes the nonlinear dielectric constant \( {\varepsilon_{eff}}\left( \omega \right) \).

We incorporate the nonlinearity by using first-order perturbation theory. We replace \( \varepsilon \) with \( {n^2} \) and approximate \( \varepsilon = {\left( {n + \Delta n} \right)^2} \approx {n^2} + 2n\Delta n \). Next, we use

$$ \Delta n = {n_2}{\left| E \right|^2} + \frac{{i\alpha }}{{2{k_0}}}, $$
(7.121)

where \( {n_2} \) is the nonlinear index of refraction and \( \alpha \) is the attenuation coefficient. Using perturbation theory results in a modal distribution \( F\left( {x,y} \right) \) that is unaffected by the nonlinearity, to first order. Meanwhile, the nonlinearity comes into play through a perturbation of the propagation constant:

$$ \tilde{\beta }\left( \omega \right) = \beta \left( \omega \right) + \Delta \beta \left( \omega \right), $$
(7.122)

where

$$ \Delta \beta \left( \omega \right) = \frac{{{\omega^2}n\left( \omega \right)}}{{{c^2}\beta \left( \omega \right)}}\frac{{\int {\int_{ - \infty }^\infty {\Delta n\left( \omega \right){{\left| {F\left( {x,y} \right)} \right|}^2}dxdy} } }}{{\int {\int_{ - \infty }^\infty {{{\left| {F\left( {x,y} \right)} \right|}^2}dxdy} } }}. $$
(7.123)

This expression is completely analogous to self-phase modulation using a nonlinear wavevector \( k \) from Eq. (7.73), except applied to the propagation constant. This situation is slightly more complicated because we must consider the modal distribution. However, we can use the waveguide geometry, through \( F\left( {x,y} \right) \), to increase the rate of nonlinear phase accumulation.

Solving Eq. (7.119) and applying (7.123), we obtain the modal distribution and both the linear and nonlinear propagation constants. Now, we can apply these results to Eq. (7.120). To further simplify Eq. (7.120), we assume that the frequency-dependent propagation constant remains close to the propagation constant of the carrier wave \( \tilde{\beta }\left( \omega \right) \approx {\beta_0} \), and we approximate \( {\tilde{\beta }^2}\left( \omega \right) - \beta_0^2 \approx 2{\beta_0}\left( {\tilde{\beta }\left( \omega \right) - {\beta_0}} \right) \). Now, the Fourier transform of the slowly varying envelope \( \tilde{A}\left( {z,\omega - {\omega_0}} \right) \) satisfies:

$$ \frac{{\partial \tilde{A}}}{{\partial z}} = i\left[ {\beta \left( \omega \right) + \Delta \beta \left( \omega \right) - {\beta_0}} \right]\tilde{A}. $$
(7.124)

As we are only concerned with the spectral bandwidth around the carrier frequency of the pulse, we Taylor expand the propagation constant in the frequency domain:

$$ \beta \left( \omega \right) = {\beta_0} + \left( {\omega - {\omega_0}} \right){\beta_1} + \frac{1}{2}{\left( {\omega - {\omega_0}} \right)^2}{\beta_2} + \frac{1}{6}{\left( {\omega - {\omega_0}} \right)^3}{\beta_3} + ... $$
(7.125)

where \( {\beta_0} = \beta \left( {{\omega_0}} \right) \) is the propagation constant of the carrier wave and

$$ {\beta_m} \equiv {\left( {\frac{{{d^m}\beta }}{{d{\omega^m}}}} \right)_{\omega = {\omega_0}}}. $$
(7.126)

Similarly, we expand \( \Delta \beta \left( \omega \right) \):

$$ \Delta \beta \left( \omega \right) = \Delta {\beta_0} + \left( {\omega - {\omega_0}} \right)\Delta {\beta_1} + \frac{1}{2}{\left( {\omega - {\omega_0}} \right)^2}\Delta {\beta_2} + ..., $$
(7.127)

using a comparable definition of \( \Delta {\beta_m} \) to Eq. (7.126). Using these definitions for the propagation constant to second order in \( \beta \left( \omega \right) \), and first order in \( \Delta \beta \left( \omega \right) \), we take the inverse Fourier transform of Eq. (7.124) to arrive at:

$$ \frac{{\partial A}}{{\partial z}} + {\beta_1}\frac{{\partial A}}{{\partial t}} + \frac{{i{\beta_2}}}{2}\frac{{{\partial^2}A}}{{\partial {t^2}}} = i\Delta {\beta_0}A. $$
(7.128)

We note that both the nonlinearity and the losses are contained in the \( \Delta {\beta_0} \) term. Using the assumption that the modal distribution \( F\left( {x,y} \right) \) does not vary considerably over the pulse bandwidth and \( \beta \left( \omega \right) \approx n\left( \omega \right)\omega /c \), we can write:

$$ \frac{{\partial A}}{{\partial z}} + {\beta_1}\frac{{\partial A}}{{\partial t}} + \frac{{i{\beta_2}}}{2}\frac{{{\partial^2}A}}{{\partial {t^2}}} + \frac{\alpha }{2}A = i\gamma \left( {{\omega_0}} \right){\left| A \right|^2}A. $$
(7.129)

This equation is related to the nonlinear Schrodinger equation from quantum mechanics if the loss term \( \alpha \) is zero. We have also introduced the nonlinear parameter, or effective nonlinearity [9]:

$$ \gamma \left( {{\omega_0}} \right) = \frac{{2\pi }}{\lambda }\frac{{\int {\int_{ - \infty }^\infty {{n_2}\left( {x,y} \right){{\left| {F\left( {x,y} \right)} \right|}^4}dxdy} } }}{{{{\left( {\int {\int_{ - \infty }^\infty {{{\left| {F\left( {x,y} \right)} \right|}^2}dxdy} } } \right)}^2}}}. $$
(7.130)

The effective nonlinearity has units of W−1 m−1 when \( {\left| A \right|^2} \) represents the optical power.

Equation (7.129) is approximately valid for pulses that are longer than 1 ps and does not include the effects of Raman or Brillouin scattering. Raman scattering requires the inclusion of a delayed (non-instantaneous) nonlinearity. Meanwhile, shorter pulses have a much larger bandwidth that requires several modifications. For short, large-bandwidth pulses propagating in the linear regime, \( \beta \left( \omega \right) \) cannot be sufficiently modeled unless we include higher-order dispersive terms beyond \( {\beta_2} \). Additionally, for short pulses in the nonlinear regime, the nonlinear parameter’s frequency dependence must also be included, leading to an effect known as self-steepening [9]. These additional effects change the details of how a pulse evolves as it propagates; however, Eq. (7.129) describes many of the major features.

The NLSE is only solvable analytically under certain conditions using the inverse-scattering method [28]. However, an efficient method for numerically simulating the NLSE exists, which is known as the split-step Fourier method [29, 30]. The technique discretizes pulse propagation in the z-direction. We split each discrete step into two parts. We apply only the nonlinear phase in the time domain during the first part. Subsequently, we Fourier transform and apply linear dispersion in the frequency domain, followed by second Fourier transform back to the time domain, which completes the z-step. This method is both efficient and straightforward to apply.

4.1.2 Effective Nonlinearity

The effective nonlinearity is the nonlinear phase per distance, per power, thus, large value of \( \gamma \) can enable smaller and more efficient devices. With the effective nonlinearity, we can rewrite Eq. (7.73) for the nonlinear phase within a waveguide by including the propagation constant \( \beta \) and the effective nonlinearity \( \gamma \left( \omega \right) \):

$$ \varphi = \beta L + \gamma \left( \omega \right)LP = {\varphi_L} + {\varphi_{NL}}. $$
(7.131)

We note that \( \gamma \) takes both the distribution of the mode into account and a location-dependent nonlinearity. The importance of the modal distribution is critical for silica nanowires because the only significant nonlinearity occurs within the silica and not the air cladding. The spatially dependent nonlinearity is also important for waveguides formed from multiple materials with different intrinsic nonlinearities.

If the field is contained within a single material, such as in optical fiber, we can write:

$$ \gamma \left( {{\omega_0}} \right) = \frac{{{n_2}\left( {{\omega_0}} \right){\omega_0}}}{{c{A_{eff}}}}, $$
(7.132)

where the effective mode area is:

$$ {A_{eff}} = \frac{{{{\left( {\int {\int_{ - \infty }^\infty {{{\left| {F\left( {x,y} \right)} \right|}^2}dxdy} } } \right)}^2}}}{{\int {\int_{ - \infty }^\infty {{{\left| {F\left( {x,y} \right)} \right|}^4}dxdy} } }}. $$
(7.133)

Alternatively, we can write the effective area in terms of the mode-field diameter (MDF) d, related to the effective area by \( {A_{eff}} = \pi {\left( {d/2} \right)^2} \).

4.1.3 Pulse Propagation Using the Nonlinear Wave Equation

Pulse propagation within a nonlinear waveguide is considerably different from the continuous-wave case. For large nonlinear interactions, high powers are necessary. This requirement benefits from restricting the wave spatially by reducing the effective area, and controlling the spreading of the pulse in time using dispersion. Therefore, there is an interplay between the nonlinearity and the group velocity dispersion.

To get a sense of which effects are dominant, we look to Eq. (7.129) and determine length scales on which these relative effects become important. The length associated with group velocity dispersion is:

$$ {L_D} = \frac{{T_0^2}}{{\left| {{\beta_2}} \right|}}, $$
(7.134)

known simply as the dispersion length. Similarly,

$$ {L_{NL}} = \frac{1}{{\gamma {P_0}}}. $$
(7.135)

is known as the nonlinear length. When the physical length of the waveguide is comparable to these lengths scales, the associated effects become important.

There are four relevant combinations of these lengths scales [9]. For \( L \ll {L_D} \) and \( L \ll {L_{NL}} \), both the dispersion and the nonlinearity are minimal and the pulse passes through unaffected, ideal for transmission systems. When \( L \geqslant {L_D} \) and \( L \ll {L_{NL}} \), the pulse undergoes significant dispersive broadening, but the spectrum remains constant. If \( L \ll {L_D} \) and \( L \geqslant {L_{NL}} \), the pulse is affected primarily by the nonlinearity and the spectrum will change via self-phase modulation. This regime is important for wide spectral broadening because the pulse remains relatively short due to the minimal dispersive broadening.

The final combination requires special attention [9]. When both \( L \geqslant {L_D} \) and \( L \geqslant {L_{NL}} \), the interplay between the dispersion and the nonlinearity produces dramatically different results depending on relative signs of the two effects. If both effects are positive (\( {n_2} > 0 \) and \( \beta > 0 \), dispersion in the normal regime), spectral broadening can occur and even temporal pulse compression can be achieved under certain conditions [9]. If the dispersion is anomalous with \( \beta < 0 \) and \( {n_2} > 0 \), stable solutions, known as solitons, can form [9, 31]. Under these conditions, the group velocity dispersion causes the red to move toward the front and blue toward the back; however, the nonlinearity causes the opposite effect and the two effects balance one another. These solutions are quantized in the sense that the shape repeats itself after a regular, fixed distance. Solitons are extremely stable because they can shed excess energy in the form of a dispersive wave until a stable solution is formed [32].

The lowest order (fundamental) soliton is of particular interest for communications systems because the dispersion and nonlinearity are in constant balance (\( {L_D} = {L_{NL}} \)) and the pulse maintains its shape as it propagates [33, 34]. In addition, the spectral phase across the pulse is flat, which has two advantages for all-optical switching [3537]. First, a fundamental soliton is intrinsically transform-limited, leading to efficient switching. Secondly, unlike other pulses, the flat-phase guarantees that the entire pulse will undergo switching, avoiding pulse distortions.

4.2 Nonlinear Properties of Silica Nanowires

We will theoretically explore the advantages of silica nanowires for nonlinear optics. This discussion will illustrate how high index-contrast sub-wavelength waveguides can achieve large effectively nonlinearities and tunable group velocity dispersion.

4.2.1 Effective Nonlinearities

The high index contrast between silica and air enables strong light confinement, facilitating large effective nonlinearities (\( \gamma \)). High confinement within a waveguide, alone, does not increase \( \gamma \). Instead, we must consider both the effective mode area and the distribution of the mode, as shown in Eq. (7.130).

To study how we can maximize \( \gamma \) in silica nanowires, we will first consider the how the MFD changes as we change the physical diameter (PD) of the nanowire [25]. Figure 7.17 (left) shows the MFD versus the PD. For illustration, we also plot the PD versus itself (a dashed line with a slope of unity). We see that for large diameter fibers, the MFD is smaller than the PD [25]. As we reduce the PD, the MFD becomes equal to the PD. Then the evanescent field increases rapidly, causing the MFD to become much larger than the PD. Because the nonlinearity of air is negligible, we expect the largest \( \gamma \) when the MFD is smallest, while remaining in the core. This crossover occurs when the MFD is equal to the PD, which corresponds to a diameter around 550 nm for 800-nm light.

Fig. 7.17
figure 17

Strong nonlinear-optical interaction in silica nanowires. The mode-field diameter versus the physical diameter (left). The effective nonlinearity as a function of the physical diameter (right) [25]

From the simple structure of silica nanowires, we expect the highest effective nonlinearity when the MFD coincides with the PD. We calculate and plot the effective nonlinearity for a silica nanowire as a function of the diameter (Fig. 7.17, right) [25]. As expected, the largest nonlinearity occurs for a diameter of 550 nm for 800-nm light. The nonlinearity peaks at a value of 660 W−1 km−1, which is over 300 times the effective nonlinearity of standard telecommunications fibers [9]. This result is impressive considering that we are using a material, silica, which has a comparatively low nonlinear index of refraction [38].

4.2.2 Group Velocity Dispersion

As we have seen previously, waveguide dispersion enables us to engineer the total dispersion, greatly influencing nonlinear pulse propagation. Figure 7.18 (left) shows the wavelength-dependent total dispersion for silica nanowires of varying diameters [39]. We see that the material dispersion in silica is normal for short wavelengths, reaches zero around 1,300 nm, and then becomes anomalous for longer wavelengths. If we operate around 800 nm, we are restricted to normal dispersion, unless we can counteract the normal dispersion with anomalous waveguide dispersion.

Fig. 7.18
figure 18

Dispersion engineering in silica nanowires. The dispersion parameter D versus wavelength for several waveguide dimensions (left). The dispersion parameter D as a function of silica nanowire dimension for 800-nm light (right) [39]

Figure 7.18 (right), shows the dispersion for 800-nm light as a function of nanowire diameter [39]. For diameters of 700–1,200 nm, strong anomalous waveguide dispersion overcomes the material dispersion and results in anomalous total dispersion. For diameters near 700 nm, total dispersion is very low. For even smaller dimensions, the dispersion becomes extremely normal. This ability to tune the total dispersion is a key element for engineering efficient nonlinear interactions.

4.3 Supercontinuum Generation in Silica Nanowires

We can achieve significant nonlinear effects if the nonlinear length is comparable to the waveguide length. To demonstrate these strong nonlinearities, we fabricate a series of silica nanowires with minimum diameters that vary from 360 to 1,200 nm [40]. For each nanowire, we launch 90-fs pulses at a central wavelength of 800 nm and observe the output spectrum. To understand the phenomena, we will first consider the energy-dependence of the spectral broadening in a single nanowire. Next, we will observe the diameter dependence of spectral broadening, which incorporates changes both the effective nonlinearity as well as the group velocity dispersion.

4.3.1 Energy-Dependent Spectral Broadening

Figure 7.19 shows the output spectra for a 1,200-nm diameter silica nanowire as we increase the input pulse energy. For an input pulse of 0.5 nJ, we observe no broadening and the output pulse spectrum is identical to the input pulse spectrum. Around 1 nJ, we reach the threshold for spectral broadening. For higher pulse energies, we see larger broadening and develop blue-peaks near 415 and 437 nm. These are not believed to be second-harmonic signal, as bulk amorphous silica is centrosymmetric [23]; instead, we attribute this phenomenon to higher-order soliton self-splitting, as has been observed previously [4145].

Fig. 7.19
figure 19

Energy-dependent spectral broadening in a 1,200-nm diameter silica nanowire [40]

4.3.2 Diameter-Dependent Spectral Broadening

Using similar pulse energies, we observe the diameter dependence of supercontinuum generation in Fig. 7.20 [40]. For a diameter of 360 nm, we find very little spectral broadening. The absense of broadening for a 360-nm nanowire is consistent with a significant evanescent field reducing the nonlinear parameter, as shown in Fig. 7.17. As we increase the diameter to 445 nm, we begin to observe broadening due to a larger nonlinear parameter, however the broadening is limited by the strong normal dispersion (Fig. 7.18). For a 525-nm diameter fiber, the near-maximal nonlinear parameter is responsible for the large broadening observed; meanwhile, the 700-nm nanowire’s spectrum can be partially attributed to the near-zero dispersion. For the 850-nm diameter fibers, we enter the anomalous dispersive regime and reduced broadening may be related to soliton formation. Lastly, the 1,200-nm nanowire shows additional structure in the blue-part of the visible spectrum which can be attributed to dispersive waves [4145].

Fig. 7.20
figure 20

Diameter-dependent supercontinuum generation in silica nanowires [40]

We see that by combining strong confinement and dispersion engineering in sub-wavelength nanowires, we can achieve significant spectral broadening using nJ pulse energies. These compact, integratable supercontinuum sources open the door to more advanced devices such as all-optical modulators.

4.4 All-Optical Modulation, Switching, and Logic

Now that we have come to understand nonlinear optics and the advantages of using nano-scale waveguides, we will explore light-by-light modulation, switching, and logic. All-optical modulation requires an optical device whose output depends nonlinearly on its input. Strong all-optical modulation leads to all-optical switching, whereby light-signals turn other light-signals on and off. Finally, these all-optical switches form the basis for all-optical logic devices. Logic devices have the additional requirement that predefined input bits (logical 0’s and 1’s, determined by power or energy ranges) must undergo a Boolean logic operation to produce an output bit(s).

In this section, we will look at Sagnac interferometers as a prototype for all-optical modulation, switching, and logic. Our analysis will be restricted to self-switching that utilizes the self-phase modulation for simplicity. Similar results can be derived for cross-phase modulation [31]. We will demonstrate all-optical modulation in a Sagnac interferometer made using a silica nanowire. Lastly, we will end on a discussion on how to form all-optical logic gates using Sagnac interferometers.

4.4.1 Sagnac Interferometers

A Sagnac interferometer, as shown in Fig. 7.21, is an excellent configuration to demonstrate all-optical modulation, switching, and logic. These interferometers are perhaps the simplest nonlinear interferometer configuration achievable in a silica nanowire. These symmetric, self-balancing interferometers utilize counter-propagating waves and are therefore stable, immune to temperature gradients, and easy to work with [36, 4648]. Sagnac interferometers provide a tunable balance between switching energy and device size simply by changing path lengths. These devices are non-resonant and therefore can sustain extremely high bit-rates [49]. Although we will only consider all-optical switching and logic as a final application, Sagnac interferometers are versatile, enabling operations including multiplexing/demultiplexing, switching and logic [50] and signal regeneration [51].

Fig. 7.21
figure 21

Diagram of a Sagnac interferometer showing the input pulse, the coupling region, the clockwise and counter-clockwise paths as well as the transmitted pulse [23, 31]

To understand how a Sagnac interferometer works, we can consider a nonlinear interferometer using a quasi-continuous-wave pulse in an interferometer. We form these by making a loop as shown in Fig. 7.21, then input light into the waveguide on the left of the loop and observe the output in the waveguide to the right.

The linear version of a Sagnac interferometer acts as a mirror [31]. A pulse enters the input in Fig. 7.21, and then encounters a directional coupler. A fraction of the pulse energy couples into the adjacent waveguide, adding a π/2 phase-shift, then it propagates around the loop in the counter-clockwise direction. The other fraction remains within the same waveguide (no phase-shift), and proceeds in the clockwise direction. These two pulses collect identical amounts of phase as they traverse the loop and then recombine at the directional coupler. The output is a summation of the counter-clockwise pulse and the clockwise pulse. The counter-clockwise pulse picks up a second π/2 phase-shift at the directional coupler, resulting in a total phase shift of π relative to the clockwise pulse at the output. This phase-shift results in destructive interference at the output. Meanwhile, constructive interference occurs on the input-side and the pulse is reflected back toward input, creating a mirror. For a 50–50 directional coupler, the Sagnac interferometer acts as a “perfect mirror”. If the ratio is different from 50 to 50, part of the light will reach the output.

A Sagnac interferometer can be made nonlinear, sometimes referred to as a nonlinear optical loop mirror (NOLM) [37, 47, 48, 50, 5260], by forming the loop using a nonlinear optical material. When traversing the loop, the clockwise and counter-clockwise pulses collect both linear and nonlinear phase. Both pulses collect identical linear phase. If the coupling ratio is anything other than 50–50, there will be a nonlinear phase difference between the pulses, causing power-dependent interference at the output. Just as we observe fringes in a Michelson interferometer when we change one of the path lengths, we observe power-dependent fringes in a Sagnac interferometer. These fringes form a basic, all-optical modulator that we can extend to perform all-optical switching and logic applications. The speed of these devices is only limited by the response time of the nonlinearity, although a delay in the device will occur as the pulses propagate. For a pure electronic nonlinearity, this response is on the order of several femtoseconds [5], making such a device compatible with Tb/s operations [49].

4.4.2 Analysis of a Sagnac Interferometer

The operation of a nonlinear Sagnac interferometer can be quite complex if we include all possible effects to correctly model pulse propagation; however, we can get a very good sense of how one works if we consider a quasi-continuous-wave pulse [31]. Essentially, we modulate a carrier wave with a square envelope, simplifying the analysis by using a continuous-wave approximation using peak-powers, (not pulse energies). For this approximation, we ignore dispersion and temporal phenomena, such as spectral broadening. For simplicity, we will also ignore cross-phase modulation between the counter-propagating pulses. This effect is negligible if the pulse durations are much shorter the loop-propagation time. In the continuous-wave regime, cross-phase modulation is the dominant source of nonlinear phase, however the results are identical [31].

Analysis in the quasi-continuous-wave regime first requires us to determine the power that is traveling in the clockwise versus the counter-clockwise directions. We define a coupling parameter \( \rho \) that represents the fraction of the power that stays within the input leg of the directional coupler and traverses the loop in the clockwise-direction. Thus, \( \rho {P_{in}} \) is the clockwise power in the loop, corresponding to an electric field of magnitude \( \sqrt {\rho } {E_{in}} \), for an input power of \( {P_{in}} \) and input electric field of \( {E_{in}} \). Similarly, in the counter-clockwise direction, the power and electric field magnitudes are \( \left( {1 - \rho } \right){P_{in}} \) and \( \sqrt {{1 - \rho }} {E_{in}} \), respectively. When we build our devices, we can change \( \rho \) within the directional coupler by adjusting the separation between the two-waveguides and/or adjusting the distance of the parallel section (described using coupled-mode theory [3]).

Now, we determine the total phase accumulated in both directions, which requires that we define the length of the loop, L. We consider three sources of phase: from the directional coupler, linear propagation, and nonlinear propagation. Within the directional coupler, the power transferred from the input pulse to the counter-clockwise pulse receives a phase shift of π/2. This phase shift occurs a second time when the counter-clockwise pulse is transferred to the output, leading to total phase shift of π for the counter-clockwise pulse only (the clockwise pulse remains in the original waveguide). Combining this phase-shift with the linear and nonlinear phase from Eq. (7.131), we obtain the total phase at the output for the clockwise and counter-clockwise pulses:

$$ {\varphi_{CW}} = {\varphi_L} + {\varphi_{CWNL}} = \beta L + \gamma \left( \omega \right)L\rho {P_{in}}, $$
(7.136)
$$ {\varphi_{CCW}} = {\varphi_L} + {\varphi_{CCWNL}} + {\varphi_{DC}} = \beta L + \gamma \left( \omega \right)L\left( {1 - \rho } \right){P_{in}} + 2\left( {\pi /2} \right). $$
(7.137)

Here, \( {\varphi_{CW}} \) is the total phase in the clockwise direction at the output, made up of a linear and a nonlinear contributions, \( {\varphi_L} \) and \( {\varphi_{CWNL}} \), respectively. In the counter-clockwise direction, these terms are \( {\varphi_{CCW}} \), \( {\varphi_L} \), and \( {\varphi_{CCWNL}} \), respectively. Additionally, there is a phase shit \( {\varphi_{DC}} \) from the directional coupler in the counter-clockwise direction.

Lastly, we sum the electric fields at the output to determine the transmission. The magnitude of the electric field for the clockwise pulse becomes \( \rho {E_{in}} \) at the output, having picked up an additional factor of \( \sqrt {\rho } \) from the directional coupler. Similarly, the magnitude of the electric field for the counter-clockwise pulse is \( \left( {1 - \rho } \right){E_{in}} \) at the output. The electric field at the output is:

$$ {E_{out}} {=} \rho {E_{in}}{e^{i{\varphi_{CW}}}} {+} \left( {1 {-} \rho } \right){E_{in}}{e^{i{\varphi_{CCW}}}} {=} \left( {\rho {E_{in}}{e^{i{\varphi_{CWNL}}}} {-} \left( {1 - \rho } \right){E_{in}}{e^{i{\varphi_{CCWNL}}}}} \right){e^{i{\varphi_L}}}. $$
(7.138)

We can solve this equation for the transmission to show:

$$ T = \frac{{{{\left| {{E_{out}}} \right|}^2}}}{{{{\left| {{E_{in}}} \right|}^2}}} = 1 - 2\rho \left( {1 - \rho } \right)\left\{ {1 + \cos \left[ {\left( {1 - 2\rho } \right)\gamma {P_{in}}L} \right]} \right\}. $$
(7.139)

From this equation, we see that if \( \rho \) is 0.5, the transmission is zero and therefore, we require that \( \rho \ne 0.5 \) to achieve modulation.

We can understand the nature of this device by plotting the output power, \( {P_{in}}T \), as a function of the input power, as shown in Fig. 7.22. We also note that we can change the spacing of the fringes relative to the input power, using the coupling parameter. For small values of \( \rho \), the fringes are closely spaced in power and the modulation is minimal. As the coupling parameter \( \rho \) approaches 0.5, the first minimum requires higher powers while the modulation increases.

Fig. 7.22
figure 22

Transmitted power as a function of input power for a typical nonlinear Sagnac interferometer [23, 31]

Although changing the coupling parameters is straightforward experimentally, simply making longer devices is not always possible and requires that we consider both linear and nonlinear losses. Linear losses, as we have seen in previously, effectively limit the amount of nonlinear phase that we can accumulate. Similarly, two-photon absorption limits the highest power that we can effectively use.

In addition to losses, a real device must include the effects of the pulse shape as well as dispersion. When moving to pulsed operation without dispersion, the phase accumulated will be different across the pulse. The non-uniform phase will cause the peak of the pulse to switch before the front and back, creating pulse distortions, and lowering the modulation. With normal dispersion, a similar effect will occur. Even with all of these complications, by exploiting nonlinear optics in nano-scale waveguides, significant modulation with minimal pulse distortion is possible using solitons, as has been shown in fiber [36].

4.4.3 All-Optical Modulation Using Silica Nanowires

We have fabricated silica nanowires and formed Sagnac interferometers to demonstrate all-optical modulation. Using a 500-nm silica nanowire, we have observed all-optical modulation of femtosecond-pulses at a wavelength of 800 nm, as shown in Fig. 7.23 [23]. The approximate interaction length for this device is 500 μm and we modulate it using pulse energies up to 2 nJ.

Fig. 7.23
figure 23

Experimental data for a silica nanowire-based Sagnac interferometer (triangles) and theoretical fit (solid line) [23]

We can fit this data to the model in Eq. (7.139) and estimate a coupling parameter of 0.08. Although we have achieved light-on-light modulation, our data shows that it is necessary to increase the coupling ratio in order to obtain stronger modulation.

4.4.4 All-Optical Switching

In all-optical switching devices, the power ratio between “on” and “off” states, as well as the power required for full modulation, are important design considerations. The ratio of the “off” to the “on” power is the depth of modulation or extinction coefficient [2]. We can approximate the depth of modulation for our Sagnac interferometer by evaluating Eq. (7.139) when the argument of the cosine function is equal to π (the approximate location of the first maximum) and 2π (the approximate location of the first minimum). The extinction ratio is approximately given by:

$$ \frac{{{P_{\max }}}}{{{P_{\min }}}} \approx \frac{1}{{2{{\left( {1 - 2\rho } \right)}^2}}}. $$
(7.140)

For example, we find that the extinction ratio reaches 3 dB when \( \rho = 0.25 \) and 17 dB when \( \rho = 0.45 \).

Stronger modulation requires higher input powers or longer length devices; therefore, we consider this trade-off by observing the power required to reach the first minimum, given by:

$$ {P_{switch}} \approx \frac{{2\pi }}{{\left( {1 - 2\rho } \right)L\gamma }}. $$
(7.141)

As we approach a coupling ratio of 0.5, the required power for full switching increases as \( {\left( {1 - 2\rho } \right)^{ - 1}} \). Comparing this expression to Eq. (7.140), as \( \rho \) approaches 0.5, the extinction ratio improves as \( {\left( {1 - 2\rho } \right)^{ - 2}} \). Therefore, we can achieve a substantially increased extinction ratio for only a moderate operational power increase. In addition, we can design a longer device to offset the additional power requirements.

4.4.5 All-Optical Logic

Now we show how to form all-optical logic gates using a nonlinear Sagnac interferometer. The nonlinear optical loop mirror serves as a prototypical all-optical logic-gate for future communications systems. These and similar devices have been shown using bulk fiber [49, 50, 61, 62] as well as using active nonlinear devices [51]. Our configuration begins by adding two (or more) inputs before our Sagnac interferometer, as showing in Fig. 7.24 (top). For each logical operation, we must define logical 1’s and 0’s for both the input and the output.

Fig. 7.24
figure 24

All-optical XOR logic gate. Physical layout (top), input versus output for the Sagnac interferometer, showing logical 0’s and 1’s (bottom left), truth table for the resulting XOR gate (bottom right) [23]

As a first example, we will explore an exclusive-OR gate, having the truth table shown in Fig. 7.24 (bottom right). We observe the power-dependent transmission plot in Fig. 7.24 (bottom left), using a coupling ratio of 0.45 to define our logical 1’s and 0’s. On the input, a logical 1 has an input power greater of 28 W. On the output, we define a logical 1 as an output power greater than 18 W. Using the Sagnac’s transmission function, we can construct the truth table. If neither A nor B is present at the input, the device produces a logical 0. If only input A or input B is present, the output is close to the first maximum, resulting in a logical 1. If both A and B are present at the input, the device produces a logical 0. This demonstrates the truth table for an XOR gate.

We can construct other gates by adjusting the coupling coefficient, and possibly adding a clock signal. We are particularly interested in constructing a “universal” logic gate. These gates can be configured together to form the entire set of Boolean logic operations. One particular universal logic gate we can construct using a Sagnac interferometer is a NAND gate. A NAND gate has the truth table shown in Fig. 7.25 (bottom right). As we can see, this gate is more difficult to create than an XOR-gate because we require a logical 1 when there is no input. Therefore, we add an additional input to our Sagnac interferometer and deliver a clock pulse with every set of input pulses. By adjusting our coupling parameter to 0.435, we find that for no inputs in either A nor B, the clock pulse produces a logical 1 at the output. For an input in either A or B, the output of the device is around 15 W, constituting a logical 1 at the output. Now, if both A and B are present, the devices reaches the second minimum, resulting in a logical 0. This prototypical device shows how we can construct an all-optical NAND gate and thereby create all other Boolean logic gates.

Fig. 7.25
figure 25

All-optical NAND logic gate. Physical layout (top), input versus output for the Sagnac interferometer, showing logical 0’s and 1’s (bottom left), truth table for the resulting NAND gate (bottom right) [23]

5 Summary

We have shown the advantages of using sub-wavelength dielectric waveguides for nonlinear optics. We have developed a theoretical basis for linear and nonlinear propagation of continuous-wave and pulsed optical signals in free-space, within materials, and using waveguides. We have seen how nano-scale, high-index contrast waveguides are capable of enhancing nonlinear interactions spatially, using strong confinement, and in time, using tunable dispersion. To demonstrate these effects in a physical system, we have fabricated silica nanowires, explored their mechanical properties, and observed their linear and nonlinear optical properties. Lastly, we have developed two devices, a supercontinuum source, and an all-optical modulator, which serve as the foundation for more advanced nonlinear devices, including all-optical logic devices, both in silica and in future material systems.