Keywords

1 History of Classical Device Simulation

The success of microelectronic technology has been partly enabled and supported by sophisticated technology computer-aided design (TCAD) tools which are used to assist in development and engineering at practically all stages from process definition to circuit optimization. The modern technology development cost and time reduction from the use of TCAD has been estimated at around 40% [1].

At the early days of transistor technology, relatively simple compact models based on the drift-diffusion formalism, idealized doping profiles, and quasi-one-dimensional assumptions produced sufficiently accurate results. To reduce the costs of the chip manufacturing process, the number of devices on a chip has been continually increased by increasing the chip size and decreasing the size of the individual devices. For these devices, the simplifying assumptions valid before became increasingly violated, thus requiring a more rigorous description. Development of compact models, though still possible and of fundamental importance, became more complicated, requiring large development times and continuous modifications.

The drift-diffusion (DD) model was first presented by Van Roosbroeck in the year 1950 [2]. Instead of a closed-form compact model, the basic semiconductor equations can be solved directly for a given doping profile and geometry. The same model was later derived from the Boltzmann transport equation (BTE) by the method of moments [3] or from basic principles of irreversible thermodynamics [4, 5]. The self-consistent numerical solution of carrier transport models dates back to the famous seminal work of Scharfetter and Gummel [6, 7] and eliminates many of the assumptions required for compact models. Due to its simplicity and its excellent numerical properties, the drift-diffusion equations have become the workhorse for most applications in technology computer-aided design.

In the 1980s, the first tools appeared which offered a solution of the drift-diffusion equations on two-dimensional domains. The first monograph comprehensively covering aspects from modeling and discretization to applications was Selberherr [3], which was published in 1984. Since then, current transport models have been continuously refined and extended to more accurately capture transport phenomena occurring in state-of-the-art semiconductor devices.

The steady need for refinement and extension is caused by the ongoing feature size reduction in technology. As the supply voltages cannot be scaled accordingly without compromising the circuit performance, the electric fields inside the devices have increased. Large electric fields, which rapidly change over small length scales, give rise to nonlocal and hot-carrier effects which begin to dominate device performance. These phenomena are a concern for industrial applications, and an accurate description is required.

To deal with nonlocal and hot-carrier effects, extended models have been proposed which consider the carrier energy as an independent solution variable. The most prominent examples of such models are the hydrodynamic and energy transport (ET) models. These models are based on the work of Stratton [8] and Bløtekjær [9] who solved the BTE using the method of moments with three or four moments. Many variants of these models have been published, which are reviewed in [10]. However, detailed examinations in the deep submicron area show that using three or four moments is still not sufficient for specific problems which depend on high-energy tails. One method for improving the approximation is the six-moments (SM) model [11].

Figure 37.1 gives an overview of transport models in use for modern TCAD. The most accurate methods solve the Boltzmann equation directly, either by Monte Carlo methods or by the spherical harmonic expansion. Yet the computational costs of these methods are challenging even today, and in many cases, simpler moment-based methods are preferred. In addition, for the simulation of small state-of-the-art devices, quantum effects have to be included.

Fig. 37.1
figure 1

Hierarchy of transport models

This handbook chapter is concerned with macroscopic transport models based on the method of moments as used in classical device simulation. Other chapters in the handbook specialize on modeling with spherical harmonic expansion, the Monte Carlo simulation method, and quantum transport modeling. Classical monographs on macroscopic transport models are [3] and [12]. A modern mathematical reference on macroscopic transport models is [13]. A review article about general classical device simulation is [5]. The limits of validity for classical macroscopic models and models for nanoscale device simulation beyond these limits are discussedin [14].

1.1 Structure of the Review

Following the historical notes in Sect. 37.1, we start our exposition in Sect. 37.2 with a phenomenological derivation of the drift-diffusion model which is the simplest reasonable macroscopic transport model. The drift-diffusion model assumes an equilibrium between carrier energy and electric field, which however is no longer valid in modern electronic devices. Several assumptions are required for its validity to hold, and a heuristic derivation is instructive for a good physical understanding.

The theoretical cornerstone of classical device simulation is the Boltzmann equation which is discussed in Sect. 37.3. In classical device simulation, electron transport is dominated by the scattering operator. We introduce the relaxation time approximation in the Boltzmann equation which allows an analytical treatment of the scattering term.

Macroscopic transport models based on the Boltzmann equation are derived by the method of moments, which is a powerful approach to obtain a series of weighted balance and flux equations. This approach is studied in Sect. 37.4. To enable an analytical treatment, we assume parabolic bands and give the corresponding equations for the first four moments. Following Bløtekjær’s approach, we derive a hydrodynamic model based on the first three moments by assuming an isotropic thermal energy tensor and a phenomenological closure for the heat flow.

In Sect. 37.5, we derive energy transport models starting from the hydrodynamic model by making additional simplifications, which are known as the diffusion approximation. To motivate the diffusion approximation, the drift-diffusion equations are derived from the Boltzmann equation using the method of moments. The assumptions necessary for the derivation are analyzed.

The validity of drift-diffusion and energy transport models is investigated theoretically and numerically in Sect. 37.6. Differences and common features of transport models can be understood by looking closely at the assumptions used in their derivation from the Boltzmann equation. Based on this analysis, we discuss critical issues for the validity of energy transport and hydrodynamic models. The highest-order even moment available in typical energy transport models is the temperature. Based on Monte Carlo simulation data, it is demonstrated that the energy distribution in n+nn+ test structures strongly deviates from a heated Maxwellian distribution below 100 nm channel length. We then compare results from drift-diffusion and energy transport models with Monte Carlo simulations.

A natural extension of energy transport and hydrodynamic models is the inclusion of higher-order moments. A transport model based on six moments of the distribution function is presented in Sect. 37.7. This model does not assume any particular shape of the distribution function. The additional even order moment, the kurtosis of the distribution function, is shown to contain fundamental information about the shape of the distribution function.

In Sect. 37.8, we present applications of higher-order transport models to n+nn+ test structures and double-gate MOSFETs. We compare results of the energy transport model and the six-moment model with Monte Carlo simulation results and discuss short-channel effects.

Finally, we close the review by summarizing our results in Sect. 37.9.

2 The Phenomenological Drift-Diffusion Model

In this section, the drift-diffusion current relations will be derived using a phenomenological approach. This set of equations has first been presented by van Roosbroeck in 1950 [2]. Together with the Poisson equation and the carrier continuity equations, these equations constitute the basic semiconductor equations, the simplest reasonable transport model possible [3]. The presentation in this section follows conceptionally [15].

2.1 Poisson’s Equation and the Continuity Equation

The Poisson equation is a common constituent to all charge transport models for semiconductor devices and serves as the link between the electrostatic potential and the charge distribution within a semiconductor device.

In the eddy-current-free quasi-stationary case (electro-quasistatic model), which typically holds true for semiconductor devices [16,17,18], the electric field E is expressed as the gradient of a scalar potential ψ.

$$\displaystyle \begin{aligned} \mathbf{E} = - {\boldsymbol{\nabla} \psi} {} \end{aligned} $$
(37.1)

Using Gauss’s law and the proper material equation for the quasi-stationary case one obtains the Poisson equation:

$$\displaystyle \begin{aligned} {\boldsymbol{\nabla} \cdot (\varepsilon \, {\boldsymbol{\nabla} \psi})} =- \rho {} \end{aligned} $$
(37.2)

ε is the dielectric permittivity (assumed to be scalar hereafter). In semiconductors, two separate particle systems are responsible for charge transport, namely, electrons (n) and holes (p). Neglecting defects and fixed charges, the total space charge density in semiconductors is primarily composed of the charges of electrons, the holes, and the ionized dopant atoms:

$$\displaystyle \begin{aligned} \rho = {q}\mkern2mu (p + N_{\mathrm{D}}^+) - {q}\mkern2mu (n + N_{\mathrm{A}}^-) ={q}\mkern2mu (p-n+N) {} \end{aligned} $$
(37.3)

In order to obtain a complete description, the Poisson equation has to be solved self-consistently with the transport equation system.

Charge transport is described by the current continuity equation which expresses local charge conservation:

$$\displaystyle \begin{aligned} {\boldsymbol{\nabla} \cdot \mathbf{J}} + {{\partial_{t}} \varrho} = 0 {} \end{aligned} $$
(37.4)

Like the Poisson equation, the continuity equation can be derived from Maxwell’s equations. The conduction current density J can be written as the sum of two components:

$$\displaystyle \begin{aligned} \mathbf{J} = {\mathbf{J}}_{\mathrm{n}} + {\mathbf{J}}_{\mathrm{p}} {} \end{aligned} $$
(37.5)

Jn and Jp are the electron and hole current density.

It is convenient to split up also the current continuity equation (37.4) into two equations by introducing a formal separation parameter R:R can be interpreted as the net recombination rate defined as:where Rn/Gn and Rp/Gp are the recombination/generation rates for electrons and holes, respectively.

2.2 Drift and Diffusion Current

The simplest current transport model at hand is based on the so-called drift-diffusion assumption. This model distinguishes between two charge carrier transport mechanisms: the drift of charge carriers due to an electric field caused by a gradient in the electric potential and the diffusion of the charge carriers due to a spatial gradient in the charge carrier concentration.

2.2.1 Drift Current

The component of the current which is caused by the electric field is called drift current. From a macroscopic point of view, the current density and the electric field are related by Ohm’s law:

$$\displaystyle \begin{aligned} {\mathbf{J}}^{\mathrm{drift}} = \sigma \, \mathbf{E} {} \end{aligned} $$
(37.9)

σ is the electric conductivity.

Semiconductors are in principle anisotropic due to their crystal structure. However, due to symmetry properties, especially in the case of germanium and silicon, the anisotropicity of the conductivity is very small, and the electric conductivity can be well approximated by a scalar.

From a microscopic point of view, the current density can be expressed as

$$\displaystyle \begin{aligned} {\mathbf{J}}^{\mathrm{drift}} = {q} \, p \, \bar{\mathbf{v}}_{\negthinspace \mathrm{p}} - {q} \, n \, \bar{\mathbf{v}}_{\negthinspace \mathrm{n}} \ {} \end{aligned} $$
(37.10)

where q is the elementary charge, n and p the electron and hole concentrations, and \(\bar {\mathbf {v}}_{\negthinspace \mathrm {n}}\) and \(\bar {\mathbf {v}}_{\negthinspace \mathrm {p}}\) the mean velocities of electrons and holes, respectively. The electric field accelerates the carriers, but due to various scattering mechanisms, the velocity of the carriers in a constant field is bounded and is proportional to the electric field:

$$ \begin{array}{lll} \bar{\mathbf{v}}_{\negthinspace \mathrm{n}} & =&\displaystyle - \mu_{\mathrm{n}}(\mathbf{E}) \, \mathbf{E} \end{array} $$
(37.11)
$$ \begin{array}{lll} \bar{\mathbf{v}}_{\negthinspace \mathrm{p}} & =&\displaystyle \mu_{\mathrm{p}}(\mathbf{E}) \, \mathbf{E} \end{array} $$
(37.12)

μn and μp are the respective mobilities of electrons and holes. In order to improve the applicability range of the drift current expression, a field-dependent mobility is generally used (see Sect. 37.2.4). By inserting these equations into (37.10), we get

$$\displaystyle \begin{aligned} {\mathbf{J}}^{\mathrm{drift}} = {q} \, p \, \mu_{\mathrm{p}} \, \mathbf{E} + {q} \, n \, \mu_{\mathrm{n}} \, \mathbf{E} . {} \end{aligned} $$
(37.13)

Comparing (37.13) with (37.9) and (37.5), the following relations are obtained:

$$ \begin{array}{lll} {\mathbf{J}}^{\mathrm{drift}}_{\mathrm{n}} & =&\displaystyle \sigma_{\mathrm{n}} \, \mathbf{E}\end{array} $$
(37.14)
$$ \begin{array}{lll} {\mathbf{J}}^{\mathrm{drift}}_{\mathrm{p}} & =&\displaystyle \sigma_{\mathrm{p}} \, \mathbf{E} \end{array} $$
(37.15)

The conductivitiesare are given as

$$ \begin{array}{lll} \sigma_{\mathrm{n}} & =&\displaystyle {q} \, n \, \mu_{\mathrm{n}} \end{array} $$
(37.16)
$$ \begin{array}{lll} \sigma_{\mathrm{p}} & =&\displaystyle {q} \, p \, \mu_{\mathrm{p}}. \end{array} $$
(37.17)

2.2.2 Diffusion Current

The component of the current which is caused by the thermal motion of the carriers is called diffusion current. It is driven by a gradient in the carrier concentration. The law of diffusion, which originally stems from the theory of dilute gases, defines the flux densities:

$$ \begin{array}{lll} {\mathbf{F}}_{\mathrm{n}} & =&\displaystyle - D_{\mathrm{n}} \, {\boldsymbol{\nabla} n}\end{array} $$
(37.18)
$$ \begin{array}{lll} {\mathbf{F}}_{\mathrm{p}} & =&\displaystyle - D_{\mathrm{p}} \, {\boldsymbol{\nabla} p} \end{array} $$
(37.19)

Here, Dn and Dp are the diffusion coefficients for electrons and holes. The particle flux densities Fn and Fp have to be multiplied by the charge of the particle to get the electric current densities:

$$ \begin{array}{lll} {\mathbf{J}}^{\mathrm{diffusion}}_{\mathrm{n}} & =&\displaystyle - {q} \, {\mathbf{F}}_{\mathrm{n}}\end{array} $$
(37.20)
$$ \begin{array}{lll} {\mathbf{J}}^{\mathrm{diffusion}}_{\mathrm{p}} & =&\displaystyle {q} \, {\mathbf{F}}_{\mathrm{p}} \end{array} $$
(37.21)

2.3 The Semiconductor Equations

Superposition of the current components yields the drift-diffusion current relations:

$$ \begin{array}{lll} {\mathbf{J}}_{\mathrm{n}} & =&\displaystyle {q} \, n \, \mu_{\mathrm{n}} \mathbf{E} \,+\, {q} \, D_{\mathrm{n}} {\boldsymbol{\nabla} n}\end{array} $$
(37.22)
$$ \begin{array}{lll} {\mathbf{J}}_{\mathrm{p}} & =&\displaystyle {q} \, p \, \mu_{\mathrm{p}} \mathbf{E} \,-\, {q} \, D_{\mathrm{p}} {\boldsymbol{\nabla} p}\end{array} $$
(37.23)

For conditions close to thermal equilibrium and for nondegenerate carrier systems where Boltzmann statistics are valid, the diffusion coefficients are related to the mobilities by the Einstein relation: kB is Boltzmann’s constant, and Tn and Tp are the temperatures of electrons and holes.

Since no information about the carrier temperature is available, the carrier temperature is set equal to the lattice temperature TL, assuming the thermal equilibrium approximation [19]. We finally write the equations for the electrons as: The analogous equations for holes, namelyare added to complete the system. Together with Poisson’s equation (37.2), the set of basic semiconductor equations is obtained.

The drift-diffusion scheme takes only local quantities into account. As such, it completely neglects nonstationary transport effects which occur in response to a sudden variation of the electric field, either in time or in space. A robust discretization of the drift-diffusion equations was proposed by Scharfetter and Gummel [6, 7], which is still in use today.

As one of the basic limitations of the drift-diffusion model, it should be pointed out that the Einstein relations are only valid near equilibrium and the diffusivity relations are largely underestimated, if the lattice temperature rather than the carrier temperature is being used [20, p.145]. In order to account for hot-carrier effects, the average energy can be estimated via the homogeneous energy balance equation: This equation can be derived from the homogeneous limit of an energy transport model based on the Boltzmann equation (see Sect. 37.5.4). However, for rapidly increasing electric fields, the average energy lags behind the electric field and can be considerably smaller than the one predicted by the homogeneous energy balance equation . Purely phenomenological approaches to describe heat transport in semiconductors are available [5], but are not further discussed here.

2.4 Parameter Modeling

Even thoughthis set of basic semiconductor equations is complete, it cannot be solved without the further description of the material parameters for the mobilities μn and μp and the generation/recombination rate R.

2.4.1 Mobility

The carriermobilities in semiconducting materials are determined by various physical mechanisms, because the charge carriers experience scattering events by thermal lattice vibrations, ionized impurities, neutral impurities, vacancies, interstitials, dislocations, surfaces, and with each other.

The engineering approach to take all this into account is to use fitted analytical models based on measurements or to extract parameters from sophisticated Monte Carlo simulations. Rigorous first principle models for the carrier mobilities exist, but they are complicated and hard to implement.

In the drift-diffusion equations, mobility models depending on the electric field are usually employed. Various models have been developed; examples can be found in [21,22,23]. For weak electric fields, the mobility is constant with respect to the field, and therefore the relation between the velocity and the electric field is linear. For large electric fields, the relationship begins to deviate from linear, and the velocity saturates for very high fields. The maximum velocity observed in bulk silicon measurements is the saturation velocity vsat requiring the mobility to be of the form μ → vsatE. In silicon, the value of vsat is 107 cm∕s [24]. Simulation tools commonly differentiate between low- and high-field mobilities and allow the user to select the models independently.

However, as will be shown below, the carrier mobility does not depend on E, but is a functional of the carrier distribution function. Mobility models in higher-order transport models can use more information from the distribution function. In energy transport models, for example, the carrier temperature can be used as a solution variable. As a consequence, effects like the velocity overshoot can be approximately described.

2.4.2 Carrier Generation and Recombination

The recombination rate R was formally introduced in the drift-diffusion equations () and () by splitting the continuity equation into two individual parts for electrons and holes. From a physical point of view, this term includes the generation and the recombination of electron-hole pairs.

Various physical mechanisms can cause the generation/recombination of an electron-hole pair. For instance, the absorption or emission of a photon, the absorption or emission of a phonon, three particle transitions, and transitions assisted by recombination centers can occur. The impact of these mechanisms depends on the operation conditions and the properties of the employed materials.

One important generation/recombination process is the well-known Shockley-Read-Hall (SRH) mechanism [25, 26] which describes a two-step phonon transition. A trap level, which is energetically located within the bandgap and enables the electron to recombine with a hole, is utilized. Four partial processes can be separated: the capture and the emission of both, electrons and holes, on the trap level. Balance equations can be formulated for the trap occupancy function which leads to the Shockley-Read-Hall model. In the stationary limit, the rate dependence on the carrier concentration is described by the expression

$$\displaystyle \begin{gathered} \begin{aligned} R &= R_{\mathrm{n}} - G_{\mathrm{n}} = R_{\mathrm{p}} - G_{\mathrm{p}} \\ &= \frac{n \, p - n_i^2} {\tau_{\mathrm{p}} \, (n + n_1) + \tau_{\mathrm{n}} \, (p + p_1)} \end{aligned} \end{gathered} $$
(37.31)
$$\displaystyle \begin{gathered} n_1 \, p_1 = n_i^2 . \end{gathered} $$
(37.32)

Here ni denotes the intrinsic carrier concentration (constant). The auxiliary concentrations n1 and p1 depend on lattice temperature and the trap energy level. The parameters τn and τp are the lifetimes for electrons and holes depending on the lattice temperature. Various physical mechanisms may influence the recombination/generation process. A doping dependence of recombination lifetimes is experimentally observed in silicon technology.

In thermal equilibrium, the net recombination rate vanishes (\(n_0 p_0=n_i^2\)). An excess number of carriers (\(n p > n_i^2\)) lead to an increased recombination, and a low-carrier concentration (\(n p < n_i^2\)) leads to an increased generation. Lifetimes can be modeled depending on trap concentration, temperature, and doping.

The carrier recombination process is critical for any device that involves both electron and hole flows. Generation/recombination phenomena are involved in many fundamental effects, like leakage current and device breakdown.

3 Microscopic Transport Modeling

Following a more rigorous approach, macroscopic transport equations used in semiconductor device simulation are normally derived from the Boltzmann transport equation (BTE), which determines the microscopic phase space distribution function f = f(r, k, t) for a stochastic ensemble of particles of a single-carrier type. The unknown distribution function in the BTE is considered as a classical, everywhere positive, distribution function.

However, the Boltzmann transport equation requires information only accessible by quantum mechanical considerations [14]. These are the band structure, expressions for the scattering and generation/recombination rates, and the Pauli exclusion principle reflecting the Fermi statistics of carriers. In this section, we discuss the Boltzmann equation and simplified models for band structure, scattering, and generation/recombination processes, which are employed in the derivation of macroscopic transport models.

3.1 The Boltzmann Transport Equation

The BTE describing charge transport in semiconductors is an integro-differential equation in the seven-dimensional space (r, k, t). Here r is the spatial variable, t is time, and the variable k is called the wave vector. Alternatively, one can use the crystal momentum p = ħk as variable.

The BTE is a classical kinetic equation which reads for electrons [27]

$$\displaystyle \begin{aligned} {\partial_{\mathrm{t}}}f + \mathbf{v} \cdot {{\boldsymbol{\nabla}_{\negmedspace \mathbf{r}}} f} + \frac{\mathbf{F}}{\hbar} \cdot {{\boldsymbol{\nabla}_{\negmedspace \mathbf{k}}} f} = C[f]-R[f] {} . \end{aligned} $$
(37.33)

The operator C on the right-hand side is called the collision operator (see Sect. 37.3.5). In general, this operator is nonlinear and nonlocal in the variable k. The operator R on the right-hand side describes generation/recombination terms and depends on the electron and hole distribution function (see Sect. 37.3.6). Mathematically, the operators C and R are integral operators nonlocal in k-space. Scattering and generation/recombination have an interpretation in terms of stochastic processes for classical point particles. In general, these operators depend also on any local impurities and material defects.

The quantity v = v(r, k) on the left-hand side in (37.33) is the group velocity of the carrier

$$\displaystyle \begin{aligned} \mathbf{v}(\mathbf{r},\mathbf{k}) &= \frac{1}{\hbar} {{\boldsymbol{\nabla}_{\negmedspace \mathbf{k}}} {\mathcal{E}}(\mathbf{r},\mathbf{k})} . {} \end{aligned} $$
(37.34)

The energy of the carrier \({\mathcal {E}}(\mathbf {r},\mathbf {k})\) is determined by the band structure of the material (see Sect. 37.3.2).

In (37.33), F is an external force exerted on the electrons, which in general depends on both r and k. It is given as

$$ \begin{array}{lll} \mathbf{F}(\mathbf{r},\mathbf{k}) & =&\displaystyle - {{\boldsymbol{\nabla}_{\negmedspace \mathbf{r}}} E_{\mathrm{c}}(\mathbf{r})} \notag\\ & &\displaystyle - {q} \bigl( \mathbf{E}(\mathbf{r}) + \mathbf{v}(\mathbf{r},\mathbf{k}) {\times} \mathbf{B}(\mathbf{r}) \bigr) \end{array} $$
(37.35)
$$ \begin{array}{lll} & &\displaystyle - {{\boldsymbol{\nabla}_{\negmedspace \mathbf{r}}} {\mathcal{E}}(\mathbf{r},\mathbf{k})}\notag\end{array} $$

where Ec is the conduction band minimum.

The Boltzmann equation is valid for general inhomogeneous materials with arbitrary band structure [28]. If the influence of the magnetic field is omitted and a homogeneous material is assumed, then F depends only on the local electric field as

$$\displaystyle \begin{aligned} \mathbf{F}(\mathbf{r}) = - {q} \,\mathbf{E}(\mathbf{r}) . {} \end{aligned} $$
(37.36)

Like in the drift-diffusion model (see Sect. 37.2.1), the electric field E = −ψ must be determined by Poisson’s equation (37.2), where we have

$$\displaystyle \begin{gathered} {\boldsymbol{\nabla} \cdot} (\varepsilon {\boldsymbol{\nabla}} \psi) = {q} \,(n-p-N) {} \end{gathered} $$
(37.37)

with N representing the net doping. The right-hand side contains the electron and hole concentrations, n and p. These are the zeroth-order moments of the distribution functionsfn(r, k, t) and fp(r, k, t). For a complete description of the system of charge carriers, two Boltzmann equations with different physical parameters have to be solved self-consistently, one equation for the electron distribution fn and a second one for the hole distribution fp. The electron and hole ensembles couple through the Poisson equation and through generation and recombination of electron-hole pairs. The coupling of the transport equations with the Poisson equation requires an iterativeself-consistent solution procedure [6].

3.2 The Band Structure

For the evaluation of the transport equation, some information about the band energy \({\mathcal {E}}(\mathbf {r},\mathbf {k})\) must be provided. The inverse effective mass tensor is defined by the second-order derivative of \({\mathcal {E}}(\mathbf {r},\mathbf {k})\) as

$$\displaystyle \begin{aligned} {\hat{\boldsymbol{m}}}^{-1}(\mathbf{r},\mathbf{k}) &= \frac{1}{\hbar} {{\boldsymbol{\nabla}_{\negmedspace \mathbf{k}}} {\otimes} \mathbf{v}(\mathbf{r},\mathbf{k})} \end{aligned} $$
(37.38)
$$\displaystyle \begin{aligned} &= \frac{1}{\hbar^2} {{\boldsymbol{\nabla}_{\negmedspace \mathbf{k}}} {\otimes} {{\boldsymbol{\nabla}_{\negmedspace \mathbf{k}}} {\mathcal{E}}(\mathbf{r},\mathbf{k})}} {} \end{aligned} $$
(37.39)

where ⊗ denotes the tensor product.

A common assumption in macroscopic transport models is that the band structure is isotropic, that is, the kinetic energy depends only on the magnitude of the wave vector k. Then, an isotropic effective mass approximation via the trace of the mass tensor is employed as [29]

$$\displaystyle \begin{aligned} {m}^{-1} = \frac{1}{3} \,{\mathrm{tr}\, {\langle {\hat{\boldsymbol{m}}}^{-1} \rangle}}. \end{aligned} $$
(37.40)

The simplest isotropic model is given by a parabolic relationship between the energy and the crystal momentum p = ħk, which is assumed to be valid for energies close to the band minimum:

$$\displaystyle \begin{aligned} {\mathcal{E}} = \frac{p^2}{2 m} {} \end{aligned} $$
(37.41)

With increasing carrier energies, non-parabolic bands must be taken into account. A first-order non-parabolic relationship was given by Kane [30]. However, an analytical treatment of the corresponding carrier energy and group velocity is no longer possible.

In the parabolic case, we get for the group velocity

$$\displaystyle \begin{aligned} \mathbf{v}=\frac{\hbar \,\mathbf{k}}{m}=\frac{\mathbf{p}}{m} \, \end{aligned} $$
(37.42)

and the band energy can be expressed in various equivalent ways as

$$\displaystyle \begin{aligned} {\mathcal{E}} = \frac{\hbar^2 \, k^2}{2 \, m}= \frac{p^2}{2 \, m}=\frac{\mathbf{v}\cdot\mathbf{p}}{2}=\frac{m \, v^2}{2} . \end{aligned} $$
(37.43)

Using the crystal momentum p as variable, the Boltzmann equation for electrons simplifies in the parabolic case for a homogeneous material to

$$\displaystyle \begin{aligned} {{\partial_{\mathrm{t}}} f} + \frac{\mathbf{p}}{m} \cdot {{\boldsymbol{\nabla}_{\negmedspace \mathbf{r}}} f} - \, {q} \, {\boldsymbol{\mathrm{E}}} \cdot {{\boldsymbol{\nabla}_{\negmedspace \mathbf{p}}} f} = C[f]-R[f] \end{aligned} $$
(37.44)

where the left-hand side is the classical Liouville equation.

3.3 Macroscopic Observables

Rather than looking at the microscopic distribution function f(r, k) itself, a common simplification is to investigate only some macroscopic counterparts which are averages over k-space. As a consequence, the moments no longer depend on k but only on r. Macroscopic quantities are obtained by the integration of an according microscopic weight function multiplied by the distribution function f. A general moment of the distribution function can be defined as

$$\displaystyle \begin{aligned} {\langle \Phi \rangle} = {\int \Phi f \,\, {\mathrm{d}}^3 \mathbf{k}} {} \end{aligned} $$
(37.45)

with a suitable weight function Φ =  Φ(k). In the definition of the physical observables, we will use moments rescaled with a fixed constant:

$$ \displaystyle \begin{aligned}{\ll \varPhi \gg} = \frac{1}{4 \pi^3} {\langle \varPhi \rangle} {}\end{aligned} $$
(37.46)

Here, the prefactor 1∕(4π3) incorporates the density of states in three-dimensional k-space [31, p.26]. This factor is composed of the spin degeneracy which implies a factor of two; a further factor of 1∕(2π) per degree of freedom results from the transition from discrete states to a continuum distribution function.

The moments contain information about the full microscopic distribution function, and they are used to define macroscopic observables or state variables. Some common physical observables (e.g., carrier temperature and velocity) are derived by additionally normalizing the moment with the local carrier concentration n(r) (see table below). The angular bracket always denotes integration without normalization. As normalization of a distribution function is a nonlinear transformation, the models derived by the method of moments (see Sect. 37.4.1) are more naturally written in terms of (non-normalized) moments.

In the following, the group velocity v is separated into a random part c and the mean value \( \bar {\mathbf {v}} = {\ll \mathbf {v} \gg }/{\ll 1 \gg} \) such that \(\mathbf {c} = \mathbf {v} - \bar {\mathbf {v}}\). We then define the following symbols:Here Tn is the electron gas temperature and J the electron current density.

Macroscopic moments provide the link between the microscopic description in terms of a distribution function f and a phenomenological description such as described in Sect. 37.2. Using the method of moments (see Sect. 37.4), a system of equations for a given set of observables can be systematically derived from the Boltzmann equation.

3.4 Analytical Distributions

The Boltzmann equation can be solved approximately by making a suitable ansatz for the distribution function. An ansatz can also be used to derive closure relations for higher-order moment models (see Sect. 37.4.3). Important analytical classes of distribution functions are described below.

3.4.1 Fermi-Dirac and Maxwell-Boltzmann

The Fermi-Dirac and Maxwell-Boltzmann distribution functions (see Fig. 37.2) are frequently used to model the distribution of carriers in equilibrium which describes the case of a negligible applied electric field.

Fig. 37.2
figure 2

Fermi-Dirac versus Maxwell-Boltzmann distribution

The Fermi-Dirac distribution is defined asFor Ec ≫ EF, the Fermi-Dirac distributioncan be approximated by a Maxwell-Boltzmann distribution (see Fig. 37.3):This is valid for nondegenerate, lowly doped semiconductors and corresponds to a neglect of the Pauli principle. While the assumption of lowly doped semiconductors is virtually never justified for semiconductor devices, the Maxwell-Boltzmann distribution is still preferentially used, since it gives closed-form results when integrated, for example, for the calculation of moments.

Fig. 37.3
figure 3

In the limit Ec ≫ EF, the Fermi-Dirac distributioncan be approximated by a Maxwell-Boltzmann distribution

Rewriting the Maxwell distribution givesHere the prefactor A is given byand does not depend on p. For parabolic bands, we getwhich is often referred to as “cold Maxwellian,” as the lattice temperature appears in it.

3.4.2 Heated Displaced Maxwellian

A heated displaced Maxwellian distribution function (also called shifted Maxwellian)

$$\displaystyle \begin{aligned} f_{\mathrm{sM}}(\mathbf{k}) = \exp\big( {a + \mathbf{b} \cdot \mathbf{k} - c \, k^2} \big) \end{aligned} $$
(37.59)

is a frequently used ansatz for the distribution function in the Boltzmann equation. In terms of its moments (37.45), this distribution can be parameterized as

$$\displaystyle \begin{aligned} f(\mathbf{k}) = \frac{n_0}{(2 \pi T_2)^{\frac{3}{2}}} \exp\Big({-\frac{(\mathbf{k}-{\mathbf{k}}_0)^2}{2 T_2}}\Big) \end{aligned} $$
(37.60)

where

$$ \begin{array}{lll} n_0 & = &\displaystyle {\langle 1 \rangle} = \Big(\frac{\pi}{c}\Big)^{ \frac{3}{2}} \exp \Big(a +\frac{b^2}{4c} \Big)\end{array} $$
(37.61)
$$ \begin{array}{lll} {\mathbf{k}}_0 & =&\displaystyle \frac{ {\langle \mathbf{k} \rangle} }{n_0} = \frac{1}{2c} \, \mathbf{b}\end{array} $$
(37.62)
$$ \begin{array}{lll} T_2 & = &\displaystyle \frac{ {\langle (\mathbf{k}-{\mathbf{k}}_0)^2 \rangle} } {3\, n_0}= \frac{1}{2c}. \end{array} $$
(37.63)

For k0 = 0, we get a heated Maxwellian as a special case, while k0≠0 is required to result in a current density. Note, however, that a shifted Maxwellian has zero heat flow. Hence, this ansatz cannot represent solutions where the heat flow is physically relevant.

3.4.3 Diffusive Maxwellian

If the displacement b in (37.59) is small, then it is justified to approximate the shifted Maxwellian by a linearized shifted Maxwellian:

$$\displaystyle \begin{aligned} f(\mathbf{k}) &= \exp\big({a + \mathbf{b} \cdot \mathbf{k} - c \, k^2}\big) \notag\\ &= \exp\big({a - c \, k^2}\big) \, \exp\left({ \, \mathbf{b} \cdot \mathbf{k}}\right) \notag\\ &\approx f_{\mathrm{M}}(|\mathbf{k}|) \, (1 + \, \mathbf{b} \cdot \mathbf{k}) = \tilde{f}(\mathbf{k}) . {} \end{aligned} $$
(37.64)

Note that the distribution \(\tilde {f}\) is no longer positive on all of R3. In terms of its moments (37.45), a linearized shifted Maxwellian can be parameterized as

$$\displaystyle \begin{aligned} \tilde{f}(\mathbf{k})=\frac{n_0}{(2 \pi M_2)^{\frac{3}{2}}} \exp \Big({-\frac{{\mathbf{k}}^2}{2 M_2}}\Big) \Big(1+ \frac{\mathbf{k} \cdot {\mathbf{k}}_0}{M_2}\Big){} \end{aligned} $$
(37.65)

where

$$ \begin{array}{lll} n_0 & =&\displaystyle {\langle 1 \rangle} = \Big(\frac{\pi}{c}\Big)^{\frac{3}{2}} \exp (a)\end{array} $$
(37.66)
$$ \begin{array}{lll} {\mathbf{k}}_0 & =&\displaystyle \frac{ {\langle \mathbf{k}\rangle}}{n_0}=\frac{1}{2c \, \mathbf{b}}\end{array} $$
(37.67)
$$ \begin{array}{lll} M_2 & = &\displaystyle \frac{{\langle {\mathbf{k}}^2 \rangle}}{3\, n_0}= \frac{1}{2c}. \end{array} $$
(37.68)

The expression “diffusive Maxwellian” stems from the fact that this distribution is implicitly used in the context of the diffusion approximation (see Sect. 37.5.1) to derive parabolic equation systems from the Boltzmann equation.

3.5 The Microscopic Relaxation Time Approximation

Electron transport is often dominated by the scattering operator. In a crude approximation, the collision term on the right-hand side of equation (37.33) represents the various scattering processes and can be approximately modeled as

$$\displaystyle \begin{aligned} C[f] \approx - \frac{f - f_0}{\tau} \end{aligned} $$
(37.69)

which is commonly referred to as relaxation time approximation [32, p.144]. Here the index 0 represents the quasi-equilibrium distribution function f0(k). Depending on the type of model, both the distribution f0 and the relaxation time τ can depend on the distribution function f or its moments. The relaxation time approximation can only be justified near quasi-equilibrium and even then only for specific types of scattering processes, i.e., elastic or isotropic. In contrast, energy relaxation by optical inter-valley scattering can hardly be accounted for by a relaxation time approximation.

One consequence of the relaxation time approximation is that the distribution decays exponentially toward its quasi-equilibrium value f0 with the time constant τ, whenever it is disturbed from its quasi-equilibrium value. To allow for an easy analytic treatment, a Fermi-Dirac or a Maxwellian equilibrium distribution might be used. While this is a crude approximation, it allows for a straight forward numerical implementation. In a more rigorous way, scattering is modeled based on Fermi’s golden rule and doping-dependent rates. A discussion on the validity of the relaxation time approximation is given in [31, p.139].

3.6 Microscopic Generation and Recombination

Similar to the relaxation time approximation for scattering, there are microscopic models for generation and recombination of electron-hole pairs. A simple model for generation/recombination is presented in [15].

The Shockley-Read-Hall model (see Sect. 37.2.4) has a microscopic formulation in terms of the distribution function f. The generation-recombination term R[f] in the Boltzmann equation for electrons (37.33) is based on the assumption that generation-recombination occurs via a trap level Et. Carriers in the valence and conduction band can be transferred to this trap level via phonon-assisted transitions. In k-space, the model readsHere fe denotes the even part of the distribution function f, and ft is the trap occupancy. The parameter \({\mathcal {E}}_{\mathrm {t}}\) denotes the fixed trap level inside the semiconductor bandgap, while for the transition rates an energy-independent carrier lifetime τn is assumed. The simple modelallows for an analytical treatment under stationary conditions. Generation-recombination terms do not enter into the macroscopic flux equations, because R[f] is an even function in k.

In general, generation/recombination terms depend on both, electron and hole distribution functions, in an integral, nonlocal manner [3]. This makes the task of solving the corresponding coupled transport equations for electrons and holes extremely difficult and represents a significant challenge [33].

4 Macroscopic Transport Models

To solve the Boltzmann equation deterministically by discretization of the differential and integral operators is computationally very expensive. A widely used stochastic method for solving the Boltzmann equation is the Monte Carlo method. The Monte Carlo method has been proven to give accurate results but may be computationally expensive, too. In particular, due to its stochastic nature, very large computation times are required for rare events. For instance, if the distribution of high energetic carriers is relevant or if the carrier concentration is very low in specific regions of the device, Monte Carlo simulations tend to produce high variance in the results.

An approximate deterministic solution method for the Boltzmann equation, which is often sufficient for TCAD demands, is the method of moments, which transforms the Boltzmann equation into a macroscopic transport model, i.e., a small system of partial differential equations in the four-dimensional space (r, t). The solution variables are a selected set of moments, which often correspond to macroscopic observables. In this section, we give the basic steps of this method.

4.1 The Method of Moments

The method of moments is characterized by multiplying the Boltzmann equation with weight functions and integrating the whole equation over k-space. The moments of the distribution then become the unknowns of the equation system. The integration eliminates the coordinates in k-space and leaves a set of partial differential equations in (r, t) space.

Generation and recombination processes are not considered in the following. Application of the method of moments to the Boltzmann equation gives the moment equations. For a general scalar Φ, we obtain a scalar equation:and for a vectorial Φ, we obtain a vectorial equation:where ⊗ denotes the tensor product [28]. In the derivation, partial integration for integrals over product terms of type ( Φpf) is used.

If we assume a parabolic band and a weight function Φ which is a (multivariate) polynomial in p, then \(\mathbf {v}=\frac {\mathbf {p}}{m}\), and the terms in the angular brackets on the left-hand side in the moment equations are polynomials again.

4.2 The Macroscopic Relaxation Time Approximation

In analogy to the microscopic relaxation time approximation, the collision term on the right-hand side of equations () and (), which represents the various scattering processes, can be approximately modeled as

$$\displaystyle \begin{aligned} {\int {\hat{\boldsymbol{\Phi}}}_j \, C[f] \,\, {\mathrm{d}}^3 \mathbf{k}} \approx - \frac{{\langle {\hat{\boldsymbol{\Phi}}}_j \rangle} - {\langle {\hat{\boldsymbol{\Phi}}}_j \rangle}_0}{\tau_{{\hat{\boldsymbol{\Phi}}}_j}} \end{aligned} $$
(37.73)

which is commonly referred to as macroscopic relaxation time approximation [32, p.144]. Here the index 0 represents the average over the equilibrium distribution. Both the equilibrium moments and the relaxation times depend on moments of the distribution function f. In this approach, the moment decays exponentially toward its equilibrium value \({\langle {\hat {\boldsymbol {\Phi }}}_j \rangle }_0\) with the time constant \(\tau _{{\hat {\boldsymbol {\Phi }}}_j}\) under homogeneous conditions. As we may have different time constants for each moment, the macroscopic relaxation time approximation is in some way more general than the microscopic relaxation time approximation (37.69).

The equilibrium distribution function f0(k) is a symmetric function. Since the even weight functions are symmetric in k and the odd weight functions are antisymmetric in k, only the even moments of the equilibrium distribution function will be nonzero, whereas the odd moments will vanish

$$\displaystyle \begin{aligned} {\langle\boldsymbol{\Phi}_{j_{\mathrm{odd}}} \rangle}_0 = \!{\int \!\boldsymbol{\Phi}_j(\mathbf{k}) \, f_0(\mathbf{k}) \,\, {\mathrm{d}}^3 \mathbf{k}} = 0. \end{aligned} $$
(37.74)

Nonlinearity enters into the moment equations through the relaxation time approximation of the scattering operator, because both the equilibrium distribution f0 (and hence its moments) and the relaxation time parameters τ depend on the distribution function f.

4.3 The Closure Problem

The weight functions for the moment method are usually chosen as being proportional to powers of increasing order of the wave vector k or equivalently (in the parabolic case) as powers of the momentum p. Using polynomial weight functions of all orders, the method of moments transforms the parabolic band BTE into an infinite set of equations for the polynomial moments. To obtain a practically relevant model, the infinite hierarchy has to be truncated to a finite number of equations by restricting the set of weight functions to a finite number.

With this finite set of polynomial weight functions, an equation system containing only polynomial moments is obtained. The unknowns of the resulting system are the moments corresponding to the weight functions. However, in general, additional moments will show up, and the system will contain more unknowns than equations. The additional moments have to be expressed by the available moments. This is a nontrivial task and known as the closure problem.

A simple way to close the system of equations is to make a suitable ansatz for the shape of the distribution function. Then all moments can be calculated explicitly in terms of the ansatz parameters. With this kind of closure, the method of moments becomes equivalent to the Petrov-Galerkin method. This is a finite element method where the set of ansatz functions (e.g., shifted Maxwellians) and the set of test functions (e.g., polynomials) are not identical, in contrast to the usual Galerkin method. If the equation is linear and the ansatz is linear in the unknown parameters, we get a linear system of equations; else it is nonlinear (this would be the case for a shifted Maxwellian ansatz).

No form of the distribution function has to be assumed in the derivation of the moment equations, but in that case separate closure relations are needed. We can distinguish three types of closure relations:

  • First, the highest-order moment closure. The equation of highest order contains the moment of the next order, which has to be suitably approximated using available information, typically the lower-order moments.

  • Second, approximation of the tensors appearing in the equation system. They have to be expressed by the available scalar and vectorial moments in order to make the final equation system tractable.

  • Finally, approximations for the moments of the scattering term. Practical implementations use the macroscopic relaxation time approximation and the closure problem consists in finding models for the mobilities and relaxation times. These parameters depend in general on all other moments.

4.4 Moment Equations for a Parabolic Band

Commonly used macroscopic models like hydrodynamic or energy transport models use polynomial moments up to order four. To derive a four-moment system, we use the following weight functions of even order:

$$ \Phi_0 = 1 $$
(37.75)
$$ \begin{array}{lll} \Phi_2 & =&\displaystyle {\mathcal{E}} = \frac{p^2}{2 \, m} \,\end{array} $$
(37.76)

while the weight functions for the odd orders are

$$ \begin{array}{lll} {\boldsymbol{\Phi}}_1 & =&\displaystyle \mathbf{v} = \frac{\mathbf{p}}{m} \,\end{array} $$
(37.77)
$$ \begin{array}{lll} {\boldsymbol{\Phi}}_3 & =&\displaystyle {\mathcal{E}} \, \mathbf{v}\, = \frac{p^2}{2 \, m} \, \frac{\mathbf{p}}{m} .\end{array} $$
(37.78)

As will be seen, using these weight functions in the method of moments, the scalar weight functions will lead to the balance equations, whereas the vectorial weight functions will lead to the flux equations. The calculation of the gradients of the weight functions of even order is straightforward:

$$ \begin{array}{lll} \hspace{-2pc} {{\boldsymbol{\nabla}_{\negmedspace \mathbf{p}}} \Phi_0} & =&\displaystyle 0\end{array} $$
(37.79)
$$ \begin{array}{lll} \hspace{-2pc} {{\boldsymbol{\nabla}_{\negmedspace \mathbf{p}}} \Phi_2} & =&\displaystyle \mathbf{v} \end{array} $$
(37.80)

while the gradients of the odd weight functions can be written as

$$ \begin{array}{lll} \!\!{{\boldsymbol{\nabla}_{\negmedspace \mathbf{p}}} \otimes {\boldsymbol{\Phi}}_1} & =&\displaystyle \frac{{\hat{\boldsymbol{\delta}}}}{m}\end{array} $$
(37.81)
$$ \begin{array}{lll} \!\!{{\boldsymbol{\nabla}_{\negmedspace \mathbf{p}}} \otimes {\boldsymbol{\Phi}}_3} & =&\displaystyle \frac{{\mathcal{E}}}{m} \, {\hat{\boldsymbol{\delta}}} + \mathbf{v} {\otimes} \mathbf{v}. \end{array} $$
(37.82)

Here \({\hat {\boldsymbol {\delta }}}\) is the unit matrix in three dimensions.

Applying the macroscopic relaxation time approximation and inserting the calculated gradients from the previous section into equations () and () lead to the equation set where τn, τm = τv, \(\tau _{\mathcal {E}}\), and τS are the relaxation times for particle density, momentum/velocity, energy, and energy flux density, respectively.

Complementing the macroscopic observables already defined in Sect. 37.3.3, all moment equations will be written using the following observables [34]:

$$\displaystyle \begin{aligned}\mbox{Momentum:}\qquad \quad\,\,\,\, \bar{{\mathbf{p}}} = \frac{1}{n} {\ll \mathbf{p} \gg}\end{aligned} $$
(37.87)
$$ \displaystyle \begin{aligned}\mbox{Thermal\ tensor:}\qquad \, {\hat{\boldsymbol{T}}} = \frac{1}{n} {\ll \frac{m}{2} \mathbf{c} {\otimes} \mathbf{c} \gg}{}\end{aligned} $$
(37.88)
$$ \displaystyle \begin{aligned}\mbox{Energy-like tensor:}\quad {\hat{\boldsymbol{U}}} = \frac{1}{n} {\ll \frac{1}{2} \mathbf{v} {\otimes} \mathbf{p} \gg}\end{aligned} $$
(37.89)

In these definitions, the even moments \(w, {\hat {\boldsymbol {U}}}, {\hat {\boldsymbol {R}}}\) are normalized, while the fluxes J and S are not. The energy tensor and thermal energy tensors are also known as stress and pressure tensor in fluid dynamics.

In the parabolic case, we get the following relations:

$$ \begin{array}{lll} \bar{\mathbf{p}} & =&\displaystyle m \bar{\mathbf{v}}\end{array} $$
(37.91)
$$ \displaystyle \begin{aligned}\begin{array}{rcl}w=\displaystyle\frac{1}{n}{\ll {\mathcal{E}} \gg}=\frac{1}{n} {\ll \frac{m}{2} v^2 \gg}= {\mathrm{tr}\, {\hat{\boldsymbol{U}}}}\end{array}\end{aligned} $$
(37.92)

Thus, the scalar values for the energy and the thermal energy are the traces of the corresponding tensors.

Average energy w and energy flux S are raw moments of second and third order. Temperature Tn and heat flux Q are (up to scaling) the corresponding central moments. In the parabolic case, central and raw moments are related by

$$ \begin{array}{lll} w & =&\displaystyle w_{\mathrm{T}} + \frac{m}{2} \bar{v}^2\end{array} $$
(37.94)
$$ \begin{array}{lll} {\hat{\boldsymbol{U}}} & =&\displaystyle {\hat{\boldsymbol{T}}} + \frac{m}{2} \bar{\mathbf{v}} {\otimes} \bar{\mathbf{v}}\end{array} $$
(37.95)
$$ \begin{array}{lll} \mathbf{S} & =&\displaystyle \mathbf{Q} + 2 n {\hat{\boldsymbol{T}}} \bar{\mathbf{v}} + n w_{\mathrm{T}} \bar{\mathbf{v}} + n \frac{m}{2} \bar{v}^2 \bar{\mathbf{v}} {}.\end{array} $$
(37.96)

If we define convective energy as

$$\displaystyle \begin{aligned} w_{{ \mathrm{conv}}}=\frac{m}{2}\bar{v}^2 , \end{aligned} $$
(37.97)

the total energy is the sum of thermal and convective energy.

4.5 Hydrodynamic Models

A hydrodynamic model for semiconductors was first introduced by Bløtekjær [9], who made several assumptions listed below to derive this system. Due to the hyperbolic nature and the resulting numerical challenges, hydrodynamic models are rarely used in macroscopic device simulation. However, hydrodynamic models are based on fewer assumptions than the energy transport models which actually can be derived from the hydrodynamic models by additional simplifications. For this reason, the generic hydrodynamic transport model is discussed first. The energy transport models are discussed in Sect. 37.5.

We now consider only the first three moment equations ()–() as originally done by Bløtekjær (see [9]). This system assumes a single effective parabolic energy band and consists of five equations (line Φ1 is vectorial). The unknowns corresponding to the weight functions are \(\{n,\bar {\mathbf {v}},w\}\). Since we have , an equivalent set of unknowns is \(\{n,\bar {\mathbf {v}},T_{\mathrm {n}}\}\). All other moments appearing in the system have to be expressed through these unknowns.

4.5.1 Isotropic Symmetry

To reduce the number of unknowns, we use symmetry considerations for the thermal energy tensor. We can split the energy tensor according to (37.95):

$$\displaystyle \begin{aligned} {\hat{\boldsymbol{U}}} = {\hat{\boldsymbol{T}}} + \frac{m}{2} \bar{\mathbf{v}} {\otimes} \bar{\mathbf{v}} \end{aligned} $$
(37.102)

The random component of the velocity has zero average, that is, 〈c〉 = 0. Under the assumption that the even part of the distribution of c is isotropic, the following relation can be derived:

$$\displaystyle \begin{aligned} {\langle \mathbf{c} {\otimes} \mathbf{c} \rangle} = { \frac{1}{3}} {\langle c^2 \rangle} {\hat{\boldsymbol{I}}} \end{aligned} $$
(37.103)

This assumption is frequently considered to be justified because of the strong influence of scattering. With this assumption, we can represent the thermal energy tensor \({\hat {\boldsymbol {T}}}\) by a scalar temperature Tn:With this assumption, (37.95) now becomesand the six unknowns of the (symmetric) energy tensor \({\hat {\boldsymbol {U}}}\) are reduced to a single unknown Tn.

We employ the usual definition:

$$\displaystyle \begin{aligned} \mu = \frac{{\mathrm{q}} \, \tau_m}{\mathrm{m}} {} \ \end{aligned} $$
(37.106)

where μ will be identified by the electron mobility in the drift-diffusion model, which is used in place of the momentum relaxation time.

With the isotropic approximations () and , the moment equations ()–() can be written in the usual variables as [35]:

$$ \begin{array}{lll} {\boldsymbol{\nabla}} \cdot \mathbf{J} & =&\displaystyle {\mathrm{q}} {\partial_{\mathrm{t}}}n\end{array} $$
(37.107)
$$ \begin{array}{lll} {\boldsymbol{\nabla}} \cdot \mathbf{S} & =&\displaystyle -{\partial_{\mathrm{t}}}(n w)\notag \\ & &\displaystyle +\, \mathbf{E} \cdot \mathbf{J} -n \frac{w - w_0}{\tau_{\mathcal{E}}} .\end{array} $$
(37.109)

Note that we have set n0 = n in equation (), i.e., we do not relax the carrier density. Theequilibrium value w0 is given as

4.5.2 Phenomenological Highest-Order Moment Closure

The third equation (37.109) needs to be closed in the energy flux S. A suitable approximation for the energy flux S has to be found, and different approaches have been published. Assuming isotropic symmetryand using definition , we can express S from equation (37.96) aswhich still contains the heat flux Q as unknown third-order moment.

A simple way to close the equation is to assume that f is a shifted Maxwellian. For a shifted Maxwellian, the heat flux Q vanishes which givesBy contrast, Bløtekjær [9] took a phenomenological approach and approximated the heat flux Q by Fourier’s law asin which the thermal conductivity is given by the Wiedemann-Franz law aswhere pγ is an empirical correction factor.

4.5.3 Numerical Properties

Equations (37.107)–(37.109) yield the full hydrodynamic model for parabolic band structures. The system needs to be supplemented by Fourier’s lawwhich is used into close the system. This equation system is similar to the Euler equations of fluid dynamics with the addition of a heat conduction term and the collision terms. In the absence of electric fields and scattering terms, equations (37.107)–(37.109) together with the Maxwellian closurecorrespond to the Euler equations of gas dynamics [13, p.201]. Mathematically, they constitute a quasilinear hyperbolic system of conservation laws since mass, momentum, and energy are conserved.

The hydrodynamic model describes the propagation of charge carriers in an electronic device as the flow of a compressible fluid. The electron gas has a sound speed of , and the carrier flow may be either supersonic or subsonic. With Tn = ξTL and TL = 300 K, we have \(v_{\mathrm {c}} = \sqrt {\xi }\, 1.3 \times 10^7\,cm/s\,\), while for TL = 77 K, \(v_{\mathrm {c}} = \sqrt {\xi }\, 6.6 \times 10^6\, cm/s\,\) [36]. Typical maximum carrier temperatures, which are observed experimentally inside electronic devices, give rise to sound speeds of about 5 × 107 cm∕s which is higher than the maximum velocities observed in typical simulations.

If the flow is supersonic, electron shock waves can occur inside the semiconductor device [36]. Such shock waves arise typically either at low temperatures or at a small length scale. In general, special numerical methods have to be applied, because the system of equations is hyperbolic [36,37,38,39]. For this type of system, the traditionally used Scharfetter-Gummel scheme [7] and its extension to the energy transport models [40,41,42,43,44,45] cannot be applied. A possible approximation consists in treating the convective term as a perturbation by freezing the dependence on the state variables at each step and using the values from the previous recursion [46]. However, in those cases where the temporal or spatial variation is important, this approach degrades the convergence [47]. In order to derive a numerically feasible spatial discretization, methods from fluid dynamics known as upwinding have been applied [47, 48]. In addition, the handling of the boundary conditions is more involved [39, 49].

These numerical properties make the hydrodynamic model challenging to solve, and the energy transport models discussed in the next section are virtually always preferred for practical device modeling.

5 Energy Transport Models

Thehydrodynamic transport model introduced in Sect. 37.4.5 includes convective terms analogous to the differential equations in fluid dynamics. With these convective terms, we get a system of partial differential equations of hyperbolic type, which is difficult to solve with numerical methods. Therefore, in order to get rid of these inconvenient terms, the diffusion approximation is introduced. The resulting systems are of parabolic type which differs in the numerical properties from the original hyperbolic problem.

In the literature, there often is an arbitrary or synonymous use of the terms hydrodynamic transport and energy transport model. For the sake of consistency, the commonly employed three- or four-moment models incorporating the diffusion approximation should better be addressed as energy transport models [5]. Transport models based on the diffusion approximation are termed diffusive models, the prototypical example being the drift-diffusion model.

Assuming the validity of the diffusion approximation, we are now going to discuss two prototypical schemes to derive energy transport models from the Boltzmann equation which have been proposed by Stratton [8] and Bløtekjær [50], respectively. They use different variants of the method of moments and employ different strategies to close the system of equations.

5.1 The Diffusion Approximation

A heuristicway to motivate the diffusion approximation consists in analyzing the assumptions necessary for the derivation of the drift-diffusion model from the Boltzmann equation. We start with the first two equations of the hydrodynamic transport model:

$$ \begin{array}{lll} {\boldsymbol{\nabla}} \cdot \mathbf{J} & =&\displaystyle {q} {\partial_{\mathrm{t}}}n\end{array} $$
(37.115)

This model assumes a parabolic band and an isotropic thermal energy tensor.

In order to derive the drift-diffusion model ()– from the hydrodynamic moment equations (37.115)– for Φ0 and Φ1, we need two crucial assumptions which form the core of the diffusion approximation. Both assumptions can mathematically be justified in the framework of diffusion scaling [13, p.52].

First, we assume that the equation for the odd moment J is quasi-stationary. As signal frequencies are well below 1∕(2πτm) ≈ 1012 Hz, the time derivative inis neglected [51].

Second, the contribution of the drift kinetic energy to the carrier energy is frequently neglected:This simplification is valid if the thermal energy is much larger than the convective energy (see Sect. 37.6.1 for a discussion of validity). Mathematically, we approximate the raw second-order moment by the central moment. The last assumption has an analogue for the energy tensor. We split the energy tensor according to equation (37.95):

$$\displaystyle \begin{aligned} {\hat{\boldsymbol{U}}} = {\hat{\boldsymbol{T}}} + \frac{m}{2} \bar{\mathbf{v}} {\otimes} \bar{\mathbf{v}} \end{aligned} $$
(37.118)

and drop the convective tensor on the right-hand side, which givesWith this approximation, we can neglect the convective term:

$$\displaystyle \begin{aligned} \frac{\tau_{\mathrm{m}}}{{\mathrm{q}}} {{{{{{\boldsymbol{\nabla}}}}}}} \cdot \Big( \mathbf{J} {\otimes} \frac{\mathbf{J}}{n} \Big) \ \end{aligned} $$
(37.120)

in the current relation , and a parabolic equation system is obtained which only covers the subsonic flow regions. This is a very common approximation employed in virtually all device simulators, and we are not aware of any numerically robust scheme which can solve the HD model for realistic and relevant semiconductor devices.

With these simplifications, we still have the continuity equation () for Φ0, and the equationfor Φ1 turns into This still differs from the drift-diffusion current relation:with respect to the term inside the gradient operator. Only, if we assume that the carrier temperature Tn is in equilibrium with the lattice temperature TL and that TL is spatially homogeneous, thenandare equivalent.

The derivation of the drift-diffusion model from the Boltzmann equation using the method of moments unmasks the assumptions behind the drift-diffusion model. The assumptions made above can be justified from a mathematical point of view in the sense that they follow consistently from appropriately scaling the BTE and taking the limit α → 0 of vanishing Knudsen number. This is discussedbelow.

5.2 Diffusion Scaling

The Knudsen number α appears as scaling parameter in the Boltzmann equation. It represents the mean free path τ0v0 relative to the device dimension [52]:

$$\displaystyle \begin{aligned} \alpha = \frac{\tau_0 v_0}{x_0} \end{aligned} $$
(37.123)

τ0 is the characteristic time between scattering events, v0 denotes the velocity scale, and x0 is given by the size of the simulation domain.

Macroscopic models assume that scattering is dominant in the sense that the mean free path is much smaller than the device dimension [13]. This means that a particle will undergo many collisions along its way through the device. Carriers in a semiconductor at room temperature can be considered a collision-dominated system, for which α ≪ 1.

The choice of the scaling – hydrodynamic or diffusive – depends on the equilibrium states associated with the collision operator [13, p.47]. Hydrodynamic scaling assumes the characteristic timescale of the system to be

$$\displaystyle \begin{aligned} t_0 = \frac{\tau_0}{\alpha} . \end{aligned} $$
(37.124)

This means that we consider a timescale much larger than the collision time. Diffusion scaling assumes the characteristic timescale of the system to be even larger:

$$\displaystyle \begin{aligned} t_0 = \frac{\tau_0}{\alpha^2} \end{aligned} $$
(37.125)

In diffusion scaling, the Boltzmann equation for electrons becomes (for parabolic bands)

$$\displaystyle \begin{aligned} \alpha^2 {{\partial_{\mathrm{t}}} f} + \alpha \Big( \frac{\mathbf{p}}{m} \cdot {{\boldsymbol{\nabla}_{\negmedspace \mathbf{r}}} f} - \, {q} \, {\boldsymbol{\mathrm{E}}} \cdot {{\boldsymbol{\nabla}_{\negmedspace \mathbf{p}}} f} \Big) = C[f]. \end{aligned} $$
(37.126)

The Boltzmann equation in diffusion scaling can be solved by a Hilbert expansion in the parameter α. This can be used to derive the drift-diffusion equations in a mathematically consistent way (see [13, p.106]). Using diffusion scaling (see [13, p.52]), one obtains that convective terms of the form 〈v〉⊗〈v〉 are negligible compared to 〈vv〉. Furthermore, in the diffusion approximation, the drift kinetic energy \(\frac {1}{2} m \bar {v}^2\) is neglected against , and in the flux equation, the time derivative is ignored. The validity of these assumptions for semiconductor devices is further discussed in Sect. 37.6.1.

5.3 Stratton’s Approach

Historically, Stratton [8] performed one of the first derivations of an extended system of transport equations. This approach employs a parity decomposition, where the distribution function is split into even and odd parts:

$$\displaystyle \begin{aligned} f(\mathbf{r},\mathbf{k}) = f^{\mathrm{e}}(\mathbf{r},\mathbf{k}) + f^{\mathrm{o}}(\mathbf{r},\mathbf{k}) \end{aligned} $$
(37.127)

Because of antisymmetry fo(r, −k) = −fo(r, k), it follows that 〈fo〉 = 0. Assuming a single microscopic relaxation time approximation for the collision operator

$$\displaystyle \begin{aligned} \hspace{-3pc}C[f] = - \frac{f - f_{\mathrm{eq}}}{\tau({\mathcal{E}},\mathbf{r})}{} \end{aligned} $$
(37.128)

the Boltzmann equation splits into two coupled equations which allows to express the odd part fo as a function of fe via

$$\displaystyle \begin{aligned} f^{\mathrm{o}} = -\tau({\mathcal{E}},\mathbf{r}) \Big( \mathbf{v} \cdot {\boldsymbol{\nabla}_{\negmedspace \mathbf{r}}} f^{\mathrm{e}} - \frac{{\mathrm{q}}}{\hbar} \mathbf{E} \cdot {\boldsymbol{\nabla}_{\negmedspace \mathbf{k}}}{f^{\mathrm{e}}} \Big) {} . \end{aligned} $$
(37.129)

Stratton proposes a power law for the microscopic relaxation time; Assuming that fe is a heated Maxwellian distribution, the equation system below is obtained:

$$ \begin{array}{lll} {\boldsymbol{\nabla}} \cdot \mathbf{J} & =&\displaystyle {q} {\partial_t}n\end{array} $$
(37.131)

The inclusion of mobility within the gradient operator in () is incorrect if doping is not homogeneous and mobility is doping-dependent. In his original paper [8], Stratton discussed a Schottky-barrier junction where doping was assumed to be uniform. This is probably why he included mobility within the gradient operator [53].

Equation () can be rewritten aswith

$$\displaystyle \begin{aligned} \nu_{\mathrm{n}} = \frac{T_{\mathrm{n}}}{\mu}\frac{\partial \mu}{\partial T_{\mathrm{n}}} = \frac{\partial \ln \mu}{\partial \ln T_{\mathrm{n}}} {} \end{aligned} $$
(37.136)

which has to be considered a fitting parameter with common values in the range [−1.0, −0.5], since no analytic expression for μ is known. Note that the thermal diffusion term vanishes for νn = −1.0.

A problem with the formulationfor τ is that pν has to be approximated in order to cover all scattering processes. Particularly, in the presence of impurity scattering, the parameter pν is in the range [−1.5, 0.5], which depends on charge screening [54]. Hence, the average value is dependent on the applied field and on the dopingprofile, and it is not possible to give a unique value for pν.

5.4 Bløtekjær’s Approach

Without requiring any assumptions on the form of the distribution function, Bløtekjær [9] derived conservation equations by taking the moments of the Boltzmann equation using the weight functions 1, ħk, and , which define the moments of zeroth, first, and second order. The exposition in Sects. 37.4.4 and 37.4.5 is in line with Bløtekjær’s approach. The resulting system of equations can be written as follows [34]:

$$ \begin{array}{lll} {\partial_{\mathrm{t}}}n+ {\boldsymbol{\nabla}} \cdot (n \bar{\mathbf{v}})& =&\displaystyle n C_{\mathrm{n}}\end{array} $$
(37.137)
$$ \begin{array}{lll} {\partial_{\mathrm{t}}}(n \bar{\mathbf{p}})+{\boldsymbol{\nabla}} \cdot (n {\hat{\boldsymbol{U}}}) - n \mathbf{F}& =&\displaystyle n {\mathbf{C}}_{\mathrm{m}}\end{array} $$
(37.138)
$$ \begin{array}{lll} {\partial_{\mathrm{t}}}(n w)+{\boldsymbol{\nabla}} \cdot \mathbf{S}- n \bar{\mathbf{v}} \cdot \mathbf{F}& =&\displaystyle n C_{\mathcal{E}}\end{array} $$
(37.139)

Under the assumption that the carrier mass is position-independent, these equations are valid for arbitrary band structures. In the case of a position-dependent mass, additional force terms have to be included in (37.137)–(37.139) [55].

Bløtekjær originally proposed the concept of macroscopic relaxation times introducing one relaxation time for each moment equation:

$$\displaystyle \begin{aligned} {\mathbf{C}}_{\mathrm{m}} &= - \frac{\bar{\mathbf{p}}}{\tau_{\mathrm{m}}} {} \end{aligned} $$
(37.140)
$$\displaystyle \begin{aligned} C_{\mathcal{E}} &= - \frac{w - w_0}{\tau_{\mathcal{E}}}{} \end{aligned} $$
(37.141)

which introduces the momentum and energy relaxation times τm and \(\tau _{\mathcal {E}}\), respectively. A discussion of this approximation is given in [56]. Recombination processes are neglected here; thus, Cn = 0.

5.4.1 Three-Moment Energy Transport Model

Bløtekjær’s systemof equations is not closed, because it contains more unknown moments than equations. In order to express the equations by the moments n, \(\bar {\mathbf {v}}\), and w (or Tn), closure relations have to be introduced.

Customarily, parabolic bands are assumed, and the closure relations (37.91), (37.94), and (37.95) for \(\bar {\mathbf {p}}\), \({\hat {\boldsymbol {U}}}\), and w can be used. In addition, Bløtekjær assumed an isotropic temperature tensorand the same phenomenological closure for the highest-order moment as in his hydrodynamic model, i.e., an energy flux based on Fourier’s lawand . Bløtekjær also used the diffusion approximation as discussed above for the drift-diffusion equation (Sect. 37.5.1). The resulting simplified equations are

$$ \begin{array}{lll} {\boldsymbol{\nabla}} \cdot \mathbf{J} & =&\displaystyle {q} {\partial_{\mathrm{t}}}n\end{array} $$
(37.142)

Equations (37.142)– form a typical three-moment model which has been closed using Fourier’s law and is commonly known in the literature as energy transport model. Actually, this name is somewhat misleading as the model consists of the two conservation equations (37.142) and () and the two transport equations () and ; the latter transport equation is empirical.

In the case of a homogeneous and stationary bulk silicon system, the energy balance equation () reduces to

$$\displaystyle \begin{aligned} {n}\frac{w-w_0}{{\tau_{{\mathcal{E}}}}}= {\boldsymbol{E}} \cdot \mathbf{J}. \end{aligned} $$
(37.146)

Withone obtains from the homogeneous energy balance equation (37.146) the following relation between the carrier temperature and the electric field [10]:The estimated temperature Tn is roughly proportional to the square of the electric field.

5.4.2 Four-Moment Energy Transport Model

To overcome the difficulties associated with the Fourier law closurewhich is empirical and leaves κ undefined, an additional moment equation of the BTE can be taken into account [34]. From , we have

$$\displaystyle \begin{aligned} {{\partial_{\mathrm{t}}} \mathbf{S}} + \frac{1}{m} {\boldsymbol{\nabla} \cdot n{\hat{\boldsymbol{R}}}} + {q} \mathbf{E} \Bigl(\frac{nw}{m} + 2 \frac{n{\hat{\boldsymbol{U}}}}{m} \Bigr) = - \frac{\mathbf{S}}{\tau_{\mathrm{S}}}. \end{aligned} $$
(37.151)

In this equation, the time derivative can be ignored using the same scaling argument that led to a neglect of the time derivative in (37.138). The energy flux mobility is defined in analogy to the mobility as

$$\displaystyle \begin{aligned} \mu_{\mathrm{S}} = \frac{q \tau_{\mathrm{S}}}{m}. \end{aligned} $$
(37.152)

This gives

$$\displaystyle \begin{aligned} \mathbf{S} = -\frac{\mu_{\mathrm{S}}}{q} \Bigl( {\boldsymbol{\nabla} \cdot (n{\hat{\boldsymbol{R}}})} - q n \mathbf{E} (w + 2 {\hat{\boldsymbol{U}}}) \Bigr). \end{aligned} $$
(37.153)

A closure relation for \({\hat {\boldsymbol {R}}}=\frac {1}{n} {\langle \mathbf {v} {\otimes } \mathbf {p} \,{\mathcal {E}} \rangle }\) has to be introduced, which can be obtained, for example, by assuming a heated Maxwellian distribution. This gives the Maxwellian closure

$$\displaystyle \begin{aligned} \hat{\boldsymbol{R}} = \frac{5}{2} k_{\mathrm{B}}^2 T_{\mathrm{n}}^2 \, \hat{\boldsymbol{I}} . {} \end{aligned} $$
(37.154)

Together with the scalar isotropic diffusion approximationone obtains the closure for a four-moment model.

5.4.3 Discussion of Bløtekjær’s Closure

The closure () should be compared with Bløtekjær’s phenomenological closure:For that purpose, one now rewrites in ()

$$\displaystyle \begin{aligned} {\boldsymbol{\nabla}} (n T_{\mathrm{n}}^2) = T_{\mathrm{n}} \bigl( {\boldsymbol{\nabla}} (n T_{\mathrm{n}}) + n {\boldsymbol{\nabla}} T_{\mathrm{n}} \bigr) . \end{aligned} $$
(37.159)

Usingand (37.159), one gets the following expression for the closure () of the four-moment model:The last form allows for a direct comparison with Bløtekjær’s phenomenological closure. From (), it follows that Bløtekjær’s closure and the four-moment closure become identical if μSμ = 1 and pγ = 0 in the three-moment model. However, Monte Carlo simulations show that the ratio μSμ strongly depends on the average energy and shows a pronounced hysteresis, when plotted over the average energy as shown in Fig. 37.4a. Similar observations can be made for the electron mobility [34], which is shown in Fig. 37.4b. In essence, this implies that neither μ nor μS can be accurately expressed as a function of w or Tn.

Fig. 37.4
figure 4

(a) Ratio of μS and μ as a function of the carrier temperature inside the n+nn+ test structures obtained from Monte Carlo simulations. (b) Electron mobility inside n+nn+ test structures obtained from Monte Carlo simulations

5.5 Relaxation Times and Carrier Mobilities

Due to the scattering operator in Boltzmann’s equation, the relaxation times and mobilities depend on the distribution function. Since the distribution function is not uniquely determined by the average energy, models only based on the average energy are bound to fail. Furthermore, the band structure can play a dominant role. Nevertheless, all models should be able to correctly reproduce the homogeneous limit for which there is a unique relationship between the electric field and the carrier temperature. Hence, the relaxation times and the mobilities are often derived from homogeneous field measurements or Monte Carlo simulations. In the following, the most commonly used analytical transport parameter models for silicon are discussed. Table-based parameter modeling is discussed later in Sect. 37.7.3.

5.5.1 Mobility

Two models for the energy dependence of the mobilityare frequently used, the model after Baccarani and Wordeman [57]:

$$\displaystyle \begin{aligned} \frac{\mu(T_{\mathrm{n}})}{\mu_0} = \frac{T_{\mathrm{L}}}{T_{\mathrm{n}}} {} \end{aligned} $$
(37.161)

and the one-dimensional model going back to Hänsch [58]:Here, μ0 is the zero-field mobility.

Under homogeneous conditions, the energy flux is proportional to the particle current:which can be used to simplify () to

As has been shown in [34, 59], expression () reproduces the mobility quite well in the regions with increasing E. However, for decreasing E, equation () should be used [34, 60], which, unfortunately, is numerically very difficult due to the vector-valued fluxes J and S, particularly in two or three dimensions.

In Fig. 37.5, the three models are compared with parameters extracted from Monte Carlo simulation data. All models have been evaluated using the data from the Monte Carlo simulation for an n+nn+ test structure with Lc = 200 nm, once with Emax = 100 kV∕cm and once with Emax = 300 kV∕cm. For electric fields smaller than 100 kV∕cm, equation () gives reasonable results but breaks down for larger electric fields. Equation () is the only model that captures the hysteresis properly and, shown in Fig. 37.5a, thus the mobility at the beginning of the drain region. The hysteresis in the other models stems from the doping dependence of μ0.

Fig. 37.5
figure 5

Comparison of the full Hänsch model (a), simplified Hänsch model (b) and Baccarani mobility model (c). The three different mobility models (solid lines) are plotted in contrast to Monte Carlo simulation data (symbols). Only the full Hänsch model () covers the hysteresis appropriately, at least for lower fields

5.5.2 Energy Relaxation Time

The simplest and most commonly used model for the energy relaxation time \(\tau _{\mathcal {E}}\) is a constant value. Typically used values for electrons in silicon at room temperature are in the range [0.3, 0.4 ps], although values in the range [0.08, 0.68 ps] have been used [61]. A constant value is justified in view of the fact that Monte Carlo simulations show only a small hysteresis and a small energy dependence [62]. However, different energy dependencies have been published. The differences seem to originate from the different band structures employed in the various Monte Carlo simulation codes.

Based on theoretical considerations, Baccarani and Wordeman [57] proposed the expressionwhich should be used together with (37.161) to correctly reproduce the homogeneous limit. Within Hänsch’s approach, \(\tau _{\mathcal {E}}\) is only required to be independent of the carrier temperature for () to correctly predict the homogeneous limit. Whenis used in the Hänsch mobility models () and (), the models are equivalent to Baccarani’s model in the homogeneous case.

5.6 Bløtekjær Versus Stratton Approach

Stratton’s closureis quite different from the four-moment closure (), and it is not clear if both models can be made consistent with each other by a special choice of parameters. A thoroughly discussed difference between Bløtekjær’s approach and Stratton’s approach is the position of the mobility relative to the gradient in the current relation: Stratton himself has addressed this issue in [63], and it has also been discussed by Landsberg [64, 65]. Further comparisons of the two formulations are given in [60, 66,67,68,69]. In both formulations, the parameters μ1 and μ2 are called mobilities, but these parameters actually differ substantially in their definition. The formulations are compared in [60], and both approaches can be justified if the respective mobilities are modeled properly. The mobilities are equal in the case of bulk simulations and are suitably expressed using energy-dependent models [20, 58]. However, in the nonhomogeneous case where the electric field changes abruptly, the mobilities are no longer a single-valued function of the temperature.

The formulation based on μ1 has the advantage that it can be approximated by its bulk value for increasing values of the electric field, whereas the value of μ2 is always different from the bulk case. Hence, μ1 is more useful for numerical simulation as it can be approximately modeled as a function of the temperature. Using an empirical ansatz for Cm, namely

$$\displaystyle \begin{aligned} {\mathbf{C}}_{\mathrm{m}} = {\mathbf{C}}^*_{\mathrm{m}} + \uplambda_{\mathrm{p}} {{{{{{\boldsymbol{\nabla}}}}}}} \cdot {\hat{\boldsymbol{U}}} \end{aligned} $$
(37.167)

where λp is a dimensionless transport coefficient and \({\mathbf {C}}^*_{\mathrm {m}}\) is the homogeneous component, Tang et al. [62] demonstrated that the formulation of Stratton is obtained from Bløtekjær’s with λp = ηn, where

$$\displaystyle \begin{aligned} \eta_{\mathrm{n}} = - \frac{T_{\mathrm{n}}}{\mu^*}\frac{\partial \mu^*}{\partial T_{\mathrm{n}}} {} . \end{aligned} $$
(37.168)

In this expression, μ = μ(Tn) is the bulk mobility which is representable as a function of the temperature. The variable ηn, as given by (37.168), is analogous to − νn, given by (37.136) in the Stratton model. However, by comparing Monte Carlo simulation results, it has been demonstrated that λpηn in the inhomogeneous case [70].

6 Limits of Validity for Drift-Diffusion and Energy Transport Models

The range of validity for macroscopic models such as drift-diffusion and energy transport models has been extensively examined [71, 72]. When the critical dimensions of devices shrink below a certain value (around 200 nm for silicon at room temperature), Monte Carlo simulations reveal strong off-equilibrium transport effects such as velocity overshoot and nonlocality of important model parameters.

In this section, we start with a discussion of the simplifying assumptions used in the previous sections to derive numerically feasible transport models from the Boltzmann equation.

Then we study energy distribution functions for one-dimensional n+nn+ test structures. Energy transport models use up to four moments of the distribution function, while the highest even moment included is the temperature. Based on Monte Carlo simulation results, it is demonstrated that below 100 nm the energy distribution function strongly deviates from a Maxwellian shape in test structures.

Finally, we compare results from a drift-diffusion model and an energy transport model with Monte Carlo simulation data. In contrast to the drift-diffusion model, higher-order models such as the energy transport model can describe nonlocal effects such as the velocity overshoot as they can make use of the information available in the additional moments.

6.1 Critical Issues

The moment-based hydrodynamic and energy transport models given in the previous sections are based on various simplifying assumptions and employ several approximations of different severity. These approximations are crucial to derive numerically feasible models, and they will be discussed in the following.

6.1.1 Non-parabolic Band Structure

Many employed models are based on the single effective parabolic band model. Due to its simplicity, a closed-form solution for the single effective parabolic band model exists, while, even for the relatively simple non-parabolicity correction model by Kane [30], it is not possible to obtain a closed-form representation of the moment equations. Non-analytic table-based correction-terms describing non-parabolicity can be added to macroscopic transport models [73, 74].

6.1.2 Tensor Quantities and Anisotropy

So far we have barely discussed the assumption of isotropic symmetry used in the approximation of tensors by scalar quantities like the carrier mass and temperature. This type of simplification is used in both hydrodynamic and energy transport models.

One-dimensional simulations carried out in [34] demonstrate that the transverse temperature component Tyy is smaller than the longitudinal temperature component Txx. This is an indicator that there is an elongation of the distribution function in the direction of the electric field and hence the assumed isotropic symmetry of the energy tensor is not valid (Fig. 37.6). The rigorous approach taken in [75] models all components of the temperature tensor. For ballistic diodes and bipolar transistors, no substantial difference between the scalar temperature and the trace of the temperature tensor is found. In contrast, a difference of up to 15% in the linear region of the transfer characteristics is observed for aggressively scaled MOSFETs.

Fig. 37.6
figure 6

Main components of the temperature tensor \({\hat { \boldsymbol {T}}}\) for two different n+nn+ test structures

6.1.3 Drift Energy Versus Thermal Energy

Another common approximation [76] is to neglect the drift kinetic energy in the average carrier energy:However, as has been pointed out by Baccarani and Wordeman [57], the convective energy can reach values comparable to the thermal energy. A plot of the ratio

$$\displaystyle \begin{aligned} \frac{ \frac{1}{2} m \bar{v}^2}{{\langle {\mathcal{E}} \rangle}} \end{aligned} $$
(37.170)

inside n+nn+ test structures is given in Fig. 37.7a. As can be seen, the error introduced by neglecting the kinetic energy can reach 30% at the beginning of the channel where the carrier temperature is still low and a velocity overshoot is observed. This effect has been studied in [57, 77].

Fig. 37.7
figure 7

The ratio of \(m \bar {v}^2/2\) and \({\langle {\mathcal {E}} \rangle }\) and the ratio of the gradients for different n+nn+ test structures

When looking at the moment equations, it can be seen that not only the absolute value of the convective energy but also its gradient is important. A plot of the ratio

$$\displaystyle \begin{aligned} \frac{ {\boldsymbol{\nabla}} \frac{1}{2} m \bar{v}^2}{ {\boldsymbol{\nabla}} {\langle {\mathcal{E}} \rangle}} \end{aligned} $$
(37.171)

based on one-dimensional data is given in Fig. 37.7b which indicates that this term becomes increasingly important for nanoscale devices. In particular, pronounced spikes are observed at both junctions.

6.1.4 Highest-Order Moment Closure

The equation of highest order contains the moment of the next order which has to be suitably approximated using available information, typically the lower-order moments. As the lower-order equations remain unchanged by this choice, it is worth noting that the whole information about the remaining higher-order equations has to be packed into this closure.

One approach to derive a suitable closure relation is to assume a distribution function and calculate the fourth-order moment. Typically, a heated Maxwellian distribution is assumed [10]. This assumption is problematic, because it only uses partial, and in general insufficient, information about the distribution function.

6.1.5 Mobilities and Relaxation Times

Particularly in the diffusion-dominant regime, the scattering operator is dominant in the Boltzmann equation and must be modeled with sufficient care [13]. The physical mechanisms behind scattering are typically described using Fermi’s golden rule. Using macroscopic transport models, scattering is approximated by transport parameters such as mobilities and relaxation times. From the Boltzmann equation, it is clear that these parameters depend on the full distribution function through the collision operator. Since the distribution function is not uniquely described by the average energy, models based solely on the average energy have limited validity particularly at the drain side of MOSFETs. However, from a practical perspective, this is the only viable option found so far. In order to increase their accuracy, table-based models depending on doping, temperature, and carrier energy can be extracted from bulk Monte Carlo simulations, which is discussed in Sect. 37.7.3.

6.1.6 Complexity of Models

Modeling of carrier transport is far more complex than can be described in this work. Accurate, but complex models for a lot of physical effects from different domains are available. To give just one example, we have hardly discussed the intricacies of carrier generation and recombination. But practical engineering approaches will often rely on fitted analytical models. Models must also be numerically feasible, and one has to compromise between numerical costs and physical sophistication. Handling the trade-off between model accuracy and numerical costs is the craft of technology computer-aided design.

6.2 Non-Maxwellian Energy Distribution

Test structures with n+nn+ topology exhibit similar features as a MOSFET or a bipolar transistor like, e.g., a distinctive velocity overshoot and a mixture of a hot and a cold distribution in the drain region. As such, given their simplicity, they have been very popular for the study of the basic behavior of macroscopic transport models for very small devices within n+nn+ structures without the additional levels of complexity introduced by two-dimensional MOS devices [73, 78].

6.2.1 Qualitative Analysis

Monte Carlo simulations indicate that the shape of the distribution function inside an n+nn+ test structure behaves qualitatively as shown in Fig. 37.8.

Fig. 37.8
figure 8

Schematic evolution of the distribution function inside an n+nn+ test structure

For cold carriers injected at the contacts, the Maxwellian shape provides a good description. In Region I, carriers diffuse against the built-in energy barrier. While moving along Region I, the amount of low-energy carriers in the distribution function decreases due to reflection at the energy barrier. Therefore, unlike predicted by the Maxwellian approximation, only the low-energy range of the energy distribution is filtered out, whereas nearly no changes in the high-energy tail are observed.

It is interesting to note that inside the channel the slope in the low-energy range is almost the same as at the end of the channel. However, the knee energy E1, which is the energy of the kink between different slopes of the distribution function, changes. It shifts toward higher energies as the carriers travel through the channel in Region II.

In Region III, the small number of hot carriers from the channel meets the large pool of cold carriers in the drain, which is visible in the distribution function by a rapid increase of the low-energy part. Region III is very small, and there is little change in the high-energy tail. As the hot carriers travel through Region IV, the temperature of the high-energy tail relaxes to the equilibrium temperature.

It will be argued in Sect. 37.7.2 that the kurtosis β of the distribution function, which is defined as the fourth-order standardized moment, provides enough information to distinguish between different regions of the teststructure. This provides the basis for the development of more accurate models.

6.2.2 Shape of the Distribution Function

The transport models discussed so far only contain moments up to, and including at most, the third order. In general, the distribution function is insufficiently described by these moments. The only even moments available are the zeroth- and the second-order moment which correspond to particle density and average energy. These moments are the parameters of a heated Maxwell distribution function which often gives only a poor approximation to the actual energy distribution.

This is demonstrated for an n+nn+ test structure with Lc = 200 nm in Fig. 37.9. Here, the energy distribution function, which is obtained from Monte Carlo simulations, is shown for four characteristic points inside the structure. For points A and C, which have the same average energy, the distribution function is completely different. In particular, the mixture of a cold and a hot-carrier population leads to an inadequate description via the average carrier energy. In such cases, the extension of the energy transport model to a six-moment transport model (see Sect. 37.7.2) increases the accuracy significantly.

Fig. 37.9
figure 9

Electron temperature (a) and distribution functions (b) at four characteristic points inside an n+nn+ test structure with Lc = 200 nm. The average energies at points A and C are the same, whereas the distribution function looks completely different. Note the high-energy tail at point D where the carrier temperature is 370 K

A heated Maxwellian distribution neither captures the basic shape of the energy distribution function nor the change in the shape when the end of the channel is reached. This is shown in Fig. 37.10 where the failure of the Maxwellian approximation is obvious. The error is largest for high energetic carriers where the tail of the distribution function is severely either overestimated or underestimated.

Fig. 37.10
figure 10

The heated Maxwellian distribution approximation at the four characteristic points of Fig. 37.9. (a) Inside the channel, the number of hot electrons is dramatically overestimated because the Maxwell distribution cannot reproduce the thermal tail. (b) Inside the drain region, a cold and a hot population are visible, which cannot be resolved with a single Maxwell distribution

The importance of the high-energy tail increases with decreasing channel length. This issue becomes crucial for modeling short-channel and hot-carrier effects, e.g., for gate tunneling or hot-carrier degradation, which will be discussed in Sect. 37.8.3.

6.3 Numerical Evaluation

In order to check the validity of the drift-diffusion and energy transport models, we compare simulations of MOS transistors as a function of the channel length.

6.3.1 Modeling of Velocity Overshoot

The carrier velocity in most devices operating near room temperature and under modest bias condition is always limited by scattering. For constant or slowly varying electric fields, carriers cannot go beyond a certain saturation velocity vsat = 107 cm∕s. However, as demonstrated in Fig. 37.11, for short-channel devices, the situation is different. As the channel length decreases, the gradient of the electric field inside the device increases as well. Thus, the carriers will be accelerated without colliding with the lattice (Tn = TL) for at least a few picoseconds. Therefore, the random component of the carrier velocity induced by scattering events is small, which leads to a maximum drift velocity in the range of 107 cm∕s to 108 cm∕s [79]. This is known as the velocity overshoot.

Fig. 37.11
figure 11

Velocity profiles of a 50 nm, 100 nm, and 200 nm long n+nn+ structure calculated with the Monte Carlo method are presented after [10]. The velocity overshoot at the beginning of the lowly doped n-region is clearly visible

In the energy transport model, Tn is an independent solution variable, while in the drift-diffusion model, Tn is only estimated through a local relationship via the electric field. Neglecting diffusion, the velocity follows the mobility as \(\bar {v} = \mu E\). Commonly used analytical models for carrier mobilities in the energy transport and drift-diffusion models are given as

$$ \begin{array}{lll} \mathrm{Energy}\ \mathrm{transport}\ \mathrm{model}\mbox{:}\ \ \mu & =&\displaystyle \mu_0 \frac{T_{\mathrm{L}}}{T_{\mathrm{n}}}\notag\\ \mathrm{Drift-diffusion}\ \mathrm{model}\mbox{:}\ \ \mu & =&\displaystyle \mu_0 \frac{1}{1+\frac{E}{E_{\mathrm{crit}}}}.\notag\end{array} $$

Here Ecrit ≈ 10 kV∕cm is the critical field and μ0 is the zero-field mobility. With this simple model, the velocity in the drift-diffusion model saturates at vsat = μ0Ecrit.

The drift-diffusion mobility follows the field, while the energy transport mobility follows the temperature. Comparing models, we see that the drift-diffusion mobility model estimates the electron temperature via the electric field as

$$\displaystyle \begin{aligned} T_{\mathrm{n}} = T_{\mathrm{L}} (1 + E/E_{\mathrm{crit}}) . \end{aligned} $$
(37.172)

In reality, however, the response of the distribution f to the electric field is delayed, because carriers need time to pick up the energy from the field.

As a consequence, this can result in an overshoot in the carrier velocity for rapidly varying electric fields. The reason for this velocity overshoot is that the mobility depends on distribution function rather than on the electric field. As the mobility has not yet been reduced by the increasing temperature but the electric field is already high, an overshoot in the velocity \(\bar {v} = \mu \, E\) is observed until the carrier energy comes into equilibrium with the electric field again. One of the first works dealing with this effect is [80]. Such nonlocal effects cannot be properly modeled using the common drift-diffusion transport model.

6.3.2 Limitations of the Drift-Diffusion Model

It was found in [71] through a comparison with Monte Carlo simulations that relaxation time-based models tend to overestimate nonstationary carrier dynamics, especially the energy distribution.

Figure 37.12 compares the drift-diffusion estimate for temperature with results from Monte Carlo simulations for various MOS transistors. The gate length varies from 1000 nm (Fig. 37.12a) down to 100 nm (Fig. 37.12d). These and other simulation results demonstrate that the drift-diffusion model is a good approximation down to a gate length of about 250 nm but becomes increasingly inaccurate below. At shorter gate lengths, the carrier energy does no longer directly follow the field. In most regions, the temperature is overestimated by the drift-diffusion model. In some parts, it is also underestimated.

Fig. 37.12
figure 12

Comparison of temperature approximated within the drift-diffusion model with Monte Carlo simulation results for MOSFETs with gate lengths of 1000 nm (a), 350 nm (b), 200 nm (c), and 100 nm (d). Drift-diffusion results provide a reasonable approximation till a gate length of 200 nm

For the same MOS transistors as before, we also compare the velocity between the drift-diffusion model and results from Monte Carlo simulations (see Fig. 37.13). The critical gate length for this example is LG ≈ 350 nm (Fig. 37.13b). At 100 nm (Fig. 37.13d), one can see a significant velocity overshoot in a large portion of the channel. This velocity overshoot is not modeled correctly by the drift-diffusion model where the velocity is bounded by vsat.

Fig. 37.13
figure 13

Comparison of the velocity calculated by the drift-diffusion model with Monte Carlo simulation results for MOSFETs with varying gate lengths of 1000 nm (a), 350 nm (b), 200 nm (c), and 100 nm (d). At 350 nm and below, Monte Carlo simulation results show a velocity overshoot. Velocity in the drift-diffusion model stays bounded

6.3.3 Drift-Diffusion Versus Energy Transport

Banoo and Lundstrom [72] compared the results obtained by (37.142)– with a drift-diffusion model and a solution of the BTE obtained by using the scattering matrix approach. They found that the used variant of the energy transport model dramatically overestimates both the drain current and the velocity inside the device.

Figure 37.14 shows the IV characteristics from drift-diffusion and energy transport models and compares them with Monte Carlo simulation results. For long-channel devices with LG ≈ 1 µm, all models yield the same results (not shown here). At 250 nm (Fig. 37.14a), the models are still close. For submicron devices with LG ≈ 100 nm (Fig. 37.13b), the energy transport model is closer to Monte Carlo simulation results. For short-channel devices with LG ≈ 50 nm (Fig. 37.14c), the energy transport model overestimates the current. In Fig. 37.14d at a channel length of LG ≈ 25 nm, the drift-diffusion is even better than the energy transport model.

Fig. 37.14
figure 14

Comparison of drain currents from the drift-diffusion and energy transport models with Monte Carlo simulation results for a doublegate MOSFET with a gate length of 250 nm (a), 100 nm (b), 50 nm (c), and 25 nm (d). The drift-diffusion model stays accurate down to a gate length of 250 nm. The energy transport model is accurate down to 100 nm

To better understand this behavior, one may look at the distributed quantities inside the transistor, for instance, the velocity profile. Figure 37.15a shows the velocity profile for a channel length of 250 nm Fig. 37.15b, for 50 nm, and Fig. 37.15c for and 25 nm, respectively. While the velocity in the drift-diffusion model is always bounded, the Monte Carlo simulation results show a velocity overshoot. This is quite well matched in the energy transport model at 250 nm. However, at shorter-channel lengths, the velocity is overestimated in the energy transport model, and at 25 nm, the velocity profile of the energy transport model is far off from Monte Carlo simulation results.

Fig. 37.15
figure 15

Comparison of velocity profiles from a drift-diffusion and energy transport model with Monte Carlo simulation results for MOSFETs with varying gate lengths of 250 nm (a), 50 nm (b), and 25 nm (c). Down to 250 nm, the energy transport model matches the Monte Carlo simulation results quite well. At shorter gate lengths, the energy transport model overestimates the velocity

6.3.4 Spurious Velocity Overshoot

Compared to drift-diffusion models, energy transport models have advantages concerning the modeling of velocity overshoot in the main channel region. However, when the electric field decreases rapidly, these models also tend to show a nonphysical (spurious) velocity overshoot. This is the case, for instance, at the end of the channel of a MOSFET. The same effect is more pronounced in n+nn+ test structures (see Sect. 37.8.1 for more simulation results). Models based on Bløtekjær’s approach have been frequently associated with such nonplausible spikes in the velocity characteristics which do not occur in Monte Carlo simulations. Since the velocity overshoot at the end of the channel is not observed by Monte Carlo simulations, this effect is known as the spurious velocity overshoot (SVO) [81,82,83].

In [34], this spurious velocity overshoot is investigated using different mobility models. It is found that improved results are possible when proper mobility models are used. For example, with Hänsch’s mobility model (), these spikes are strongly diminished but not completely removed. Unfortunately, Monte Carlo simulations show that () also overestimates the real velocity overshoot at the beginning of the channel.

7 Advanced Macroscopic Transport Models

As device geometries are further reduced without according reduction of the supply voltages, the electric fields occurring inside the devices increase rapidly. Furthermore, strong gradients in the electric field are observed. These highly inhomogeneous field distributions give rise to distribution functions which deviate significantly from the frequently assumed Maxwellian distribution.

7.1 Ansatz-Based Approaches

Various improved macroscopic transport models have been developed based on analytical descriptions of the distribution function, e.g., [84]. Unfortunately, the model from Liotta and Struchtrup [85] and the model from Nekovee [86], which are both discussed below, require a large number of moments, because the chosen ansatz functions converge slowly.

Liotta and Struchtrup [85] find that a hierarchy containing 12-moment equations is needed to reproduce simulation results obtained from an expansion in spherical harmonics. The proposed model uses an ansatz for the distribution function to calculate the closure relations which is less general than moment-based models with carefully derived closure relations.

A conclusion similar to Liotta and Struchtrup is drawn in [86] by Nekovee et al., who solve the Boltzmann equation using moment-based models. However, this “moment” model is based on an expression for the distribution function using a sum of Hermite polynomials multiplied by a Maxwellian shape. They find that the hierarchy of equations converges too slowly and the model fails to predict results for ballistic diodes. Instead of the moments, the parameters of the distribution function are considered in [86]. Hence, we feel that their conclusion is not applicable to moment-based models [10, 87].

Another approach [88,89,90] derives only the closure relations from an analytical distribution function model based on the maximum entropy principle. However, these models are difficult to handle numerically as they are hyperbolic.

An explicit ansatz for the distribution function can be avoided in the derivation of a higher-order moment-based transport model. In particular, a macroscopic transport model has been proposed in [51] which considers the first six moments in the Boltzmann transport equation and closes the system using an empirical relation matched to Monte Carlo simulation results.

7.2 Higher-Order Moment Models

A natural extension of energy transport models which retains the basic structure of the equation system is to include higher-order moment equations. The only assumption made about the distribution function is the validity of the diffusion approximation in order to obtain a numerically tractable (parabolic) model [52]. Ignoring generation and recombination terms, the flux and balance equations of the six-moment model read for electrons:

$$ \begin{array}{lll} {{{{\boldsymbol{\nabla}}}}} \cdot \frac{\mathbf{J}}{{q}} & =&\displaystyle \mathop{\partial_{\mathrm{t}}}n\end{array} $$
(37.173)

The additional unknowns of the six-moment model are the kurtosis of the distribution function β and the kurtosis flux density K, which are defined as

$$\displaystyle \begin{aligned} \beta = \frac{3}{5} \frac{{\langle {\mathcal{E}}^2 \rangle}}{{\langle {\mathcal{E}} \rangle}^2} \qquad \mathrm{and} \qquad \mathbf{K} = {\langle \mathbf{v} \, {\mathcal{E}}^2 \rangle}.\end{aligned} $$
(37.179)

This definition of the kurtosis is based on the diffusion approximation. The additional transport parameters are the kurtosis relaxation time τβ and the kurtosis flux mobility μK. In terms of the kurtosis, the fourth-order moment is expressed asIn the six-moment model above, we have used an empirically found closure [51] for the sixth-order moment which depends on the third power of β:With this closure, the six-moment model is contained as a special case in the model hierarchy described by Jüngel in [13, p.187].

The first four equations are the same as for the energy transport model, except that the kurtosis β appears in the energy flux equation (). As a consequence, the energy flux equation cannot be written in the form frequently used for energy transport models as proportional to the current density (cf. ()) without producing additional terms. This modification makes the coupled equation system difficult to solve, and approximations have been used [51]. Note that the six-moment model reduces to a standard energy transport model when the equations containing K are dropped and a value of unity is assumed for β.

Discretization and numerical solution of the six-moment model are covered in [15]. A generalized Maxwellian closure for the sixth moment is proposed in [51], and a cumulant-based closure is discussed in [91, 92]. Correction terms for non-parabolicity are added to the six-moment model in [73, 74].

7.2.1 Properties of the Kurtosis

For a heated Maxwell distribution and parabolic bands, the kurtosis βM is identical 1. Thus, a β ≠ 1 quantifies the deviation from the heated Maxwellian shape. Note that a Maxwellian shape is never observed in Monte Carlo simulations, except for the contact regions where the carriers are still cold and the Pauli principle is neglected.

For the special case of a bulk semiconductor, the kurtosis can be expressed as a function of the carrier temperature. By considering the homogeneous six-moment model, the following expression is obtained [10]:

$$\displaystyle \begin{aligned} \beta_{\mathrm{bulk}}(T_{\mathrm{n}}) = \frac{T_{\mathrm{L}}^2}{T_{\mathrm{n}}^2} + 2 \, \frac{\tau_\upbeta}{\tau_{\mathcal{E}}} \frac{\mu_{\mathrm{S}}}{\mu_{\mathrm{n}}} \Bigl(1 - \frac{T_{\mathrm{L}}}{T_{\mathrm{n}}}\Bigr) {}.\end{aligned} $$
(37.182)

This is an analogy to the homogeneous energy transport equation used to estimate the carrier temperature from the electric field as discussed in Sect. 37.5.4. Since in inhomogeneous samples Tn already lags behind the field (for increasing E), some nonlocality is also expected for beta, and an expression estimating β from Tn is preferable to one using E. Figure 37.16a shows that the doping dependence of the bulk kurtosis βbulk(Tn) is only relevant at doping concentrations higher than 1016 cm−3 and for lower-carrier temperatures.

Fig. 37.16
figure 16

The kurtosis for bulk (a) and for different n+nn+ test structures (b). Note the strong deviation from unity after the second junction

As shown in Fig. 37.16b, the kurtosis behaves quite differently in inhomogeneous samples than would be expected from the bulk relation βbulk(Tn). While inside the channel β is typically below unity and reasonably close to the bulk relation, values larger than unity are observed at the end of the channel region. This deviation corresponds to the high-energy tail of the distribution function in Fig. 37.9. Typical values of the kurtosis β are in the range [0.75, 3] which indicates a strong deviation from a heated Maxwellian distribution.

7.2.2 Modeling of the Transport Parameters

Modeling of the additional parameters μK and τβ faces similar problems as discussed in Sect. 37.5.5. These parameters contain information on hot-carrier effects. Both parameters show a clear hysteresis when plotted over the average energy as shown in Fig. 37.17.

Fig. 37.17
figure 17

(a) The ratio of the mobilities μSμ and μKμ for an n+nn+ test structure with Lc = 200 nm. (b) The energy and the kurtosis relaxation times \(\tau _{\mathcal {E}}\) and τβ

In the six-moment model, the interplay between the various parameters is highly complex and has a strong impact on the numerical stability of the whole transport model. Simple analytical models based on constant values as the ones used in [51] do not always deliver satisfactory results.

For an appropriate description of higher-order transport models, it is therefore important to model the transport parameters with as few simplifying assumptions as possible. To minimize uncertainties arising from this issue, all physical parameters might be extracted as a function of the doping concentration and the average energy from homogeneous Monte Carlo simulations. This is discussed below.

7.3 Table-Based Parameter Modeling

Modeling transport parameters like the mobilities μ, μS, and μK and the relaxation times \(\tau _{{\mathcal {E}}}\) and τβ is challenging, since they all depend on the actual shape of the distribution function, on the scattering rates, as well as on the band structure [74]. Having many adjustable parameters is an inconvenience inherent in many higher-order transport models based on analytical models for the mobilities and relaxation times [10]. In particular, a consistent comparison with Monte Carlo simulations is difficult, when the resulting transport models do not reproduce the Monte Carlo simulation results in the homogeneous case. For a meaningful comparison, the models should be consistent at least in the bulk case, which already is far from trivial.

In [93], a transport parameter model based on homogeneous fullband Monte Carlo tables has been introduced. By adjusting this method for the six-moment model, all higher-order transport parameters are extracted from bulk Monte Carlo simulations for different doping concentrations and for different driving forces. In the macroscopic models, the transport parameters are then considered as a function of doping and the average energy [73, 94]. In this way, the fullband structure of the material and scattering mechanisms such as phonon-induced scattering are inherently included in the model. Correction factors for non-parabolic band structures have also been obtained [74].

Since all model parameters are obtained from bulk Monte Carlo simulations, the transport models are free of fit parameters which leaves us with “no knobs to turn” [95]. Macroscopic models based on Monte Carlo simulation data improve on their counterpart models based on analytical models significantly, both in terms of numerical stability and in the agreement with Monte Carlo device simulations [74].

In Fig. 37.18, the extracted bulk mobility parameter set for higher-order macroscopic transport models is displayed. Here the carrier mobility μ (Fig. 37.18a) and higher-order mobilities μS (Fig. 37.18b) and μK (Fig. 37.18c) as a function of the electric field Eabs for different doping concentrations Nd are presented. As can be observed for each type of mobility, the values of the mobility are independent of the doping concentration for fields above 100 kV∕cm, which is where v = vsat and μ = vsatE. For low fields, the mobilities are much larger at low doping concentrations compared to high doping concentrations.

Fig. 37.18
figure 18

Carrier mobility μ, (a) energy flux mobility μS, and (b) second-order energy flux mobility μK versus driving field for different doping concentrations. For fields higher than (c) 100 kV∕cm, the mobilities are independent of the doping concentration. For low fields, the values of the mobilities are much larger at low doping than at high doping concentrations

The value of all three types of mobilities is comparable at high doping concentrations. For low doping concentrations and low fields, the energy flux mobility and the second-order flux mobility are smaller than the carrier mobility.

Figure 37.19 presents the relaxation times \({\tau _{{\mathcal {E}}}}\) and τβ for different doping concentrations as a function of the kinetic energy of the carriers. As can be seen, for high energies, the relaxation times become doping-independent.

Fig. 37.19
figure 19

Energy relaxation time \({\tau _{{\mathcal {E}}}}\) (a) and second-order energy relaxation time τβ (b) extracted from bulk Monte Carlo simulations as a function of the kinetic energy for different bulk dopings. For very high energies, the relaxation times decrease due to the increase of optical phonon scattering

For a fixed energy, the Monte Carlo simulations predict lower relaxation times with higher Nd. In a certain field regime, where optical phonons can be neglected, the energy relaxation time \({\tau _{{\mathcal {E}}}}\) is approximately constant. However, for driving fields above 450 kV∕cm, the energy relaxation time is no longer constant but decreases due to the increase of optical phonon scattering, which is an inelastic process.

The carrier velocity as a function of the lateral field and for different Nd is shown in Fig. 37.20. The saturation velocity of Si is reached at a driving field of 150 kV∕cm.

Fig. 37.20
figure 20

Bulk velocity of electrons as a function of the driving field Eabs for a doping of 1014 cm−3, 1016 cm−3, and 1018 cm−3. In the low-field regime, the electron velocity for high dopings is lower than the velocity of the low dopings, while the value of the velocity converges for high fields

In Fig. 37.21, the bulk carrier temperature as a function of the electric field calculated with the bulk fullband Monte Carlo method is presented. The quadratic dependence of the carrier temperature on the electric field () is a good approximation for fields lower than 200 kV∕cm, while for fields up to 450 kV∕cm a linear approximationcan be used assuming μ = vsatE. However, for driving fields above 450 kV∕cm, the linear approximation breaks down due to optical phonon scattering.

Fig. 37.21
figure 21

Carrier temperature Tn as a function of the driving field in a homogeneous bulk simulation carried out with the fullband Monte Carlo method. For lower fields, the carrier temperature is a function of E2, while for high fields, the temperature is closer to a linear dependence on the driving field

8 Applications

This section discusses applications of the six-moment model in the deca-nanometer regime. The channel length range of deca-nanometer devices is here defined approximately from 100 nm down to 20 nm.

In order to consider the high-field case as accurately as possible, a transport model based on fullband Monte Carlo tables is considered. The results of the Monte Carlo table-based higher-order transport models are benchmarked for n+nn+ test structures and double-gate MOSFETs against Monte Carlo simulations. Occasionally, the spherical harmonic expansion simulator is used as a reference, because the results are very close to those obtained by the Monte Carlo method, but the computational effort is considerably lower, particularly for rare events which determine, for example, the high-energy tail of the energy distribution function.

When the channel length is reduced to increase the operation speed and the number of components per chip, the so-called short-channel effects arise [96]. An adequate description of these phenomena relies on a detailed description of the distribution function which is provided by the six-moment model [11, 74]. A variety of short-channel effects such as the velocity overshoot, the impact ionization, and the influence of hot electrons on the carrier distribution function are investigated.

8.1 n+nn+ Test Structures

We first investigate the validity of higher-order transport models on a series of the most popular test devices, one-dimensional n+nn+ structures. Due to the simplified one-dimensional structure and the requirement of only one carrier type, the influence of the various parameters on basic quantities like the velocity or the carrier temperature can be more easily separated and interpreted. However, as can be seen from the examples below, even for these simple structures, such an interpretation is far from trivial.

8.1.1 Model Check

As discussed in the previous sections, macroscopic transport models are based on many empirical assumptions. Hardly any of these can be justified theoretically below a channel-length of one micrometer. Hence, we choose 1000 nm as the calibration point, where all macroscopic transport models together with the spherical harmonic approach, which is the reference simulator here, should yield the same result.

Figure 37.22 shows the output currents of different n+nn+ structures for channel lengths of 100 nm, 250 nm, and 1000 nm calculated with the DD, ET, SM, and the spherical harmonic expansion model. For a channel length of 1000 nm, all models yield the same results with an error below 1% as can be observed in Fig. 37.23. While the error of the ET and SM model stays more or less constant below 2.5% for a channel length down to 250 nm, the error of the DD model continuously increases and reaches a value of −16% for a channel length of 100 nm. Therefore, simulating short-channel devices with the DD model gives only poor results. While the inaccuracy of the ET model starts to increase below 250 nm, the SM model still gives results very close to spherical harmonic expansion simulations.

Fig. 37.22
figure 22

Output currents for different n+nn+ structures calculated with DD, ET, and SM models. As a reference, spherical harmonic expansion simulations are used. For 1000 nm, all models predict the same current, while the DD model underestimates the current for a channel length of 100 nm

Fig. 37.23
figure 23

Relative error of the current calculated with the DD, ET, and the SM model as a function of the channel length LCh. A voltage of 1 mV∕nm × LCh has been applied. While the relative error in the ET and the SM model is always below 7.5%, the relative error in the DD model approaches 16% at a channel length of 100 nm

8.1.2 Velocity Profile

In Fig. 37.24, the evolution of the velocity profile within several n+nn+ structures calculated with the DD, ET, SM, and the spherical harmonic expansion model as a reference is shown. The bias conditions were chosen in such a way to result in an electric field of 50 kV∕cm in the middle of the intrinsic region (loosely referred to as channel region in the following) for each channel length. Overall, the SM model predicts a velocity profile closer to the spherical harmonic expansion data than the DD and the ET model. From this, we conclude that the closure relation of the SM model is improved compared to the ET model.

Fig. 37.24
figure 24

Evolution of the carrier velocity profiles for decreasing channel lengths calculated with the DD, the ET, and the SM model. The velocities are compared to the results obtained from spherical harmonic expansion simulations. While the maximum velocity of the DD model is the saturation velocity vsat, the spurious velocity overshoot at the end of the channel in the ET and the SM model is clearly visible. The velocity overshoot at the beginning of the channel can be quantitatively identified for devices smaller than 100 nm in the ET and the SM model

For long-channel devices, all models yield a similar velocity profile. For decreasing channel length below LCh = 250 nm, the spurious velocity overshoot in the ET model and the reduced one in the six-moment model are clearly visible. On the other hand, the DD model does not predict any velocity overshoot at all and stays always below the bulk saturation velocity vsat.

In order to explain this effect, various theories have been put forward. Some authors have argued that there is a relation with non-parabolicity effects [97], whereas other theories relate it to the hysteresis in the mobility [98].

However, it was demonstrated in [99] that the SVO is caused by deficient models for the closure relation and the transport parameters. For higher-order transport models, the error in the SVO decreases. This can also be seen in Fig. 37.24 where the six-moment model is consistently better than the energy transport model.

8.1.3 Drain Current

The improved accuracy in the device characteristics obtained from higher-order moments is also reflected in the currents, which is demonstrated in Fig. 37.25. Here, the output characteristics of a 40 nm and 80 nm channel length n+nn+ structure calculated with the DD, ET, SM, and the reference spherical harmonic expansion model are shown.

Fig. 37.25
figure 25

Output currents of an 80 nm and a 40 nm channel length n+nn+ structure calculated with the DD, ET, SM, and spherical harmonic expansion model. The ET model overestimates the current at 40 nm, while the SM model yields the most accurate result

While the relative error of the current calculated with the SM and the ET model stays small in long-channel devices (see Fig. 37.23), there is a significant deviation for short-channel devices as shown in Fig. 37.26. In the short-channel range, from 40 to 100 nm, the current calculated with the SM model is below an error of 10%. In contrast, at a channel length of 40 nm, the errors of the DD and the ET model are at −30% and 40%, respectively.

Fig. 37.26
figure 26

Relative error in thecurrent of the DD, the ET, and the SM model for an n+nn+ structure in the channel range from 100 nm down to 40 nm. At 40 nm the relative error of the SM model is below 6%, while the error of the DD and the ET model is at −30% and 40% respectively

As has been pointed out, the ET model accurately describes current transport down to a channel length of about 80 nm, but a strong increase in the error of the current can be observed below 80 nm. Therefore, the ET model is a suitable transport model for devices down to 80 nm channel lengths only. For a channel length below 80 nm, the SM model appears to be the model of choice. One additional strength of the six-moment model is that it gives more information about the distribution function than the ET model.

8.2 A Double-Gate MOSFET

Previously, the transport models have been validated with n+nn+ structures. Although it has been frequently claimed that n+nn+ structures emulate the behavior of MOS transistors, the most important devices in silicon technology, this is only partly true.

8.2.1 Drain Current

In Fig. 37.27, we compare results from drift-diffusion, energy transport and six-moment model with Monte Carlo simulation results for double-gate MOSFETs. All models use consistent bulk parameters. For long-channel devices down to a gate length of 250 nm (Fig. 37.28a) all of the models yield consistent results for the drain current.

Fig. 37.27
figure 27

Comparison of drain currents from the drift-diffusion, energy transport, and six-moment models with Monte Carlo simulation data for a double-gate MOSFET with varying gate lengths of 250 nm (a), 100 nm (b), 50 nm (c), and 25 nm (d). The six-moment model stays accurate down to a gate length of about 50 nm

Fig. 37.28
figure 28

Comparison of the velocity from the drift-diffusion, energy transport, and six-moment model with Monte Carlo simulation data for a double-gate MOSFET with gate length of 250 nm (a), 50 nm (b), and 25 nm (c). The spurious velocity overshoot is much less pronounced than for n+nn+ structures. The six-moment model stays closest to the Monte Carlo simulation results

However, the DD model consistently underestimates the current, particularly in the saturation regime. The ET model starts to deviate from the Monte Carlo simulation data at 100 nm (Fig. 37.28b), while the SM model properly reflects the output characteristic down to 50 nm (Fig. 37.28c). For gate lengths below 50 nm, all macroscopic models fail to describe the reference current.

8.2.2 Velocity

For a detailed analysis, we make a comparison of distributed quantities. Among several quantities, we find that velocity profiles are the most critical. Getting the right velocity profiles is important for correct terminal currents. Velocity profiles are depicted in Fig. 37.28. As above, for 250 nm, all models give reasonable results (Fig. 37.28a). At 50 nm and below (Fig. 37.28b-c), the ET models overestimate the velocity. The results of the SM model stay closest to Monte Carlo simulations but is also off at 25 nm.

The energy transportmodels tend to overestimate the velocity overshoot and introduce a spurious velocity overshoot at the end of the channel region of n+nn+ structures. For MOS transistors, on the other hand, the SVO coincides with the velocity overshoot at the end of the channel and is therefore not explicitly visible.

8.2.3 Predictiveness

To compare the predictiveness of macroscopic models, we plot drain current and transit frequency as a function of the gate length in Fig. 37.29. In both figures, the error of the DD model is much higher. The six-moment model and the energy transport model show similar behavior. But the six-moment model is closer to the Monte Carlo simulation results as it utilizes more information about the distribution function.

Fig. 37.29
figure 29

Comparison of drain currents (a) and transit frequency (b) as a function of the gate length from the drift-diffusion, energy transport, and six-moment model with Monte Carlo simulation data for a double-gate MOSFET. The six-moment model is predictive down to 50 nm

Despite the apparent limitations, it is possible to use drift-diffusion transport models also for deca-nanometer devices [100, 101]. These drift-diffusion models use mobility models and saturation velocities which are inconsistent with bulk data [102] and can provide short-term fixes to available models. This, obviously, is a fitting approach and has limited value for predictive simulations.

As a side note, we would like to mention that with shrinking device geometries quantum effects gain more importance and limit the validity of the BTE itself [103]. When the device dimensions are comparable to the carrier wave length, the carriers can no longer be treated as classical pointlike particles, and effects originating from the quantum mechanical nature of propagation begin to determine transport [104].

8.2.4 Numerical Properties

From a practical point of view, it has to be pointed out that convergence problems are an issue and inhibit the use of higher-order moment equations in everyday engineering applications. Unfortunately, simulation codes based on higher-order moment equations have never reached a robustness comparable to the drift-diffusion model. Several discretization schemes have been proposed [45], but no generally accepted discretization like the Scharfetter-Gummel scheme [7] for the drift-diffusion equations exists.

8.3 Modeling of Hot-Carrier Effects

Hot carriers can lead to a number of interesting phenomena and are difficult to model in the drift-diffusion model, because too little information about the distribution function is available.

Within DD and ET, it is difficult to estimate the deviation of the energy distribution function from the Maxwellian shape. By contrast, the SM model provides the kurtosis, which has been shown to be crucial in that respect.

8.3.1 Kurtosis

As will be shown below, the kurtosis provides information to distinguish the channel region from the drain region [87]. The kurtosis of an n+nn+ structure with 100 nm channel length and a total length of 500 nm is visualized in Fig. 37.30 for different source and drain dopings. A constant channel doping of 1016 cm−3 is considered. As can be observed, for low doping, the maximum peak of the kurtosis is at 300 nm. For high doping, the maximum is at about 220 nm.

Fig. 37.30
figure 30

Kurtosis calculated for different source and drain doping concentrations in an n+nn+ structure with a channel length of 100 nm. With increase in the doping concentrations, the peak of the kurtosis moves closer to the channel, as the cold carriers in the highly doped region strongly suppress the high-energy tail

This can be explained as follows: Due to the higher concentration of cold electrons in the highly doped drain region, the relaxation of hot carriers is faster than in the low doped drain region. Hence, the maximum peak of β for the high doping concentration case is 25% higher than for low doping concentrations.

In Fig. 37.31, the second-order temperature Θ defined as

$$\displaystyle \begin{aligned} {\Theta} = {\beta}{T_{\mathrm{n}}} \end{aligned} $$
(37.184)

and the carrier temperature Tn are presented for a short- and a long-channel device. For the long-channel device, the hot distribution part can be neglected due to the small deviation of the second-order temperature Θ from Tn. Note that for cold carriers, where the energy distribution function follows a Maxwellian distribution for which β = 1, we also have Θ = Tn.

Fig. 37.31
figure 31

Carrier temperature Tn together with the second-order temperature Θ for a 1000 nm and a 60 nm device. A bias leading to a maximum field of 50 kV∕cm has been applied. While in the long-channel device a Maxwellian can be used, the high-energy tail in the short-channel device in the drain region increases

Large field gradients strongly influence the kurtosis as shown in Fig. 37.32. Here, the kurtosis for fields of 5 kV∕cm, 20 kV∕cm, and 50 kV∕cm for a 100 nm channel length structure is shown in the upper part of Fig. 37.32. The electric field has been calculated in the middle of the channel at Point A. As can be seen for a low field of 5 kV∕cm, where the carrier temperature is low (see the lower part), a heated Maxwellian (βM = 1) can be used, while for 20 kV∕cm, an increase of the kurtosis at the beginning of the drain region is apparent. An overall significant increase of the kurtosis can be observed for high fields. As shown in the upper part of Fig. 37.32, the kurtosis starts to rise when the maximum of the carrier temperature decreases to the equilibrium value. This is the region where the hot electrons from the channel meet the large pool of cold electrons in the drain region, which quickly thermalizes the energy distribution function.

Fig. 37.32
figure 32

Kurtosis β (a) and the carrier temperature (b) for electric fields of 5 kV∕cm, 20 kV∕cm, and 50 kV∕cm through a 100 nm channel n+nn+ device (the value of the fields is calculated at Point A from Fig. 37.30). For high fields, the kurtosis increases at the beginning of the drain region, which means that the high-energy tail of the distribution function becomes very important. In the lower part of the figure, the carrier temperature profile for different electric fields is shown. The kurtosis exceeds unity in the region, where the carrier temperature drops down

8.3.2 Velocity Overshoot

Within this work, the velocity overshoot has already been shown for n+nn+ test structures in Fig. 37.24 and for double-gate MOSFETs in Fig. 37.28. For a complete discussion, the velocity is given in Fig. 37.33 for the same type of structure which has been simulated in the previous figures. The ET transport model yields the same velocity profile in the low-field regime as the SM model, which is an indication that a Maxwellian is a good approximation at low fields. However, for high fields, the ET model overestimates the velocity profile of the DD and the SM model and exhibits a maximum at the end of the channel.

Fig. 37.33
figure 33

The velocity profile for fields of 5 kV∕cm and 50 kV∕cm is shown for a MOSFET. For low fields, all models yield the same velocity profiles, which is an indication that the heated Maxwellian can be used. For high fields, a significant deviation of the velocity profiles can be observed

8.3.3 Impact Ionization

Impact ionization due to high lateral fields occurs particularly in n-channel MOSFETs, because of the higher velocity of electrons compared to holes. The electrons in the conduction band collide with electrons in the valence band and generate electron-hole pairs. Hence, the probability of impact ionization for electrons in a strong field is determined by the probability that the electrons will acquire the ionization energy of the atoms from the field [105].

Figure 37.34 shows impact ionization rates of a 200 nm (Fig. 37.34a) and a 50 nm (Fig. 37.34b) channel device calculated with the DD, ET, and SM model and the Monte Carlo method. As can be observed, the impact ionization rate predicted by the SM model is closer to Monte Carlo simulation data than the ET and the DD model due to the better modeling of the distribution function in the SM model [106, 107].

Fig. 37.34
figure 34

The impact ionization rate calculated with Monte Carlo, the DD, ET, and the SM model for a 200 nm (a) and a 50 nm structure (b). Due to the better modeling of the distribution function in the SM model, the results are closer to the Monte Carlo simulation data than those of the DD and the ET model

8.3.4 Hot-Carrier Gate Currents

In the channel region of a MOSFET, the energy distribution deviates from the ideal shape implied by a heated Maxwellian. The carrier energy distribution influences tunneling processes which is important for modeling of gate leakage currents in turned-on devices. If a heated Maxwellian approximation is used for the description of hot-carrier tunneling, the gate current density is heavily overestimated. This effect is found to be especially pronounced for devices with short gate lengths. Some non-Maxwellian models are reviewed in [108], and it is found that a model which is based on the solution variables of a six-moment transport model accurately reproduces the Monte Carlo simulation results.

8.3.5 Hot-Carrier Degradation

Hot-carrier degradation is a major device reliability issue, particularly for MOSFETs. The degradation is commonly assumed to be driven by the generation of traps at or near the Si/SiO2 interface.

Charge carriers in the channel can gain high kinetic energy and trigger the creation of defects at the Si/SiO2 interface [109, 110]. The driving force of this process is the energy deposited by charge carriers. For a proper physics-based description of this complex phenomenon, one again requires a detailed knowledge about the carrier transport in the targeteddevices [111,112,113,114,115,116].

9 Summary and Conclusion

Macroscopic models for current transport in semiconductors can be formulated by following two approaches: phenomenologically, where the semiconductor equations are based on the laws of mass and energy conservation as well as the principles of irreversible thermodynamics, or systematically, where a set of equations is derived starting from the Boltzmann transport equation.

Transport models can be derived from the semiclassical Boltzmann equation using various methods. The method of moments can be considered the most general one as it does not require an explicit ansatz for the distribution function. Within the method of moments, the Boltzmann transport equation is multiplied with a polynomial weight function, and the whole equation is integrated over k-space. Each equation contains information about the next higher-order moment which yields an infinite number of coupled equations. To obtain a tractable equation set, this hierarchy has to be truncated at a certain order. The next higher-order moment has to be modeled as a function of the available lower-order moments which is referred to as the closure of the equation system.

When only the first two moments are considered, the drift-diffusion model is obtained, which is still predominantly used in engineering applications due to its simplicity and numerical stability. As the drift-diffusion model cannot capture nonlocal effects, which become increasingly important for miniaturized devices, its use becomes questionable. Therefore, higher-order equations have to be considered.

Inclusion of the first three or four moments of Boltzmann’s equation results in hydrodynamic models which are, however, numerically too complicated to solve for everyday’s use due to their hyperbolic nature. Within the framework of the diffusion approximation, the convective terms in the hydrodynamic models can be neglected, resulting in much simpler parabolic energy transport models, which are offered by the leading commercial device simulators in addition to the traditional drift-diffusion model.

A detailed study reveals some common problems observed in macroscopic transport models using only the first three or four moments. Most importantly, the energy distribution function is frequently modeled by a heated Maxwellian distribution. This distribution function model is then used to derive a closure relation. However, Monte Carlo simulations show that the energy distribution function is only poorly described by a heated Maxwellian distribution function, both for bulk and inhomogeneous devices. Another important issue is the modeling of the relaxation times and mobilities. Both quantities show a pronounced hysteresis when plotted over the average energy. This is insofar of importance as both quantities are often modeled as a single valued function of the average energy. A comparison with Monte Carlo simulation data reveals that this error in the mobilities and the error introduced in the closure relation cause a spurious overshoot in the velocity characteristics.

Based on the observations made during the evaluation of transport models including the first four moments of Boltzmann’s transport equation, an extended model has been proposed which includes the first six moments. The additional even order moment, the kurtosis of the distribution function, has been proven to be beneficial for device simulation and the modeling of hot-carrier effects.

Several interesting cases where the heated Maxwellian assumption introduces impermissibly large errors have been discussed, particularly impact ionization, hot-carrier gate currents, and hot-carrier degradation. Highly satisfactory results are obtained in all cases. The six-moment transport model, the six-moment impact ionization model, and the six-moment gate current model have been implemented in the device simulator Minimos-NT [117]. It has been demonstrated that the six-moment model can improve on the drift-diffusion and energy transport models and still has the advantage of vastly reduced numerical costs in relation to Monte Carlo simulations [11, 74].