Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Introduction

As defined in Chapter 2, reliability engineering is the process of analyzing the expected or actual failure modes of a product and identifying actions to reduce or mitigate their effect. A Failure Mode describes the way in which a product or process could potentially fail to perform its desired function and can be defined in several ways, of which the most common is in a progression of time, where a failure mode comes between a cause and an effect. However, it is also possible that in some cases the cause or effect themselves might be the failure mode or for a single event to be a cause, effect and a failure mode. In practice, it is more likely that a single cause might have multiple effects or a combination of causes might lead to an effect. Failure modes are sometimes also called categories of failures and may be broadly categorized into two types – Design Failure Modes and Manufacturing Failure Modes, depending on their origin in the product development phase.

In this chapter, we take a broad look at defect mechanisms and associated failure modes typically observed in a variety of MEMS enabled products, without necessarily limiting ourselves to a specific type of fabricated device. The chapter focuses on failure modes obseved primarily in the MEMS element, and to a lesser extent failure modes related to the interaction of the sensor and the package, but not specifically to the IC that may co-exist with the sensor in the product. A more detailed study of IC related failure is wide available [1] and is usually very specific to the process technology node. One key characteristic that differentiates MEMS sensors from traditional IC’s is the use of a wide variety of unique process steps and engineering materials. The focus will not be on specific micromachining process steps used to create the MEMS element (such as in [2]) but on failure modes encountered in such processing and which are dependent on the particular product being developed. Surface micromachining for example, is one of the most popular fabrication flows for MEMS which uses a doped silicon starter wafer with subsequent layers of polysilicon, oxide, nitride and metal such as shown in Fig. 3.1. Such MEMS are fabricated with a wide variety of materials including metals, dielectrics [3], and polymers which are not as unique as silicon [4] in terms of being used for both the electrical and mechanical parts of a MEMS element [5], and has been shown to be versatile not only for designers but because it is highly conformable to standard manufacturing processes used in the semiconductor industry and thus reduces the need to develop completely new fabrication infrastructure.

Fig. 3.1
figure 1

MEMS assembly fabricated in polysilicon (reprinted with permission Copyright 2005 Simon Frasier University – Institute of Micromachine and Microfabrication Research [6])

As mentioned earlier, most failures observed in any MEMS enabled product can usually be traced back to a design or manufacturing decision [7], and the choice made to either use a particular material or process step. In a given product, the design phase may introduce failure mechanisms of three different varieties – functional, material, or non-analyzed depending on whether the design was properly and sufficiently analyzed for the chosen fabrication process and operating conditions or not, and whether the particular process chosen has been properly characterized. This has not always been easy because the earliest MEMS products relied heavily on a methodology of successive iterationFootnote 1 to achieve the performance functionality necessary, but in recent years, there has been a significant increase in the use of MEMS design tools that can provide detailed insight into the behavior MEMS devices prior to actual fabrication. While this has significantly reduced the overall product development time and cost, there has been a significant impact in the reduction of the possible failure modes or mechanisms.

The next section discusses potential failures modes that originate in the design phase.

3.2 Design Phase Failure Modes

The discussion in this section is restricted to failure modes that are distinguished by their origin in the design phase where they mainly impact the reliability of the product through the performance of the device. Broadly, we can identify two sub-categories of failures – functional and material modes.

3.2.1 Functional Failure Modes

Functional failure modes imply a degradation, loss or absence of intended performance under operating conditions due to inadequate or insufficient design leading to a deviation from the product specification. A functional failure affects functionality of the device in the field and thus impacts the overall reliability of the part. The loss of function of the part may occur at the beginning or later in lifeFootnote 2 but either way the failure is due to insufficient design. A good example of such a functional failure mode is the catastrophic failure of the device due to mechanical shock.

In Fig. 3.2, the catastrophic failure of a polysilicon spring is shown. Such a failure mode can routinely occur in MEMS devices either due to inappropriate handling of the part during assembly or if the device is subject to a large enough in-use shock in the field. The behavior of silicon during high-g (shock)Footnote 3 events depends on a variety of factors but fundamentally it comes down to the strength of the material and the level of the shock the structure was designed or analyzed for including appropriate assumptions of corner cases. In order to accurately predict failure, a failure criteria is necessary, and the choice of either quasistatic or rate-dependent criteria becomes important. In one study [9], the quasistatic fracture strength was cited as being a valid criterion for the dynamic performance of a MEMS device. However, for more accurate modeling of such failure processes it may be necessary to also consider the effects of grain morphology [8], surface roughness, and defect distribution. Additionally, in electro-mechanical devices, there is the further complication of a pull-in instability [10], which has been observed (under these dynamic conditions) to complicate matters sufficiently, producing complicated failure modes [11].

Fig. 3.2
figure 2

Shock induced failure of a polysilicon spring – Reprinted with permission Copyright 2009 – Sensors MDPI [8]

The capability to predict these types of failures is essential to minimizing the risk of field failures and improving the reliability of the part.

The functional failure modes are divided between MEMS element design, system level design and package design. It is possible to also have functional failures due to design of the conditioning circuitry but this is not covered in this book.

3.2.1.1 Element Design

Elemental Design failures include mask data faults, design rule violations, and engineering analysis faults that lead to failures where the MEMS element does not perform as expected.

3.2.1.1.1 Mask Data Faults

Mask data faults are fairly common in MEMS design because of the nature of MEMS fabrication process flows which are usually exclusively developed for a specific device. This makes it difficult to create comprehensive design rule checkers (DRCs) that are capable of catching each and every flaw in the mask set used with a particular process flow, and it is not uncommon to have manually executed layout reviews that are time consuming and prone to errors.

Another, unique issue with MEMS design is the use of different CAD layout tools and formats. Historically, it is fairly common to employ multiple formats (DXF, GDS etc.) for handling CAD data and data translation from one format to another can also introduce faults. Additionally, another potential source of flaws in a MEMS mask design is the fact that MEMS devices often will use a non-Manhattan shape such as a circular mass or a curved spring, and semiconductor CAD tools are not completely equipped to handle such shapes. As a result of all of these issues; it is common for faults in the mask design to occur. A good example of such a flaw is shown in Fig. 3.3.

Fig. 3.3
figure 3

Typical Mask layout faults (a) design rule violations may occur when the shape of the MEMS structure changes, and (b) misplacement of parts of the layout

The effect of such mask data faults can be quite serious from a reliability standpoint because in some case this may result in an incomplete etch or over etch which introduces a structural flaw in the MEMS element. Such a flaw could be initially benign but can manifest itself in the field [12].

3.2.1.1.2 CAD Models

In a MEMS process, materials are deposited and etched onto non-ideal geometries with complex inter-layers due to particular process sequences. The ability to accurately simulate and predict the functionality of a part in 3D depends largely on the accuracy of the CAD model representation. Most solid modeling tools use a 3D representation called nurbs Footnote 4 to realize specific shapes but since these are idealized representations the resulting models are characterized by flat surfaces and sharp edges as shown in Fig. 3.4. For surface micromachined structures, such as those depicted in the Fig. 3.4, this is not that inaccurate and with a structured design methodology that examines behavior at the process and property corners, it is possible to bound the behavioral performance of the sensor.

Fig. 3.4
figure 4

CAD solid model representation of a MEMS accelerometer (reprinted with permission Copyright 1997 Analog Devices)

However, MEMS designers create ever complex designs where it is more of a challenge to capture precise geometrical features, making it more likely that the predictive capability of numerical simulations is limited by the accuracy of the CAD solid model. From observation of the final device we know that a realistic representation of the device would not be possible using the same geometrical representation described, and so more recently, voxel based tools [13] have begun to tackle this complexity and produce more realistic CAD models. In the Fig. 3.5, one can see that these models capture much more details of the real device.

Fig. 3.5
figure 5

Comparison of a 3D rendered model and an actual MEMS device (reprinted with permission Copyright 2008 – Coventor Inc.)

3.2.1.1.3 Material Properties

In the design phase, material properties are essential quantities to properly analyze the behavior of the device and the relative inaccuracy of these properties often leads to another type of functional failure – due to inaccurate material properties. Even though the modulus and density of most material used in MEMS are widely available [14], process dependent properties such as residual stress, stress gradient, fracture strength, fatigue limit, and others are not simple to measure and it is fairly common practice for the designer to simply use bulk propertiesFootnote 5 during the design phase. The residual stress or stress gradient within a thin film originates from either intrinsic or extrinsic sources. Intrinsic sources include material phase change, grain growth, crystal misfit, and doping whereas extrinsic sources include plastic deformation, thermal expansion and external loads. The material properties of a thin film can be quite different from bulk properties.

The lack of accurate thin film properties at the beginning of the design effort can lead to several functional failure modes, where the device does not perform as expected. In Fig. 3.6, the performance of the comb drive is highly dependent on the initial curvature within the released polysilicon layer. The stress gradient produces curvature in MEMS accelerometers that can result in a degraded offset performance or the part is no longer within specification, or worse still, stress relaxation may cause the part to gradually drift out of specification over life. Finally, the lack of accurate reliability related material property data makes it challenging to predict field reliability of a device [15].

Fig. 3.6
figure 6

Curvature in a released sensor is a result of thin film stress which is highly process dependent – Reprinted with permission Copyright 1997 – Analog Devices

3.2.1.1.4 Analysis and Simulation

Another type of failure that may occur is due to insufficient design analysis of the particular MEMS element. MEMS design tools today are highly specialized analysis tools that are capable of directly accepting a mask layout file, converting it into a 3-D numerical simulation model based on the process flow and incorporating all relevant and necessary material properties [16]. A variety of full-field simulators such as finite element analysis (FEA) or boundary element analysis (BEA) can then simulate the device behavior under prescribed boundary and initial conditions encountered during operation of the device. In the case of a common element such as a comb drive used in an accelerometer, gyro or resonator, one analysis of interest might be the deflection behavior due to process induced stress gradientsFootnote 6 as shown in Fig. 3.7.

Fig. 3.7
figure 7

Simulated displacement due to applied stress gradient in a capacitive accelerometer (reprinted with permission Copyright 2000 – Coventor Inc.)

Simulation tools are routinely used to analyze very complex conditions encountered in MEMS devices, such as fluid structure interactions in ink-jets, fluid-chemical analysis in bio-MEMS, etc. However, the most commonly encountered analysis in MEMS is a simple electrostatic pull-in analysis for electro-mechanical devices. If proper care is not taken, it is possible for these predictions, to systematically under or over-predict the pull-in voltage that is important to the overall function of the product. As one can see in Fig. 3.8, the pull-in voltage modeled by the common coupled finite element-boundary element approach can be over predicted by as much as 20% (or more) if the model is not sufficiently populated with enough elements [17]. Although some failures may be caught at the fabrication of the first prototypes, there are other similar simulation analyses that are difficult to predict until they manifest themselves in the field. One such example of a failure is the pull-in behavior of an RF switch where dielectric charging causes the pull-in voltage to vary over time [18]. Models for charge accumulation within dielectrics may not be included during the design phase.

Fig. 3.8
figure 8

Comparison of modeled pull-in behavior of two similar FEA models with an analytical model (reprinted with permission Copyright – Coventor Inc.)

Design analysis that are not accurately predicted either for lack of time; simulation tool capabilities or physical understanding can lead systematic non-performance or drift of a single parameter during field operation resulting in a failure.

3.2.1.2 System Level Design

Another design limitation that is routinely encountered which may lead to field failures is related to system complexity. A MEMS product is a complex system comprised of MEMS element, electronics, and package, and it is a significant modeling challenge to be able to predict overall system behavior without simplifying the sub-components to a sufficient level of abstraction without loss of accuracy. The system level models may then not give the designer enough predictive information to identify a potentially serious failure mode. Essentially failure predictability decreases as system level model abstraction increases. The Texas Instruments DLP© product, which contains over a million individually addressable mirrors (1024×1024 pixels) with signal processing at each pixel and a custom packageFootnote 7 is a good example of system complexity. One way to understand the problem of “sufficient analysis” is the fact that for each DMD chip the failure rate is defined or set to be <1 ppm or less than 1 mirror per chip. This requires that the modeling used to predict overall functional performance of the chip have to be of extremely high fidelity [19].

3.2.1.2.1 Design Integration

Usually, a MEMS product comprises of a MEMS sensor and control circuitry described in the block diagram in Fig. 3.9. The ability to co-simulate the behavior of the entire system including the sensor can be another significant challenge considering that circuit level simulators are usually not capable of adequately representing the MEMS element [16].

Fig. 3.9
figure 9

Control system and circuitry for XL50 – (reprinted with permission Copyright – Analog Device)

Typical failure modes that could arise due to a lack of such integrated design capability fall into two main categories:

  1. 1.

    Process corners: During the circuit design phase, designers simulate the performance of the ASIC at all process corners and temperatures. During this stage of the design process, the inability to adequately represent the MEMS element over the element’s own process corners or behavior over temperature, within the design can lead to multiple failures that affect both functionality and reliability.

  2. 2.

    Circuit Timing: Again during the circuit design phase, it is critical to get the circuit timing for drive and sense signal chains precisely synchronized. If a suitably sophisticated sensor model cannot be integrated into the system level model, this may result in timing failures which are sometimes extremely difficult to pin-point and lead to field failures under certain use conditions.

In recent years, substantial advances in CAD tools have enabled designers to co-simulate both the MEMS and the circuitry required to drive it, in a single environment [16] leading to more robust simulation of the system as a whole.

3.2.1.3 Package Design

Assembly or packaging of MEMS devices present a unique challenge for MEMS designers and reliability engineers alike (Fig. 3.10). The similarities with conventional IC packaging is clear in the sense that the package must reject certain inputs like moisture, contamination, etc. and tolerate common forces such as temperature, shock, handling or tester forces. However, unlike conventional IC packaging, the MEMS element must interact directly with the outside world in order to perform their design function. In the case of accelerometers, this does not require the package to be designed markedly different than a conventional IC package but in the case of wide variety of other MEMS products which sense pressure, temperature, chemical species, or control light or sound, there needs to be direct interaction between the sensing element and the input that needs to be measured.

Fig. 3.10
figure 10

MEMS packaging challenge

Each MEMS application usually requires a new package design to optimize its performance or to meet the needs of the system. This is the primary reason why the cost fraction of packaging a MEMS device remains high [10]. There are several categories of MEMS packages including metal packages, ceramic, plastic packages, and thin-film multilayer packages [20] that is similar to standard semiconductor packaging. However, there are several factors that make MEMS packaging more complicated and establish the need for comprehensive MEMS-package design integration [21]. Some of these factors are:

  • Usually might include a discrete circuit chip besides the MEMS device.

  • Contains a hermetic bonded silicon cap over the sensor structure which is sensitive to assembly forces.

  • Assembly material selection depends on the application and is complicated because of the differences in properties such CTE, modulus, and glass transition.

  • Package failure modes observed and reliability issues can be quite diverse [22].

As a result of such factors there are two basic categories of failure modes related to the assembly process – related to package materials, and sensor-package interaction.

3.2.1.3.1 Package Materials

There are a variety of packages used in the packaging of MEMS sensors [15] and these contain an even wider variety of materials include metals, plastics, ceramics, and polymers of various kinds, that all have to function together to ensure performance of the life of the sensor. As shown in Fig. 3.11, a typical MEMS accelerometer may be packaged in a plastic over-molded package surrounded by plastic mold compound, gel, and die attach all of which have different material properties [23]. The reliability of the product depends as much on the package materials as it does on the MEMS element, and several potential failure modes have been observed due to package material selection and characteristics.

Fig. 3.11
figure 11

Cross-section of a typical plastic overmolded MEMS sensor – (reprinted with permission Copyright 2009 Chipworks [24])

These failures may be broadly divided into two types:

  1. 1.

    Interfacial: Interfacial failures arise primarily due to differences in strength or CTE between adjacent materials or poor interfacial adhesion between the layers. In many packages, it is possible to find interfaces such as the one showed in Fig. 3.12, which clearly shows that the die-attach (DA) thickness varies and may even have poor adhesion in certain areas.

Fig. 3.12
figure 12

Quality of die attach bond line for a MEMS die attached to a substrate (reprinted with permission Copyright 2008 – Analog Devices)

The failure mechanisms induced due to poor interfaces are almost always related to the stress on the MEMS element. Depending on the package and the applied stress state the sensor element can react quite differently but it is clear that in such cases the long term performance of the sensor is affected.

  1. 2.

    Bulk: Material failure modes that arise due to the behavior of package related materials again are almost always stress inducing on the MEMS element. One important property is the glass transitionFootnote 8 temperature (Tg) of the plastic used in overmolded plastic packages. In a MEMS product such as that shown (in Fig. 3.11) there is a clear interdependence of the performance of the plastic and the reliability of the product. The T g of some typical polymers used is in the 120–150°C range but may depend on the cure time i.e. a longer cure time increases the cross-linking and results in a higher T g. The higher the T g the better the performance of the part over the operational temperature range which is typically 85°C. For example, in [25] the gyro package shows large nonlinear behavior over the temperature range up to 140°C due to the fact that the mold compound goes through a glass transition. However increasing the T g might also increase the susceptibility to package cracking during reflow [26].

Other typical defects in the die attach (see Fig. 3.13) include voids, interfacial voids and delamination, and cracks, and for plastic over-molded packages additional defects include wire sweeping, incomplete filling, cracking, blistering and flashing. All of these defects can result in a variety of reliability related failure modes including drift, and component failure. The integrity and strength of the package materials used play a crucial role in the overall reliability of the part.

Fig. 3.13
figure 13

Die attach failure modes

3.2.1.3.2 MEMS – Package Interaction

MEMS by their nature require application specific packaging and since the package immediately surrounds the MEMS sensor it has a direct effect on its thermal-mechanical behavior, environmental compatibility and contamination. A major contributor to increased product development cycles is the lack of focus on the package early on in the design phase. It is therefore critical that during the design phase a thorough study of the influence of the packaging on performance be conducted and that this occurs simultaneously with the sensor element design. There are several valid methodologies [27, 28] that depend on the relative size of the package, sensor size and the level of detail required in the particular analysis. This size difference has been known to create a serious challenge for numerical simulators attempting to perform brute force simulations of the MEMS and package together.

The coupling between the package and sensor chip is most commonly observed with temperature effects where the difference in CTE between the mold compounds, die-attach and sensor chip leads to a complex stress state at the sensor. In a MEMS gyro packaged in a plastic overmolded package [25] the local CTE mismatch produces a convex bending of the package at the maximum of the temperature range, but this is of opposite curvature at temperatures lower than 125°C and at room temperature. The MEMS gyroscope in this case must be designed to be less sensitive to these strains as they deform the spring elements of gyroscope, leading to resonant frequency changes of the sensing and driving modes. In order to increase robustness of the gyro to this type of deformation, the designers in this case modified the spring design to reduce the frequency shift observed across the operational temperature range (Fig. 3.14).

Fig. 3.14
figure 14

MEMS Gyro packaged in a plastic over-molded package – Reproduced with permission Copyright – 2007 IEEE [25]

The MEMS designer needs to be able to account for similar effects due to hygroscopic swelling of the mold compound [29] and the die attach [30] both of which have long term affects on the behavior and reliability of the part.

In the next section we will take a closer look at some common material failure modes within the sensor element itself. However, the user is referred to Chapter 4 for more details on specific failure modes that manifest in the field.

3.2.2 MEMS Material Failure Modes

Specific material failure modes in MEMS devices can be varied and highly process dependent but are commonly divided into the following categories: Thermo-mechanical failures, Electrical Failures, and Environmental.

3.2.2.1 Thermo-Mechanical (TM) Failures

Thermo-mechanical failures are those failures resulting from thermo-mechanical forces and generally include the most common of MEMS stress failures i.e. residual stress:

  1. 1.

    Contact Wear: RF switches or contact actuators [31] are Class III-IVFootnote 9 MEMS devices where contact of proximal surfaces (as shown in Fig. 3.15) occurs frequently and with sufficient force to cause time dependent damage resulting in wear of the contact surfaces [32]. During normal operation of the device, these surfaces come into repeated contact and the material in the contact zone is subjected to large stresses under conditions of large current densities and temperature, which eventually lead to wear failures. The reliability of the switch is dependent on the material properties and processing conditions of the contact zone [33].

Fig. 3.15
figure 15

RF MEMS switch contact (reprinted with permission Copyright – NorthEastern University)

Other examples of RF switch reliability maybe found in an interesting case study in Chapter 2, and the wear of the SAM coating on aluminum surfaces in the DMD® mirror maybe found in Section 5.5.7 (AFM Methods). The wear observed in the latter case is purely under mechanical contact conditions that are less extreme than those encountered in an RF switch.

In the case of RF switches, the contacts needs to be able to transmit a current of sufficient magnitude in a very small area resulting in very large gradients of temperature and stress which cause local damage. As a result of this accumulated damage to these interacting surfaces, the contact resistance gradually increases over life until eventually the contact breaks down, as shown in Fig. 3.16.

Fig. 3.16
figure 16

Lifetime data for a typical RF switch (reprinted with permission Copyright – 2002 MANCEF [34])

The requirements to improve wear resistance (and consequently the reliability of the device), are low adhesion, high current capacity and low contact resistance [31]. Contact resistance (assuming an elastic-plastic contact [35]) is made up of two main terms, the constriction resistance (R c) and the film (tunneling) resistance (R t) which depend on the hardness (H), force (F), resistivity (ρ), film resistivity (ρ t), and elastic plastic factor (ζ) described in Equation (3.1).

$$R = {R_{\rm{c}}} + {R_{\rm{t}}} = 0.89\rho {} \,{\left( {\frac{{\xi \,H}}{{nF}}} \right)^{1/2}} + \frac{{{\rho _{\rm{t}}}\xi \,H}}{F}$$
((3.1))

To create a lower stable contact resistance it becomes necessary to increase the contact force (F) but this also increases the adhesion force between the surfaces which then requires more force to break the contact, and so is not really an optimal approach [35]. The relationship between the adhesion force and contact force is determined by the nature of the contact i.e. elastic or elastic-plastic, as well as the occurrence of contact heating and welding. The adhesion factor which is the ratio of the separation (breaking) force to the contact force is given by:

$$\frac{{{F_{{\rm{separation}}}}}}{{{F_{{\rm{contact}}}}}} = \frac{{{\zeta _1}{\zeta _2}{\zeta _3}}}{2}$$
((3.2))

where ζ 1 is the slide factor (1–1.5), ζ 2 is the elastic de-compression factor (0.6–1.0), and ζ 3 is the film factor (0–1.0). For pure (bulk) gold (99%), this factor is 0.68 while for ruthenium (Ru) coated contacts this is ∼0.22. A lower adhesion factor means a lower breaking force which will produce less damage at the contact surfaces. Figure 3.17 shows the contact resistance as a function of contact force for several metallic materials like gold (Au), and rhodium (Rh).

Fig. 3.17
figure 17

Contact resistance as a function of contact force for several metals (reprinted with permission Copyright – 2007 SPIE [36])

The force to break the contact is determined by the adhesion and welding forces and it is this breaking force which eventually causes wear and degradation of the contacting surfaces. For example, the difference in the breaking force between evaporated gold and sputtered gold is an order of magnitude leading to longer lasting contacts from evaporated gold [35].

Another factor in such types of asperity contacts is the temperature rise in the contact which is predicted by the following Equation (3.3):

$$\frac{{{I^2}}}{F} = \frac{{16L\left( {{T^2} - T_A^2} \right)}}{{\pi \,H\rho {} _A^2{{\left[ {1 + \frac{2}{3}\alpha {} \left( {T - {T_A}} \right)} \right]}^2}}}\frac{n}{\xi }$$
((3.3))

where L is the Lorenz constant, n characterizes the surface condition, ρ is the resistivity, H is the hardness and T A and T are the ambient and melting temperatures respectively. The decrease in life cycles between hot and cold switching is observed in Fig. 3.18.

  1. 2.

    Fatigue: Fatigue is the collective name for multiple phenomena that arise due to different mechanisms in brittle and ductile materials resulting in a progressive decline in load bearing capacity eventually leading to catastrophic failure. These types of failures are particularly troublesome because of the process dependent behavior of thin film materials over many accumulated cycles of stress. Cyclic fatigue damage may cause several types of performance failures such as resonant frequency decreases (Section 5.5.8), drift and catastrophic failure. The key design parameter for fatigue is the Endurance Limit (S m ) but some MEMS materials such as aluminum do not have a well defined endurance limit, while others such as silicon or SiN are not known to exhibit fatigue at typical operational levels. A more detailed discussion of fatigue mechanisms in MEMS materials is available in Section 4.2.5 but it is in essence the dominant failure mode associated with crack initiation and growth due to a time-varying stress [40].

  2. 3.

    Work Hardening is a characteristic of ductile materials such as metals and alloys and occurs when there is overstress above the yield limit (σ y). At low levels of work hardening it is not uncommon to see a shift in the residual stress state, leading to subtle changes in the performance of a device over time e.g. curvature or frequency of the device might shift. At higher levels this could eventually lead to catastrophic failure in the form of plastic deformation that completely degrades function of the device. Examples of susceptible components include LIGA MEMS, metal hinges, or contacts.

  3. 4.

    Delamination: Delamination between deposited layers has been observed in MEMS devices, and may occur (as shown in Fig. 3.19) either due to processing defects or high stress levels at interfaces. A processing defect occurs due to incomplete or non-uniform deposition and can be a source of weakness in a device, either leading to a non-performing part or failure of the device early in the life of the part. This can occur when the part is in the field as well and is more difficult to detect [41] because of the extent or location of the defect within the device. Typically this type of failure mode occurs due to a high stress event e.g. thermal or acceleration shock or anodic oxidation. Another form of delamination called spall occurs in multi-layered thin metal films which is a dynamic failure mode observed due to high strain rates in shock situations.

    MEMS failures due to delamination may occur more often because of commonly used process steps such as wafer bonding, and chip-to-chip bonding (e.g. wafer-to-wafer bonding) which are very challenging to perfect. Results presented in [22], show that accelerated testing and thermal cycling can induce the onset of delamination, however the influence of mechanical shock on delamination is not as large as expected. Adhesion properties between different material layers or capillary forces could result in weakness of the bonded or interface layers; and thermal effects such as CTE mismatch between layers usually plays a very active role in causing such failures.

  4. 5.

    Creep Failure: Creep failures occur mostly in metals subject to a time dependent loading at an elevated temperature. There are several types of creep i.e. dislocation glide, Nabarro-Herring (NH), Coble creep, grain boundary sliding etc. and although a mechanism like dislocation glide occurs at any temperature, others like dislocation, NH or Coble creep typically occur when the homologousFootnote 10 temperatures (T H) T H > 0.4 and failures initiate at grain boundaries but it is also possible to activate secondary creep due to strain rate effects. The high stresses and gradients introduce time dependent behavior through dislocation glide and diffusion mechanisms and the strain levels can be large compared to the average size scale of a MEMS device. A good example of high stress states that might induce creep in MEMS is an RF switch that is in the on state, at an elevated temperature [42]. The primary concern is the use of certain metals as a structural material in MEMS because creep can occur even at room temperature degrading the performance of the device. Additional discussion on this topic and mechanism may be found in Section 4.2.4.

Fig. 3.18
figure 18

Switch lifetime vs. actuation voltage several published works [3739] (reprinted with permission Copyright – 2002 MANTECH [33])

Fig. 3.19
figure 19

Delaminated or debonded interface

The dominant physical phenomenon (i.e. physics of failure) involved in each of these TM failure modes is due to material stress or strain beyond a certain limit.

3.2.2.2 Electrical (EL) Failures

Electrical failures occur due to static or dynamic charge transfer within materials or across gaps or surfaces and in MEMS devices this can lead to several potential failure modes.

  1. 1.

    Dielectric Charging: Dielectric charging is basically the accumulation of electric charge in an insulating dielectric layer. In certain MEMS applications like capacitive RF MEMS switches [39] the insulating dielectric between electrodes can accumulate charge over time leading to a failure where the switch will either remain stuck after removal of the actuation voltage or fail to contact under a sufficiently high voltage. In such switches, the mechanism for charge accumulation is a result of large electric fields across very thin dielectric layers [43]. The trapped charges have no conductive path and accumulate over time leading to two possible failure modes a) drift – the performance of the device changes slowly over time because of the stored charge, and b) latch-up – the accumulated charge changes the pull-in dynamics of the switch and can increase the pull-out voltage to a point where the mechanical restoring force is not sufficient to open the switch. Considerable effort has been devoted to both the experimental characterization of dielectric charging and the development of models that can be used to predict the impact of dielectric charging on electro-mechanical behavior of a capacitive switch [18]. Further discussion is available in Section 4.3.1.

  2. 2.

    Electromigration: In semiconductor devices, electromigration is a well documented phenomenon caused by the formation of voids or hillocks that may occur over time, due to high current densities in thin-film conductors within integrated circuits. Figure 3.20 shows a damaged interconnect due to significant momentum transfer from electrons to conductor atoms [44]. The failure modes with this phenomenon are quite clear – loss of function due to shorts, and change in parasitic impedances over time. In the design phase, Black’s empirical equation to predict MTTF of a wire, factoring in electromigration may be used to estimate the effects of current density and temperature on reliability. Section 4.3.3 delves into more detail of this in-use failure mode.

  3. 3.

    Electro-static Discharge (ESD) or Arcing: Another potential electrical failure mode that commonly results in catastrophic damage in MEMS is ESD or arcing (Fig. 3.21). The presence of very small gaps (order of a few nanometers to a few microns) with the possibility of gap closure (during use) and the geometries that can lead to non-uniform high electric field, make MEMS structures particularly vulnerable to electrostatic discharges (ESD), overvoltage, charging, or corona effects. There is limited research in this area [45] and more work is needed particularly in the area of RF MEMS switches.

Fig. 3.20
figure 20

SEM Micrograph showing voids and hillocks (reproduced with permission Copyright – [44])

Fig. 3.21
figure 21

Catastrophic failure due to ESD in RF MEMS switches (reprinted with permission Copyright – 2008 IACM, ECCOMAS [45])

The reliability of MEMS devices can be heavily influenced by electrical failures and it is important for the designer to understand the sources of these failures and account for them. A more detailed treatment of this topic is available in the next chapter (Section 4.3.2).

3.2.2.3 Environmental (ENV) Failures

MEMS applications are diverse in their interaction with the environment. In some applications, such as pressure sensors and microphones the sensing element is directly exposed to the operating environment which could in some cases be quite aggressive. In a harsh environment application such as tire pressure monitoring (TPMS), the sensor element has to be able to directly sense the air pressure on one side of the diaphragm, however besides the application stresses and temperatures, this environment typically contains particulates of various sizes, and a multitude of contaminants. The interactions between environmental forces and the materials within the device can result in a variety of failure modes in MEMS. Some of these are listed below and is covered in more detail in Chapter 4.

  1. 1.

    Anodic Oxidation: Anodic oxidation can be a fatal failure mechanism in polysilicon MEMS devices that operate in humid environments. The exact failure mode depends on the design but it is known that positively charged polysilicon traces can fail due to oxidation [46], and that such oxidation can cause delamination between polysilicon and nitride layers [41]. A more thorough discussion of anodic oxidation is available in Section 4.4.2.

  2. 2.

    Corrosion – There are many types of failure mechanisms related to corrosion including galvanic, crevice, pitting corrosion, stress corrosion, dendrite growth, whisker growth, and corrosion due to moisture, microorganisms and biological contamination. While it is not possible to cover in detail all these types of corrosion the reader is directed to Section 4.4.3 for more information on galvanic corrosion.

  3. 3.

    Grain growth: Grain growth in MEMS materials like polysilicon has been observed under conditions of high stress and temperature leading to failure mechanisms where the device ceases to function or there is a gradual change in behavior. From experiments on polysilicon [47] it is has been observed that grain growth mechanism is significantly affected by the doping conditions specifically when the dopant concentration in the grains is above the solid solubility limit (and is apparently independent of the method of polysilicon doping). Generally, polysilicon grows by secondary recrystallization which is driven by grain boundary energy as opposed to defect energy and the rate is temperature and stress dependent. In particular geometrical features such that cause higher stress states (e.g. film corners) will have lower grain growth.

In the Chapter 4, a more detailed look at other environmental failure mechanisms caused by radiation as well as the physics of failure involved in selected cases is presented but for now we will continue to look at failure modes in MEMS that have their origin in the product development phase.

3.2.3 Non-analyzed Conditions

Non-analyzed conditions is a sub-category of failure modes that basically is a catch-all for the many different environmental (or other) conditions the device may be subjected to that are not analyzed a priori. It is impossible to simulate all possible environmental or operating conditions prior to fabrication, and so a robust strategy of testing and qualifying the device under a series of burn-in, acceleration and other protocols is used to reveal weaknesses or uncover potential failure modes. Examples of these are described in Chapter 2.

A good example of such a factor is the stress corrosion cracking of polysilicon [48]. In the absence of a corrosive environment, a brittle material like polysilicon should be relatively insensitive to cyclic fatigue but such fatigue effects are observed in MEMS polysilicon samples tested in air [49] as seen in Fig. 3.22. The fatigue damage may originate from contact stresses at surface asperities; which exacerbates subcritical crack growth during further cyclic loading.

Fig. 3.22
figure 22

High cycle fatigue of polysilicon (reprinted with permission Copyright – 2002 Science [48])

Under these conditions, a corrosive ambient such as laboratory air exacerbates the fatigue process through formation of an additional thickness of surface oxide on surface asperities or crack surfaces which generates higher stresses during compressive stress cycles. Without cyclic loading, polysilicon does not undergo stress corrosion cracking.

3.2.3.1 Leakage Currents

Leakage currents in MEMS can cause havoc in the performance of the part over life. Previous studies [50] have researched the implications of leakage currents in surfaces and volumes of dielectrics within the MEMS device itself but quite often the failures due to leakage currents are not limited to the sensor itself but could occur due to assembly or semiconductor processing.

Most MEMS have some control circuitry and I/O pads that influence performance of the part, and often it is here that we encounter leakage currents that can cause reliability failures. The use of conductive die-attach or even silicon chip outs from wafer dicing or handling can lodge in bond pad regions or exposed interfaces leading to leakage currents between isolated parts of the design. The presence of an oxide layer on the surfaces of such particles makes it less likely that these will be detected at final test and during field use but when the oxide starts to degrade due to stress effects, the resulting leakage currents could start to influence part performance. In traditional IC chips, such failures are detected by techniques such as XIVA/LIVA/CIVFootnote 11 and similar techniques can be used for MEMS.

In summary, the functional and material failure modes described in this section describe the major failure modes encountered in MEMS design. In the next section we will look at failure modes that have their origin in the manufacturing phase of product development.

3.3 Manufacturing Failure Modes

Manufacturing related failures are due to specific processing characteristics and are usually difficult to eliminate – they are mainly of two types depending on where they originate in the manufacturing process. In general, the manufacturing process is divided between front-end processing which typically includes specific clean room processing, photolithography, etching etc., and back-end processing which includes wafer dicing, assembly and final test. Figure 3.23 identifies some of the main manufacturing related steps where defects may be introduced. Sometimes it is also common to use the terms local and global defects rather than front end or back end. Local refers primarily to contamination and any form of mis-processing such as a voids or stringers, and could potentially also include effects of design rule sensitivity; whereas global defects include a broader spectrum of defects from those due to wafer level variations and handling, to assembly.

Fig. 3.23
figure 23

Manufacturing process related defects

3.3.1 Front End Process Defects

Process related defects fall into three broad categories – material transport failures such as those due to deposition and etch steps, wafer bonding failures such as hermiticity, and tribological failures such as stiction.

3.3.1.1 Local (Wafer) Defects

Local defects due to contamination are a common failure mode that is routinely dealt with throughout the industry. In general, local defects in MEMS are primarily of four types – particulate, ionic, organic contamination defects, and voids and stringers.

Particulate defects refer to nano or micron size particulates (typically FEOLFootnote 12 18–90 nm and 0.052–0.064 #/cm2 and BEOLFootnote 13 36–180 nm; 0.052–0.064 #/cm2 [51]) that will cause a variety of potential failure modes in ASICs but in MEMS larger particles can be tolerated. The small feature size (Fig. 3.24) especially if conductive, could potentially cause open/short circuits or degrade motion of the MEMS element or even cause intermittent performance deviations (Fig. 3.25).

Fig. 3.24
figure 24

Particulate contamination on a passive shock element (reprinted with permission Copyright – 2008 Sandia [52])

Fig. 3.25
figure 25

TOF-SIMS image of a 60 μm2 area of a wafer after two different cleaning steps showing copper ion counts along the corresponding highlighted lines (reprinted with permission Copyright – 1999 Micromagazine.com [57])

Obviously, the size of the particle has much to do with the physical kinetics of the particle in terms of what forces activate them, and more importantly on what can be done to eliminate particles below or above a certain threshold size. For example, for particles <10 nm in size; the particle motion is dominated by Brownian motion and is heavily influenced by gas or liquid molecules. Typically particles that are created close to the wafer surfaces may be deposited due to Brownian motion. At the next size up – between 0.1 and 1 μm, the motion of the particle is influenced by thermophoresis which is a non-continuum effect caused by the temperature gradient e.g. cold wafer introduced into a hot oven. In fact, particles are repelled from hot surfaces and attracted to cold surfaces leading to higher contamination levels in these cold regions [53]. Lastly, at sizes above a critical size d cr (1 μm or greater) inertial or gravitational forces will dominate.

The adhesion and removal of particles from wafer surfaces [54] is of critical importance to MEMS manufacturing because these particles can directly result in defects leading to failure. Adhesion forces are categorized [55] as follows:

  • Adhesion forces that dominate in the region of the contact and immediate surroundings such as Van der Waals and electrostatic forces.

  • Adhesion forces due to chemical bonding such as EDL – electrical double layer (function of solution pH) or Hydrogen and Covalent Bonds (e.g. SiO2/Glass). An EDL forms when particles in solution become charged and the zeta potential Footnote 14 affects particle deposition.

  • Adhesion forces caused by interfacial reactions such as diffusion, condensation or diffusive mixing (RH dependent)

Once a particle is in contact with the surface it is not uncommon for adhesion induced deformation [56] to occur which increases with time resulting in a decrease in removal efficiency of such particles.

Ionic Defects generally refer to the presence of metallic or non-metallic ions on the surface of the wafer. The metallic ions are highly mobile and can cause charging of dielectrics or other layers which can directly impact performance. However, more problematic is the presence of ionic defects in the presence of strong local electric fields which may tend to concentrate the accumulation of ions on the surface of the wafer. Ions can be detected with a variety of techniques (discussed in Chapter 5) but TOF-SIMs (Fig. 3.25) and TXRF are quite common [57]. A possible technique to mitigate such defects before field testing is burn-in, although a wash followed by burn-in may be more effective (Fig. 3.26).

Fig. 3.26
figure 26

Switch contact resistance as a function of organic residual contamination level determined by Auger spectroscopy (reprinted with permission Copyright – 2007 SPIE [36])

Organic Defects are generally carbon based organic solvents used in fabrication processes and comprise of typical molecules from sources such as – photoresist (polyimides, SU-8 etc.), methyl alcohol, acetone, isopropyl alcohol etc. that are due to wet or dry wafer cleaning operations. The effect of such organic defects can be generally small during normal operation of the device but occasionally there might be situations where they could lead to failures [36]. A good example is the accumulation of organic molecules at sites with high field fluctuations, or the contribution of organics to degradation of anti-stiction coatings. Optimization of the cleaning process is necessary to minimize the presence of organic contaminants at the end of wafer fab processing. One method for removal of these defects from critical surfaces before field testing is burn-in (48 h at 150°C in dry atmosphere) but it is generally better to perform cleaning steps prior to hermetic sealing or packaging to minimize the possibility of field failures.

Voids and Stringers are isolated defects that have been observed in MEMS processing either due to chemical or physical conditions that the wafer is subjected to during fabrication. Voids have been observed to form in a variety of process conditions including deposition, annealing, corrosion, etc. For example, stress induced voiding is a common occurrence in IC manufacturing (Fig. 3.20), and is observed in trace wires usually due to electron migration at grain boundaries. Stringers or streamers on the other hand, are due to incomplete process steps or a marginal violation of a design rule, which gives rise to residual stringers within the moving MEMS element. These can linger even after cleaning steps and may subsequently lead to failures in the field. The example of a stringer shown in Fig. 3.27 was detected after a short-circuit was detected during operation of the device. The failure analysis revealed that the stringer created an electrical path between two conductive adjacent surfaces.

Fig. 3.27
figure 27

Example of a stringer lodged in an isolation trench (reprinted with permission Copyright – 2000 Coventor)

Local wafer defects such as those described in the previous section are very common in most IC manufacturing but in MEMS manufacturing these same defects cause a variety of reliability issues in the field.

3.3.1.2 Material Transport – Deposit/Etch Failures

A variety of process deposition and etch defects are routinely encountered in MEMS fabrication.Footnote 15 In modern IC and MEMS manufacturing there are established techniques [12] to identify defects and discard the specific die where these defects occur. The presence of defects is more often than not a yield issue where the die or part will simply not function as intended or at all and since this can typically be detected at final test it does not specifically pose a problem for long term reliability but there are certain classes of defects that will not be detectable by electrical testing and only manifest themselves in the field. It is these defects that cause reliability issues and lead to degraded performance of the part.

In MEMS fabrication there is a wide variety of material addition techniques and from time-to-time deposition defects will occur in all of them either because of the tool used or because of interaction of the design and the flow conditions within the process zone. While it is impossible to go into great detail with each process step used in the MEMS industry we will look briefly into a few such steps and the defects they produce to give the reader a basic idea of which failure mechanisms can impact overall reliability.

  • Chemical Vapor Deposition (CVD): The primary defect classes that occur during this deposition step are point defects, clusters, dislocation, and stacking faults. Due to the high temperatures (above 600°C) and relatively low deposition rates some of these defects can form weakness (crack initiation sites) or grow larger (grain size structures) that could lead to anomalies. CVD results in the conformal deposition of material over the previously deposited layer and this creates some potential vulnerable areas in the design, specifically like anchor locations and steps where defects can lead to reliability issues.

  • Photolithography: There are several types of photolithography related defects that can be generated during fabrication. Defects that prevent motion of the MEMS element are the most easily detectable because of self-test (or BISTFootnote 16) or final electrical test because a defect such as a particle or residue that obstructs the motion of a proof mass or finger will not respond to the applied stimulus in the same way and can thus be effectively screened out. There are several sources of photo-track-induced defects [12] such as bubbles in the developer dispense, incomplete post-develop rinsing, scumming of the resist etc. and each will interact with a given design in a unique way leading to different defect size and localization distributions.

    Reliability issues are also caused by the same defects when they are undetectable by electrical screening. In this case, defects like particles that are smaller than a critical gap, or residue that adheres to the moving element can be undetectable during the electrical test. Optical inspection on each and every die is prohibitively time consuming and expensive for high volume applications and is not really an effective solution. Quite often the development of an electrical test to screen out offending die is the only path for corrective action but ultimately optimization of the process to remove these defects is necessary.

  • Evaporation and Sputtering are similar processes for depositing materials on the surface of a MEMS wafer. The primary defects are point defects that form either due to the characteristics of the process tool or because of a pre-existing local defect (e.g. particles). For example, in the evaporation of gold using an e-beam evaporator one can sometimes observe local defects shaped like round balls (diameter ∼100 nm or larger) known as “spit” gold which are contaminants on the target surface that act as a catalyst. It is possible for vibration to dislodge these defects resulting in particulate defects during field operation leading to a reliability failure.

There are several other common deposition techniques such as electrodeposition, and thermal oxidation that also can result in localized defects.

Etch related failures are also fairly common sources of defects that result in reliability failures. An example of a particularly common problem is the variation of etch characteristics across a wafer. As can be seen in Fig. 3.28, the thickness of deposited nitride across the wafer can vary significantly at the edges of the wafer compared to the center. This typically results in parts with a variety of different responses and it is entirely possible to get parts that are close to set limits and even failures. A more severe problem is that of marginal parts, which are nothing but parts that are within acceptable performance limits but close to the margins. In operation, these parts can quickly fail due to particular in-field stresses (Fig. 3.29).

Fig. 3.28
figure 28

Process etch variation across the wafer

Fig. 3.29
figure 29

Stress relaxation behavior for pure Al films (<100 nm) as a function of temperature (reprinted with permission Copyright – 2000 Applied Physics Letters [60])

Finally, one should remember that process steps involved in MEMS just like those in semiconductor processing are highly controlled chemical reactions that can also produce failures due to incompleteness of the reaction. For example, a resist removal step can be incomplete due to insufficient process time, physio-chemical differences in material, etch design rules, etc. The material left behind can then cause failures in the form of the above described Local Defects.

3.3.1.3 Stress Relaxation Effects

There are several types of thin-film stress effects that are commonly encountered during MEMS fabrication that can influence the long-term reliability of a MEMS device. The effects of residual stress, creep, and fracture were discussed earlier in this chapterFootnote 17 but it is important to mention the effects of stress relaxation effects brought about by fabrication conditions and which can impact long term reliability.

The investigation of stress relaxation in nanoscale thin films (such as aluminum – [58]) have found that the relaxation is strongly dependent on temperature and film thickness, with the relaxation rate being highest for the highest temperature and the thinnest films. In polysilicon, stress relaxation is negligible at room temperatures but has been observed at elevated temperatures above 1000°C [59]. In metals however, the relaxation mechanism is attributable to dislocation motion or grain boundary sliding [60] and in metals annealing is commonly used to relax metal stress at relatively low temperatures.

The relaxation of metallic thin films can adversely affect the performance of a MEMS device in many different ways. The use of metal films in coatings (for optical devices) or conductor electrodes (RF switches) makes it important to factor in the stress relaxation into the design. An alternate approach is to effect the relaxation through an annealing step (at elevated temperature) during manufacture.

3.3.1.4 Process Tribological Failures – Stiction

Stiction is one of the primary tribological failure mechanisms in MEMS, and occurs where suspended structures are pinned unexpectedly due to adhesion which might occur during contact of proximal surfaces [61, 62]. In MEMS, particularly surface micromachined structures, the surface area to volume ratio is large, and the stiffness of restoring springs is typically small, which makes these proximal surfaces particularly prone to stiction which may occur during processing or in the presence of liquid (e.g. elevated RH levels) [63]. Stiction as a phenomenon can also occur during use (e.g. shock), and is occasionally called in-use stiction, as it dependent on the state of the surfaces and specifically the surface energies post manufacture.

Stiction as a failure mechanism can be understood by considering the adhesion of the two surfaces in contact. Adhesion occurs either due to van der Waals, electrostatic forces (trapped charge), capillary forces or a combination thereof [66]. During fabrication, the use of wet chemical processes can leave behind ions and dangling bonds as well as minute amounts of water from trapped liquid due to pressure differences and surfaces tension forces. When the surfaces of two solids are brought close to each other (solid-solid contact), a surface force arises due to direct interaction between the molecules or atoms at the surfaces [32], and this force can be positive or negative depending on the proximity of the surface pair. In liquid mediated contact, the adhesion arises due to surface tension forces and this adhesion energy can be quantitatively measured using analytical models [67] and test structures [68, 69], such as the free-standing cantilever beams shown in Fig. 3.30.

Fig. 3.30
figure 30

Pinned and free standing cantilever beams – Reprinted with permission Copyright – 1993 IEEE [64, 65]

In MEMS devices, the contact between surfaces can occur horizontally or vertically depending on the particular design. The electrostatic force between surfaces usually has to be factored into the force balance as shown in the diagram in Fig. 3.31:

Fig. 3.31
figure 31

Pull-in and pull-out curve for a typical MEMS micromachined structure

The electrostatic pull-in force brings the two contacting surfaces together but it is the restoring force that has to overcome stiction if the device is to function correctly [70]. In most devices, the restoring force is typically enabled through a spring-like structure with a constant restoring force or a bias in the opposite direction.

The surface energy U s is simply defined in terms of the contact area A c and the work per unit area () required to separate the surfaces to ∞.

$${U_{\rm{s}}} = 2{A_{\rm{c}}}\gamma $$
((3.4))

If we consider a Hertzian Footnote 18 contact with circular contact area (πa 2), the total contact force is the sum of the mechanical force and the adhesion force which is given by the following expression:

$${F_{{\rm{adhesion}}}} = 4\sqrt {\gamma E{A_{\rm{c}}}} $$
((3.5))

For non-circular contact areas the adhesion force is determined by using the same equations with an equivalent radius (a). The surface energy γ is highly dependent on processing conditions and can be modulated with the use of anti-stiction coatings (e.g. SAMsFootnote 19) or particular drying or cleaning steps [66, 71].

The simplest strategy employed to quantify stiction in a given process is by measuring the free standing lengths of cantilevers fabricated in the same process. For simple cantilevers, derived formulae [64, 65] which relate the contact length to the beam stiffness and adhesion energy for both simple configurations of a deformed cantilever, i.e. “S” and “Arc” shape (Fig. 3.32) provide the relationship between the dimensions of the beam, and surface adhesion energy. Below some limit the beam will not stick to the substrate and to determine the surface energy for a material and process one can use this technique to measure the detachment length from an array of cantilevers and fit the values to equation (3.6). As the free standing length of the beam approaches the beam length, the cantilever pivots and changes from an S-Shape to an Arc-Shape. The models assume that adhesion energy includes elastic deformation of the substrate

Fig. 3.32
figure 32

Cantilever beam adhering to the substrate in an (a) S-Shape and (b) arc shape – Reprinted with permission Copyright – 1993 IEEE [64]

and the surface energy γ s is given by:

$${\gamma _s} = \frac{3}{2}\frac{{E{t^3}{h^2}}}{{{{\left( {l - {l_s}} \right)}^4}}}$$
((3.6))

The physical state of the contact surfaces is a critical factor in the existence and development of stiction during use. In simple terms, a high surface energy will make it easier for stiction to play a role in the performance of the device, and since surface chemistry is easily affected by a variety of factors such as oxide or contaminant films, moisture, roughness, ambient gas, and obviously the design, it becomes necessary to consider the balance and control of these factors in preventing stiction [66]. It is also now easier to understand how the surface energy may change over time either due to repeated contact (from pull-in or shock) or due to a change in moisture levels, or a change in ambient gas or surface roughness which could result in higher surface energies and stiction during field operation.

3.3.1.5 Wafer Bonding (or Hermiticity)

A significant number of MEMS devices produced today are hermetically sealed using wafer-wafer bonding which have enabled hermetic packaging of MEMS die before they leave the fab line so as to minimize contaminations from particles and ambient gases. In several applications, the MEMS element needs a lower pressure or vacuum conditions to perform optimally. In these cases the hermitic seal is created by traditional wafer bonding methods which include anodic, glass frit, eutectic, solder, reactive and fusion bonding [72]. It is also common to see a wide variety of gases or gas mixtures (N2, Ne, etc.) employed for optimal performance of the MEMS sensor element. Other techniques such as anodic bonding and solder bonding are known to have issues such as a lower limit for ambient pressure and contamination because of generated or surface desorbed gases.

In a majority of MEMS sensors today, glass frit bonding is by far the most common technique for achieving a hermetic seal. The glass is composed of solvents, organic binder, lead borosilicate glass, and alumina silicate glass (cordierite) and is screen printed onto one wafer, dried and then the temperature is increased until the glass melts (glaze) above 400°C. The second wafer is then aligned and bonded under pressure to the first wafer, followed by slowly cooling the wafers back to room temperature. There are several failure mechanisms that can be traced back to the bonding process and the common ones are incomplete seal glass coverage (Fig. 3.33), squish out of the seal glass on both sides of the seal, lead particles, glass cracks, incomplete adhesion, etc.

Fig. 3.33
figure 33

Cross-section of a bonded sensor showing (a) cross-section, (b) seal glass squish out, (c) incomplete seal glass coverage and (d) gaps in seal glass (reprinted with permission Copyright – 2008 Analog Devices)

In devices packaged at atmospheric pressures any loss of hermiticity due to the above mentioned failure modes, makes the device susceptible to the ingress of undesirable gases or elements from the ambient operating environment. The external packaging of the device can also make a significant difference. In the case of plastic over-molded packages, the diffusion rate of moisture through the package is relatively quick and so moisture can enter the finest of gaps (through capillary action) in the glass seal and over time can accumulate within the cavity causing stiction or even corrosion. One noteworthy point is that these types of failures maybe undetectable before failure because the rates of moisture ingress could be very slow.

In the case of cavities sealed below or above atmospheric pressure, the failure of the seal glass to maintain hermiticity will cause the cavity pressure to immediately revert to atmospheric pressure, resulting in a measurable change in performance of the device. In this case, a manufacturing defect such as incomplete seal glass coverage could be detected before the part is in the field. However, in the field a seal glass failure could occur due to external loading conditions like shock and the resulting performance change could be detrimental to the intended application.

3.3.2 Back End Process Failures

In MEMS product development, back end process steps include all the steps after the final wafer fabrication step such as dicing, assembly process steps, and ATEFootnote 20 testing. These steps can introduce a variety of different failure modes and we will limit our discussion to just a few of the worst offenders and will highlight others.

3.3.2.1 Wafer Dicing

Wafer dicing is the process step where the MEMS wafer is sent to a high speed saw that is capable of cutting the wafer along predefined streets to singulate individual dice. Dicing is typically done with a diamond tipped blade few mils Footnote 21 thick. Silicon is a very brittle material and during a high-speed mechanical sawing operation, blade vibration in the presence of diamond particles and coolant can cause damage to sensitive die. A majority of these particles can be removed with subsequent cleaning steps but a fraction can linger in reentrant gaps and grooves in the die leading to failures due to silicon chips or chip-outs that break away from diced surfaces as shown below in Fig. 3.34. Visual inspections during this dicing step and careful cleaning after are necessary to minimize the proliferation of such chip-outs.

Fig. 3.34
figure 34

Evidence of chipping from the cut surface during a dicing step

The presence of chip-outs within the final package can degrade the performance of the part or cause catastrophic failure which may not be detected during the final testing of the part leading to field escapes where the prevalence of certain conditions can lead to part failure.

Other dicing techniques such as cleaving and stealth dicing are also commonly used but these depend on the thickness and crystal orientation of the wafer as well as cost. Lastly, DBG (Dice Before Grind) is a singulation process primarily developed for separating dice and is employed when normal sawing would created unacceptable levels of chipping and edge damage. This technique has been demonstrated for ultra-thin die as thin as 25 μm [73].

3.3.2.2 Wafer Handling

Wafer handling is another important BE process step that can introduce failure modes which can compromise the reliability of the part. Wafer handling occurs both during front-end and back-end operations but it is during back-end operations that handling of the wafer becomes more sensitive because of the difference in operating environments between front and back end lines.

A common example is shown in Fig. 3.35 above where the level of training of fab technicians is directly correlated to the number of scratches introduced on the surface of a wafer. Damage related to scratches can be a major reliability issue depending on the location and severity of the scratch. The scratch can easily become an initiation site for a crack that propagates from the top or bottom surface of the wafer through the thickness, and the resulting defect may not cause failure immediately but result in a weakened part that then becomes a reliability issue.

Fig. 3.35
figure 35

Differences in number of scratches per wafer between trained and untrained technicians

3.3.2.3 Packaging

In Section 3.2.1.3 we discussed the major design related failure modes encountered in packaging and assembly. The field of electronic packaging is broad and the reader can find plenty of references on specific failure modes pertaining to the manufacture of different kinds of packages, as well as wire bonding, die-attach and over-molding [74]. From a MEMS manufacturing perspective there are two important areas related to the manufacturing of the part that can significantly influence the part reliability – package manufacturing design rules, and material control.

Assembly design rules used to create a MEMS packaged part are usually identical or at least very similar to those used in the packaging of IC chips. For example, in the case of a two-chip packaged part, where the MEMS and IC chip co-exist in the same package, the edge of a MEMS die has to be placed a certain minimum distance away from the edge of the ASIC pad row to allow for wire bonding. This design rule exists to prevent squish-out of the die-attach from contaminating the pad row and allow for proper wire bonding, since it is possible that the design rule could be violated due to a combination of factors including die placement tolerance, die-attach cure conditions, or other assembly process conditions. Strict incoming and process quality screening for contaminants/foreign materials as well as design rule violations is absolutely necessary to minimize the risk of internal corrosion in MEMS products.

Material control involves optimization of the assembly materials set (and processes) to achieve certain repeatability in the package construction which is very intimately connected to the MEMS sensor. There are many examples to chose from but in packaging MEMS devices, the stress state of the device is of critical importance and as we have seen before, there are many factors that can modify the device performance. In plastic over-molded packages, for example, the effect of post-mold cure time influences mechanical strength, glass transition, and adhesion strength and may play a significant role in the reliability of the part.

In summary, assembly processes may quantitatively influence the overall reliability of the part. The process controls in the back-end need to be as stringent as those in the front end to ensure high reliability of the overall product.

3.4 Summary

In this chapter, we have looked at a variety of MEMS failure modes that have their origin in the design or manufacturing phases of the product development cycle. A fair majority of the design failure modes can be avoided with robust design practices that are usually cumulative in the sense that over time the design team adds analyses that can predict the propensity of a part to cause yield or reliability issues. In terms of manufacturing failures modes, the ability to avoid failure modes that can impact part reliability is more tenuous because of two main factors – the sensitivity of front end processes to MEMS device performance, and the interaction of the MEMS device with its immediate surroundings. The control of these factors has to be very good to avoid failure modes that can impact part reliability. In the next chapter, we will look at the mechanisms of certain more common MEMS in-use failure modes in more detail.