1 Introduction

Copper films have found increasing popularity in wafer level packaging (WLP) owing to its excellent thermal and electrical properties [13]. A major concern is thermal stress caused by coefficient of thermal expansion (CTE) mismatch during subsequent high temperature treatment. Due to intricate large biaxial modulus and large tensile strength, the stress in Cu films can be very high and the stress evolution can be quite complicated, thus causing various problems such as void, hillock, or degradation of electrical properties [48].

There has been an increasing interesting to evaluate the stress evolution during thermal cycling in order to improve product yielding and device reliability, and various models are proposed in recent years [9, 10]. Different factors such as preparation technique (as-deposited or as-electroplated), film thickness, patterns, and passivation/capping layers, are investigated [1113]. Different kinds of deformation models regarding different mechanisms such as dislocation glide [4, 14], substrate constraint diffusion [15], coupling grain-boundary (GB) diffusion with surficial and interfacial diffusion [16], or Bauschinger effect [17], are proposed, and lots of mathematic models such as coble creep, strain hardening, power-law breakdown/creep are investigated [1820]. Due to the fact that different mechanism are involved at different temperatures, the models are usually very complicated with many fitted parameters, and the consistency with experiment result is not satisfactory enough. A more concise and practical model is desired.

In this work, the thick Cu blanket film is prepared on silicon wafer, the thermal stress evolution is in situ measured by warpage method. Combined with microstructure analyses, a stress evolution model based on equivalent diffusional energy is proposed, which largely reduces the formula complexity. Compared with conventional model, the current model has fewer fitted parameters, and shows better agreement with the experiment results.

2 Experimental

2.1 Sample preparation

The sample structure is simply Cu films electroplated on 4 inch N-type (100) silicon wafer substrate. For the convenience of warpage measurement (discussed below), the wafer is 420 μm thick double-side polished (DSP). Firstly, 10,000 ± 100 Å thick SiOx film is prepared on both sides of the wafer at 1100 °C by thermal oxidation. Specifically, the wafer is underwent 15 min of dry-oxygen oxidation, then 149 min of wet-oxygen oxidation, and finally 15 min of dry-oxygen oxidation again. After the oxide passivation, a 100 nm thick titanium is deposited as adhesion/barrier layer, followed by 200 nm Cu film as seed layer, both of which are sputtering deposited by PVD in the same chamber with a pressure of less than \(10^{ - 5}\) Pa continuously without breaking the vacuum. Finally, 5 μm thick Cu blanket films are deposited in the electroplating bath at ambient temperature (RT) with additives such as leveler and surfactant. Thick Cu films in this work is defined as a film with one more grains in the vertical direction normal to the film, as shown in Fig. 1.

Fig. 1
figure 1

Illustration of the layered structure of the Cu film on Si wafer substrate

2.2 In-situ warpage measurement

When the film is prepared, the wafer is underwent thermal cycling in a low vacuum chamber with rarefied nitrogen protection. The temperature of the thermal cycling is from RT to 375 °C and then decrease to RT, with the temperature rate of 10 °C/min for one group and 5 °C/min for another comparison. It’s noticed that when cooling to below 100 °C, the cooling rate will become slower than preset value due to limited thermal conductivity of the refrigerating system.

During the thermal cycling, the wafer warpage is in situ measured by a Multi-beam Optical Sensor (MOS) system from k-Space Associate Inc [2123]. Twelve laser beams (3 × 4) project on the backside of the wafer, and reflects to CCD, then the light shift \(\delta\) is detected, and curvature \(\kappa\) is derived. Figure 2 schematically illustrates the curvature of a warped wafer and the MOS measuring system.

Fig. 2
figure 2

Schematic illustration of a curvature of a warped wafer and b the MOS warpage measurement system

There’re two kinds of scanning method: mode A and mode B. Mode A is isothermal surficial scanning, resulting in a 3D morphology of the surface of warped wafer, as displayed in Fig. 3. Because the wafer could be warped into different kinds of shapes, such as saddle, cylinder, or even more complicated shapes, due to buckling under the perturbation [24], the morphology exploration should be conducted by Mode A in the first place. The 3D scanning filed is in the center of the wafer with a scanning radius r s  = 24 μm Because it’s a time consuming process, mode A is only applicable to isothermal process.

Fig. 3
figure 3

Schematic plot of mode A, the isothermal surficial scanning

Mode B is a time-saving point scanning process, which can be conducted with temperature variation. Usually the center point (actually an extremely small area) of the wafer is scanned and the curvature of this point represents the whole wafer curvature.

3 Results

3.1 Wafer warpage analysis

To explore the shape of the warped wafer, mode A is firstly conducted. Figure 4 presents the measured 3D morphology of the warped wafer after thermal cycling, proving that the shape of the warped wafer in current work is an approximate revolution paraboloid (ignore the anisotropy of single crystalline Si).

Fig. 4
figure 4

The 3D morphology of a warped wafer after thermal cycling in current work

For convenience, the paraboloid is approximated to sphere because it leads to ignorable error. From Fig. 3 it can be proved by simple geometry that, for a detected bow (\({\text{B}}\), maximum surface height), the curvature of the paraboloid is \(\kappa_{p} = \frac{{2B_{{}} }}{{r_{s}^{2} }}\left( {1 + \frac{{4B_{{}}^{2} }}{{r_{s}^{2} }}} \right)^{{ - \frac{3}{2}}}\), while that of a sphere is \(\kappa_{s} = \frac{2B}{{r_{s}^{2} }}\left( {1 + \frac{{B_{{}}^{2} }}{{r_{s}^{2} }}} \right)^{ - 1}\). The approximate error between them in current work is less than 0.01 %. Then an in situ warpage measurement can be conducted by mode B, i.e., scanning only the center of the wafer with temperature cycling, the curvature of the wafer is presented by curvature of the center point.

The curvature–temperature (\(\kappa - T\)) response is displayed in Fig. 5. Because the microstructure during the first cycle is thermally unstable, only the \(\kappa - T\) of the second cycling is plotted. Positive curvature indicates tension and negative means compression. For as-deposited Cu films in IC process, typically an intrinsic tensile stress is generated because of grain growth, defect annihilation, insufficient ion implantation, substitutional impurities [25] and etc.

Fig. 5
figure 5

The measured κT curve with different temperature rate

After thermal cycling, a hysteric \(\kappa - T\) behavior is formed, which is normally regarded as originating from plastic deformation. It’s noted that hysteresis is the characteristic of thin films of metals, because thin films leads to biaxial stress, which in turn leads to large effective shear stress and corresponding plastic deformation [17]. On the contrary, no obvious hysteresis are observed in polyimide films or TSV wafers [21, 22], indicating an elastic behavior.

For both group, the stabilized HT curvature (marked as C′) is 0.079 ± 0.001 m−1, and RT curvature after thermal cycling (marked as A′) is about 0.235 ± 0.001 m−1. The elastic range can be roughly regarded as \(A'B'\) (or \(AB\) in inset) during heating, and \(C'D'\) (or \(CD\) in inset) during cooling. The elastic range during heating is obviously longer than that of cooling, indicating that the plastic deformation is easier to activate during cooling treatment [12]. The dramatic stress evolution occurs at about 200 °C.

3.2 Stress distribution

The film stress evolution can be derived from the \(\kappa - T\) response. The film stress \(\sigma_{f}\), containing thermal stress component and intrinsic component, can be derived by a revised Stoney Formula [26], which gives better accuracy for thick films:

$$\sigma_{f} = \sigma_{0} + \frac{{M_{s}^{{}} t_{s}^{3} }}{{6t_{f} (t_{f} + t_{s} )}}(\kappa_{{}} - \kappa_{0} )$$
(1)

where \(\kappa\), E, t and \(\mu\) are measured curvature, Young’s Modulus, thickness, and Poisson’s ratio respectively. Biaxial modulus \(M = E/(1 - \mu )\), and the subscript \(s\) and \(f\) refers to Si wafer substrate and composite films respectively. According to the RT curvature and formula (1), the biaxial stress in the Cu film at RT is about 260 MPa.

The stress state can be largely influenced by sample dimension, i.e., the edge effect [27], especially for a bit thick films (such as 5 µm in present work), resulting the final planner stress deviating from the value deduced by Stoney Formula [13]. In film structure with lateral dimensions much greater than the thickness of the film, the ratio of shear and normal stress on the film/substrate interface are significant only near the edges of the film. Except near the edge, the shear stress is small and the stress state in the film is purely biaxial.

Consequently, the edge effect, which deals with shear stress and normal stress distribution, as well as the von-Mises stress that critical for plastic deformation, is conducted by FEA. The thermal stress is assumed to generate by cooling from a reference temperature to RT, the temperature is chosen at which the simulated warpage is equal to experimental values at RT.

Figure 6 displays the in-plane and out-of-plane film stress distribution at RT predicted by FEA simulation, which is generally in accordance with analytical results except near the edge. The in-plane stress is presented by \(L_{r}\), along r-direction, and the out-of-plane is defined as the normal direction of the film, as marked by \(L_{\text{z}}\), along z-direction in the picture. The normal stress as well as von-Mises stress decline gradually near wafer edge. And the ratio of shear stress to normal stress is also plotted, which is only important near the edge, showing good agreement with the edge effect model [27]. The entrance length is about 100 µm, which can be ignored when considering the overall wafer level film stress.

Fig. 6
figure 6

a The in-plane (L r , along r-direction) and b out-of-plan (\(L_{\text{z}}\), along z-direction) stress distribution at ambient temperature predicted by FEA simulation

The von-Mises stress shows similar distribution tendency with normal stress, suggesting that the plastic deformation can be considered as uniformly distributed except the very edge of the wafer. In the vertical direction normal to the film (z-direction), the stress slightly decreases from interface to film surface, but generally can be treated as uniform distributed.

3.3 Microstructure investigation

The microstructure is investigated by focus ion beam (FIB), as displayed in Fig. 7. For better contract of different crystal orientation in FIB ion image, the copper surface is firstly chemical polished, followed by slightly ion etching before taking images.

Fig. 7
figure 7

FIB ion images of the a as-deposited and c Cu film after 10 °C/min thermal cycling. b is the magnification of cross section in a

It can be seen from Fig. 7 that few voids are observed at the Cu/Si interface in the as-deposited sample. After thermal cycling, lots of micro voids are generated at the interface, suggesting dramatically atomic diffusion occurs during thermal cycling.

The thick film in this work proves that there’re two or three grains along the normal direction of the film, thus leading to an obvious difference from thin films where columnar grains dominated [28]. In thin films, generally the inner atoms diffuse to (or from) surface vertically along GBs and then migrate horizontally along the surface or interface, while in current thick films, main diffusion path is not limited to vertical GBs and horizontal surface/interface. In other words, the horizontal GB diffusion path may also play an important role for stress relaxation.

4 Discussion

4.1 Temperature rate influence

Different temperature increasing/decreasing rate (temperature rate for short hereinafter) reveals different \(\kappa - T\) characteristics during heating but quite similar during cooling process, as seen in Fig. 5. The deviation begins at 150 °C about, in tensile stress range during heating, and the largest deviation occurs at about 250 °C. The temperature, at which the corresponding warpage declines to zero during heating, is defined as reference temperature \(T_{ref}\). It’s interesting to find that different loading rates reveals negligible difference during cooling, and the nonlinear behavior shows that plastic relaxation is pervasive. It can be inferred that the stress evolution during heating is a time-dependent (non-stationary) process, which is also demonstrated by isothermal stress relaxation [19, 29, 30], while during cooling it’s mainly temperate-dominant due to ignorable difference between two loading rate [31].

Figure 5 also indicates that larger temperature rate leads to faster stress evolution. For instance, the average stress of the sample with temperature rate of 10 °C/min declines to zero after 14.3 min and the corresponding temperature \(T_{ref}\) is 173 °C, while for 5 °C/min sample, the stress at 173 °C is about 26 MPa, and it declines to zero after 32.4 min of heating with T ref  = 192 °C. Both curves reach the same point at 375 °C. From the microcosmic aspect, a faster loading rate allows less time for atom diffusion or dislocation gliding, but meanwhile it poses a larger superheat and correspondingly a larger driving energy. If different diffusion mechanism, such as GB or lattice or surface or interface diffusion, do play the key role for stress relaxation/evolution, the difference between heating and cooling matches the fact that the diffusion time is longer during heating than cooling under the same temperature rate [12].

4.2 Linear temperature-dependent diffusional energy \(Q\)

The thermal strain \(\varepsilon_{\text{th}}\), consists of elastic component and plastic component, results from CTE mismatch, and is confined by silicon expansion and wafer warpage (warpage leads to film stretching or contraction), which is defined as:

$$\varepsilon_{th} = \varepsilon_{e} + \varepsilon_{p} = \int_{{T_{ref} }}^{T} {(\alpha_{f} ({\text{T}}') - \alpha_{s} ({\text{T}}'))dT'} + \kappa (t_{f} + t_{s} )$$
(2)

The nonlinearity of the \(\kappa - T\) curve is in fact determined by plastic rate \(\varepsilon '\):

$$\frac{d}{dT}\sigma = M_{f} ((\alpha_{si} - \alpha_{Cu} ) + \varepsilon_{p}^{\prime} /{\text{T}}^{\prime} )$$
(3)

The plastic rate in formula (3) is defined as \(\varepsilon_{p}^{\prime} = d\varepsilon_{p}^{\prime} /dt\).

As revealed by Fig. 7 that dramatically atomic diffusion occurs after thermal cycling, it can be speculated that the stress evolution in Fig. 5 is mainly dominated by diffusional creep, such as diffusion through grain boundary (i.e., Coble creep) or through lattice (i.e., Nabarro-Herring creep). Thus the plastic rate in formula (3) can be considered as controlled by diffusional process:

$$\varepsilon^{\prime} = \varepsilon_{0}^{\prime} \exp \left( { - \frac{Q}{{k_{B} {\text{T}}}}} \right)$$
(4)

where \(Q\) is the thermos-activated diffusional energy, which varies from 0.54 to1.08 eV in different literatures due to different models, mechanisms, and measurement method [19, 2935].

It’s apparent that the dominant mechanism for plastic deformation differs at different temperatures, such as dislocation creep at low temperature, surface migration at medium temperature, and interface/GB diffusion occurring at higher temperature, and lattice diffusion occurring at even higher temperature [15, 20, 36, 37]. Besides, the annealing rate, protection atmosphere, substrate constraint, passivation, and etc., all may affect the stress evolution of the film. It’s improbable to propose a simple formula that can describe all the mechanism simultaneously with fair agreement. In fact, the formulas in most of previous work are complicated with lots of fitted parameters, or fail to predict the stress evolution accurately during thermal cycling.

To avoid the complexity caused by different diffusional mechanism, the \(Q\) in formula (4) is assumed to be linear temperature-dependent, thus it may reflect different mechanisms at different temperatures by only three fitted parameters. Combined formula (1)–(4) and the measured \(\kappa - T\) response in Fig. 5a, \(Q\) is deduced, as displayed in Fig. 8. It’s in the range of 0.3–0.9 eV, which is in general accordance with reported values [33, 34], demonstrating that the linear temperature-dependent assumption of \(Q\) is reasonable.

Fig. 8
figure 8

Comparison of the derived temperature-dependent Q with reported values. A is from Surholt et al. [34], and BD is from Gupta et al. [33]

To verify the linear temperature dependent assumption of \(Q\), the diffusivity \(\delta D\) is further analyzed, which is defined as:

$$\delta D = {\text{C}}_{0} \exp ( - {\text{Q}}/{\text{kT}})$$
(5)

where \(\delta\) is the GB width, \(D\) is the GB diffusion coefficient, and \({\text{C}}_{0}\) is the pre-exponential factor. The pre-exponential factor here refers the results of unpassivated Cu film in the work from Gan [29] et al. It can be seen from Fig. 9 that diffusivity deduced in current work shows similar trend and comparable order of magnitude with literature [29, 33, 34], demonstrating the validity and accuracy of current model.

Fig. 9
figure 9

Comparison of the diffusivities with previous studies. (a) Surholt [34], (b) Frost et al. [32, 34], (c) Gupta et al. [33]

Despite the comparable values, the deduced \(Q\) in current work is a bit smaller than 1.07 eV, which is the GB diffusional energy in reported unpassivated Cu film [16]. The difference may be originated from different grain structures of the thick films and reported thin films. Generally, thin copper films consists of standard column grains in the reported literature, thus the main diffusion path is vertically along GBs \(D_{GB.v}\), and along grain surface \(D_{surf}\), and the stress relaxation is determined by \(D_{GB}\) because \(D_{surf} \gg D_{GB}\). In current work, however, there are 2–3 grains along the normal direction of the thick Cu film, as seen in Fig. 7. Except for \(D_{surf}\) and \(D_{GB.v}\), there are an additional horizontal diffusional path along GBs \(D_{{GB.{\text{h}}}}\). Apparently, \(D_{GB.v}\) and \(D_{{GB.{\text{h}}}}\) are not independent, and the actual diffusion \(D_{\text{final}}\) depends on the coupling of two manners, as schematically illustrated in Fig. 10.

Fig. 10
figure 10

Schematic illustration of the diffusion path in thin film (up) and thick film (down)

4.3 Warpage prediction by proposed model

At last, it’s assumed that the plastic deformation is only determined by a linear temperature-dependent diffusional energy \(Q\), then curvature evolution can be modeled by the following formula:

$$\kappa = \frac{{6B_{f} t_{f} ({\text{t}}_{f} + {\text{t}}_{s} )}}{{B_{s} t^{3} }}\int_{RT}^{HT} {\left( {\Delta \alpha - \varepsilon_{0} \exp \left( { - \frac{Q}{{k_{B} {\text{T}}}}} \right)/{\text{T}}'} \right)dT}$$
(6)

To substitute the linear temperature-dependent \(Q\) into formula (6), the simulated \(\kappa - T\) curve is obtained. Figure 11a, b plot the simulated \(\kappa - T\) curves under temperature rate of 10 and 5 °C/min respectively. For comparison, the experimental measured \(\kappa - T\) response is also plotted. It shows that the model makes accurate prediction of the stress evolution during thermal cycling of thick films in current temperature range. The accuracy can be interpreted by the temperature-dependent \(Q\), because a variable diffusional energy in the proposed model represents different mechanisms at different temperatures.

Fig. 11
figure 11

The simulated warpage evolution with temperature rate of a 10 °C/min and b 5 °C/min, showing good agreement with experimental measurements

5 Summary and conclusion

In this work, a 5 μm thick Cu blanket film is electroplated onto 4-inch 420 μm thick DSP Si wafer, and the stress evolution during thermal cycling from RT to 375 °C is in situ analyzed by MOS system. The stabilized RT curvature after thermal cycling is about 0.2347 ± 0.001 m−1, a corresponding biaxial stress of about 260 MPa. FIB ion images reveal that the grain growth during thermal cycling is ignorable in current work, but dramatic atomic diffusion has occurred instead, suggesting the deformation mechanism during thermal cycling is dominated by diffusional creep. As the dominant diffusion mechanism differs at different temperatures, an equivalent diffusional energy that increasing linearly with temperature rising is proposed, and consequently a stress evolution model based on the equivalent diffusional energy is deduced. Different from conventional diffusion analyses, the diffusional energy in current model is assumed to be linearly correlation to temperature, which in fact presents different diffusional mechanisms at different temperatures. Compared with conventional models which usually has complex equations, the current model has a concise formula with only three fitted parameters, and shows much better agreement with the experiment results.