1 Introduction

As the foundation and core components of power electronic systems, power modules are the "hub" of power conversion and control [1]. Power modules usually operate under very severe working conditions, such as irradiation, NBTI, high electric fields, hot carrier injections, unclamped inductive switching, and so on [2,3,4,5]. They can withstand more than five million power cycles in their life cycle [6]. Thus, the aging problem is inevitable.

According to an industrial survey, about 34% of converter failures are caused by device failures [7], and 55% of system failures are caused by temperature [8]. In addition, the failure rate of power devices is known to double when the temperature increases by 10 ℃ [9]. Therefore, it is very important to accurately monitor the junction temperature of IGBT modules [10]. There are three reasons why this is important. (1) Changes in the junction temperature can affect the physical parameters of a semiconductor and change its characteristics. (2) Any change in the IGBT health state caused by common faults (such as bond wire lift off and solder layer fatigue) is represented by a change in the IGBT junction temperature. Auxiliary manual intervention can prolong the lifetime of the device. (3) Improving the power density of the IGBT power module improves the system reliability and reduces cost.

At present, there are four popular methods for junction temperature monitoring and estimation: (a) physical contact methods; (b) optical methods; (c) thermal impedance model prediction; and (d) thermo-sensitive electric parameters (TSEPs). The physical contact measurement methods put the thermal sensor directly on the IGBT chip to directly measure the junction temperature [11, 12]. Although this method is simple and low cost, it has strong invasiveness and a slow response speed. The optical measurement methods use infrared cameras to monitor the temperature of a device [13, 14]. This method has high accuracy and can measure the surface temperature field distribution of the die. However, its cost is high, and it is necessary to open the package of the tested module for direct measurements. The thermal impedance model prediction method simulates the junction temperature distribution through a finite element method when the power loss and thermal impedance models are determined [15, 16]. Although this method is non-invasive, it is affected by the time-varying thermal resistance network parameters, which results in junction temperature prediction errors [17]. The physical characteristics of semiconductor materials are correlated with temperature [18]. The technique to acquire junction temperature by this relationship is referred to as TSEPs. It is one of the most promising temperature measurement methods since it is not necessary to break the power module and it has a rapid response time.

Most of the studies on the monitoring of chip junction temperature by TSEPs are concentrated on single-chip power modules. Commonly used TSEPs include the Miller plateau duration [19], the change rate of the voltage between the collector and the emitter [20], the short-circuit current [21], the turn-on/turn-off delay time [22], the threshold voltage [23], the gate current [24], the collector–emitter voltage [25], etc. The safe operation of IGBT modules using parallel chips is becoming more and more important due to the development of high-power inverters [26]. However, when considering multichip modules with uneven temperature distributions between chips, the aforementioned methods are longer meaningful due to their lower sensitivity. In the case of multichip modules, considering that the electrothermal characteristics of chips cannot be the same, coupled with the non-uniformity of the chip cooling systems and operating conditions, some chips suffer from higher stresses. The biggest stressed chip solder layer degenerates the most. At the same time, the enhancement of thermal resistance accelerates the inhomogeneous degradation in multichip IGBT modules [27].

For a single-chip module or multichip modules with an even temperature distribution, the evaluated temperature is the junction temperature of the chip or chips. However, what the evaluated temperature indicates is still unknown if multichip modules are subjected to uneven temperature distribution. From a practical point of view, the condition monitoring technology of power modules with parallel chips is urgently needed. A comparison of the new TSEPs method with existing methods is shown in Table 1.

Table 1 Comparison of the proposed method with existing methods

This paper clarifies the electrothermal coupling characteristics between chips in multichip modules, and establishes a mathematical model of the electrothermal relationship. Tests indicate that the temperature evaluated by the threshold voltage approaches the maximum temperature in the chips within the module. In addition, the junction temperature evaluated by the change rate of the collector current and that of the collector–emitter voltage approaches the average junction temperature of the module.

2 Temperature evaluation of the TSEPs in multichip modules

Paralleling multiple devices or chips is considered as a promising solution for increasing the module or converter capacity. However, when considering multichip modules with uneven temperature distributions between the chips, estimating the temperature of multichip modules through TSEPs is no longer valid due to its low sensitivity. In this section, a mathematical model of the electrothermal relationship in a multichip power module is described in detail. Then, the physical significance of the evaluated junction temperature of the multichip module through existing TSEPs is clarified.

2.1 Threshold voltage

Previous studies [23] have shown that the threshold voltage is reduced when the temperature increases. Thus, the chip with the highest temperature (with the lowest threshold voltage) plays a decisive role in the conduction of a multichip module. In the conduction transient process, the chip with the maximum temperature value is activated by a large current stress. Then, the temperature is further elevated as much as the positive feedback mechanism. Therefore, it is identified as the chip with the heaviest stress, which is more prone to aging and eventual failure. The expression of threshold voltage is:

$$V_{{{\text{thp}}}} (T){ = }V_{{{\text{th}}}} (T_{\max } ).$$
(1)

Consequently, the temperature evaluated by the threshold voltage is the maximum temperature of the chips in the multichip power module.

2.2 Maximum collector–emitter voltage change rate during turn-off

To establish a mathematical model of the electrothermal relationship in a multichip module, the internal structure of n IGBT chips in parallel is built, as shown in Fig. 1, where L1n is the parasitic inductance in the gate path, while L2n and Rn are parasitic inductance and parasitic resistance of the main power path, respectively. The gate current ig is obtained as follows:

$$i_{{\text{g}}} (t) = \sum\limits_{z = 1}^{n} {i_{{{\text{g}}x}} (t)} .$$
(2)
Fig. 1
figure 1

Internal structure of n IGBT chips in parallel

During the turn-off period, the collector–emitter voltage increases rapidly when the gate current charges the Miller capacitor, and the collector–emitter voltage change rate reaches the maximum value, which is [20]:

$$C_{{{\text{gc}}x}} (T) = \frac{{C_{{{\text{ox}}}} A\sqrt {\frac{{q \cdot N_{{\text{B}}} (T) \cdot \varepsilon }}{{2V_{{{\text{ce}}}} }}} }}{{C_{{{\text{ox}}}} + A\sqrt {\frac{{q \cdot N_{{\text{B}}} (T) \cdot \varepsilon }}{{2V_{{{\text{ce}}}} }}} }} \propto N_{{\text{B}}} (T) = kT_{x} ,$$
(3)

where Cgcx is the gate–collector capacitance of chip x.

$$\frac{{{\text{d}}v_{{{\text{ce}}}} }}{{{\text{d}}t}}_{\max } = \frac{{i_{{\text{g}}} }}{{C_{{{\text{gc}}}} (T)}} = \frac{{\sum\nolimits_{z = 1}^{n} {i_{{{\text{g}}x}} (t)} }}{{\sum\nolimits_{z = 1}^{n} {C_{{{\text{gc}}x}} (T)} }}.$$
(4)

From Eqs. (3) and (4), the relationship between the maximum change rate of the collector–emitter voltage and the temperature is obtained when n chips are in parallel as follows:

$$\frac{{{\text{d}}v_{{{\text{ce}}}} }}{{{\text{d}}t}}_{\max } = \frac{{\sum\nolimits_{x = 1}^{n} {i_{{{\text{g}}x}} (t)} }}{{k\sum\nolimits_{x = 1}^{n} {T_{x} } }}.$$
(5)

It can be seen that the maximum change rate of the collector–emitter voltage is independent of the parasitic parameters in the gate path and the main power path. When all of the internal chips show the external module temperature:

$$\frac{{{\text{d}}v_{{{\text{ce}}}} }}{{{\text{d}}t}}_{\max } = \frac{{\sum\nolimits_{x = 1}^{n} {i_{{{\text{g}}x}} (t)} }}{{nkT_{j} }},$$
(6)

where Tj is the evaluated temperature from the TSEPs in the module. From Eqs. (5) and (6), the following is obtained:

$$T_{j} = \frac{1}{n}\sum\limits_{x = 1}^{n} {T_{x} } .$$
(7)

Then, the temperature evaluated by the maximum change rate of the collector–emitter voltage is the mean temperature of the chips within the multichip power module.

2.3 Maximum collector current change rate during turn-off

For an inductive load, the maximum change rate of the collector current is [28]:

$$\frac{{{\text{d}}i_{c} }}{{{\text{d}}t}}_{\max } = \frac{{ - 2I_{{\text{L}}} }}{{\tau \left( {\frac{T + 273.15}{{300}}} \right)^{1.5} }} \approx \frac{{ - I_{{\text{L}}} }}{\tau (0.003T + 0.42)},$$
(8)

where \(\tau\) is the large injection lifetime of the cut-off layer at 26.85 ℃, and IL is the load current. Therefore, the relationship between the maximum change rate of the collector current and the temperature is obtained when n chips are paralleled as follows:

$$\frac{{{\text{d}}i_{{\text{c}}} }}{{{\text{d}}t}}_{\max } = \frac{{ - {\text{d}}\sum\nolimits_{x = 1}^{n} {i_{{{\text{c}}x}} } }}{{{\text{d}}t}}_{\max } = \frac{{ - I_{{\text{L}}} }}{{\tau \sum\limits_{x = 1}^{n} {0.003T_{x} } + 0.42n\tau }}.$$
(9)

It can be seen that the maximum change rate of the collector current is only affected by the total load current and the temperature. The parasitic parameters in the gate path and the main power path are different, which results in uneven current in the power module while the total current of the multichip module remains unchanged. Thus, it does not change the maximum change rate of the collector current. When all of the internal chips show the external module temperature:

$$\frac{{{\text{d}}i_{{\text{c}}} }}{{{\text{d}}t}}_{\max } = \frac{{ - I_{{\text{L}}} }}{{0.003\tau nT_{{\text{j}}} + 0.42n\tau }}.$$
(10)

From Eqs. (9) and (10), the following can be obtained:

$$T_{j} = \frac{1}{n}\sum\limits_{x = 1}^{n} {T_{x} } .$$
(11)

Then, the temperature evaluated by the maximum change rate of the collector current is the mean temperature of the chips within the multichip power module.

According to the discussion above, although some of the fundamental equations are not novel, it is found for the first time that the physical significance of the junction temperature evaluated by threshold voltage (change rate of the voltage and that of the current) represents the highest (average) temperature in a multichip module.

3 Electrothermal coupling in multichip modules

3.1 Effect of the electrical difference between chips on temperature distribution

To improve the reliability of the multichip module, IGBT chips with same batch of identical electrical parameters are connected in parallel. However, the parasitic parameters in the gate path and the main power path are different in the power module due to the asymmetry in the layout design, which can lead to uneven current sharing. The junction temperature of chip z in thermal steady state is:

$$T_{jsz} { = }T_{{\text{c}}} { + }P_{{\text{loss,z}}} R_{{{\text{th}}}} ,$$
(12)

where Ploss.z is the power loss of chip z. Rth is the junction-to-case thermal resistance. According to the datasheet of the device and (12), the temperature difference between chips caused by electrical difference, which is generated by parasitic parameters or chip layout, does not exceed 3 ℃. In addition, the sensitivity of the threshold voltage is about 10 mV/℃. Thus, the chip temperature difference caused by chip manufacturing tolerance is only 2–3℃.

3.2 Effect of the temperature difference caused by chip solder layer degradation on current distribution

Lai et al. found that the solder layer begins to deteriorate first, followed by junction temperature increases, which further promotes the lift off of bonding wires [29]. The modules keep functioning properly at the beginning of its life. Although there is thermal coupling between the chips, previous researches have revealed that, above the base plate, there is barely any crosswise heat transfer between chips, and the temperature difference between chips caused by electrical differences or chip position differences does not exceed 3 ℃. After a long period of operation, some dies probably tolerant higher stress as a result of the inevitable differences of the chip electrothermal characteristics and the unevenness of the cooling system. Due to the temperature difference caused by the electrical characteristics and position of the chips, the chip under the highest stress further accelerates the aging process and increases the temperature difference, which results in positive feedback.

Electrothermal simulations are carried out in MATLAB, and the electrothermal characteristic of two IGBTs are extracted from SKM50GB12T4 IGBTs rated at 1200 V/50A. According to previous research [30], for multichip modules, the higher the switching frequency, the greater the junction temperature difference between chips. The common switching frequency 4 kHz and duty cycle 50% settings are selected for the simulation.

At different temperatures, the I–V characteristics and switching losses are represented by a linearly interpolated look-up table (LUT) to evaluate the power losses of parallel IGBTs. As shown in Fig. 2, a controlled power supply is connected to two IGBTs, and its current and junction temperature are determined by the LUT. The power losses are output to the thermal network model to estimate their junction temperatures. Meanwhile, the LUT also gives the I–V characteristic curves of two IGBTs at different junction temperatures to evaluate the current sharing characteristics.

Fig. 2
figure 2

Electrical–thermal modeling

The degradation level (DL) is defined according to the increase of the chip-to-case thermal resistance. Since IGBT 1 is a healthy chip (DL 0), it is assumed that the degradation of IGBT 2 changes from DL 0 to DL 4. Figure 3 presents the temperature differences and current differences between the two IGBTs with uneven degradation levels. It was found that the junction temperature difference between chips due to solder layer degradation under rated current injection can be 40 °C. However, the current difference at the operating point of the rated current injection is less than 5A. As shown in Fig. 4, the percentage of temperature difference increase between chips is higher than the percentage of current difference reduction. It is also found that there is a corresponding relationship between the DL of the solder layer and the difference between the maximum temperature and average temperature of the chips in the multichip module.

Fig. 3
figure 3

Temperature and current differences between parallel IGBTs

Fig. 4
figure 4

Percentage of temperature and current differences between parallel IGBTs

By clarifying the physical significance of the junction temperature evaluated through the dynamic TSEPs, this study creates new pathways for future research to realize the condition monitoring of initial solder layer degradation in multichip modules.

4 Experimental verification

4.1 Experimental platform

From the above analysis, the three TSEPs are only affected by the DC bus voltage, load current, and temperature. They are independent of the parasitic parameters in the gate path and the main power path. When initial solder layer degradation occurs in the module, an uneven temperature distribution of the chips occurs. However, the total load current and DC bus voltage of the power module remain constant. This research focuses on finding the corresponding relationship between the characteristics of the multichip module terminals and the internal chip junction temperature. It also aims to clarify the physical significance of the junction temperature evaluated by electrical parameters. To eliminate the parallel uneven current phenomenon caused by the discreteness of the parameters of the IGBT and the inconsistency of the external main circuit and driving circuit parameters, chips (devices) of same model and batch are used in parallel in the multichip module. For multichip modules, it is difficult to accurately control different temperature distributions or uneven aging of chips. Therefore, devices of the same type and batch are selected and set up in parallel to simulate a multichip power module in a double pulse test rig. The differences between the simulated multichip power module and the actual multichip power module are the layout and parasitic parameters, which are independent of the TSEPs. The circuit topology and experimental platform are shown in Fig. 5. The half bridge power module consists of an IGBT and a diode chip on each leg, as shown in Fig. 5b. The upper half bridge leg of the half bridge module SKM50GB12T4 (1200 V, 50A) and Infineon discrete IGBTs RGS50TSX2DHR (1200 V, 50A) packaged in TO-247 are separately selected as the devices under test (DUT) to verify the applicability of the proposed model and the feasibility of the simulated multichip modules in two applications. The details of the two DUTs are shown in Fig. 5c.

Fig. 5
figure 5

Double pulse test circuit topology experimental platform: a circuit topology and experimental platform; b layout of the power module; c details of two DUTs

To obtain waveforms of parallel IGBTs under different junction temperatures, two heating platforms were placed under D1 and D2. Before the test, the baseplates of D1 and D2 were heated up for 20 min to ensure thermal equilibrium [31]. An infrared camera is used to measure the temperature of the module chip and the black case of the discrete devices, so that the accuracy of the temperature controlled by D1 and D2 is higher than 99%. While adjusting the temperature of the baseplate, the corresponding voltage and current waveforms of the IGBTs under different temperature differences are measured.

4.2 Threshold voltage

The condition of the multichip module with an even temperature distribution is simulated by heating up two parallel IGBTs to the same temperature. Figure 6 shows Vge and Ic waveforms of different devices with an even temperature distribution. The Vth measurement is the value of vge obtained when the IGBT is turned on. The value of Vth for different DUTs as a function of temperature when the conduction current is 40 mA, 60 mA, and 80 mA are shown in Fig. 7. Although the values of Vth for different semiconductor devices are different, they show a linear relationship with temperature. The temperature sensing properties under different conduction currents are almost identical, which is approximately equal to 10.76 mV/°C for discrete IGBTs (approximately 9.44 mV/°C for modules). Thus, the Vth LUT for different devices can be established.

Fig. 6
figure 6

Vge and Ic, waveforms of different devices with even temperature distributions

Fig. 7
figure 7

Vth of different DUTs as a function of temperature when the conduction current is 40 mA, 60 mA, and 80 mA, respectively

Heating up two IGBTs to different temperatures simulates a multichip module with an uneven temperature distribution. Figure 8 shows Vge and Ic waveforms of different devices with uneven temperature distributions. According to the above, the temperature sensing properties under different values of conduction current are almost the same. Hence, the corresponding gate–emitter voltage of 80 mA conduction current is chosen to obtain the threshold voltage. The temperature evaluated by the Vth LUT of distinct devices is shown in Table 2. As can be seen from Table 2, for discrete IGBTs or power modules, the junction temperatures evaluated by Vth all approach the maximum temperature of the parallel chip. Although the parasitic inductance of the power modules topology is much higher than that of discrete IGBTs, the maximum estimation error for any type of IGBT is no more than 3 ℃. Thus, the physical significance of the temperature evaluated by the threshold voltage is the maximum temperature of one of the chips in multichip modules.

Fig. 8
figure 8

Vge and Ic waveforms of different devices with uneven temperature distributions

Table 2 Tj (°C) Evaluated by the Vth of different DUTs

4.3 Maximum collector–emitter voltage change rate during turn-off

Heating two parallel IGBTs to the same temperature simulates a multichip module with an even temperature distribution. Figure 9 shows dVce/dtmax waveforms of different devices. It can be seen that the maximum change rate of the collector–emitter voltage is reduced when the temperature is increased, which indicates subzero temperature reliance. The dVce/dtmax values of different DUTs as a function of temperature are shown in Fig. 10. Although the maximum change rates of the collector–emitter voltage of different devices are different, they have a linear relationship with temperature. Results show that the temperature sensitivity is approximately 9.88 V/us °C for discrete IGBTs (approximately 7.69 V/us °C for modules). Thus, it is possible to create a LUT of the maximum change rate of the collector–emitter voltage of different devices.

Fig. 9
figure 9

dVce/dt waveforms of different devices with even temperature distributions

Fig. 10
figure 10

dVce/dtmax of different DUTs as a function of temperature

A multichip module with an uneven temperature distribution is simulated by heating two parallel IGBTs to the different temperatures. Figure 11 shows dVce/dt waveforms of different devices with uneven temperature distributions. The temperature evaluated by the maximum change rate of the collector–emitter voltage LUT of different devices is shown in Table 3. From this table, for either discrete IGBTs or power modules, the junction temperatures evaluated by the maximum change rate of the collector–emitter voltage are all approaching the average temperature of the parallel chip. Regardless of what type of IGBT is used, the maximum estimation error is no more than 3℃. Thus, the physical significance of the temperature evaluated by the threshold voltage is the average temperature of one chip in multichip modules.

Fig. 11
figure 11

dVce/dt waveforms of different devices with uneven temperature distributions

Table 3 Tj (°C) Evaluated by the dVce/dt of different DUTs

4.4 Maximum collector current change rate during turn-off

Figure 12 shows dic/dt waveforms of different devices with even temperature distributions. These results indicate that the maximum change rate of the collector current decreases with an increase of the temperature, which shows a negative temperature dependency characteristic. The dic/dtmax values of different DUTs as a function of the temperature is shown in Fig. 13. Although the maximum change rates of the collector currents of different devices are different, they are linear with respect to the measured temperature. These results show that the temperature sensitivity is approximately 13.55 A/us °C for discrete IGBTs (approximately 16.72 A/us °C for modules) Thus, it is possible to create a LUT of the maximum change rates of the collector currents of different devices.

Fig. 12
figure 12

dice/dt waveforms of different devices with even temperature distributions at 500 V and 45A

Fig. 13
figure 13

dice/dt waveforms of different devices with uneven temperature distributions at 500 V and 45A

Figure 14 shows dic/dt waveforms of different devices with uneven temperature distributions. The temperatures evaluated by the maximum change rate of the collector current LUT of different devices is shown in Table 4. As can be seen from this table, for discrete IGBTs or power modules, the junction temperatures evaluated by the maximum change rates of collector currents all approach the average temperature of the parallel chip. Regardless of what type of IGBT is used, the maximum estimation error is no more than 3 ℃. Thus, the physical significance of the temperature evaluated by the maximum collector current change rate is the average temperature of the chip in multichip modules.

Fig. 14
figure 14

dice/dt waveforms of different devices with uneven temperature distributions at 500 V and 45A

Table 4 Tj (ºC) Evaluated by the dic/dt of different DUTs

5 Conclusion

This paper establishes a mathematical model of the electrothermal relationship. The characteristics of the electrothermal coupling between chips in a multichip module, and the influence of solder layer aging on it are analyzed through simulations. Simulation and experimental results show a number of things.

  1. 1.

    The proposed junction temperature measurement method is independent of both the internal chip layout and the parasitic parameters inside multichip modules.

  2. 2.

    The physical significance of the temperature evaluated by the threshold voltage is the highest temperature of one of the chips in a multichip module, and the physical significance of the temperature evaluated by the maximum change rate of the collector–emitter voltage or the maximum change rate of the collector current is the average temperature of the chips in multichip modules.

  3. 3.

    There is a corresponding relationship between the DL of the solder layer and the difference between the maximum temperature and average temperature of the chips in multichip modules.

Although the mathematical model of the electrothermal relationship is not novel, this research provides a new path for future research, and the corresponding relationship will be studied in the future. A solder layer aging monitoring method based on combined TSEPs will be proposed in the future. In addition, the different IGBT modules will be connected in parallel to propose an IGBT parallel current sharing control method.