Keywords

1 Introduction

In order to supply space equipment with highly reliable electronic components, specialized testing centers conduct a variety of tests for each installed semiconductor device. Electronic component base (ECB) designed for installation in spacecraft equipment, along with the input testing is subjected to additional rejection tests, including a selective destructive physical analysis (DPA). DPA allows us to confirm the good quality of the batches of ECB, or to identify the batches, having defects due to manufacturing technology and are not detected during conventional rejection tests and additional non-destructive testing. In order to be able to transfer the results of DPA of several devices to the entire batch of semiconductor devices, the following requirement is put forward for the ECB intended for installation in space equipment: all devices from the same batch must be made from the same raw materials. Equipment manufacturers for general consumption (not designed solely for use in spacecraft) can not guarantee the implementation of this requirement. Therefore the problem of automatic grouping of semiconductor devices by production batches is very relevant.

It was shown [1], that the problem of allocation of homogeneous batches can be further reduced to a problem of cluster analysis. Authors [1] consider k-means, p-median and other optimization models for solving such a problem. Each group (cluster) must represent a homogeneous batch. To solve the problem of identifying homogeneous batches, in papers [2,3,4], the application of the clustering optimization algorithm k-means is proposed. In [5], authors consider the clustering method based on the EM algorithm which maximizes the log likelihood function. A model of separation of homogeneous production batches based on a mixture of Gaussian distributions was proposed in [6]. In [7], authors propose using ensembles of optimization models (k-means, k-medoids, k-medians), EM, as well as their optimized versions. In [1], authors consider the application of genetic optimization algorithms with greedy heuristic procedures, in combination with the EM algorithm for the separation of homogeneous batches of electronic devices. The advantage of the new algorithms over classical clustering algorithms for multidimensional data is shown.

In this paper, the initial data are represented by multidimensional sets (arrays) of parameters of electronic radio components (ERC), measured as the results of several hundred mandatory non-destructive tests [8]. In order to reduce the dimensionality of the input parameter sets for clustering devices into homogeneous batches, we propose the application of factor analysis methods. The aim of factor analysis is to find a simple structure that would accurately reflect and reproduce the real dependencies existing in nature [9]. Factor analysis is based on the definition of the factor model

$$\begin{aligned} X_i=\sum \limits _{j=1}^{m} a_{ij}F_j+u_i \end{aligned}$$
(1)

where \(X_i\) is a vector of values of measured parameter (\(i=1,\dots ,n\)), \(F_j\) are primary factors (\(j=1,\dots ,m\)), \(a_{ij}\) are coefficients named factor loadings, \(u_i\) are characteristic (specific) factors describing the part of the parameter that is not included in any primary factor. If \(m<n\), the reduction of the original problem dimensionality takes place. By reducing the dimension of the data in the article we mean reducing the number of input variables due to the introduction of factors.

The quality improvement is achieved both by more coordinated functioning of radio elements with identical characteristics (from a single production batch), and by improving the quality and reliability of the results of destructive testing, for which it is possible to select elements from each production batch [1]. This paper is devoted to the problem of reducing the dimension of the original data for the corresponding problems of cluster analysis and attempts to find an optimal set of the informative features used in such cluster analysis optimization problems.

2 Data and Preprocessing

As an example of real data, in this paper we consider a sample consisting of seven different homogeneous batches. The sample is deliberately composed of homogeneous batches, some of which are extremely difficult to separate by known methods of cluster analysis.

One of the largest samplings, which the specialized test center was faced with, is presented in this paper. The total number of all devices in all batches is 3987: batch 1 contains 71 devices, 116 in batch 2, 1867 in batch 3, 1250 in batch 4, 146 in batch 5, 113 in batch 6, 424 in batch 7. Each batch contains information about 205 input measured parameters of the device. Input parameters for which the data vector contains only zero values or for which the number of non-zero values does not exceed 10% were excluded from consideration. For further processing, 67 input parameters remain to be considered.

At the first step, the analysis of the input parameters showed that the considered set of parameters can be divided into three groups:

  1. 1.

    parameters for which the histograms represent the normal Gaussian distribution (In21 - In28, In39 - In46, In92 - In107);

  2. 2.

    parameters for which the histograms represent a Gaussian distribution with frequency gaps (In84 - In91);

  3. 3.

    parameters for which the histogram does not correspond to Gaussian distributions (In 57 - In64, In75 - In82, In10-In20).

Fig. 1.
figure 1

Histogram of observed frequencies and graphs of the fit of the distributions. Normal Gaussian distribution

For each group, the histograms of observed frequencies and graphs of adjustment of distributions are given on the example of several input parameters (Figs. 12 and 3).

Fig. 2.
figure 2

Histogram of observed frequencies and graphs of the fit of the distributions. Gaussian distribution with frequency gaps

In the second step, the parameters were normalized according to the Eq. (2),

$$\begin{aligned} a_{i,k}=\frac{a_{i,k}^*-\overline{a_k^*}}{\delta _k^{max}-\delta _k^{min}} \end{aligned}$$
(2)

where \(a_{ik}^*\) is the value of the measuared parameter before normalization, \(\overline{a_k^*}\) are average values of the parameter, \(\delta _k^{min}\) and \(\delta _k^{max}\) are the lower and upper bounds of the parameter drift, respectively. The drift means the amount of change of parameters of ERC arising during the additional non-destructive testing, simulating extreme operating conditions. This method of normalization by the drift bounds was proposed in [1]. It is shown experimentally that this method of normalization gives a separation by production batches with a much smaller number of errors.

Fig. 3.
figure 3

Histogram of observed frequencies and graphs of the fit of the distributions. The histogram does not correspond to Gaussian distribution

3 Factor Analysis Using Pearson’s Correlation Matrix

In the first step, we determine the Pearson correlation coefficient matrix [9] for input parameters. In the second step, we determine the matrix of factor loadings. Assuming the orthogonality of the factors, we obtain

$$\begin{aligned} R=A \cdot A^T \end{aligned}$$
(3)

where R is the correlation matrix, A - factor loadings matrix.

The number of factors in the factor model was determined by two criteria. The first of them, the Kaiser criterion [10], selects factors with eigenvalues greater than one. However, the number of sufficient factors also depends on the total share of variance reproduced by these factors. The second of them, Cattel screening criterion [11], selects factors by scree plot based on eigenvalues of factors. The number of factors defined at the point on the chart where the decrease of eigenvalues from left to right slows down as much as possible. Since the Kaiser criterion selects factors with eigenvalues greater than one, and the Cattel screening criterion involves visual observation of the scree plot, there is no need to use any software to calculate these criteria.

Also, to simplify the factor structure, rotation is used to find one of the possible coordinate systems in the space of factors. The consequence of this is the maximization of high correlations and the minimization of low correlations. The problem of rotation is formulated as follows [9]: need to find the transformation matrix T corresponding to:

$$\begin{aligned} A^{\cdot }=A \cdot T\;\;\;\;\;\; R=A \cdot A^T=A^{\cdot } \cdot A^{\cdot T} \end{aligned}$$
(4)

The following methods of orthogonal rotation are used in this paper: the Varimax with Kaiser normalization and the Quartimax with Kaiser normalization [12]. Varimax rotation maximizes the total variance of the loadings squares of the common factors for each input attribute. Quartimax rotation based on the fact that the sum of squares of pairwise products of the matrix A elements will decrease as the values of the loading tend to zero.

Various combinations of parties were subjected to factor analysis: full mixed lot and its subsets lots from four, three and two batches. The full mixed lot consists of seven homogeneous batches. The mixed four-batch lot consists of batch 1, batch 2, batch 5, and batch 6. The mixed three-batch lot consists of batch 1, batch 2, and batch 6. The mixed two-batch lot consists of batch 1 and batch 2.

In this paper, the number of factors was determined by the Kaiser criterion, and the total proportion of variance reproduced by these factors should be at least 70%.

4 Computational Experiments with Various Compositions of the Mixed Lot

To extract factors, we used the principal components method, the principal factor method with multiple R-square, principal axes method, maximum likelihood factors method, iterated communalities method (MINRES) and centroid method [9]. In further consideration, we used principal components method since it describes the maximum variance of input parameters.

For the whole mixed lot, the method based on Cattel criterion recommends to select 4 factors in the model, and this number does not change with any rotation (Fig. 4). According to Kaiser criterion, taking into account the total share of variance of at least 70%, there are five factors selected. Uberla [9] recommends in cases of dispute to select a larger number, therefore we allocate 5 factors for further consideration. Factor 1 corresponds to the highest loadings on the parameters In92-In107. This factor describes 22.779–23.954% of total variance. Factor 2 corresponds to the highest loadings on the parameters In58-In64, In76-In82. This factor describes an additional 19.335–21.265% of the total variance. Factor 3 corresponds to highest loadings on the parameters In39-In46, This factor describes an additional 12.300–14.776% of total variance. Factor 4 (parameters In10, In11, In13, In14, In18) describes 9.003–9.375% of total variance. Factor 5 (parameters In21 - In28) describes 6,781% (unrotated), 11.928% (Varimax) and 11.993% (Quartimax) of total variance. Regardless of the rotation method, the final solution has a cumulative percent of the total variance 75.794% (Table 1).

Table 1. Rotation of factor structure. Full mixed lot
Fig. 4.
figure 4

Scree plot for whole mixed lot. Adv.Grapher

The total number of devices in a mixed lot composed of four batches is 446. For further processing 62 input parameters remain. The Cattel criterion, regardless of the rotation, recommends to select 4 factors in the model (Fig. 5), however, according to the Kaiser criterion, taking into account the total percentage of variance at least 70%, we allocate 6 factors. Substantial loadings on the Factor 1 appear for the parameters In21 - In28, In39 - In46. This factor describes 23.304–38.622% of total variance. Factor 2 shows substantial loadings for the parameters In58-In64, it describes in additional 13.220–17.761% of the total variance. Factor 3 has substantial loadings for In91-In107, Factor 4 for In79 - In82, Factor 6 for In57, In78. Regardless of the rotation method, the final solution has a cumulative percent of the total variance equal to 70.364% (Table 2).

Table 2. Rotation of factor structure. Four-batch mixed lot
Fig. 5.
figure 5

Scree plot for four-batch mixed lot. Software - Adv.Grapher

The total number of devices in a mix of three batches is 300. The Cattel criterion, regardless of the rotation, recommends selecting 3 factors in the model (Fig. 6). According to the Kaiser criterion, taking into account the total percentage of variance at least 70%, we also allocate 3 factors. Substantial loadings on the Factor 1 appear for the parameters In21-In28, In39-In46. This factor describes 37.09–46.39% of total variance. Factor 2 has substantial loadings for In92-In107 and describes 22.03–26.61% of total variance. Factor 3 has substantial loadings for In84-In91 and describes in addition 9.192-13.905% of the total variance. Regardless of the rotation, the total solution has a cumulative percentage of the total variance 77.61% (Table 3).

Fig. 6.
figure 6

Scree plot for three-batch mixed lot. Adv.Grapher

Table 3. Rotation of factor structure. Three-batch mixed lot

The number of devices in the simplest mixed lot of two batches is 187. According to the Kaiser criterion, taking into account the total percentage of variance at least 70%, we allocate 2 factors. Factor 1 shows the highest loadings for the parameters In21 - In28, In39 - In46 and describes 45.41–66.20% of the total variance. Factor 2 shows the highest loadings for the In92-In95, In100-In102, In106, and describes in addition 7.28–28.07% of total variance. Regardless of the rotation, the solution has a cumulative percentage of the total variance 73.48% (Table 4).

Table 4. Rotation of factor structure. Two-batch mixed lot

5 Adequacy of the Factor Model

Verification of the factors number sufficiency in the model was performed using The Kaiser and Cattel criteria. Verification the adequacy of the factor model is reduced to checking the achievement of a simple structure. A simple structure is a configuration of vectors that rotates to the state when the vast majority of vectors will be on or near hyperplanes of coordinates [9]. In addition, the simple structure is “contrast”: factor loadings are high for variables that determine this factor, and close to zero for all others. To test the significance of a simple structure in various areas of research, modern scientific literature offers the Bargmann test [9], the Lawley-Bartlett’s test [9], the Bartlett-Wilks test [9], the Burt’s test [9]. In this paper, we use the Bargmann’s test [13] due to the ability of this criterion to show that main axis rotation procedure is not completed and control the density of variables positions. It is necessary to calculate the number of zero loadings for each factor:

$$\begin{aligned} \left| \frac{a_{ij}}{h_i}\right| <0.1 \end{aligned}$$
(5)

where \(a_{ij}\) are factor loadings on each parameter, \(h_i\) - square root of communality (communality refers to the variance of a parameter due to common factors). If the number of zero loadings is not lower than the table value, the simple structure is considered to be achieved.

For the full mixed lot Bargmann test is satisfied for 3 of 5 factors in case of unrotated structure and for all factors in case of rotation with \(\alpha <=0,05\) (where \(\alpha \) is a level of significance). For four-batch mixed lot test is satisfied for 3 from 6 factors in case of unrotated structure and for 4 from 6 factors in case of rotated structure with \(\alpha <=0.25\). For three-batch mixed lot test satisfied just for 1 factor in case of unrotated structure and for 2 factors in case of rotated structure with \(\alpha <=0,25\). And for two-batch mixed lot test is satisfied in one case with \(\alpha <=0.25\) (Table 5).

Table 5. Bargmann test

Analysis of the percentage of zero loadings shows, that with increasing the number of batches and at any rotation the number of cases for which test Bargmann is satisfied also increases.

Factor values obtained by orthogonal rotations described above are considered as input data for clustering algorithms. Clustering was performed with Deductor Studio Academic tool. EM algorithm applied with lower bound of likelihood = 0.2, level of accuracy = \(10^{-5}\), maximum of iterations = 100. Self-organizing Kohonen maps (SOM) [14] applied with linear initialization with eigenvalues, bubble neighborhood function, significance level = 0,1%. The clustering accuracy for considered mixed lots with different orthogonal rotations is presented in Table 6.

The analysis of Table 6 showed that for any orthogonal rotations and clustering algorithms, the clustering accuracy increases with a decrease the number of homogeneous batches in the sample from 39% up to 98%.

Table 6. Clustering results

Clustering results on three-batch and two-batch mixed lots are shown in Figs. 7 and 8, respectively. Separating batches takes place exclusively on Factor 1 in both cases.

Fig. 7.
figure 7

Clustering results for three-batch mixed lot

Fig. 8.
figure 8

Clustering results for two-batch mixed lot

6 Conclusions

The possibility of using factor analysis for the separation of a mixed lot, consisting of an arbitrary number of homogeneous batches of electronic radio components, has been proposed and described in the paper. Thus, the use of the factor model is appropriate to improve the accuracy of batch separation, regardless of the clustering algorithm used. It is shown, that the optimal number of the selected factors depends on the number of considered devices in the mixed lot, as well as on the input measured parameters of the device in a given sample. Regardless of the type of orthogonal rotation, the clustering accuracy decreases with the increase of the number of homogeneous batches in the mixed lot. A similar result was shown earlier in [6, 7] when using the ensemble approach of cluster algorithms and [5], where efficiency of EM algorithm at the small volume of input data was demonstrated. At the same time, the considered factor analysis methods do not allow us to obtain a universal set of a small number of features for the separation of mixed lot consisting of an arbitrary number of the homogeneous batches. Thus, despite the fact that the proposed method makes it possible to somewhat reduce the dimensionality of the data, for reliable separation of homogeneous batches with cluster analysis methods, the use of multidimensional data is inevitable.