

# **Integrated circuit probe card troubleshooting based on rough set theory for advanced quality control and an empirical study**

**Chen‑Fu Chien1,2 · Hsin‑Jung Wu1**

Received: 22 June 2021 / Accepted: 11 October 2022 / Published online: 3 November 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022

#### **Abstract**

Wafer probe test plays a crucial role to distinguish the good dies from the remaining defected dies on the wafers via the probe card as the testing signal interface between the tester and the integrated circuits on the fabricated wafers. Unexpected probe card failures that happen during the testing process will affect testing quality, reduce overall equipment efficiency and productivity. In practice, the engineers rely on domain knowledge and the process of trial and error for fault diagnosis and troubleshooting. However, as the IC device features are continuously shrinking with an increasing number and density of the bond pads of the circuits on the wafer, fault diagnosis and troubleshooting for probe card have become complicated and timeconsuming. To fll the gap, this study aims to develop a data-driven framework that integrates rough set theory and domain knowledge to derive effective decision rules to enhance the decision quality and efficiency for advanced quality control and smart manufacturing. An empirical study was conducted in a leading semiconductor testing company in Taiwan for validation. The proposed framework can shorten fault diagnosis procedure and enhance productivity, while enhancing probing test integrity to reduce both the producer risk and customer risk. The developed solution is implemented in real setting.

**Keywords** Rough set theory · Probe card · Industry 3.5 · Advanced quality control (AQC) · Semiconductor manufacturing

# <span id="page-0-0"></span>**Introduction**

Semiconductor products have become one of the most important components of various supply chains that leading countries have reemphasized the importance of semiconductor manufacturing (Chien et al., [2020b\)](#page-12-0). Indeed, driven by Moore's Law (Moore, [1965](#page-12-1)) that the number of transistors fabricated on a wafer will be doubled every 12 or 24 months, semiconductor industry has continuously migrated the technologies to shrink the feature size of the integrated circuit (IC). Thus, semiconductor industry is highly capital intensive, with complex and lengthy manufacturing process. In order to meet customer requirements and maintain competitive advantage, continuous yield enhancement for overall wafer effectiveness is critical for the semiconductor manufacturing companies (Chien & Hsu, [2014](#page-11-0); Chien et al., [2013a,](#page-11-1) [2020a\)](#page-11-2). In particular, wafer probe test plays a crucial role to distinguish the good dies from the remaining defected dies on each of the fabricated wafers that can reduce both the producer risk and customer risk via efective probing test. However, as the IC device features are continuously shrinking via technology migration, the fabrication processes as well as IC testing are increasingly complicated.

Semiconductor manufacturing consists of two phases for IC testing including circuit probing conducted on each fabricated wafer for sorting and fnal testing performed on the packaged IC (Hsu & Chien, [2007\)](#page-12-2). The functionality and design specifcations of each die can then be ensured by conducting IC testing. IC probe card is a crucial component as the signal interface between the tester and the tested wafers to detect failures against the designed electrical specifcation such as current, voltage, leakage, trigger and other functional speed as illustrated in Fig. [1.](#page-1-0) To obtain reliable test results, the probe card that is characterized by a set of mechanical and electrical parameters should match with the tester and the devices to be tested such as device size and shape, the number of bond pads, and signal characteristics. Advanced

 $\boxtimes$  Chen-Fu Chien cfchien@mx.nthu.edu.tw

 $1$  Department of Industrial Engineering and Engineering Management, National Tsing Hua University, 101 Section 2 Kuang Fu Road, Hsinchu 30013, Taiwan

<sup>2</sup> Artifcial Intelligence for Intelligent Manufacturing Systems (AIMS) Research Center, Ministry of Science & Technology, Hsinchu 30013, Taiwan



<span id="page-1-0"></span>**Fig. 1** IC probe card and wafer probing test

quality control (AQC) is critical for yield enhancement in advance among the suppliers of semiconductor supply chain for virtual vertical integration to address the increasing challenges for maintaining technology migration for IC shrinkage (Chien et al., [2020a](#page-11-2)). Since probe card quality will afect the sensitivity and specifcity of testing outcomes, it is important to enhance advanced quality control of the probe cards to ensure data integrity of IC probe testing results.

As one of the suppliers for semiconductor supply chain, it is difficult for probe card manufacturers to verify the usage quality of their products in real setting. Design of experiments will be employed for electromagnetic analysis and circuit model verifcation of new probe cards for in-house research and development. However, owing to trade secret and confdentiality, little usage data can be collected from the customers for probing test on their IC devices. Thus, most of the probe card manufacturers reply on domain knowledge and the process of trial and error for troubleshooting. With the advancement of semiconductor process technology and shrinking critical dimensions, the diagnosing and troubleshooting procedure for probe card has become much more complicated and time consuming (Fu et al., [2022\)](#page-12-3). The probe card fault diagnosis procedure will differ based on multiple machine parameter settings and the types of failures. Thus, it is challenging for new engineers lacking sufficient fault diagnosis know-how, increasing the difficulty for finding effective troubleshooting solutions in short time. Therefore, it is crucial to develop a systematic fault diagnosis framework to improve the efficiency of probe card troubleshooting.

Once probe card fault occurs during wafer testing, probe card manufacturer needs to conduct a series of on-site function testing to clarify the fault issues and then execute troubleshooting solutions to ensure the testing quality of restored testers. The whole fault diagnosis process may take up to30 days, causing long machine downtime. In practice, as the customer inform a fault occurred, the engineers rely on domain knowledge and the process of trial and error for fault diagnosis and troubleshooting that are time-consuming. However, little research has been done to conduct probe card fault diagnosis with a systematic and efective method. Furthermore, since the customers will not share on-site testing data for proprietary information protection, only limited amount of data can be collected in real settings. Thus, potential data-driven approaches such as decision tree or random forest are limited by the data of the present problem. Rough set theory (RST) has been efectively employed for data mining and knowledge discovery in the incomplete or insufficient information (Chien & Chen,  $2007$ ; Kusiak,  $2001$ ; Pawlak, [1982](#page-12-5), [1997;](#page-12-6) Peng et al., [2017\)](#page-12-7).

Focusing on real setting to address the limitation and data quality, this study aims to construct a data-driven framework that integrates RST and domain knowledge to enhance the efficiency and decision quality of probe card troubleshooting for advanced quality control and smart manufacturing for Industry 3.5. To ensure data integrity for further analysis, data preparation is conducted based on domain knowledge. Domain knowledge will be incorporated to verify the derived rules from RST and thus support the decision making of on-site service engineers for troubleshooting efectively. An empirical study was conducted in a leading semiconductor testing company in Taiwan for the approach validation. The results have shown the efectiveness and feasibility of the proposed approach than the existing approach of trial and error to derive decision rules to efectively support the engineers shortening troubleshooting and equipment downtime. Indeed, the proposed framework is implemented in this company.

This paper is organized as follows. ["Introduction](#page-0-0)" section introduces the background and motivation of this research. ["Literature review"](#page-2-0) section reviews the related studies on IC testing fault diagnosis and rough set theory method. "[UNI-](#page-3-0)[SON framework](#page-3-0)" Section proposes a probe card troubleshooting framework. ["Empirical study"](#page-8-0) section validates the proposed framework with empirical study in IC probe card troubleshooting. ["Conclusion](#page-10-0)" section concludes thus study with contributions, limitations and future research directions.

## <span id="page-2-0"></span>**Literature review**

### **IC testing fault diagnosis**

With the rapid development of consumer electronics, market demand of high-quality IC chips is continuously growing. Indeed, real-time monitoring of the manufacturing process is needed to support fault detection to efectively eliminate the cause of the faults and thus reduce abnormal yield loss (Chien et al., [2013a\)](#page-11-1). With the shrinkage of critical dimensions and feature sizes of IC, semiconductor manufacturing process has become much more sophisticated, thereby increasing testing and package cost (Fu et al., [2022](#page-12-3)).

In order to ensure IC quality and save packaging cost, semiconductor testing is an important process that involves the entire semiconductor supply chain. To ensure the produced wafers meet specifcations, IC testing is performed to distinguish the good products from the defected ones. Each die on the fabricated wafer will be probing tested by the probe card of the tester to flter electrically dysfunctional dies. The testing results are shown as the wafer bin maps with spatial defect patterns for further diagnosis and yield enhancement (Chien et al., [2013b](#page-11-4)). Probe card that serves as the mechanical and electrical signal interface between the wafer and testing equipment is a critical component of the wafer testing system, since probe card quality strongly infuences IC yield rate. Misjudgment of specifcations is not allowed, a stable and reliable contact interface is needed, therefore high reliability of probe card is required in wafer testing. Figure [2](#page-3-1) illustrates the testing procedure for probe card.

The whole semiconductor manufacturing process involves hundreds of steps and various equipment are used, such as wafer fabrication equipment, wafer testing system and package machine. Since equipment breakdown may cause signifcant loss of productivity and proft, a systematic fault diagnosis approach is needed to eliminate the root cause quickly and efectively. Furthermore, it is crucial to enhance data integrity by combining the domain knowledge and careful data preparation to fnd potentially useful troubleshooting solutions.

A number of studies have been done for semiconductor equipment fault diagnosis. Hong et al. ([2011](#page-12-8)) presented a fault detection and classifcation (FDC) method for semiconductor manufacturing equipment by applying modular neural network in tool data modeling and address the associated uncertainty by Dempster-Shafer (D-S) theory. Li et al. ([2013\)](#page-12-9) developed a semiconductor equipment fault diagnosis expert system based on Bayesian networks with improved causal relationship questionnaire and probability scale methods as inference machine. Nawaz et al. [\(2014\)](#page-12-10) applied Bayesian networks to construct a semiconductor etching equipment fault diagnosis framework using principal component analysis (PCA) to analyze status variable identifcation (SVID) data. Kim et al. ([2017\)](#page-12-11) proposed a time of propagation delay pass fail approach to diagnose interconnect failures of high parallelism probe card. Khakifrooz et al. ([2018\)](#page-12-12) employed Bayesian inference for mining semiconductor manufacturing big data for defect detection and yield enhancement. Rostami et al. ([2018\)](#page-12-13) applied Support Vector Machine (SVM) to detect abnormal observations and extract and classify fault fngerprints. Fu et al. ([2022\)](#page-12-3) employed Bayesian network to model the causal relationship between diagnosis variables for IC testing probe card fault diagnosis. Little research has been done on probe card troubleshooting for advanced quality control.

## **Rough set theory**

RST is a data mining methodology to discover hidden patterns and derive useful decision rules under the incomplete or insufficient information (Pawlak, [1982\)](#page-12-5). Different from conventional approaches, RST does not need to make the assumptions about the independence of variables and normality of data distribution (Chien et al., [2016\)](#page-11-5). Furthermore, RST can deal with incomplete information for the present problem, while the "IF–THEN" rules derived by RST can provide simple and explainable information for the decision makers. Thus, RST is particularly suitable to support the

<span id="page-3-1"></span>**Fig. 2** Probe card testing



engineers who should employ a series of steps to detect the faults for troubleshooting.

A number of studies have applied RST in various domains such as semiconductor manufacturing (Kusiak, [2001\)](#page-12-4), printed circuit board manufacturing (Tseng et al., [2004\)](#page-12-14), decision rule mining for machining method chains (Wang et al., [2022](#page-12-15)), and product feature design (Chien et al., [2014](#page-12-16), [2016](#page-11-5); Wu et al., [2020\)](#page-12-17). In particular, Kusiak [\(2001\)](#page-12-4) introduced rough set theory and data mining to extract efective decision rules from data sets for making predictions in the semiconductor industry. Chien et al. ([2016\)](#page-11-5) developed a data-driven framework based on RST to efectively extract product visual aesthetics and identify useful design concepts for product design. Wu et al. ([2020](#page-12-17)) integrated rough set theory and information entropy to develop a knowledge recommender to enhances knowledge acquisition and reuse for new product development. Wang et al. [\(2022](#page-12-15)) developed a decomposition-reorganization approach for mining decision rules for machining method chains based RST.

Furthermore, RST has been applied for fault diagnosis in various areas including power system (Muralidharan & Sugumaran, [2013;](#page-12-18) Peng et al., [2004,](#page-12-19) [2017;](#page-12-7) Shen et al., [2000\)](#page-12-20), medical diagnosis (Jothi & Inbarani, [2016](#page-12-21)), and image classifcation (Hassanien et al., [2009\)](#page-12-22). Shen et al. [\(2000\)](#page-12-20) applied RST in diagnosing the valve fault for diesel engine and proposed a method suitable for discretizing the frequency and time domain attributes. Peng et al. [\(2004\)](#page-12-19) used RST as a data mining tool for fault diagnosis on distribution feeder, that useful patterns and rules are derived for faulty equipment diagnosis and fault location. Muralidharan and Sugumaran [\(2013](#page-12-18)) presented a rough set based rule generation and fuzzy classifcation of wavelet features for fault diagnosis of monoblock centrifugal pump. Peng et al. [\(2017\)](#page-12-7) applied RST in fault diagnosis of diferent cable fault types by rejecting interference signals and recognizing partial discharge signals from diferent sources, and results showed that the proposed method have higher accuracy than SVM and Back-propagation Neural Network. Jothi and Inbarani ([2016](#page-12-21)) presented a hybrid tolerance rough set and frefy based approach to classify MRI brain tumor image, that can select imperative features of brain tumor. Hassanien et al. ([2009\)](#page-12-22) presents a review of rough set and near set applications in medical imaging such as image segmentation, object extraction and image classifcation, while hybrid approaches including neural networks, particle swarm optimization, SVM, and fuzzy sets are discussed. Extensive review including extensions, theory and applications of RST can also be found, for example, in the research by Zhang et al. [\(2016](#page-12-23)). However, little research has been done in applying RST for probe card troubleshooting.

## <span id="page-3-0"></span>**UNISON framework**

Focusing on the present problem for probe card troubleshooting, this study develops a UNISON decision framework that integrates RST and domain knowledge to derive rules from incomplete information including six phases: understand and define the decision problem, identify the niche for decision quality improvement, structure the infuence relationships among uncertain events, sense and describe expected outcomes, judge and measure overall performance, and tradeoff and decision making, as illustrated in Fig. [3](#page-4-0). Indeed, UNSION framework has been revised and efectively applied to various contexts including IC fnal testing strategy (Chien et al., [2007\)](#page-12-24), overlay error reduction in semiconductor manufacturing (Chien & Hsu, [2011\)](#page-11-6), demand forecast (Chien et al., [2020b](#page-12-0); Fu & Chien, [2019\)](#page-12-25), product design innovation based on user experience (Chien et al., [2016](#page-11-5)), knowledge management (Hu et al., [2019\)](#page-12-26), IC probe

## **Understand and defne the problem**

Probe card is a highly customized product, its design may be diferent based on diferent tester and testing programs. With the technology migration of semiconductor manufacturing, probe card manufacturing has become much more complicated. Unexpected probe card failure happen during the testing process will cause machine downtime, which reduces overall equipment efficiency. Thus it is crucial for probe card



<span id="page-4-0"></span>**Fig. 3** RST-based UNISON framework for probe card troubleshooting

manufacturer to immediately respond and provide technical service to recover the failure as soon as possible and ensure customer satisfaction.

Probe card troubleshooting is challenging due to the problem complexity and information limitation (Fu et al., [2022\)](#page-12-3). There are many factors that affect the troubleshooting solutions, various solutions may map to the same abnormal situation under different circumstances. It is difficult to comprehensively consider all the possible infuential factors by human judgement, and even more difficult for new coming engineers to quickly identify efective troubleshooting solutions. Due to proprietary information protection of the IC testing companies, it is difficult to perform on-site testing and obtain complete probe usage information, only partial data can be collected, leading to incomplete or insufficient information.

## **Identify the niche**

The objective of probe card troubleshooting is to shorten diagnosis and processing time and improve IC testing efficiency. Each of the component in the test system may be an uncertain event that afects probe card troubleshooting decisions. The test system consists of the automated test equipment (ATE), probe card and device under test. ATE is an equipment that measures the functionality and performance of a device under test that consists of tester heads, prober and testing programs. The probe card includes printed circuit boards, electronic components, and probes. By constructing a rough set based decision analysis framework, experience and knowledge of domain experts are converted into systematic analytics for Industry 3.5 that employs intelligent solutions to empower decision makers (Chien et al., [2020b](#page-12-0); Fu et al., [2022;](#page-12-3) Lin et al., [2022\)](#page-12-27). To enhance data integrity, the data needed for analytics is also defned after discussion with the domain experts. With the proposed framework, relationship between uncertain events can be structured and troubleshooting solutions can be kept consistent, as well as improving the decision quality.

#### **Structure infuence relationship**

In real settings, troubleshooting actions and procedures will be diferent according to diferent product types and testing conditions. Based on domain knowledge, the infuence relationship for probe card troubleshooting processes was structured. Due to the difficulty of collecting on-site testing data, engineers often diagnose the fault cases using the abnormal situation data and product information available. Thus, the abnormality situation and product information are two main factors that infuence probe card troubleshooting solutions. Experienced domain experts have their own troubleshooting procedures for frequently occurring abnormal situations, but the expert knowledge is not recorded, leading to difficulty of experience sharing and knowledge management. Therefore, this study standardizes the troubleshooting procedure and derive useful decision rules from multiple attributes.

Data preparation process includes data collection, data inspection and cleaning, data transformation and partition to ensure the data is in good quality and can be efectively analyzed (Lee & Chien, [2022](#page-12-28)). First, three types of data were collected based on domain knowledge, including abnormality data, product data and troubleshooting solution data from historical probe card fault cases.

- (i) Abnormality data: The abnormality data includes the abnormal situation of the probe card and on-site testing fail items of the fault cases.
- (ii) Product data: The product data is the basic information related to each probe card, including product type, design house for the tested wafer, customer and tester type.
- (iii) Solution data: The troubleshooting solutions for the fault cases are regarded as the solution data. The solution data contains the solution codes and its efectiveness of solving the abnormal situation.

Second, consistency and completeness of the data is checked by correcting errors, removing noise and deleting incomplete objects. Third, the data was transformed into analytical form with condition and outcome attributes. Fourth, the collected data is partitioned into training dataset  $(k\%)$  and testing dataset (100-k%), where the decision rules were derived from the training dataset and the testing dataset were used for rule validation.

## **Sense and describe the outcomes**

RST based data analysis starts from a decision table, called an information system (Pawlak, [2002\)](#page-12-29). The decision table presents the relationship between the decisions that will be made when the data meets specifed conditions. In the decision table, each row stands for an object and each column stands for an attribute. The attributes can be further classifed into condition and decision attributes.

#### **Information system**

In particular, the information system S can be defned as follows:

$$
S = (U, A, V, f) \tag{1}
$$

where *U* denotes the universe of all objects $u_j, u_j \in U$ . A is the finite set of the attributes $a_k$ . V is the set of all attribute value, such that

$$
V = U_{a_k \in A} V_{a_k} \tag{2}
$$

where  $V_{a_k}$  is finite attribute domain of the attribute value of attribute  $a_k$ . Finally,  $f$  denotes an information function such that, for every  $u_i \in U$  and  $a_k \in A$ ,

$$
f(u_j, a_k) \in V_{a_k} \tag{3}
$$

#### **Indiscernibility relation**

In RST data analysis, considering specifc attributes, objects that have identical attribute values are indiscernible. For example, Objects 1 and 2 in Table [1](#page-6-0) have the same attribute in "Customer", indicating that Objects 1 and 2 were indiscernible in the attribute "Customer". Let D be a non-empty subset of set *A* of all attributes, i.e.,  $D \subseteq A$ . The D-indiscernibility relation  $(I_D)$  defines that the objects  $x_1$  and  $x_2$  are D-indiscernible with respect to D as follows:

$$
(x_1, x_2) \in I_D \iff f(x_1, a_d) = f(x_2, a_d) \forall a_d \in D \tag{4}
$$

Based on the indiscernibility relations, the object universe can be decomposed into blocks of indiscernible objects (i.e., elementary sets). For example, considering the attribute "design house", two elementary sets {1, 3, 5} and {2, 4} will be generated. That is, objects 1, 3, 5 are indiscernible since their "design house" is the same. Furthermore, if  $D = {``design house''}, "tester''}, then three elementary sets$  $\{1, 3\}, \{2, 4\}$  and  $\{5\}$  will be derived. The combination of the elementary sets  $I_D(.)$  having the indiscernible relations of  $I_D$  is denoted as UlI<sub>D</sub>. For the above example D = {"design" house", "tester"},  $UI_D = {\{1, 3\}, \{2, 4\}, \{5\}}$ .

Table [1](#page-6-0) illustrates an example decision table of five objects in the rows that are characterized with attributes in the columns including one decision attribute (i.e., troubleshooting solution) and four condition attributes (i.e., design house, customer, tester and fault problem).

#### **Approximation of sets**

<span id="page-6-0"></span>**Table 1 Illustr** 

In RST, the lower and upper approximations are used to represent the uncertain relationship among the objects. The lower approximation of a set consists of all the

D-indiscernible elementary sets included in X. That is, under given attributes, all the objects that can be certainly classifed are included. The upper approximation of a set is the union of indiscernible objects having non-empty intersection with X. That is, The upper approximation of a set consists of all the objects that is possible to be classifed into the set, considering specifc attributes.

The lower approximation of X in D is denoted as *D*(*X*), while the upper approximation of X in D is denoted as *D*(*X*). The defnition is as follows:

$$
\underline{D}(X) = \left\{ x \in U | I_D(x) \subseteq X \right\} \tag{5}
$$

$$
\overline{D}(X) = \left\{ x \in U | I_D(x) \cap X \neq \emptyset \right\}
$$
\n<sup>(6)</sup>

The boundary region  $BN<sub>D</sub>(X)$  is defined as the difference between the upper and lower approximation of X in D and could be denoted as follows:

$$
BN_D(X) = \underline{D}(X) - \overline{D}(X) \tag{7}
$$

For example, as shown in Table [1,](#page-6-0) the objects can be classified into two sets, i.e.,  $X_0$  and  $X_1$ , according to the decision attribute of "troubleshooting solution" as follows:

 $X0 = \{1, 2, 4\}$  denotes the objects using "SB0311" solution.

 $X1 = \{3, 5\}$  denotes the objects using " SB0501" solution.

Let D = {" design house", "tester"}, then  $UI_D = \{\{1,$ 3}, {2, 4}, {5}}. The objects that can be included in  $X_0$  is {2, 4}, which is the lower approximation of  $X_0$ . That is,  $DX_0 = \{2, 4\}$ . For the upper approximation,  $DX_0 = \{1, 2, 3, 4\}.$ 

Therefore, the boundary region can be derived as follows:

$$
BN_D(X_0) = DX_0 - \underline{DX}_0 = \{1, 2, 3, 4\} - \{2, 4\} = \{1, 3\} \tag{8}
$$

Furthermore, the accuracy of the approximation for  $X_i$  in D is defned as follows:

$$
\alpha_D(X_i) = \frac{card \underline{D}(X)}{card \overline{D}(X)}
$$
\n(9)



In which the cardinality, denoted as *card(.)*, is the number of objects in the set. The range of  $\alpha_D(X_i)$  is between 0 and 1. If  $\alpha_D(X_i) = 1, X_i$  is a exact set with respect to D, while  $\alpha_D(X_i)$  < 1,  $X_i$  is a rough set with respect to D. Following the example mentioned above, consider  $D = {``design house''},$ "tester"},

$$
\underline{D}X_0 = \{2, 4\} \Rightarrow \operatorname{card}(\underline{D}X_0) = 2\tag{10}
$$

$$
\overline{D}X_0 = \{1, 2, 3, 4\} \Rightarrow \operatorname{card}(\overline{D}X_0) = 4 \tag{11}
$$

Thus,

$$
\alpha_D(X_0) = \frac{cardDX_0}{card\overline{D}X_0} = \frac{2}{4} = 0.5\tag{12}
$$

Since the value of  $\alpha_D(X_0)$  is smaller than one,  $X_0$  is a rough set with respect to D.

#### **Attribute reduction and rule extraction**

A main purpose of the RST approach is to derive compact rules from the selected reducts, where reduct is a subset from the original set of attributes that can perceive the information as applying the whole set of attributes (Paw-lak, [1997\)](#page-12-6). Given a subset of attributes *D* ⊆ *A*, considering *a<sub>d</sub>* ∈ *D*. If  $I_D = I_{D-\lbrace a_d \rbrace}$ , then *a<sub>d</sub>* is dispensable in D; elsewise  $a_d$  is indispensable in D. Thus, dispensable attributes can be excluded without original classifcation. Let D and E have the equivalence relation over U, where D *⊆* AandE *⊆* A. The D-positive region of E is defned as:

$$
POS_{D}(E) = \bigcup_{X \in U/I_{E}} \underline{D}X
$$
\n(13)

that denotes the set of objects that can be categorized into the E-elementary sets employing the knowledge expressed by I<sub>D</sub>. Suppose  $a_i$  ∈ *D*, if  $POS_D(E) = POS_{D-\{a_i\}}(E)$ , then  $a_i$ is E-dispensable in D; otherwise  $a_i$  is E-indispensable in D. Reduct is the set of indispensable attributes, which is the essential knowledge that can capture all the fundamental concepts. The subset F of D is called the E-reduct if and only if F is the E-dispensable subset of F, where  $POS_{D}(E) = POS_{F}(E).$ 

For instances, suppose  $D = {^\circ}$  design house", "tester"} and  $E = \{$  "troubleshooting solution}, the elementary sets generated for D is  $U/I_D = \{\{1, 3\}, \{2, 4\}, \{5\}\}\$ , while the elementary sets generated for E is  $U/I_E = \{\{1, 2, 4\}, \{3, 4\}\}$ 5} }. Thus,  $POS_D(E) = \{2, 4, 5\}$ .

In RST, *Core*(*D*) is defned as the intersection of the reducts as follows:

$$
Core(D) = \cap Reduct(D)
$$
\n(14)

where *Reduct*(*D*) is the set of all the reducts of D. In particular, *Core*(*D*) contains the most important subset of attributes, in which any removal of an attribute would decrease the classifcation power.

Moreover, for better explainability, the derived reducts can be transformed into decision rules based on the essential attributes. Following the above example, Objects 2, 4 and 5 belong to E-positive region of D, with the same condition attributes of reduct= ${$ " design house", "tester" }. The derived rules are follows:

IF "Design house= DH2"" and "Tester= ND2", THEN "Troubleshooting solution=SB0311." IF "Design house= DH1" and "Tester= ND2", THEN "Troubleshooting solution=SB0501."

#### **Rules generation**

RST is applied to generate the reducts and derive candidate decision rules from the training dataset, which is shown as follows.

- Step 1: Construct the decision table. If there are numerical attributes, data discretization must be performed to transform the continuous data into several intervals; otherwise, go to Step 2.
- Step 2: Generate all possible reducts with removing inessential attributes and retain attributes which have obvious relationship with decision variables.
- Step 3: Evaluate the generated reducts based on domain knowledge and obtain the sets of satisfed reducts.
- Step 4: Generate rules from the extracted reducts.
- Step 5: Delete the rules where the decision attributes are not the focus class.
- Step 6: Calculate the support value for each generated rule.
- Step 7: For each generated rule, if support value  $\geq$  threshold value, the rule is then selected and added to the candidate rules pool.
- Step 8: Arrange all the candidate rules as output.

#### **Overall judgment and measurement**

Three indices of support, confdence, and lift were applied to validate the rules derived from the proposed approach.

Firstly, support denotes the signifcance of the candidate rule. Secondly, confdence denotes the prediction accuracy of the rule. Thirdly, lift is the information gain ratio of the rule.

The following steps are employed to examine the objects in the testing data set to estimate the validity of the derived rules.

- Step 1: Define the threshold for confidence and lift as  $\theta_c$ and $\theta_l$ .
- Step 2: Input the testing dataset to calculate confidence and lift value for each candidate rule.
- Step 3: If confidence value  $> \theta_c$  and lift value  $> \theta_l$ , select this rule into the decision rules pool; or else delete this rule.
- Step 4: After checking the confdence and lift value for all the decision rules, have a discussion with domain experts to confrm the meaning and interpretation of the derived rules.

## **Trade‑of and decision**

By applying RST analysis, troubleshooting suggestions can be provided for diferent circumstances. While the threshold of support can be set to denote the signifcance of the candidate rules to be investigated, the confdence of a derived rule denotes a percentage value showing how frequently the rule is correct. Thus, the threshold for confdence should be higher is better. Nevertheless, since probe card manufacturers need to eventually fnd out defect causes for troubleshooting. We used confdence level to prioritize the derived rules in order to support troubleshooting process. Lift is a measure of the performance of a derived rule based on RST at predicting as having an enhanced response with respect to the measure based on a random choice with the population as a whole. Thus, the threshold for lift should be larger than 1. The decision rules can be further expressed in "IF–THEN" form, thus the relationships between the abnormal situations, product information and troubleshooting solutions can be structured. Important features that mainly afect the troubleshooting solutions are also identifed.

In addition, the derived rules were discussed with domain experts for further verifcation and implication interpretation in probe card troubleshooting. With useful information provided by the decision rules, engineers can improve the efficiency and decision quality of probe card fault diagnosis. Since new abnormal situations may appear from time

to time, the decision rules obtained from RST should be updated regularly to capture the latest information.

## <span id="page-8-0"></span>**Empirical study**

An empirical study was conducted in a leading semiconductor testing company in Taiwan for validation. Historical data provided by the case company are systematically transformed for proprietary information protection.

## **Understand and defne advanced quality control for probe cards**

The case company is customer oriented, providing testing related products for various industries. Although the case company aims to enhance customer competiveness by developing latest technology and advanced manufacturing techniques, it is challenging to avoid abnormal situations of probe card, since its quality cannot be tested before actual usage during the wafer testing process. It is crucial for the case company to provide local maintenance services to ensure customer satisfaction. In actual situations, customer service engineers need to frst clarify with the customer the abnormal situations happened, and then provide the relevant information to troubleshooting engineers for further analysis and troubleshooting. This case company is a probe card manufacturer with limited and incomplete testing information that can be collected from experiments and the manufacturing process while the customers may not be able to share usage information in real settings.

#### **Identify the niche for probe card troubleshooting**

The possible niche for the problem lies in fnding the possible infuence factors for probe card troubleshooting based on domain expertise to improve IC testing efficiency. Through integration of data mining techniques and domain knowhow, useful decision rules for probe card troubleshooting can be generated for diferent abnormal situations, shorten fault diagnosis time and reduce equipment downtime, to support probe card quality improvement and subsequent packaging cost reduction. Uncertainties that mainly afects probe card troubleshooting includes ATE, DUT and probe card. In the troubleshooting process, engineers frst need to validate condition of the ATE by using the probe card on another ATE. After checking the ATE is in normal condition, perform testing on another DUT to ensure that the testing failure is not caused by the previous DUT. By executing the above mentioned steps, uncertainties of the frst two factors can be eliminated. Digital decision solutions should be developed to support the engineers for troubleshooting via a number of decision rules to diagnose the faults efectively.

# **Structure the infuence relationships of probe card defects**

To empower digital decision and improve probe card troubleshooting efficiency, it is crucial to transform domain expertise into record data and construct a systematic troubleshooting approach. According to the proposed framework, product information and abnormal situation are two important factors that affects the troubleshooting procedure, where the abnormal situation data is collected from the testing machine. For analytics purpose, the troubleshooting solution data are transformed from textual narratives to systematic coding. For example, the solution described as "Add a capacitor on the spider side probe", can be extracted as "Electrical components-capacitor-specifcations-add", named as "SD111". Following this procedure, 15 abnormal situations and 248 solution codes for probe card troubleshooting were defned based on domain knowledge and historical data.

For data preparation, historical probe card fault cases data including product data, abnormality data and solution data were collected, and then merged into a dataset referring to record ID for decision table construction. The proposed RST decision system consists of condition attributes (1 for product information and 3 for abnormal situation) and 1 decision attribute (troubleshooting solutions), as shown in Table [2](#page-9-0). For data cleaning, the missing and inconsistent data were completed and corrected by verifying the rationality of results with domain experts. In addition, 21 objects including 21 types of troubleshooting solution were also removed from the target data set, because none of them had sufficient supporting objects. (with less than 2 supporting objects). After the data cleaning process, we collected 215 pieces of data regarding the historical probe card fault cases and troubleshooting solution. The levels of the attributes is shown in Table [3](#page-9-1). Finally, the data were randomly divided

<span id="page-9-0"></span>**Table 2** Illustration of probe card troubleshooting

| <b>Item</b> | Condition attributes |                              |                               |              | Decision<br>attribute |
|-------------|----------------------|------------------------------|-------------------------------|--------------|-----------------------|
|             | Customer             | Abnor-<br>mal situa-<br>tion | Charac-<br>teristic<br>result | Fault source | Solution code         |
| 1           | OC01                 | <b>SCAN</b>                  | CR05                          | Voltage      | SF131                 |
| 2           | OC01                 | <b>SCAN</b>                  | CR <sub>06</sub>              | Trigger      | SF131                 |
| 3           | OC <sub>02</sub>     | <b>SCAN</b>                  | CR09                          | Voltage      | SF131                 |
| 4           | OC02                 | <b>SCAN</b>                  | CR10                          | Voltage      | SD111                 |
| 5           | OC02                 | <b>SCAN</b>                  | CR10                          | Voltage      | SF151                 |

<span id="page-9-1"></span>**Table 3** Number of levels of the attributes



into training dataset that contained 172 objects (80%) and testing dataset that contained 43 objects (20%).

# **Sense and describe the outcomes of derived decision rules**

Following the proposed framework, RST is applied to derive candidate decision rules for probe card troubleshooting. Due to the less amount of data collected and problem complexity, candidate rules were selected if there were at least two items supporting the rules. The decision system comprised 4 condition attributes and one troubleshooting solution code; 46 candidate rules were derived, and a sample of them is shown in Table [4.](#page-10-1) Rule 1 was a reduct with only one attribute, and 8 supporting data, meaning "IF the customer is OC1, THEN the solution code is SF131". In addition, rule 4 was another reduct with three attributes and 6 supporting data, meaning "IF the customer is OC1, abnormal situation is ATPG, and characteristic result is Voltage, THEN the solution code is SF131".

# **Overall judgment and measurement for decision support**

To estimate the validity of the derived candidate rules, the threshold of the lift was 1. The confdence level denotes the one-shot probability that is the probability for successful troubleshooting via frstly employing this rule. A sample of the results of the candidate rules is shown in Table [5](#page-10-2). Following the result of the frst cross validation, take Rule 1 in Table [5](#page-10-2) as an illustration. Rule 1 could be accepted since its confdence value 0.5 is larger than the one-shot probability threshold of 0.25 and the lift is 1.95 that is larger than the lift threshold. However, Rule 4 would be rejected since the lift is 0.98 less than 1, though its confdence value 0.25 is larger than the one-shot probability of 0.06. Therefore, through the validation process for the testing data, 39 rules were extracted.

Among the rules extracted, there are rules with the same condition attributes, but diferent solution code, which can be further sorted according to the confdence level, with a sample of rules for ATPG abnormal situation are illustrated <span id="page-10-1"></span>**Table 4** Partial candidate rules for probe card troubleshooting

suggested solutions for "C



#### <span id="page-10-2"></span>**Table 5** The validation results of partial candidate rules



<span id="page-10-3"></span>

in Table [6](#page-10-3). Where engineers can adopt the rules with a higher accuracy frst, which implicates a higher probability to successfully fx the abnormal situation. Through the derived ''IF–THEN" rules for probe card troubleshooting, the relationships between the abnormal situations, product information, and the troubleshooting solutions can be identifed.

#### **Trade‑of and decision for troubleshooting**

Based on the proposed approach, decision rules for probe card troubleshooting are extracted based on historical fault cases and domain knowledge, where the validity and implication of the results are checked by discussion with domain experts. The derived rules which passed the validation criteria can provide insight for troubleshooting engineers to identify the key factor that infuences the troubleshooting solutions, as well as providing information for troubleshooting solutions suggestions, which is benefcial for improving probe card troubleshooting efficiency. Indeed, when taking into account the factor of customer, the case company can provide extra care and service for specifc important customers to enhance their satisfactory. Furthermore, the decision rules can be especially useful in providing troubleshooting solution suggestions when limited information are available, which can also speed up the time needed for fault analysis. The derived rules according to the ranking of confdence and life can be employed to prioritize the potential troubleshooting solutions for support the engineers in light of the obtained information to enhance the efectiveness and efficiency for advanced quality control of probe cards. The developed solution is thus implemented in this company.

## <span id="page-10-0"></span>**Conclusion**

Focusing on realistic needs, this study has developed a systematic framework that integrates RST and domain knowledge to systematically derive decision rules from incomplete information including the abnormal situations and product information to suggest prioritized solutions for probe card troubleshooting. Indeed, probe card that serves as the testing signal interface between the tester and wafer is an indispensable component to ensure the integrity of IC testing that is increasing challenging owing to the shrinking IC feature sizes. The main contribution and novelty of this research can be summarized as follows: Firstly, domain knowledge and troubleshooting know-how are efectively incorporated for defning the condition attributes for developing RST and validating the derived rules. Secondly, an empirical study was conducted for validation in a leading probe card manufacturer, in which the results have shown practical viability of the proposed approach. Indeed, the reducts and the rules derived by RST can provide efective suggestions for the engineers to shorten the troubleshooting time and reduce the equipment downtime. Thirdly, the validated solution has been embedded in the digital decision system that is implemented in real settings. The probe card manufacturers can efectively employ the proposed solution to integrate domain knowledge and troubleshooting experience that can reduce the loss of core know-how due to retirement or resignation of talents via maintaining troubleshooting techniques.

As critical dimensions of IC are continuously shrinking, high quality and reliability of probe card is crucial to ensure the sensitivity and integrity of IC probing test for advanced quality control. Future research can be done in a number of directions as follows: Firstly, the proposed approach can be employed for similar troubleshooting issues with incomplete information for other semiconductor testing tools such as prober and tester for advanced quality control. Secondly, more research can be done to retrain the RST model with updated data, while more decision rules can be extracted by applying the proposed approach in diferent manufacturing contexts. Thirdly, more studies can be done to incorporate other variables including product related factors such as IC design house and product type as well as the data of equipment status variables collected from the sensors to explore potential causal relationships to support trouble shooting. Fourthly, future research can be done to derive more comprehensive decision rules by employing other data-driven approaches such as decision tree and Bayesian network for comparison. Future studies should be done to develop a digital decision system for real-time decisions with integration of customized planned data storage system and data-driven approaches for smart manufacturing.

# **Appendix**

The terminology and notations for RST are defned as the following table.



**Acknowledgements** This research is supported by National Science and Technology Council, Taiwan (MOST 110-2634-F-007-008; MOST 110-2634-F-007-017) and the MPI Corporation, Taiwan.

# **References**

- <span id="page-11-3"></span>Chien, C.-F., & Chen, L. (2007). Using rough set theory to recruit and retain high-potential talents for semiconductor manufacturing. *IEEE Transactions on Semiconductor Manufacturing, 20*(4), 528–541.
- <span id="page-11-2"></span>Chien, C.-F., Chen, Y.-H., & Lo, M.-F. (2020a). Advanced quality control of silicon wafer specifcations for yield enhancement for smart manufacturing. *IEEE Transactions on Semiconductor Manufacturing, 33*(4), 569–577.
- <span id="page-11-6"></span>Chien, C.-F., & Hsu, C.-Y. (2011). UNISON analysis to model and reduce step-and-scan overlay errors for semiconductor manufacturing. *Journal of Intelligent Manufacturing, 22*(3), 399–412.
- <span id="page-11-0"></span>Chien, C.-F., & Hsu, C.-Y. (2014). Data mining for optimizing IC feature designs to enhance overall wafer efectiveness. *IEEE Transactions on Semiconductor Manufacturing, 27*(1), 71–82.
- <span id="page-11-1"></span>Chien, C.-F., Hsu, C.-Y., & Chang, K.-H. (2013a). Overall wafer efectiveness (OWE): A novel industry standard for semiconductor ecosystem as a whole. *Computers & Industrial Engineering, 65*(1), 117–127.
- <span id="page-11-4"></span>Chien, C.-F., Hsu, C.-Y., & Chen, P.-N. (2013b). Semiconductor fault detection and classifcation for yield enhancement and manufacturing intelligence. *Flexible Services and Manufacturing Journal, 25*(3), 367–388.
- Chien, C.-F., Hsu, S.-C., & Chen, Y.-J. (2013c). A system for online detection and classifcation of wafer bin map defect patterns for manufacturing intelligence. *International Journal of Production Research, 51*(8), 2324–2338.
- <span id="page-11-5"></span>Chien, C.-F., Kerh, R., Lin, K.-Y., & Yu, A.P.-I. (2016). Data-driven innovation to capture user-experience product design: An

empirical study for notebook visual aesthetics design. *Computers & Industrial Engineering, 99*, 162–173.

- <span id="page-12-16"></span>Chien, C.-F., Lin, K.-Y., & Yu, A.P.-I. (2014). User-experience of tablet operating system: An experimental investigation of Windows 8, iOS 6, and Android 4.2. *Computers & Industrial Engineering, 73*, 75–84.
- <span id="page-12-0"></span>Chien, C.-F., Lin, Y.-S., & Lin, S.-K. (2020b). Deep reinforcement learning for selecting demand forecast models to empower Industry 3.5 and an empirical study for a semiconductor component distributor. *International Journal of Production Research, 58*(9), 2784–2804.
- <span id="page-12-24"></span>Chien, C.-F., Wang, H.-J., & Wang, M. (2007). A UNISON framework for analyzing alternative strategies of IC fnal testing for enhancing overall operational efectiveness. *International Journal of Production Economics, 107*(1), 20–30.
- <span id="page-12-25"></span>Fu, W., & Chien, C.-F. (2019). UNISON data-driven intermittent demand forecast framework to empower supply chain resilience and an empirical study in electronics distribution. *Computers & Industrial Engineering, 135*, 940–949.
- <span id="page-12-3"></span>Fu, W., Chien, C.-F., & Tang, L. (2022). Bayesian network for integrated circuit testing probe card fault diagnosis and troubleshooting to empower Industry 3.5 smart production and an empirical study. *Journal of Intelligent Manufacturing, 33*(3), 785–798.
- <span id="page-12-22"></span>Hassanien, A. E., Abraham, A., Peters, J. F., Schaefer, G., & Henry, C. (2009). Rough sets and near sets in medical imaging: A review. *IEEE Transactions on Information Technology in Biomedicine, 13*(6), 955–968.
- <span id="page-12-8"></span>Hong, S. J., Lim, W. Y., Cheong, T., & May, G. S. (2011). Fault detection and classifcation in plasma etch equipment for semiconductor manufacturing *e*-diagnostics. *IEEE Transactions on Semiconductor Manufacturing, 25*(1), 83–93.
- <span id="page-12-2"></span>Hsu, S.-C., & Chien, C.-F. (2007). Hybrid data mining approach for pattern extraction from wafer bin map to improve yield in semiconductor manufacturing. *International Journal of Production Economics, 107*(1), 88–103.
- <span id="page-12-26"></span>Hu, Y.-F., Hou, J.-L., & Chien, C.-F. (2019). A UNISON framework for knowledge management of university–industry collaboration and an illustration. *Computers & Industrial Engineering, 129*, 31–43.
- <span id="page-12-21"></span>Jothi, G., & Inbarani, H. H. (2016). Hybrid Tolerance Rough Set-Firefy based supervised feature selection for MRI brain tumor image classifcation. *Applied Soft Computing, 46*, 639–651.
- <span id="page-12-12"></span>Khakifrooz, M., Chien, C.-F., & Chen, Y.-J. (2018). Bayesian inference for mining semiconductor manufacturing big data for yield enhancement and smart production to empower Industry 4.0. *Applied Soft Computing, 68*, 990–999.
- <span id="page-12-11"></span>Kim, G.-Y., Kang, S.-H., & Nah, W. (2017). Novel TDR test method for diagnosis of interconnect failures using automatic test equipment. *IEEE Transactions on Instrumentation and Measurement, 66*(10), 2638–2646.
- <span id="page-12-4"></span>Kusiak, A. (2001). Rough set theory: A data mining tool for semiconductor manufacturing. *IEEE Transactions on Electronics Packaging Manufacturing, 24*(1), 44–50.
- <span id="page-12-28"></span>Lee, C.-Y., & Chien, C.-F. (2022). Pitfalls and protocols of data science in manufacturing practice. *Journal of Intelligent Manufacturing, 33*, 1189–1207.
- <span id="page-12-9"></span>Li, B., Han, T., & Kang, F. (2013). Fault diagnosis expert system of semiconductor manufacturing equipment using a Bayesian network. *International Journal of Computer Integrated Manufacturing, 26*(12), 1161–1171.
- <span id="page-12-27"></span>Lin, Y.-S., Chien, C.-F., & Chou, D. (2022). UNISON decision framework for hybrid optimization of wastewater treatment and

recycle for Industry 3.5 and cleaner semiconductor manufacturing. *Resources, Conservation and Recycling, 182*(106282), 1–11.

- <span id="page-12-1"></span>Moore, G. E. (1965). Cramming more components onto integrated circuits. *Electronics, 38*(8), 114–117.
- <span id="page-12-18"></span>Muralidharan, V., & Sugumaran, V. (2013). Rough set based rule learning and fuzzy classifcation of wavelet features for fault diagnosis of monoblock centrifugal pump. *Measurement, 46*(9), 3057–3063.
- <span id="page-12-10"></span>Nawaz, J. M., Arshad, M. Z., & Hong, S. J. (2014). Fault diagnosis in semiconductor etch equipment using Bayesian networks. *Journal of Semiconductor Technology and Science, 14*(2), 252–261.
- <span id="page-12-5"></span>Pawlak, Z. (1982). Rough sets. *International Journal of Computer & Information Sciences, 11*(5), 341–356.
- <span id="page-12-6"></span>Pawlak, Z. (1997). Rough set approach to knowledge-based decision support. *European Journal of Operational Research, 99*(1), 48–57.
- <span id="page-12-29"></span>Pawlak, Z. (2002). Rough sets, decision algorithms and Bayes' theorem. *European Journal of Operational Research, 136*(1), 181–189.
- <span id="page-12-19"></span>Peng, J.-T., Chien, C., & Tseng, T. (2004). Rough set theory for data mining for fault diagnosis on distribution feeder. *IEE Proceedings-Generation, Transmission and Distribution, 151*(6), 689–697.
- <span id="page-12-7"></span>Peng, X., Wen, J., Li, Z., Yang, G., Zhou, C., Reid, A., Hepburn, D. M., Judd, M. D., & Siew, W. H. (2017). Rough set theory applied to pattern recognition of Partial Discharge in noise afected cable data. *IEEE Transactions on Dielectrics and Electrical Insulation, 24*(1), 147–156.
- <span id="page-12-13"></span>Rostami, H., Blue, J., & Yugma, C. (2018). Automatic equipment fault fngerprint extraction for the fault diagnostic on the batch process data. *Applied Soft Computing, 68*, 972–989.
- <span id="page-12-20"></span>Shen, L., Tay, F. E., Qu, L., & Shen, Y. (2000). Fault diagnosis using rough sets theory. *Computers in Industry, 43*(1), 61–72.
- <span id="page-12-14"></span>Tseng, T.-L.B., Jothishankar, M., & Wu, T. T. (2004). Quality control problem in printed circuit board manufacturing—An extended rough set theory approach. *Journal of Manufacturing Systems, 23*(1), 56–72.
- <span id="page-12-15"></span>Wang, R., Guo, X., Zhong, S., Peng, G., & Wang, L. (2022). Decision rule mining for machining method chains based on rough set theory. *Journal of Manufacturing Systems, 33*, 799–807.
- <span id="page-12-17"></span>Wu, Z., He, L., Wang, Y., Goh, M., & Ming, X. (2020). Knowledge recommendation for product development using integrated rough set-information entropy correction. *Journal of Manufacturing Systems, 31*, 1559–1578.
- Yu, H.-C., Lin, K.-Y., & Chien, C.-F. (2014). Hierarchical indices to detect equipment condition changes with high dimensional data for semiconductor manufacturing. *Journal of Intelligent Manufacturing, 25*(5), 933–943.
- <span id="page-12-23"></span>Zhang, Q., Xie, Q., & Wang, G. (2016). A survey on rough set theory and its applications. *CAAI Transactions on Intelligence Technology, 1*(4), 323–333.

**Publisher's Note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.