Introduction

Dimensional quality control is a major challenge within discrete part manufacturing processes. For instance, in the automotive industries, two-third of all quality related engineering changes in the automotive and aerospace industries are caused by dimensional variation related failures (Ceglarek and Shi 1995). Hence, automatic in-process sensing and data collection techniques are employed in complex multi-station manufacturing processes in an effort to identify the root causes of 6-sigma variations.

In automotive assembly processes, end-of-line or distributed sensing are generally used to diagnose process variation sources (Khan et al. 1998, 1999;Ding et al. 2003; Khan and Ceglarek 2000). Distributed sensing is more effective than end-of-the-line sensing as it can identify more critical variation sources (Ding et al. 2003). The effective root cause diagnosis of product 6-sigma variation faults relies on optimal sensor distribution in multi-station assembly process. Poor sensor distribution often produces large amounts of conflicting and vague information. The problem pertaining to optimal sensor distribution in multi-station assembly processes involves the determination of: (i) location of measurement stations; (ii) number of sensors required at each measurement station; and, (iii) the location of sensors within the measurement station. The term “location of sensor” can be interpreted as either: (i) the location where a sensor is actually installed; or, (ii) the location of a point or a feature on a given part or subassembly that the sensor measures. The latter, i.e., the point which is measured, is commonly used in quality control research. Hence, using this specification, sensor distribution may be defined as the selection of points or features to be measured on different measurement stations. In particular, measurement of a selected set of points leads to an inference about the root cause(s) of product 6-sigma variation faults (Mandroli et al. 2006). Several researchers in the area of manufacturing have focused on the sensor networks (Levi et al. 2010) and fault diagnosis and prediction in case of assembly systems (Jeremy et al. 2011; Baydar and Saitou 2004). Levi et al. (2010) deals with the sensor networks in terms of its security performance in real world applications. Fixture faults monitoring using auto regressive models in automotive assembly processes are discussed in Jeremy et al. (2011). Error prediction, diagnosis, and recovery for discrete part manufacturing using Monte Carlo simulations and genetic algorithm are discussed in Baydar and Saitou (2004).

Research on sensor distribution can be classified in terms of selection of objective function, optimization approach, and type of process considered (see Table 1). Objectives such as diagnosability index, A-optimality, D-optimality, E-optimality and pattern distance have been predominantly used in the literature to characterize sensor distribution. The A-optimality maximizes the summation of all eigenvalues of Fisher information matrix, D-optimality maximizes the determinant of Fisher information matrix, and E-optimality maximizes the smallest eigenvalue of Fisher information matrix. However, these objectives are known to be computationally complex due to their non-linear characteristics.

Table 1 Methodologies used in literature for sensor distribution problem and its classification based on single and multiple station assembly system

The sensor distribution problem becomes even more complex when these objectives are evaluated in a high dimensional search space (Ding et al. 2003; Liu et al. 2005). This paper selects the E-optimality objective for evaluating the sensor layouts as it subsumes other objectives (Liu et al. 2005). Furthermore, the existing optimization algorithms for sensor distribution have been tested only on the problems of lower dimensions; mostly in a production systems with a single assembly station (see Table 1). Table 1 classifies methodologies for sensor distribution approaches used in literature based on the type of objective used and optimization approach. Additionally, there are also some studies that conducted analysis of sensor distribution problem without proposing optimization approach and using objectives such as diagnosability (Ding et al. 2002b; Zhou et al. 2003), pattern distance (Ding et al. 2002a), A-optimality (Djurdjanovic and Ni 2003), D-optimality (Djurdjanovic and Ni 2003), and E-optimality (Djurdjanovic and Ni 2003).

As illustrated in Table 1, the state-of-the-art approaches such as exchange algorithm, SQP, random search, direct search; provides optimal sensor layout where the measurement points are arbitrarily selected on the part or subassembly (unrestricted search), rather than selecting KCs which are free from measurement difficulties. That is, the state-of-the-art approaches does not consider the ease for calibration of measurement gauges, feature based measurement error (Huang et al. 2004), and lack of explicit relations between measured features and geometrical dimensioning and tolerancing (GD&T) characteristics (Meadows 1995). Hence, the solution provided by existing approaches often becomes costly or difficult to implement in industrial applications as they cannot be easily integrated with the required product assurance and quality control standards. Increasingly, there is a need to develop an effective and efficient methodology to obtain optimal sensor layouts which can maximize production system diagnosability and simultaneously maximize the number of measurement points placed at various KPCs and KCCs, which are specifically selected for product assurance and quality control standards during the design phase of product and process validation, respectively. However, since there are a large number of KCs with various complex interactions defined by the GD&T, and it is economically not justifiable to measure all of the KCs in multi-station assembly process. Therefore, optimal sensor distribution is a very relevant and challenging problem.

A feature-based sensor distribution approach is proposed in this paper that maximizes the number of measurement points that are placed at critical design features called Key Characteristics (KCs) available as part of the product and process design information (CAD/CAM), and classified into: Key Product Characteristics (KPCs) and Key Control Characteristics (KCCs) as to represent critical product and process design features, respectively. The feature-based approach starts with the GA-based approach, which considers only KCs as candidates for measurement point selection for sensor layout. In particular, GA are used because of the huge search space in which to search, owing to the large number of KCs and their combinations, to create sensor layouts with various complex interactions defined by the GD&T. The resulting sensor layout from GA allows having measurements with the best alignment to the product design requirements (GD&T). However, restrictions to select measurement points only from the predefined set of KCs can lead to a decrease of the overall 6-sigma variation faults diagnosability level (i.e., sensitivity of sensor layouts). Therefore, an iterative procedure is employed, which uses sensor layout possessing all measurement points as KCs (obtained by GA) to search for sensor layout having higher sensitivity. The procedure iteratively replaces KC(s) present in the sensor layout obtained by GA with arbitrary point(s) based on state-of-the-art approaches. This procedure is repeated until the sensitivity value of the sensor layout is greater than the predefined threshold value. Thus, the proposed feature-based optimal sensor distribution approach integrates both (i) traditional sensor distribution approaches such as random search, exchange algorithms, and direct search (unrestricted selection of measurement points) and (ii) GA-based approaches (pre-defined KCs as candidates for measurement points selection) to maximize the number of KCs selected as measurement points subject to minimum required production system diagnosability.

The remainder of this paper is organized as follows: Section “Information required for sensor distribution problem in multi-station assembly” presents a brief discussion on relevant challenges and complexity pertaining to the sensor distribution problem. In Section “Sensor distribution problem formulation”, a mathematical formulation of the objective function and related constraints are discussed. Section “Feature-based approach for sensor distribution” details the GA-based procedure for optimal sensor distribution problem taking into consideration predetermined KCs as the measurement points. Further, the feature-based approach for sensor distribution based on GA-based procedure and the state-of-the-art approaches based on the random search, exchange algorithms, and direct search, is discussed in Section “GA-based approach for sensor distribution from predetermined KCs”. Section “Case Study” details the application of the proposed methodology for a case study of cab assembly process. Finally, Section “Summary and Conclusions” provides summary and conclusions along with a discussion on future research directions.

Information required for sensor distribution problem in multi-station assembly

The problem of sensor distribution for process control and quality improvement is a complex issue which requires design information to model all critical intricacies involving products and processes, inherent for control of multi-station assembly processes. The information required for sensor distribution can be explicitly divided into: product information; process information; and information related to interactions between process and product. These required design information creates a significant challenges due to its complexity as outlined below.

Product information complexity

Early design evaluation of multi-station assembly processes is very important for new product development and also for designing a robust manufacturing system to improve product quality. Common automotive product assembly consists of 200–300 sheet metal parts and subassemblies which are to be assembled on 55–75 assembly stations (Ceglarek and Shi 1995). Therefore, the complexity arises when selecting measurement points for sensor layout from the large combinations KPCs in multiple parts and their subassemblies in several stations. The assembly process of body-in-white is represented in the form of process tree as shown in Fig. 1. Each KPC on parts/subassemblies can be represented as a design feature such as in automotive body assembly process there are four major features measured on the product: (i) points; (ii) edges; (iii) holes; and, (iv) slots. It is important to use/measure KPCs as they directly represent product performance evaluation. However, the KPCs are selected with different objectives in mind and thus not all can be measured (see Table 2).

Fig. 1
figure 1

The process tree of a body-in-white

Table 2 Classification of KCs and measurement points based on design requirements, goal, and purpose

Process information complexity

Multi-station assembly process generally refers to the processes involving more than one assembly station to manufacture a complex product. For example, automotive body assembly processes include multiple stations where parts are assembled to produce complex product. For example, a common assembly process for automotive product consists of 55–75 stations (Ceglarek and Shi 1995).

To evaluate the dimensional quality of the assembled product, measurement points are selected on parts. Figure 2 illustrates a 3-D fixture layout for plenum lower subassembly restrained during assembly operations by set of 4-way, 2-way fixture pins and three datum fixture pads. These types of fixtures are used throughout assembly stations to constraint part/subassembly movements (\(\ge \)1,000 possible variation sources). Thus, there are large number of fixtures (KCCs) controlling the variations in assembly operations. In parts/subassemblies, these KCCs are defined as various design features such as points, holes, edges, and slots. Each of the design features are defined by GD&T characteristics, which is important for estimation of process capability. Hence, KCCs on parts/subassemblies have to be measured for fault root cause identification. However, not all KCCs can be measured due to visibility & accessibility of measurement points and associated costs. Therefore, there is a need for selection of measurement points, which can maximize the sensitivity of sensor layouts to detect variations. More information about the sensitivity of sensor layouts is detailed in Section “Sensor distribution problem formulation”.

Fig. 2
figure 2

Fixture layout on a 3-D Plenum lower subassembly; F/A—Fore/Aft, CC—Cross Car, U/D—Up/Down

Information complexity related to interactions among product and process (KPCs and KCCs)

In order to deliver the intended dimensional accuracy of each product as defined by KPCs & their tolerances, dozens of fixtures are used to constraint all parts and intermediate subassemblies throughout the production line. Each fixture restricts movements of a part/subassembly by using locators, which are in contact with parts/subassemblies at KCC locations. The direct calibration of tooling is a time consuming process conducted by using theodolites or laser trackers which frequently requires stopping the production.

Therefore, fixtures used in production are not frequently calibrated by using direct measurement of the locators, but rather indirectly by using measurement of KCC points on the parts/subassemblies if available; or measurement of surrogate points (key measurement point) for which the relation between them and KCC(s) can be estimated accurately (for example, stream-of-variations analysis (SOVA) model described below). Therefore, the process of distributing the sensors needs to (i) maximize number of measured KCCs is subject to cost constraints, such as a limited number of measurement stations and number of measurement points; and then (ii) select additional points which minimize uncertainty in model estimating relation between measurement points and KPCs and simultaneously maximize production system diagnosability. This approach will be discussed in Section “Sensor distribution problem formulation”.

In multi-station assembly processes, the propagation of fixture variation generated from each station and its impact on product quality are mathematically described by the assembly response function such as SOVA model. The SOVA model is developed for multi-station assembly processes as illustrated in Fig. 3 (Jin and Shi 1999; Huang et al. 2007). Mathematically it is represented as

$$\begin{aligned} \mathbf{X}\left( k \right)&= \mathbf{A}\left( {k-\text{1 }} \right) \times \mathbf{X}\left( {k-\text{1 }} \right) \nonumber \\&+\mathbf{B}\left( k \right) \times \mathbf{P}\left( k \right) +\mathbf{E}\left( k \right) ,\quad k= \text{1 },\text{2 }\ldots \text{ N } \end{aligned}$$
(1)
$$\begin{aligned} \mathbf{Y}\left( k \right)&= \mathbf{C}\left( k \right) \times \mathbf{X}\left( k \right) +\mathbf{W}\left( k \right) ,\quad \{ k \} \subset \left\{ {\text{1 },\text{2 },\text{3 }\ldots \text{ N }} \right\} \end{aligned}$$
(2)

where, \(k\) is the station index and N is the number of stations. X(\(k\)) represents the dimensional deviations that occur randomly as a result of assembly process on station \(k\). The input vector P(\(k\)) represents the random deviations associated with fixture locators on station \(k\). Process errors and unmodeled higher order terms are represented by E(\(k\)). Y(\(k\)) and C(\(k\)) represents product measurements and observation matrix at station \(k\). W(\(k\)) is white noise representing measurement noise.

Fig. 3
figure 3

Diagram of multi-station assembly process with N stations

Equation (1) suggests that part deviation X(\(k\)), at the \(k\)th station is influenced by the accumulated deviation up to station \(k-1\), i.e., X(\(k-1\)) and deviation contribution at station \(k\), i.e., P(\(k\)). Whereas, in Eq. (2), observation vector Y(\(k\)),is obtained at measurement station \(k\). When sensors are installed on one or more stations in a production line, the index for the observation equation (Eq. 2) is actually a subset of {1,2,3, ...,N}, whereas the index for the state equation is complete set. In case of end-of the line sensing k = N only, i.e., all the measurement points are present at the end of production line. Whereas, in case of distributed sensing, \(k\) for Eq. 2 is subset of {1, 2, 3,...,N}, i.e., measurement points are selected on parts assembled at multiple stations.

The matrices A(\(k\)) and B(\(k\)) in the state space model represent process design such as change of fixture layouts at each station, as well as the effect of fixture layout change across stations (see Table 3). The matrix C(\(k\)), can be interpreted as, sensor layout at \(k\)th station (number of measurement point and its locations, see Table 3). The aforementioned matrices are determined by utilizing the information about product and process (CAD/CAM) and thus tend to become large in dimensions. Furthermore, the mathematical indices, which are formulated for sensor distribution based on these matrices, becomes computationally complex. The interpretation of the system matrices A, B, and C is illustrated in Table 3.

Table 3 Interpretation of the SOVA matrices (Ding et al. 2003)

The sensor distribution problem in case of distributed sensing can be divided into: (i) determining measurement stations (i.e., determining values of \(k\) in Eq. 2); and, (ii) location of measurement points on parts or subassembly at the measurement station. Generally, restriction is imposed on the number of measurement stations in multi-station assembly process due to high capital investment in constructing measurement stations and installing measurement sensors. Figure 4, shows the assembly and measurement station of an assembly line.

Fig. 4
figure 4

a Automotive assembly station; and, b Measurement station

After measurement stations are identified, the selection of the set of measurement points located on parts/subassemblies at measurement stations are identified from a large number of candidate measurement points. Furthermore, the combination of measurement points that can occur in sensor layouts adds to the complexity of sensor distribution problem. The following section discusses the mathematical formulation of the sensor distribution problem, which is used in feature-based approach to obtain optimal sensor layout.

Sensor distribution problem formulation

In this section, the sensor distribution problem for distributed sensing is formulated using the SOVA model (Jin and Shi 1999; Huang et al. 2007) for modeling multi-station assembly processes (see Section “Information required for sensor distribution problem in multi-station assembly”). Based on the SOVA model (Eqs. 1, 2), numerous performance measures for optimal sensor placement have been introduced in the current literature such as: maximum distance between the variation patterns (Khan et al. 1999); diagnosability index (\(\mu \)) (Ding et al. 2003); and, sensitivity index (\(\text{ S }_\mathrm{m}\)) (Liu et al. 2005). The diagnosability condition does not makes distinction between diagnosable systems even though some sensor systems may have a superior performance compared to others in that they can easily detect a small change in the variation sources. This difference of detection capability is characterized by the concept of “sensitivity”. It is desirable that a sensor system not only has full diagnosability but also is sensitive to the underlying changes of variation sources. Hence, this paper will go beyond diagnosability, aiming to maximize sensitivity indices. The non-zero values of the sensitivity index, as developed in this paper, guarantees full diagnosability. The sensitivity index differentiates among the diagnosable systems and thus is a tougher objective.

The linear input-output relations between observation vector Y(\(k\)), and variation sources P(\(k\)), is illustrated based on the SOVA model as shown in Eqs. (1) and (2). The input-output model is

$$\begin{aligned} \mathbf{Y}=\mathbf{J}\cdot \mathbf{P}+\mathbf{J}\left( 0 \right) \cdot \mathbf{X}\left( 0 \right) +\mathbf{D} \end{aligned}$$
(3)

where, \(\mathbf{Y}^\mathrm{T}\!\!=\!\![\mathbf{Y}^\mathrm{T}(1) \mathbf{Y}^\mathrm{T}(2)\ldots \ldots \ldots \mathbf{Y}^\mathrm{T}(\mathrm{N})], \mathbf{D}^\mathrm{T}\!\!=\!\![\mathbf{D}^\mathrm{T}(1) \mathbf{D}^\mathrm{T}(2)\ldots \ldots \ldots \mathbf{D}^\mathrm{T}(\mathrm{N})]\) and \({\mathbf{D}}(\text{ k })\equiv \sum \limits _{\mathrm{i}=1}^\mathrm{k} {{\mathbf{C}}({\text{ k }}){\varvec{\Phi }}(\text{ k, } \text{ j }){\mathbf{E}}(\mathrm{i})} +{\mathbf{W}}(\text{ k }).{\varvec{\Phi }}(i,j)\) is interpreted as change of fixture layout among multiple stations (from \(i\)th to \(j\)th station).

The coefficient of first term of Eq. (3) J can be defined as:

$$\begin{aligned} {\mathbf{J}}=\left[ {{\begin{array}{l@{\quad }l@{\quad }l@{\quad }l@{\quad }l} {{\mathbf{C}}(1){\mathbf{B}}(1)}&{} 0&{} \ldots &{} \ldots &{} 0 \\ {{\mathbf{C}}(2){\varvec{\Phi }}(2,1){\mathbf{B}}(1)}&{} {{\mathbf{C}}(2){\mathbf{B}}(2)}&{} \ldots &{} \ldots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \vdots \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \vdots \\ {{\mathbf{C}}(\text{ N }){\varvec{\Phi }}(\text{ N },1){\mathbf{B}}(1)}&{} {{\mathbf{C}}(\text{ N }){\varvec{\Phi }}(\text{ N },2){\mathbf{B}}(2)}&{} \ldots &{} \ldots &{} {{\mathbf{C}}(\text{ N }){\mathbf{B}}(\text{ N })} \\ \end{array} }} \right] \nonumber \\ \end{aligned}$$
(4)

and coefficient of X(0) term as:

$$\begin{aligned} {\mathbf{J}}(0)=\left[ {{\begin{array}{l} {{\mathbf{C}}(1){\varvec{\Phi }}(1,0)} \\ {{\mathbf{C}}(2){\varvec{\Phi }}(2,0)} \\ \vdots \\ \vdots \\ {{\mathbf{C}}(\text{ N }){\varvec{\Phi }}(\text{ N },0)} \\ \end{array} }} \right] \end{aligned}$$
(5)

The deviations due to stamping processes X(0) are ignored as only deviation of parts during assembly processes are considered. Thus, the linear diagnostic model can be represented as:

$$\begin{aligned} \mathbf{Y}=\mathbf{J}\cdot \mathbf{P}+\mathbf{D} \end{aligned}$$
(6)

In root cause diagnosis, inferences can be made about P based on a sample of measurements of Y.

In the model represented by Eq. (6), the J matrix is determined by system design parameters such as locator and sensor locations. The J matrix is called system matrix in engineering systems design. Also, the P matrix is not the vector of parameters but a vector of unknown random inputs. In fact, Eq (6) can be represented as a linear mixed model with both fixed and random effects.

$$\begin{aligned} \mathbf{Y}=\mathbf{J}\cdot {\varvec{\upmu }}+\mathbf{J}\cdot \tilde{\mathbf{P}} +\mathbf{D} \end{aligned}$$
(7)

where \({\varvec{\upmu }}\) is the mean vector of P and \(\tilde{\text{ P }}\) is the zero-mean random part of the variation sources. Hence, \({\varvec{\upmu }}\) corresponds to the fixed effects and \(\tilde{\text{ P }}\) corresponds to the random effects. For root cause diagnosis, one needs to detect abnormal variations of the mean components \({\varvec{\upmu }} =[\mu _1 \ldots \mu _p ]^{T}\) and the variance components \({\varvec{\uptheta }} =[\sigma _1^2 \ldots \sigma _p^2 ]^{T}\). If \(\mathbf{m}_{\mathbf{Y}}\) and \(\Sigma _{\mathbf{Y}}\) represents the mean and covariance matrix of Y, then the model represented by Eq. (7) can be

$$\begin{aligned} {\mathbf{m}}_{\mathbf{Y}} ={\mathbf{J}}\cdot {\varvec{\upmu }} \end{aligned}$$
(8)
$$\begin{aligned} vec\left( {\Sigma _{\mathbf{Y}} } \right) =\pi (\mathbf{J}){\varvec{\uptheta }} +\sigma _D^2 vec({\mathbf{I}}) \end{aligned}$$
(9)

where \(\pi (.)\)is a matrix transform defined on matrix \(\mathbf{Z}=[{\mathbf{z}}^{1}\cdots {\mathbf{z}}^{k}\cdots {\mathbf{z}}^{n}]^{T}\) having \(\mathbf{Z}^{k}\) as its \(k\)th row vector, \(k=1,2\)....\(n\).

$$\begin{aligned} \pi (\mathbf{Z})=\left[ {(\mathbf{z}^{1}*\mathbf{z}^{1})^{T}\cdots (\mathbf{z}^{1}*\mathbf{z}^{n})^{T}\cdots (\mathbf{z}^{n}*\mathbf{z}^{1})^{T}\cdots (\mathbf{z}^{n}*\mathbf{z}^{n})^{T}} \right] \nonumber \\ \end{aligned}$$
(10)

and ‘*’ represents the Hadamard product of the two vectors. In defining the diagnosability, sensitivity for detecting changes in mean and variance components can be defined as the ratio of the change in the mean or variance of Y over a perturbation of the mean and variance of the input sources. Hence, given measurements Y, the mean-detecting sensitivity \((S_{m})\) and variation-detecting sensitivity \((S_{v})\) is defined as:

$$\begin{aligned} S_m&= \mathop {\min }\limits _{\delta {\varvec{\upmu }} \ne \text{0 }} \frac{(\delta \mathbf{m}_{\mathbf{Y}} )^{T}(\delta \mathbf{m}_{\mathbf{Y}} )}{(\delta {{\varvec{\upmu }}})^{T}(\delta {\varvec{\upmu }})}\end{aligned}$$
(11)
$$\begin{aligned} S_v&= \mathop {\min }\limits _{\delta {{\varvec{\uptheta }}}\ne \text{0 }} \frac{tr\big ( {( {\delta \tilde{\Sigma }_{\mathbf{Y}} } )^{T}( {\delta \tilde{\Sigma }_{\mathbf{Y}} } )} \big )}{(\delta {\varvec{\uptheta }})^{T}(\delta {\varvec{\theta }})} \end{aligned}$$
(12)

where, \(\delta \tilde{\Sigma }_{\mathbf{Y}}\) is the covariance matrix obtained from the process variation sources.

Since a linear relation exists in Eqs. (8) and (9) and using the eigen value property of symmetric matrix, the abovementioned sensitivity indices can be expressed in terms of \(\mathbf{J}^{T}\mathbf{J}\) as:

$$\begin{aligned} S_m =\lambda _{\min } ({\mathbf{J}}^{T}\mathbf{J})\quad \text{ and } \,S_v =\lambda _{\min } (\pi (\mathbf{J})^{T}\pi (\mathbf{J})) \end{aligned}$$
(13)

Where, \(\lambda _{\min } (.)\) denotes the smallest eigenvalue of a matrix. An inequality relationship between \(\text{ S }_{m}\) and \(\text{ S }_\mathrm{v}\) is identified; for same J, the lower bound for \(\text{ S }_\mathrm{v}\) is \(S_m^2 \). That is

$$\begin{aligned} S_m^2 \le S_v ,\quad \text{ for } \text{ same } \,\mathbf{J} \end{aligned}$$
(14)

From Eq. (14), it can be inferred that maximization of \(\text{ S }_{m}\) will certainly increase the value of \(\text{ S }_\mathrm{v}\). Hence, \(\text{ S }_{m}\) can be considered as a unified criterion for the problem of sensor distribution in multi-station assembly processes. Therefore, the design variables for sensor distribution problem are the number of sensors and their location on parts at different measurement stations represented by vector \(\psi (s)\), where ‘\(s\)’ is the number of sensors. The number of sensors ‘\(s\)’ is divided into ‘\(n\)’ measurement stations as \(\text{ s }_{1}\), \(\text{ s }_{2}{\ldots }, \text{ s }_{n}\); where, \(\text{ s }_\mathrm{k}\) represents the number of sensors allocated to \(k\)th measurement station. Hence,

$$\begin{aligned} s=\sum _{k=1}^n {s_k } \end{aligned}$$
(15)

\({\varvec{\Psi }}(s)\) consists of the X, Y and Z coordinate of measurement points on parts/subassemblies at measurement station. Now, \({\varvec{\Psi }}(s)\) is represented as:

$$\begin{aligned} {\varvec{\Psi }} (s)=[X_1 ^{1}Y_1 ^{1}Z_1 ^{1}\cdots X_{s_1 } ^{1}Y_{s_1 } ^{1}Z_{s_1 } ^{1}:\cdots :X_1 ^{n}Y_1 ^{n}Z_1 ^{n}\cdots X_{s_n } ^{n}Y_{s_n} ^{n}Z_{s_n} ^{n}]\nonumber \\ \end{aligned}$$
(16)

where,\(X_i^j, Y_i^j\) and \(Z_i^j\) is the coordinate of \(i\)th sensor placed on the \(j\)th station. The sensor distribution approach in this paper is based on the sensitivity index \(S_{m}\) (\({\varvec{\Psi }}\)(s)), which characterizes the quality of sensor layout \({\varvec{\Psi }}\)(s).

Feature-based approach for sensor distribution

In this section, the feature based approach for sensor distribution is discussed in detail. This section details the feature-based approach for sensor distribution by involving GA-based approach (see Section “GA-based approach for sensor distribution from predetermined KCs”) and state-of-the-art approaches such as random search, exchange algorithms, and direct search. The feature-based approach tries to maximize the number of KCs in the sensor layout thereby maintaining high sensitivity (\(S_{m})\) of sensor layouts. In feature-based approach, initially only KCs are analyzed by using GA for getting the sensor layouts with high sensitivity value. If the sensitivity index of the solution obtained is lower than the predefined threshold, then state-of-the-art approaches are used to select the measurement points on the entire regions on the parts. More information about the approach is provided in the latter half of this section. Following text first discusses about the problems in selecting arbitrary points as measurement points.

As mentioned in the introduction, the sensor placement on arbitrary points usually incurs different types of problems:

  1. i.

    Ease of calibration: It means that the measurement points selected should be in the regions which are easily accessible to the measurement device. This is done to avoid time consuming setups by the measuring device during measurement, which increases the overall inspection time of the assembly processes.

  2. ii.

    Measurement error associated with the measurement point on the part: The measurement devices have inherent errors caused by the lack of feature traceability for some of the points on the part. The lack of feature traceability means that instead of measuring a given point, the measurement device may actually measure the area around the selected point (Huang et al. 2004). This causes measurement errors corresponding to each measurement point which can have significant impact on the measurement accuracy and hence on the process control and the diagnostic algorithms currently used in manufacturing. Figure 5 has been used to illustrate the concept of measurement errors related to some points on the part. The nominal position of a point on part is A and when it is mislocated due to the part positioning error in Z direction, its position becomes \(\text{ A }^{*}\). The measurement error arises when the measurement device measures point \(\text{ A }^{**}\) instead of \(\text{ A }^{*}\). The measurement error in Y direction is illustrated in Fig. 5. The features such as a point on a plane can be measured with full accuracy in one direction which is known as feature tracing direction. The measurement error associated with the measuring devices is mainly depends upon: (i) the direction of measurement; (ii) the geometry of the features; and (iii) the direction of the pattern variation. The relations for estimating the errors in each direction are detailed in Appendix A.

  3. iii.

    Tolerance values of the measurement points: Before assembly operations are actually performed, design engineers use the geometric dimensioning and tolerancing guidelines for most of the design operations. Based on these guidelines, the tolerance values are assigned to the predetermined critical features/points (Meadows 1995).

Following text discusses about the feature based approach in detail.

Fig. 5
figure 5

Illustration of the error in the Y direction caused by a part mislocation in Z direction. A—nominal position of the measurement point; \(A^{*}\)—mislocated position of the measurement point caused by part mislocation in Z direction; \(A^{**}\)—point measured (Huang et al. 2004)

The overall approach for feature-based approach for the decision making is presented in Fig. 6. The feature based methodology starts with the arrangement of CAD data, and design information about KCs. The CAD data provides the geometric and dimensional information (GD&T) of the parts, subassemblies and the final product including all KCs: KPCs & KCCs and their tolerances. The design information provides the details about the different KCs, in the form of features and points on the parts, which are easy to calibrate, free from feature based measurement errors and have defined tolerances at the design stage. Then, GA-based procedure is applied in terms of selecting the measurement points from available KCs. GA based procedure is employed first for solving sensor distribution problem after getting the design information and CAD data as measurement points are selected solely from available KCs.

Fig. 6
figure 6

Feature-based approach to identify optimal sensor layout in multi-station assembly processes; \(\zeta \) is the % of KCs in sensor layout as measurement points

The GA-based approach finds best sensor layout with all measurement points as KCs and having highest sensitivity value for given number of sensors. Detailed description about the GA-based approach is discussed in Section “GA-based approach for sensor distribution from predetermined KCs”. Intuitively, it may be noted that the sensor layouts obtained from the GA-based approach may not be as sensitive as the layouts from state-of-the-art approaches. This is due to the fact that all the state-of-the-art approaches consider entire regions on the part for measurement point selection. Therefore, the decision regarding accepting the sensor layout from GA-based approach as the final solution is made based on threshold value (T) of the sensitivity index. Hence, a threshold value (T) is defined to be \(\tau \,\%\) of potential sensitivity value (\(S_{p})\), which is attained if the restriction for measurement point selection from KCs is removed. The sensor layout from the GA-based approach is accepted if its sensitivity index (\(S_{m})\) is greater than T, otherwise, an iterative procedure of removing KC(s) from the sensor layout and a search procedure based on the state-of-the-art approaches such as exchange algorithms, random search, and direct evaluation techniques is employed. The iterative procedure of sensor distribution is illustrated in Fig. 7.

Fig. 7
figure 7

Iterative procedure for optimal sensor layout; \(\zeta \) is the % of KCs in sensor layout as measurement points

The iterative procedure takes CAD data and the sensor layout obtained by GA-based approach (\(SL^{GA}\)) considering KCs only as measurement points. The state-of-the-art method (exchange algorithms, random search, and direct evaluation) resulting in highest \(S_{m}\) is selected for further comparison with the T value. After each iteration, one KC in the sensor layout is removed and it is replaced by the arbitrary point is selected by state-of-the-art method or GA. The resulting \(S_{m}\) is checked to see if it is greater than T. If the resulting sensor layout has \(S_{m} \ge \) T then the layout is considered to be the final optimal sensor layout. In case \(S_{m}\le \) T for the resulting layout, then number of KC to be replaced (represented as K) is incremented by 1 and again the procedure is run with state-of-the-art approaches. Another stopping criteria for this procedure is when percentage of KCs in sensor layout (\(\zeta \)) becomes zero, i.e., when there are no KCs left in the sensor layout to be replaced.

Figure 8 illustrates the situation when \(SL^{GA}\) has \(S_{m} < \text{ T }\) and the sensor layout from the best state-of-the-art approach is greater than T. The sensor layout \(SL^{GA}\) has all the measurement points as KCs, i.e., \(\zeta \) = 100 %. In case of state-of-the-art approach, \(\zeta < 100\) as sensor layout obtained from the state-of-the-art approaches has measurement points which can be arbitrary points or KCs. Hence, the sensor layout from state-of-the-art approach has the advantage of having greater \(S_{m}\) values than GA-based approach. But, they are inferior to GA-based approach as \(\zeta \) is lower for state-of-the art approaches.

Fig. 8
figure 8

The case when \(S_{m}\) from GA-based approach is less than T and \(S_{m} > \text{ T }\) for state-of-the-art approach

The proposed feature based approach is applied to sensor distribution optimization problem; where the objective of the problem is to maximize \(\zeta \) (percentage of KCs in resulting sensor layout) such that the \(S_{m} \ge \text{ T }\). Therefore, the problem can be formulated as:

$$\begin{aligned} Max. \zeta \end{aligned}$$
(17)
$$\begin{aligned} \text{ Subject }\,\text{ to }:S_m \ge T \end{aligned}$$
(18)

The feature based methodology for solving the abovementioned problem is based on the knowledge developed by applying the GA-based approach (search for measurement points in KCs) and the state-of-the-art approaches developed in this paper. As shown in Fig. 8, the main aim of the methodology is to increase the number of KCs in the sensor layout obtained by state-of-the art approaches and simultaneously maintaining sensitivity value above threshold T.

The following section discusses in detail the steps involved in the GA-based approach utilizing KCs as measurement point.

GA-based approach for sensor distribution from predetermined KCs

In this section, CAD data and predetermined KCs from design information are used to obtain the sensor layout \((SL^{GA})\) with high sensitivity index using GA-based approach. The steps involved in GA-based approach are detailed as follows:

Determination of measurement station

In this step, each assembly station of the multi-station assembly system is classified either as a measurement or a non-measurement station. The index for identifying measurement station is detailed by Ding et al. (2003). They studied, variation transmission in multi-station assembly process and an identified an index for identifying the measurement stations. The determination of variation transmission index requires fixture layout geometry B(i) and the fixture layout changes between stations, as modeled by \(\varvec{\Phi }(k,i)\) (Ding et al. 2003). Assuming, \(p_i\) number of 3-2-1 fixtures on station ‘\(i\)’ and each of them physically supports each rigid part. Therefore, the total number of degrees of freedom to be restrained is

$$\begin{aligned} p_i \cdot DOF=m(i)=\text{ dimension }({\mathbf{P}}(i)), \end{aligned}$$
(19)

where \(m(i)\) is the number of independent variation sources related to \(p_i\) fixtures. The variation transmission ratio is defined to quantify the variation transmission between stations

$$\begin{aligned} \eta \left( \frac{i}{k}\right) \equiv \frac{\rho \bigg (\pi \big (\varvec{\Phi }(k,1)\cdot \mathbf{B}(i)\big )\bigg )}{m(i)}, \end{aligned}$$
(20)

where \(\eta (i/k)=1\) represents the complete information regarding fixture variation that is transmitted from station \(i\) to \(k\). The detailed analysis of \(\eta (i/k)\) is provided in Ding et al. (2003). Specifically, if \(\eta (i/k)=1\) for all values of ‘\(i\)’ then sensor placement on only the last station, i.e., \(N\)th station is required. Therefore, \(i\)th station is designated for taking measurements, if \(\eta (i/k)< 1\), i.e. variance information lost during transmission from station \(i\) to station \(N\), is retrieved if sufficient number of sensors are installed on \(i\)th station. Consequently, a decision variable \(\alpha _i\) is defined as

$$\begin{aligned} \alpha _i =\left\{ {{\begin{array}{ll} 1&{} \quad \text{ if }\, \eta (i/N)=1 \\ 0&{} \quad \text{ if }\, \eta (i/N)<1 \\ \end{array} }} \right. \end{aligned}$$
(21)

The variable \(\alpha _i\) is computed for all the stations of multi-station assembly processes in order to identify the measurement station.

Input candidate measurement points

The design information about the parts which are to be assembled is utilized to obtain a set of measurement points. The design information of a part includes the KCs which are defined at the design stage by the designers as the critical points or features which are necessary to be measured for dimensional quality inspection of the products and processes, i.e., KPCs and KCCs. The measurement points, in case of GA-based approach, are selected only from KCs (KPCs and KCCs). Thus, difficulties such as sensor calibration, feature-based measurement errors and the tolerance allocation are eliminated. Furthermore, a large number of available KCs for the process and products make the search space of sensor distribution problem computationally large.

Measurement point selection on a measurement station

In this subsection, the measurement stations and measurement points obtained from Sections “Determination of measurement station” and “Input candidate measurement points” are utilized to find the sensor layout with maximum sensitivity index value. First, a station is classified into a measurement or a non-measurement station based on the decision variable \(\alpha _i \). The possible measurement points, based on the part information, are available from Section “Input candidate measurement points”. These measurement points occur in large numbers, and their combination to construct sensor layout, based on the given number of sensors, becomes combinatorial optimization problem. Hence, the GA is utilized for the sensor distribution problem as it comes under the category of evolutionary algorithms which are identified as the efficient techniques for dealing with complex optimization problems.

The GA is a commercially available technique in most of the standard software’s optimization toolbox. The objective function of the sensor distribution problem is the sensitivity index (\(S_{m})\) formulated in Section “Sensor distribution problem formulation” (Eq. 13) and the search space is the predetermined measurement points obtained from Section “Input candidate measurement points”. The standard value of tuning parameters in GA, i.e., crossover, mutation probability and population size has been used for effective search of the solution space. The GA is stopped when 1,000 successive iterations no longer produce better sensitivity index. The output of the application of GA on sensor distribution is the sensor layout of a single station with maximum \(S_{m}\) value.

Sensor distribution in case of multi-station assembly systems

In this subsection, the GA-based procedure has been discussed for measurement point selection for multi-station assembly system, which builds on Section “Measurement point selection on a measurement station”. The available number of sensors is divided among measurement stations. Furthermore, with the allocated number of sensors, measurement point selection is carried out on each measurement station as discussed in Section “Measurement point selection on a measurement station”. The overall procedure for optimal sensor distribution in multi-station assembly system is illustrated in following steps.

Determination of measurement station and possible sensor layout

  • Step 1: For stations \(k=1,2,3...N\), the corresponding decision variable \(\alpha _k\) is calculated for determining the measurement stations. Thereafter, the number of measurement stations is denoted as ‘\(n\)’ and the measurement station index is stored in vector \(\varvec{\upomega }\) of \(1\times n\)dimension.

  • Step 2: The total number of sensors ‘\(s\)’, are divided randomly among the ‘\(n\)’ measurement stations as \(s_1 ,s_2 , s_3 \ldots s_n\) such that all \(s_k >\)=1. Where, \(s_k\) denotes the number of sensors available for placement on \(k\)th measurement station.

Determination of best sensor layout from the predetermined KCs

  • Step 3: Apply GA to find optimal sensor layout \(({\varvec{\Psi }}^{l}(s))\) having highest sensitivity value \((S_m^l )\).

  • Step 4: If \(S_m^l >S_m^{Best}\) then \(S_m^{Best} \leftarrow S_m^l, {\varvec{\Psi }}^{Best}(s)\leftarrow {\varvec{\Psi }}^{l}(s)\). Here, \({\varvec{\Psi }}^{Best}(s)\) and \(S_m^{Best}\) are the best sensor layout obtained and its sensitivity value.

  • Step 5: If \(l<L_{\max }\) then procedure is repeated from Step 2 and \(l=l+1\). Where, \(L_{max}\) is the maximum number of iterations (user defined).

Else Stop.

The output from above procedure are \({\varvec{\Psi }}^{Best}(s)\) and \(S_m^{Best} \). The following section illustrates the application of the proposed feature based methodology on industrial case study.

Case study

The feature-based approach is illustrated by implementing it on a case study involving five-station cab assembly process. The process tree of the product to be assembled on five stations is provided in Fig. 9. It is illustrated that the process tree of cab assembly process includes parts/subassemblies such as underbody, right door frame, left door frame, front bow, central bow, and rear bow; which are assembled on five stations (as presented in Ceglarek and Prakash 2012). The current case study involves assembly of 3-D parts on five stations; hence, a newly formulated 3-D SOVA model has been employed to model variation propagation in multi-station assembly process (Huang et al. 2007).

Fig. 9
figure 9

The process tree of the cab assembly process with 5 stations

Therefore, in the case of 3-D part assembly process, the deviations arising on \(k\)th station (X(\(k))\) are due to three translational and three rotational DOF. The state equations of five station assembly of parts are

$$\begin{aligned} \mathbf{X}\left( k \right)&= \mathbf{B}\left( k \right) \times \mathbf{P}\left( k \right) +\mathbf{E}\left( k \right) ,\quad k=\text{1 }\end{aligned}$$
(22)
$$\begin{aligned} \mathbf{X}\left( k \right)&= \mathbf{A}\left( {k-\text{1 }} \right) \times \mathbf{X}\left( {k-\text{1 }} \right) +\mathbf{B}\left( k \right) \times \mathbf{P}\left( k \right) +\mathbf{E}\left( k \right) ,\nonumber \\&k= \text{2 },\text{3 }\ldots \text{5 } \end{aligned}$$
(23)

On the basis of the derivation and analysis carried out in (Huang et al. 2007), 3-D SOVA matrices (A, B) for five station cab assembly process are constructed.

As discussed in Section “Feature-based approach for sensor distribution”, the CAD data and design information about cab assembly parts are used for applying proposed feature-based approach for sensor distribution. The feature based methodology starts by considering only predetermined KCs (available from CAD and design information) for selecting the measurement points by GA-based procedure (see Section “GA-based approach for sensor distribution from predetermined KCs”). The GA-based approach finds a optimal sensor layout for the given number of sensors (which is 25 in this case). The values of other parameters used for running GA based approach are \(L_{max}=20; \alpha _{1}=0, \alpha _{2}=1, \alpha _{3}=0, \alpha _{4}=1, \alpha _{5}=1\); and \(n\) =3 (refer Section “GA-based approach for sensor distribution from predetermined KCs” for explanation of these variables). The results of GA-based approach on a cab assembly have been reported in Table 4. GA-based approach is computationally efficient than the state-of-the-art approaches, which is evident from Table 4. The state-of-the-art approaches, such as simulated annealing (SA), exchange algorithm and direct evaluation strategy perform badly in terms of required computational time. However, the solution found by the state-of-the-art approaches is more sensitive than the GA-based approach. This is due to the fact that the GA-based approach considers only KCs for measurement point selection as opposed to the unrestricted search of state-of-the-art approaches. Following paragraph presents brief descriptions of the search methods used in this section for comparison.

Table 4 Comparison among various approaches against the proposed KC-based approaches when \(s\) = 25

GA are an inspired search method based on the principles of natural evolution. The algorithm starts with a random set of solutions called chromosomes, whose fitness chromosome is determined by evaluating the objective function. The process of survival of the fittest is simulated by allowing better chromosomes to produce the offspring chromosomes (through crossover and mutation). The offspring population members are then evaluated to evolve next iteration’s population if they provide better solutions. This process is repeated for large number of iterations to obtain a best chromosome. The main parameters values used in the GA-based approach are population size is 20, mutation probability is 0.02, and crossover probability is 0.8. Simulated annealing resembles the process of physical annealing of solids. It starts with an initial solution at high temperature, exploring its nearby solutions by a perturbation process, and then replacing the solution with higher energy solutions, if obtained. This is repeated for a number of iterations which is determined by cooling rate. In this application, the main parameter values are initial temperature is 1,000, cooling rate is 0.98, and number of solutions checked at each stage is 20. Exchange algorithms start with a set of measurement locations (randomly selected) and exchanges the current locations with those in candidate locations set to improve the chosen objective function. Candidate locations are obtained by discretizing the assembly parts (10 mm). When the number of measurement locations are large (\(>\)10), CPU time of exchange algorithms are high. This is due to the fact that exchange algorithm were initially developed for experimental design with a relatively small number of factors and experiments. Direct search methods do not require any information about the gradient of the objective function as opposed to gradient-based search. A direct search algorithm searches a set of solutions around the current solutions, looking for one in which the value of the objective function is lower than the value at the current solution. Due to a large number of objective function evaluations for the set of solutions, computational times are typically higher.

The decision regarding the suitability of the sensor layout from the GA-based approach has to be made by comparing the sensitivity value (\(S_{m})\) with the threshold value (T). The threshold sensitivity value is obtained based on the potential sensitivity (\(S_{p})\) value, which is taken to be 40.00. Therefore, the value of ‘T’ becomes 36.00 (taking \(\tau \)= 90), which is greater than the \(S_{m}\) obtained from GA-based approach and lower than the sensitivity value obtained by the state-of-the-art approaches (see Table 4). This scenario is discussed in detail in Section “Feature-based approach for sensor distribution” (see Fig. 8). Therefore, iterative procedure is employed to obtain sensor layout to replace KCs (as discussed in Section “Feature-based approach for sensor distribution”). The iterative procedure is used to retain maximum number of KCs in the sensor layout obtained by the state-of-the-art approaches. Hence, the methodology described in Section “Feature-based approach for sensor distribution” is applied to obtain the best sensor layout which has \(S_m \ge T\) and maximum number of measurement points as KCs. After running this procedure, five measurement points (KCs) in the sensor layout obtained by GA-based approach has been replaced by arbitrary points on parts/subassembly. \(S_{m}\) value for the sensor layout is obtained to be 38.21. Therefore, sensor layouts obtained by the feature-based approach can be used for measurement purposes in multi-station assembly processes. In this case study, the option of using sensor layouts directly from state-of-the-art approaches directly has not been employed due to the potential cost that would be incurred if calibration, tolerance allocation and measurement error analysis are done for the sensor layout having arbitrary measurement points.

Therefore, the sensor layout obtained after the application of feature-based methodology will have fewer challenges related to calibration, tolerancing and measurement errors due to the presence of KCs in final sensor layout.

Summary and conclusions

This paper presents a feature-based approach for determining the optimal sensor distribution in the case of multi-station assembly processes. The main objective of the proposed method is to maximize the number of KCs that can be used as a measurement point in a sensor layout. A sensitivity index value has been used for characterizing the sensor layout, which is defined as the capability of measurement systems to detect the underlying root causes of variation. The application of feature-based sensor distribution methodology is illustrated on the 3-D automotive part. Where, GA-based approach (taking in consideration predetermined KCs only for measurement point selection) is integrated with state-of-the-art approaches with a view to increase the number of predetermined points in the sensor layout based on the threshold sensitivity value. The proper mathematical formulation of the KC maximization problem and related constraints such as: (i) ease of calibration; (ii) measurement errors; and (iii) tolerance allocation is not detailed in this paper. Instead, conceptual guidelines have been discussed above so that future researches in this area may focus on it.