Detecting shifts in Conway–Maxwell–Poisson profile with deviance residual-based CUSUM and EWMA charts under multicollinearity

Mammadova, Ulduz; Özkale, M. Revan

doi:10.1007/s00362-023-01399-z

Detecting shifts in Conway–Maxwell–Poisson profile with deviance residual-based CUSUM and EWMA charts under multicollinearity

Regular Article
Published: 15 February 2023

Volume 65, pages 597–643, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Statistical Papers Aims and scope Submit manuscript

Detecting shifts in Conway–Maxwell–Poisson profile with deviance residual-based CUSUM and EWMA charts under multicollinearity

Download PDF

304 Accesses
3 Citations
Explore all metrics

Abstract

Monitoring profiles with count responses is a common situation in industrial processes and for a count distributed process, the Conway–Maxwell–Poisson (COM-Poisson) regression model yields better outcomes for under- and overdispersed count variables. In this study, we propose CUSUM and EWMA charts based on the deviance residuals obtained from the COM-Poisson model, which are fitted by the PCR and r–k class estimators. We conducted a simulation study to evaluate the effect of additive and multiplicative types shifts in various shift sizes, the number of predictor, and several dispersion levels and to compare the performance of the proposed control charts with control charts in the literature in terms of average run length and standard deviation of run length. Moreover, a real data set is also analyzed to see the performance of the newly proposed control charts. The results show the superiority of the newly proposed control charts against some competitors, including CUSUM and EWMA control charts based on ML, PCR, and ridge deviance residuals in the presence of multicollinearity.

A new auxiliary information based cumulative sum median control chart for location monitoring

Article 01 April 2019

Principal component regression-based control charts for monitoring count data

Article 10 November 2015

Monitoring Coefficient of Variation Using CUSUM Control Charts

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In statistical quality control, profile refers to the situations in which the quality of a process or product is represented by a functional relationship between a response and one or more predictors. When the response variable follows a distribution that belongs to the exponential family, it is called a generalized linear profile (GLP).

Because of the similarities of the profiles to the regression models, monitoring methods are typically based on regression modelling. Although the maximum likelihood (ML) approach is the traditional way of estimating parameters, alternative estimation methods are discussed in the literature due to the problems arising from the multicollinearity between the predictors. Some of these alternative estimation methods are integrated into profile monitoring.

A common way of monitoring GLPs with count response is the application of Poisson profile monitoring methods. Skinner et al. (2003) proposed a technique based on the likelihood ratio statistic and used the deviance residual from Poisson regression for monitoring purposes. Amiri et al. (2011) examined Hotelling’s $T^2$ approach based on estimated model parameters, Asgari et al. (2014) developed a procedure that involves a mixture of log and square root link functions for the modelling process with count response. Asgari et al. (2014) monitored the process via proposed Shewhart and exponentially weighted moving average (EWMA) charts based on the standardized residuals. Qi et al. (2016) suggested a control chart based on weighted likelihood ratio statistics for monitoring GLPs. Later Marcondes Filho and Sant’Anna (2016) proposed a Shewhart-type residual control chart based on the principal component (PC) scores for the Poisson processes. The effect of the parameter estimator in Poisson profile monitoring is investigated by Maleki et al. (2019). Wen et al. (2021) proposed a regression-adjusted EWMA chart that adjusts and updates the expected values according to the situation to monitor the Poisson process. Mammadova and Özkale (2021a) studied the impact of the tuning parameter on ridge deviance-based control charts and Iqbal et al. (2022) presented homogeneously weighted moving average control charts where monitored observations are either deviance or standardized residuals of the generalized linear model (GLM).

The aforementioned approaches are extended for monitoring COM-Poisson profiles since a Poisson distribution becomes unsuitable when the data set shows signs of over or underdispersion. The flexible two-parameter COM-Poisson distribution was proposed by Conway and Maxwell (1962) to overcome the challenges caused by the difference between the mean and variance of the count data set. Park et al. (2018) adapted the principal components regression (PCR) approach for monitoring Poisson processes and constructed an r-chart for COM-Poisson profile monitoring. Park et al. (2020) examined COM-Poisson regression-based control charts and utilized randomized residuals to build a Shewhart chart. Rao et al. (2020) studied a mixed EWMA and cumulative sum (CUSUM) chart for COM-Poisson profile while Shewhart, EWMA, and CUSUM charts on the bases of ridge deviance residuals were developed by Mammadova and Özkale (2021b) for monitoring Poisson as well as the COM-Poisson profiles. Jamal et al. (2021) monitored real-time highway safety surveillance data set with CUSUM and EWMA charts and used randomized quantile and deviance residuals of the COM-Poisson regression model for the monitoring.

To address the multicollinearity problem, options other than ridge estimator were developed for GLMs. The iterative PCR estimator and the first-order approximated Liu estimator were proposed for GLMs respectively by Marx and Smith (1990) and Kurtoğlu and Özkale (2016). Özkale (2019) studied a combination of the Liu and PCR estimators, the r–d class estimator to minimize the effect of multicollinearity. Abbasi and Özkale (2021) developed the iterative r–k class estimator for GLMs by combining ridge and PCR estimators. The authors showed the superiority of proposed approaches to ML and ridge estimators in terms of the mean squared error criterion through the simulation study. Apart from the mentioned studies, several studies are specifically devoted to the examination of COM-Poisson distribution in the framework of GLM (see, Guikema and Goffelt (2008); Lord et al. (2008); Sellers and Shmueli (2010); Francis et al. (2012)). Reformulation of the distribution was suggested by Guikema and Goffelt (2008) to model the data set in order to prevail over the computational limitations. Characteristics of the COM-Poisson regression model, estimation, diagnostics, and interpretation were discussed by Guikema and Goffelt (2008); Sellers and Shmueli (2010) utilized the Bayesian technique for parameter estimation whereas Sellers and Shmueli (2010) and Francis et al. (2012) used unconstrained optimization. Abdella et al. (2019) introduced a penalized likelihood technique in the form of the ridge estimation, Mammadova and Özkale (2021b) provided an iterative closed-form solution to the ridge estimator given by Abdella et al. (2019); Sami et al. (2022b) proposed a COM-Poisson ridge regression estimator, which is a ridge estimator obtained at the final iteration of ML estimator. Recently, Sami et al. (2022a) proposed a modified one-parameter Liu estimator, and Rasheed et al. (2022) developed a modified jackknifed Liu-type estimator for COM-Poisson regression.

In this paper, we propose an extension of residual-based CUSUM and EWMA charts for identifying abnormalities in the COM-Poisson profile mean by using PCR and r–k class estimators. We intend to address a multicollinearity problem and optimize the monitoring process by reducing the dimension of the data set while detecting out-of-control observations as quickly as possible by utilizing CUSUM and EWMA charts based on the PCR and the r–k class estimator. Compared with the control charts previously proposed in the literature, our contributions can be summarized as follows:

CUSUM and EWMA control charts, which are used to detect small changes in the process, will become effective by combining these control charts with the r–k class estimator, which is an effective estimator in multicollinearity.
The CUSUM and EWMA control charts based on the r–k class estimator provide a general framework of CUSUM and EWMA control charts based on ridge and PCR estimators.
The CUSUM and EWMA control charts based on the r–k class estimator outperform the CUSUM and EWMA control charts based on the ML estimator in the presence of multicollinearity.

The following is the outline for this paper: Sect. 2 covers a brief description of the COM-Poisson distribution and estimation methods for the COM-Poisson regression model in the case of multicollinearity. Construction of the deviance-based CUSUM and EWMA charts for monitoring GLP with correlated predictors and dispersed count response is presented in Sect. 3. Section 4 provides a performance analysis of the proposed method through a simulation study. Section 5 delivers a real-life application that is carried out in the example of the SECOM data set. The concluding remarks are given in Sect. 6.

2 COM-Poisson modelling

2.1 COM-Poisson distribution

Introduced in the early 1960 s, the COM-Poisson distribution has attracted more attention from researchers in the recent past due to its flexibility. Characterizations of the distribution is thoroughly investigated by Boatwright et al. (2003); Shmueli et al. (2005); Li et al. (2020).

The probability density function of the COM-Poisson distribution is defined as

$$\begin{aligned} f(y_i)=\frac{\mu _i^{y_i}}{(y_i!)^v}\frac{1}{Z(\mu _i,v)},\quad y_i=0,1,2,\ldots ,\quad i=1,2,\ldots ,n \end{aligned}$$

where $\mu _i\ge 0$ is the centering parameter, $v \ge 0$ is the shape parameter, $Z(\mu _i,v)=\sum _{s=0}^{\infty }{\frac{\mu _i^s}{(s!)^v}}$ is a normalization parameter and v is a dispersion parameter. Overdispersion in the data set is represented by $v<1$, equidispersion by $v=1$, and underdispersion by $v>1$.

Based on the values of centering and shape parameter, COM-Poisson distribution converges to three different distributions. These are

Geometric distribution: $v=0$ and $\mu _i<1$;
Poisson distribution: $v=1$;
Bernoulli distribution: $v\rightarrow \infty $.

When $v=0$ and $\mu _i\ge 1$, the normalization parameter $Z(\mu _i,v)$ does not converge and the distribution is undefined (Shmueli et al. 2005).

2.2 Parameter estimation in COM-Poisson regression

Let $y_{n \times 1}=[y_1,y_2, \dots , y_n]'$ be the response vector with COM-Poisson distribution and $X_{n \times p}=[x_1, x_2, \dots , x_n]'$, be the predictor matrix with $x_i'=[x_{i1}, x_{i2}, \dots , x_{ip}]$, $i=1,2, \dots , n$ being the i-th observation. Log-link function can be applied for modelling the relationship between y and X as $\mu =\text {log}(X\beta )$ where $\beta _{n \times 1}=[\beta _1,\beta _2,\dots , \beta _p]'$ is the vector of unknown parameters.

The $\beta $ parameters are estimated with the help of the log-likelihood function of the COM-Poisson distribution that is provided by Sellers and Shmueli (2010) as

$$\begin{aligned} l(\beta ;y) =v\sum _{i=1}^n{y_i \text {log} (\mu _i)}-v\sum _{i=1}^n{\text {log}(y_i!)}-\sum _{i=1}^n{\text {log}\left( Z(\mu _i;v)\right) }. \end{aligned}$$

(1)

Sellers and Shmueli (2010) and Francis et al. (2012) proposed using the iteratively reweighted least squares (IRLS) technique which was presented by Nelder and Wedderburn (1972) and Wood (2017). Then, a closed form solution for the IRLS estimator known as the ML estimator was given by Mammadova and Özkale (2021b) as

$$\begin{aligned} \begin{aligned} {\hat{\beta }}^{(t)}_{ML}=\left( X'{\hat{W}}_{ML}^{(t-1)}X\right) ^{-1} X'{\hat{W}}_{ML}^{(t-1)}u_{ML}^{(t-1)} \end{aligned} \end{aligned}$$

where t is the iteration step, $u_{ML}^{(t-1)}=X{\hat{\beta }}_{ML}^{(t-1)}+({\hat{W}}_{ML}^{(t-1)})^{-1}(y-{\hat{\mu }}^{(t-1)}_{ML})$ is the working response, and ${\hat{W}}_{ML}$ is the estimated weight matrix evaluated at ${\hat{\beta }}^{(t-1)}_{ML}$. After the successful iterations, ${\hat{\beta }}_{ML}$ at convergence is obtained as ${\hat{\beta }}_{ML}=\left( X'{\hat{W}}_{ML}X\right) ^{-1}X'{\hat{W}}_{ML}u_{ML}$.

The weighted matrix for the COM-Poisson model was first introduced by Sellers and Shmueli (2010) and later elaborated by Francis et al. (2012) as $W=\text {diag}(w_{ii})$,^{Footnote 1}$i=1,2,\ldots ,n$ where

$$\begin{aligned} \begin{aligned} w_{ii}&= \sum _{s=0}^{\infty }{\frac{\frac{v(v-1)s^2(\text {exp}(\mu _i ))^{2s}\left( \frac{\left( \text {exp}(\mu _i )\right) ^s}{s!}\right) ^{v-2}}{(s!)^2}+\frac{vs^2(\text {exp}(\mu _i ))^{s}\left( \frac{(\text {exp}(\mu _i ))^s}{s!}\right) ^{v-1}}{s!}}{\sum _{s=0}^{\infty }{\left( \frac{(\text {exp}(\mu _i ))^s}{s!}\right) }^v}}\\&\quad -\sum _{s=0}^{\infty }{\frac{\left[ \frac{vs(\text {exp}(\mu _i ))^{s}\left( \frac{(\text {exp}(\mu _i ))^s}{s!}\right) ^{v-1}}{s!}\right] ^2}{\sum _{s=0}^{\infty }{\left[ \left( \frac{(\text {exp}(\mu _i ))^s}{s!}\right) ^v\right] ^2}}}. \end{aligned} \end{aligned}$$

(2)

In the case of uncorrelated predictors, it is well known that the ML estimator is a reliable method. However, multicollinearity poses challenges with the computation of the inverse matrix of $X'WX$, which is essential for ML estimation. Therefore, alternative approaches were proposed.

One of these alternatives is the iterative ridge estimator presented by Mammadova and Özkale (2021b) in COM-Poisson regression as

$$\begin{aligned} \begin{aligned} {\hat{\beta }}^{(t)}_{ridge}=&(X'{\hat{W}}_{ridge}^{(t-1)}X+kI_p)^{-1} X' {\hat{W}}_{ridge}^{(t-1)}u_{ridge}^{(t-1)} \end{aligned} \end{aligned}$$

where t refers the iteration step, $u_{ridge}^{(t-1)}=X{\hat{\beta }}^{(t-1)}_{ridge}+({\hat{W}}_{ridge}^{(t-1)})^{-1}(y-{\hat{\mu }}^{(t-1)}_{ridge})$ is the working response, ${\hat{W}}_{ridge}$ is the weight matrix in Eq. (2) evaluated at ${\hat{\beta }}^{(t-1)}_{ridge}$, and k is the tuning parameter.^{Footnote 2} The ridge estimator of $\beta $ at convergence has the form of ${\hat{\beta }}_{ridge}=\left( X' {\hat{W}}_{ridge}X+kI_p\right) ^{-1}$ $X'{\hat{W}}_{ridge}u_{ridge}$.

Another suitable alternative to the ML estimation in case of the multicollinearity is the PCR estimation. Unlike the ridge estimator, the PCR estimator does not require a tuning parameter; instead, it addresses the multicollinearity problem by generating a new set of uncorrelated variables using the singular value decomposition (SVD) technique discussed in GLMs by Smith and Marx (1990), Aguilera et al. (2006), Özkale and Arıcan (2016), Abbasi and Özkale (2021). Jolliffe (2002) stated that the SVD is effective in terms of both computation and interpretation in PCR. They also emphasized the importance of standardizing the predictors to zero mean and unit variance to eliminate scale dependence of PCs.

In brief, SVD can be described as follows: Let the linear predictor $\eta $ be expressed as $\eta =X\beta = XTT'\beta =X^*\omega $ where $X^*=XT$ and $\omega =T'\beta $. T is an orthogonal matrix through $T'X'{\hat{W}}_{ML}XT=\Lambda = \text {diag} (\lambda _1,\lambda _2,\dots , \lambda _p)$ and $\lambda _1=\lambda _{max} \ge \lambda _{2} \ge \dots \lambda _{p}=\lambda _{\text {min}}$ are the eigenvalues of the $X'{\hat{W}}_{ML}X$ matrix. $X^*$ matrix can be partitioned as $X^*=[X^*_r \quad X^*_{p-r}],$ where $X^*_r=XT_r \quad (r\le p) $ is the matrix of the PCs that will be retained in the model. $\omega $, T, and $\Lambda $ are partitioned as $\omega = [\omega _r \quad \omega _{p-r}], \quad T= [T_r \quad T_{p-r}],$ and $\Lambda =[\Lambda *_r, \Lambda *_{p-r}]$ where $\Lambda *_r=X^{*'}_r {\hat{W}}_{ML} X^*_r=\text {diag} (\lambda _1,\dots ,\lambda _r)$, $\Lambda *_{p-r}= X_{p-r}^{*'}{\hat{W}}_{ML}X_{p-r}^*=\text {diag} (\lambda _{r+1},\dots ,\lambda _p)$ and r is to the number of PCs that will be included the model.

By using SVD, Abbasi and Özkale (2021) obtained the PCR and r–k class estimators in GLMs which were originally introduced by Marx and Smith (1990); Baye and Parker (1984) in linear regression. We adjust the PCR and r–k class estimators for the COM-Poisson model as

$$\begin{aligned} {\hat{\beta }}_{PCR}^{(t+1)}= T_r\left( T_r'X'{\hat{W}}_{ML}XT_r\right) ^{-1}T_r'X'{\hat{W}}_{ML}u_{PCR}^{(t)} \end{aligned}$$

(3)

where $u_{PCR}^{(t)}=XT_rT_r'{\hat{\beta }}^{(t)}_{PCR}+\left( {\hat{W}}_{ML}\right) ^{-1}\left( y-\mu _{PCR}^{(t)}\right) $ is evaluated at ${\hat{\beta }}_{PCR}^{(t)}$ and

$$\begin{aligned} {\hat{\beta }}_{r-k}^{(t+1)}= T_r\left( T_r'X'{\hat{W}}_{ML}XT_r+kI_r\right) ^{-1}T_r'X'{\hat{W}}_{ML}u_{r-k}^{(t)} \end{aligned}$$

(4)

where $u_{r-k}^{(t)}=XT_rT_r'{\hat{\beta }}^{(t)}_{r-k}+\left( {\hat{W}}_{ML}\right) ^{-1}\left( y-\mu ^{(t)}_{r-k}\right) $.

Abbasi and Özkale (2021) can be examined for the detailed information on obtaining the PCR and r–k class estimators in GLMs. To summarize, the general idea behind the PCR and r–k class estimators is that the linear predictor is reduced another linear predictor by deleting the PCs having large variances. After then, the IRLS idea is applied on this reduced linear predictor, and the resulted estimator is transformed to the original parameter space to give the PCR estimator in GLMs. On the other hand, the r–k class estimator is obtained by applying the ridge idea on the reduced linear predictor and transforming the resulted estimator back to the original parameter space. The notion in Eqs. (3) and (4) is that both estimators use the same number of PCs. The main difference is that the r–k class estimator lowers the degree of multicollinearity a little bit more and uses the tuning parameter for this purpose.

The PCR and r–k class estimators at convergence are respectively as

$$\begin{aligned} {\hat{\beta }}_{PCR}= T_r\left( T_r'X'{\hat{W}}_{ML}XT_r\right) ^{-1}T_r'X'{\hat{W}}_{ML}u_{PCR} \end{aligned}$$

and

$$\begin{aligned} {\hat{\beta }}_{r-k}= T_r\left( T_r'X'{\hat{W}}_{ML}XT_r+kI_r\right) ^{-1}T_r'X'{\hat{W}}_{ML}u_{r-k}. \end{aligned}$$

2.2.1 Tuning parameter selection

The tuning parameter determines the effectiveness of the ridge estimator as well as the r–k class estimator since the increasing value of the tuning parameter pulls the estimator further from its actual value. Studies conducted by Hoerl and Kennard (1970), Hoerl et al. (1975), Lawless and Wang (1976), Kibria (2003), Alkhamisi et al. (2006), Alkhamisi and Shukur (2007), Månsson and Shukur (2011), Kibria et al. (2012), Algamal (2018), Zaldivar (2018) and others cover a wide range of calculating approaches for the tuning parameter of the ridge estimator in linear regression and GLMs.

Abbasi and Özkale (2021) adapted the tuning parameter selection method proposed by Hoerl and Kennard (1970) in linear regression for the estimation of the r–k class estimator in the GLMs. We adjust the same tuning parameter for the COM-Poisson regression models and obtain

$$\begin{aligned} k =\frac{rv}{{\hat{\beta }}^ {{(0)}^{'}}T_rT_r'{\hat{\beta }}^ {(0)}} \end{aligned}$$

(5)

where ${\hat{\beta }}^{(0)}$ is the initial value of the $\beta $ parameter which is usually taken as the ordinary least squares (OLS) estimator.

2.3 Deviance residuals

The deviance residual represents the difference between the predicted value of a given point i (the $E(y_i)$), which is denoted as ${\hat{\mu }}_i$ and the actual response $y_i$. Dobson (2002) defined the deviance residual for the i-th observation of the GLMs as the square root of the difference between the log-likelihood of the fitted and saturated models multiplied by two. The sign of the difference between the actual response and the fitted response determines the increase and decrease in deviance residual.

The i-th deviance residual based on ML, ridge, PCR, and r–k class estimators can be expressed by using the log-likelihood functions of the fitted model provided in Eq. (1) as

$$\begin{aligned} d_{est,i}=&\text {sign}(y_i-\widehat{E(y_i)}) \times \sqrt{2 \left[ l(y_i, y_i; {\hat{v}}) - l(\widehat{E(y_i)},y_i; {\hat{v}}) \right] } \end{aligned}$$

(6)

where $l(\widehat{E(y_i)},y_i; {\hat{v}})$ and $l(y_i, y_i; {\hat{v}})$ are the log-likelihood functions of the fitted and saturated models, respectively, $\widehat{E(y_i)}= {\hat{\mu }}_{est,i}^{1/v}-\frac{v-2}{2v}$ and the subscript est is used to designate the method employed for modeling, i.e. $est\equiv \{ML, ridge, PCR, r-k\}$ and ${\hat{\mu }}_{est,i}$ is the fitted value obtained by using the corresponding estimator at convergence. Deviance residuals in Eq. (6) are constrained such that $y_i > c$ for $v < 1/\left( 2c+1\right) $, $c \in N^+$.

3 Monitoring of COM-Poisson profiles

Page (1954) proposed the CUSUM chart which utilizes the past information available from previously plotted points for effective monitoring. Later, Roberts (1959) introduced the EWMA chart, an alternative to the CUSUM chart. EWMA charts also accumulate current and past information from the observations that make the control chart sensitive to small shifts. Since then, several modifications and extensions of these control charts are investigated. Montgomery (2020) describes the CUSUM and EWMA charts as alternatives to the Shewhart chart where detecting small shifts is significant.

Mammadova and Özkale (2021b) extended the traditional CUSUM and EWMA charts to the charts by using both the deviance and ridge deviance residuals to define control chart statistics. Mammadova and Özkale (2021b) gave the formulas in Table 1 to construct CUSUM and EWMA charts based on deviance and ridge deviance residuals. In Table 1, deviance-based charts are referred to as CUSUM$_{ML}$ and EWMA$_{ML}$, whereas ridge deviance-based charts as CUSUM$_{ridge}$ and EWMA$_{ridge}$ to prevent confusion. ${\hat{\mu }}_{ML}^{0}$ and ${\hat{\mu }}_{ridge}^{0}$ are the in-control means, ${\hat{\sigma }}_{ML}^{0}$ and ${\hat{\sigma }}_{ridge}^{0}$ are the in-control standard deviations of deviance and ridge deviance residuals, respectively. K and h are the reference and the decision value of the CUSUM chart, respectively. $0<\lambda \le 1$ refers to the smoothing parameter of the EWMA chart while L is the EWMA control limit constant.

Table 1 Construction of the deviance and ridge-deviance residual-based CUSUM and EWMA control charts

Detecting shifts in Conway–Maxwell–Poisson profile with deviance residual-based CUSUM and EWMA charts under multicollinearity

Abstract

Similar content being viewed by others

A new auxiliary information based cumulative sum median control chart for location monitoring

Principal component regression-based control charts for monitoring count data

Monitoring Coefficient of Variation Using CUSUM Control Charts

1 Introduction

2 COM-Poisson modelling

2.1 COM-Poisson distribution

2.2 Parameter estimation in COM-Poisson regression

2.2.1 Tuning parameter selection

2.3 Deviance residuals

3 Monitoring of COM-Poisson profiles

3.1 The newly proposed control charts based on PCR and r–k deviance residuals

4 Simulation study

4.1 Determination of control chart constants

4.2 Performance analysis

4.2.1 Results for CUSUM charts

4.2.2 Results for EWMA charts

5 Real life application

5.1 Data information

5.2 Data reconstruction

5.3 Data processing

5.4 Monitoring process: performed for CUSUM and EWMA separately

5.4.1 Performance evaluation of the control charts

6 Conclusions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Simulation study results

Appendix: Simulation study results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation