Introduction

In the first part of this series, Quesenberry’s Q control charts were introduced and an initial discussion of their performance in analytical chemistry was presented [1]. The discussions focused on situations where the raw data were in state of statistical control and in the presence of initial outliers. Control charts are one of the most important tools associated with quality control activities for testing and calibration laboratories as described in ISO/IEC 17025 [2]. Classical control charts have receive a much attention in the analytical laboratories [3,4,5]. However, these classical charts have the start-up problem: It is necessary a previous study or training phase to define the central line and the control limits to run them in the control phase.

When Quesenberry proposed his Q charts for individual measurements for short and long runs [6,7,8], their performance was studied based on numerical simulations. Only one variant or case of these charts was mentioned before in the field of analytical chemistry [9, 10]. However, it was recognized that the possibilities have not yet been widely tested in analytical practice [10].

The purpose of this series of publications is to experimentally test Quesenberry’s Q charts for individual measurements, to demonstrate their capabilities, and to discuss their applicability in various circumstances encountered in analytical laboratories. These charts could be particularly useful when the laboratory has no historical data on performance of the required analytical process (i.e., the needed analytical process is not previously validated or verified) and/or CRM (or an appropriate surrogate) is not available in the laboratory.

The main objectives of this second part of the series of publications on Q charts are to test their performance in: (1) the presence of persistent systematic error and (2) the presence of simultaneous small drifts of process mean and process variance. Their performance is compared under the same conditions with the classical Shewhart charts for individual measurements and the moving range of two.

The basics of Q charts for process mean and process variance and their algorithms are presented in the first part of this series [1]. Some details on the application of this especial kind of Shewhart control chart were given in the same publication.

Materials and methods

Certified reference materials, analytical procedure, and measurement conditions

Data from the determination of the mass fraction of Co and SiO2 by ICP-OES, in the serpentinite SNi and laterite L1 CRMs (Certified Reference Materials), respectively, were used. Both CRMs were developed through and interstate project of COMECON (Council of Mutual Economic Assistance) with participation of more than 30 laboratories of Cuba, and countries of West and East Europe [11, 12]. Updates of the CRMs were issued recently. The validity periods have been extended until May 2030. The certified values for Co and SiO2 are (0.150 ± 0.005) mg/g and (28.00 ± 0.22) mg/g, respectively, where the numbers following the symbol ±  are the uncertainty of the reference value and not a confidence interval.

Details on the analytical procedure were published in the first part [1]. The mass fractions were evaluated by ICP-OES (inductively couple plasma optical emission spectrometry), and spectral lines (in nm) were Co II 228.615 and Si I 251.611. Data for quality control were obtained in the [time + operator]-different intermediate precision conditions of measurement [13], along several weeks.

Preliminary statistical analysis of raw data

A preliminary statistical analysis was applied to validate the raw data, in order to run the charts in the control phase. The raw data of the selected elements were statistically analyzed with the use of Microsoft Excel and Statgraphics Centurion XV, version 15.2.05 [14]. In this research, Q values were calculated with Microsoft Excel and imported into Statgraphics Centurion XV. The calculated statistical tests and control charts prepared with Excel were validated with the statistical application. For both elements, the first 25 values were used for the data validation process with the interest of determining whether they were obtained in a state of statistical control or if any characteristic or trend of interest was manifested in them. Initially, scatter plots, box and whisker plots, and Grubbs’ and Dixon’s tests were applied. To determine if the parent population could be considered a normal one, normal probability plots, the Shapiro–Wilk test, and the Kolmogorov–Smirnov and Anderson–Darling goodness-of-fit tests were also applied. In the search of evidence against the null hypotheses on the absence of systematic deviations in the means and variances of the processes, Student's t and Fisher's F tests were used. Additionally, autocorrelation and randomness were studied. The confidence intervals for the autocorrelation and partial autocorrelation coefficients among values of the variable at different lags were evaluated. If all the confidence intervals for a probability of 95% include them, the null hypothesis of no autocorrelation is accepted. Randomness was examined using three tests: the number of points above and below the median test, the number of runs up and down test, and the Box-Pierce test. All of them have different principles. In the two cases discussed in this report, if a value raised an alarm in the charts, it was not replaced by a new one. This was done in order to compare responses and performances of the Q charts with respect to the charts of the individual measurements and the moving range of 2. All statistical tests were performed for a significance level \(\alpha = 0.05\). Some further details can be found elsewhere [1]. The mass fraction values presented here are not, necessarily, laboratory outcomes.

Application of control charts

Details on the application of the three kinds of Shewhart control charts were exposed in the first part of this series. They were run in the control phase, prior to defining zones (as shown in Figs. 1 and 2) using CRMs and raw data validation. The tests for special causes (or decision rules) used along this work were a selection of Nelson’s recommendations [15], as detailed in the first part of this series [1].

Fig. 1
figure 1

Control chart for individual measurements for the first 25 results for the determination of the mass fraction of Co in the serpentinite SNi CRM by means of ICP-OES. The presence of systematic error is evident from the results of preliminary statistical analysis of raw data and the pattern of this chart

Fig. 2
figure 2

Control chart of the moving range of two for the first 25 results for the determination of the mass fraction of Co in the serpentinite SNi CRM. The chart has no alarm signal

The charts for individual measurements and the moving average of two were used as the basis for the study of Q charts. In the presence of alarms, signaled values were not rejected and subsequently replaced. This procedure makes it possible to study the performance of the Q charts in different circumstances and patterns of the points displayed. The development of Q charts and the respective use of available information from CRM and data validation are exposed in the first part of this series.

The use of the EWMA charts is complementary to Shewhart charts because they are a little bit more effective in detecting small and moderate shifts in the parameters [16]. Some details concerning them were pointed out in [1]. This type of chart is well established in the scientific literature [17, 18]. The constants used for EWMA charts associated with Q charts were \(\lambda = 0.25\) and \(K = 2.90\), which give action limits at ± 1.096.

Results and discussion

Raw data with persistent systematic error

To study the performance of Q charts in the presence of a persistent systematic error, 25 values from quality control for the determination of mass fraction of Co in the serpentinite SNi CRM were used. From the initial statistical analysis of the data for Co, a Student's t test rejected the null hypothesis of absence of systematic error; then, the alternative hypothesis of its presence is accepted. A Fisher's F test did not find evidence to reject the null hypotheses on equality of the variances across the set of 25 values. It was concluded that the values were obtained with a presumably constant dispersion, but with a persistent systematic error around 5.7%. In addition, the normal probability plot induced to think us the parent population was normal. The Shapiro–Wilk normality test failed to reject the null hypothesis about normality. Accordingly, both the Kolmogorov–Smirnov and Anderson–Darling goodness-of-fit tests also did not allow to reject normality. The three randomness tests did not allow to reject the null hypothesis of independence of the 25 values, but a tiny autocorrelation was accepted due to the correlation coefficient for a lag 1, a little bit higher than the upper limit of its 95% confidence interval. Nevertheless, autocorrelation was rejected for the first 10 values, which also were accepted as random based on the three mentioned randomness tests. When the Shapiro–Wilk test for normality was applied to these 10 values, no evidence was found against the normality of the parent population. In addition, in the set of the first 10 values, Dixon’s and Grubbs’ tests did not identify any outlier. However, according to a Student’s t test, it was necessary (and convenient) to accept the alternative hypothesis regarding the presence of a persistent systematic error. Although the set of values was not obtained in a state of statistical control, it is appropriate to study the behavior of Q charts in the presence of a persistent systematic error in short and long runs.

Figure 1 shows the chart for individual measurements for 25 values of the mass fraction of Co. The solid line represents the center line and the dashed lines represent the six control limits: in the upper part above the center line the upper action limit, the outer upper warning limit and inner upper warning limit. Below the center lines are the inner lower warning limit, outer lower warning limit, and lower action limit. Zones between outer action limits and outer warning limits, outer warning limits and inner warning limits, and inner warning limits and center line are labeled as A, B, and C, respectively. Crosses represent alarm signals (see text for their positions and causes). The chart has several alarm signals from the sixth run and later on due to several rules applied. Since all points are on or above the center line, it is obvious that there is a significant deviation from the certified value. Alarm signals became: sample numbers 18 and 25 (a point beyond zone A); 6, 17–25 (four out of five points in a row in zone B or beyond); 8–25 (eight points in a row in zone C or beyond); 15, 22, 24, and 25 (two out of three points in a row in zone A or beyond). The EWMA chart, associated to the chart for the individual measurements (not presented here), also has several points above the upper action limit. Specifically, the alarms are triggered at the fifth and sixth points and from the 15th sample number and beyond. The presence of a persistent systematic error in both the short and long run is evident from this chart. It is consistent with the results of the Student’s t tests.

Figure 2 displays the moving range of two chart for the same data. The solid line represents the center line and dashed lines represent the control limits: in the upper part above the center line, the upper action limit, the outer upper warning limit, and the inner upper warning limit. Below the center line to lower values, the inner lower warning limit. The outer lower warning limit and the lower action limit are equal to 0. The zones between the outer action limits and the outer warning limits, the outer warning limits and the inner warning limits, and the inner warning limits and center line are labeled as A, B, and C, respectively. The same is true below the center line, but only zone C is labeled because the other lower limits are 0. A thick dashed line represents the median of data. It does not show an alarm signal. In addition, the numbers of points above and below the median (thick dashed line in Fig. 2) are acceptable. Therefore, it is assumed that the precision was under control in the intermediate precision measurement condition for several weeks.

The different cases of the Q chart for the mean, (cases KK, μ known and σ known; UK, μ unknown and σ known; KU, μ known and σ unknown; and UU, μ unknown and σ unknown) are shown in Fig. 3. The case KK is trivial. If there is relevant information on the reference value and the precision from previous validation (or verification) studies, classical Shewhart control charts should be used directly in the control phase. Nevertheless, the case KK is useful for comparison. The case KK of the Q chart for the mean (diamonds in Fig. 3) have an identical pattern to the chart for individuals (Fig. 1). It shows the same alarm signals as the latter chart. The associate EWMA Q chart also has an identical pattern to the EWMA chart associated to the chart for individual measurements (and the same alarms in the same positions, Fig. 4). In this Fig. 4, the solid line represents central line and broken lines action limits. Alarms are represented by crosses.

Fig. 3
figure 3

Q control charts for the mean (the four cases: KK, UK, KU, and UU) for the 25 results for the determination of the mass fraction of Co. The center line, control limits, and zones as in Fig. 1. Several alarms are detected (see text)

Fig. 4
figure 4

EWMA Q control charts for the mean (the four cases: KK, UK, KU, and UU) for the 25 results for the determination of the mass fraction of Co. The KK and KU cases show many alarms along the charts, but the UK and UU cases show none (except for the last point of the UU case)

In Fig. 3, it can be observed that the general pattern of the case KU is very similar to the pattern of the chart for individual measurements. The case KU shows many alarms (sample numbers 5, 6, 9–25) due to several applied rules for special causes. The applied rules are: a point beyond zone A (18 and 25), eight points in a row in zone C or beyond (9 to 25), four out of five points in a row in zone B or beyond (5, 6, 17 to 25), and two out of three points in a row in zone A or beyond (15, 22, 24, and 25). Note that in some cases, different rules are applied to the same point at the same time. This high sensitivity is due to the fact the case KU uses a relevant information: the certified reference value. The central tendency is anchored to this value. The EWMA chart associated to this case also shows many points above the upper action limit (Fig. 4, sample numbers 4–7 and 13–25). In a short run situation (say, up to 10 values), the case KU of the Q chart for the mean also shows a good sensitivity. In the presence of systematic error, alarm signals appeared early (first alarm at the fifth sample number). This also happens in the corresponding EWMA chart (Fig. 4, first alarm at the fourth sample).

The other two cases of the Q control chart for the mean (UK and UU, Fig. 3) show a similar but shifted pattern if compared with the case KK. The shift to lower values could be due to the fact that they use the sequential sample means instead of the certified value. These cases are quite different. They show alarms at the end of the chart (sample numbers 24 and 25 in both cases, due to eight points in a row in zone C or beyond). The EWMA chart associated to the UK case shows no alarms at all (Fig. 4, triangles). However, the EWMA Q chart associated to case UU (Fig. 4, squares) shows only one alarm at the end (at sample number 25, the point is above the upper action limit). They were not sensitive enough to detect the presence of a clear systematic error in the short or long runs. These low sensitivities to systematic errors are explained by the fact that in both cases, the certified value is not used. This is the cost of lack of essential information. If the Q charts were applied in a short run (say, up to 10th sample number), it is obvious that there is something wrong with the cases KK and KU for the mean, due to the early alarms. However, the cases UK and UU became useless, because they did not show any alarm signal.

The Q charts for the variance (both cases: K, μ unknown and σ known; U, μ unknown and σ unknown) are shown in Fig. 5. They have no alarms signals. The corresponding EWMA charts (not shown here) also have no alarm signals. The performance of the Q charts for the process variance and the associated EWMA charts reinforce the conclusion that the precision was under control, as expressed in relation to the chart of the moving range of two (Fig. 2).

Fig. 5
figure 5

Q control charts for the variance (for the two cases: K, μ unknown and σ known; U, μ unknown and σ unknown) for the 25 results for the determination of the mass fraction of Co in the serpentinite SNi. The center line, control limits, and zones as in Fig. 1. No alarms are shown in the charts

Performance of Q charts in the presence of simultaneous small drifts of the mean and the variance

The quality control data for the determination of mass fraction of SiO2 in the L1 laterite CRM were used to study the performance of Q charts in the presence of simultaneous small drifts of the mean and variance. From the preliminary statistical analysis of the 25 values for SiO2, the absence of outliers was concluded based on the Dixon’s and Grubbs’ tests. According to the normal probability plot, the Shapiro–Wilk normality test, and the goodness of fit tests, no evidences were found that allowed to reject the hypothesis of normality of the parent population. No grounds were found to reject the hypothesis of absence of autocorrelation, and therefore, the values were accepted as independent. In turn, the three runs tests did not allow to reject the hypothesis of randomness. When applying the Student's t test, the absence of systematic error for the complete set of 25 results could not be rejected. Thus, the values were validated to run the charts in the control phase. However, when applying Student's t tests taking as reference the certified value for the set of 25 results and separately for the first 10 values, no evidences were found against the null hypothesis of absence of systematic error. However, when the same test was applied to the set made up of the 10th value to the 25th, evidence was found that allowed to reject the null hypothesis and accept the manifestation of a systematic error. On the other hand, when applying a Fisher's F test to compare the variances of the sets of results from the first to the ninth value and from the 10th to the 25th value, it was evident that they were not homogeneous, which leads one to think that there was a shift in the value of the standard deviation of the process toward lower values around the 10th value.

Figure 6 shows the control chart for individual measurements for SiO2. The center line and control limits were calculated using the reference value and standard deviation from the preliminary statistical analysis of the data. Beginning with the ninth sample, all points are on or below the center line. Five alarms are displayed on the 21st to 25th sample because four out of five points in a row are in zone B or beyond. These facts suggest that there is probably a shift in the mean around the 10th sample number. The associated EWMA chart (not shown here) shows alarms at the 23rd and 24th runs, due to points below the lower action limit. A Student's t test for the mean of the values from the ninth onward allowed the rejection of the null hypothesis of absence of systematic error and allowed the acceptance of the alternative hypothesis about a significant negative bias, due to a small negative systematic error.. Therefore, it was necessary to accept the presence of a small shift to lower values in the mean from around the 10th sample.

Fig. 6
figure 6

Control chart for individual measurements for the results for the determination of the mass fraction of SiO2 in the laterite L1 CRM by ICP-OES. The center line, control limits, and zones as in Fig. 1. Five alarms are displayed at sample numbers 21 to 25 due to four out of five points in a row in zone B or beyond. The geometrical pattern suggests that there was a shift of the mean to lower values at about the 10th sample. This fact was confirmed by a Student’s t test

In the control chart of the moving range of two (Fig. 7), there is a tendency to lower moving ranges from about the 10th sample and beyond. On the 25th run, there is an alarm signal due to eight points in a row in zone C or beyond. From the 10th sample, there are 13 values below the median of the moving ranges and only three values above. This suggests that there was probably also a shift in the standard deviation to lower values around the 10th sample number or forward. A Fisher test to compare variances of the first nine results and the rest allowed to accept the variances were not equal. Then, it was accepted a drift in the variance to lower results occurred at about the 10th value. This fact is consistent with the pattern as shown in Fig. 6. Therefore, it was concluded that a simultaneous drifts to lower values in the mean and the variance occurred around the 10th sample.

Fig. 7
figure 7

Control charts of the moving range of two for the determination of the mass fraction of SiO2 in the laterite L1 CRM. The center line, median, control limits, and zones as described in Fig. 2. The cross represents an alarm signal at sample number 25, due to eight points in a row in zone C or beyond

Figure 8 displays the four cases for the Q chart for the process mean (cases KK, UK, KU, and UU). All cases show a pattern very similar to the pattern of the chart for the individual measurements, but the number and positions of the alarm signals may differ. The case KK has the same alarms in the same positions as the chart for the individual measurements, but with slight differences in the origin (four out of five points in a row in zone B or beyond: sample numbers 21 to 24; and eight points in a row in zone C or beyond: sample numbers 24 and 25). Case UK has alarms from sample numbers 16 to 25 due to eight points in a row in zone C or beyond. Case KU has the same alarms in the same position due to the same causes as case KK. Finally, case UU has alarms in the same position as case UK, due to the same decision rule. Whatever the positions of the alarms, the presence in a row of six or seven points below and close to the center line (between the 9th and the 15th or 16th sample numbers) leads to the assumption of a drift in the mean and/or the variance.

Fig. 8
figure 8

Q control charts for the process mean (cases KK (diamonds), UK (triangles), KU (circles), and UU (squares)) for the 25 results for the determination of the mass fraction of SiO2 in the laterite L1 CRM. The center line, control limits, and zones as in Fig. 1. Alarms are indicated by crosses. See text for further details

It is difficult to explain in detail the differences in number in alarms raised alarms and their positions. As we have already concluded, in this example, there is a combination of small shifts of the mean and standard deviation. Nevertheless, on the one hand, it is necessary to take into account the differences in the weights of shifts of both parameters (μ and σ), but on the other hand, there are random contributions to the Q values due to the sequential character of estimates of the sample mean and the standard deviation. These contributions depend on the relative values of the two statistics. Since the arguments of the Q values for the cases UK, KU, and UU always have the sequential estimated mean in the numerator and the sequential estimated standard deviation is in the denominator, the Q values have a random behavior. Nevertheless, it is clear that the four cases reproduce the general pattern of the chart for the individual measurements.

However, the four cases for the Q chart for the process mean could be divided into two groups: on the one hand cases KK and KU and on the other hand cases UK and UU. In the first two cases, the certified value is used as the known mean. The 16th value of mass fraction of SiO2 was 28.02 mg/g, a little bit higher than the certified value, 28.00 mg/g. Thus, the two Q values became above the center line (but practically on the line). However, in the other two cases (UK and UU), the points were below the center line due to the natural random variation of the sequential mean, according to its updating formula. Consequently, in the cases, UK and UU alarms occurred at an early sample number (due to eight points in a row in zone C or beyond, below the center line). This random fact is not categorically a result of a higher sensitivity of these cases. In conclusion, in the presence of the simultaneous slight drifts in the process mean and the process variances, alarm signals were delayed and sensitivities of the four cases were very similar to the chart for the individual measurements.

Figure 9 shows the EWMA Q charts associated with the four cases of Q charts for the mean. They also have a very similar pattern to the EWMA chart associated to the chart for individual measurements (this latter is not shown here). Cases KK and KU have two alarm at samples 23 and 24, due to points below the lower action limit. They are located at the same position as in the EWMA chart associated to the chart for individual measurements. However, cases UK and UU do not show any alarms in their associated EWMA charts. Similarly, these differences can be attributed to the random nature of sequential estimates of μ and σ.

Fig. 9
figure 9

EWMA Q control charts for the mean (cases KK, UK, KU, and UU) for the determination of the mass fraction of SiO2 in the laterite L1 CRM. The central line and action limits as in Fig. 4. Alarms are indicated by crosses

Figure 10 displays the Q control charts for the variance (cases K (diamonds) and U (squares)). Both charts have a descending pattern consistent with the variance shifting to lower values. Case K shows an alarm at sample 24, but case U has three alarms (20th to 24th samples). All of these alarms are due to eight points in a row in zone C or beyond. The descending patterns and alarms in both charts clearly indicate that there has been a shift to lower values in the standard deviation. The corresponding EWMA charts are shown in Fig. 11. The EWMA chart for case K also shows an alarm signals at the 22nd sample, but for case U, two alarms appeared at the 20th and 24th positions. They also present a descending pattern. The Q charts for the process variance and the corresponding EWMA charts confirm that there was a shift in the standard deviation, which is in agreement with the results of the F test mentioned above, the patterns of the chart for individual measurement (Fig. 6) and the chart of the moving range of two (Fig. 7). The small differences between the two cases can be attributed to the random nature of the ranges and/or their sequential sums in the denominator (see Eqs. (9) and (10) of the first part of this series [1]).

Fig. 10
figure 10

Q control charts for the variance (cases K and U) for the determination of the mass fraction of SiO2. The center line, control limits, and zones as in Fig. 1. For the case K, one alarm is signaled by means of a cross at the 24th sample number, but for the case U, three alarms are located at samples numbers 20 to 24. Observe the descendent pattern from about the 10th sample number

Fig. 11
figure 11

EWMA Q control charts for the variance (cases K and U) for the determination of the mass fraction of SiO2. Case K (diamonds) and case U (squares). The center line and action limits as in Fig. 4. Case K shows alarms at the 22nd and 24th samples. Case U has only one alarm at the 22nd sample. They also show a clear descendent pattern

Conclusions

In the presence of an obvious persistent systematic error (but precision under control), Q charts for the process mean, cases KK and KU, showed a reasonable performance. They were sensitive enough to detect very early the lack of control with respect to the process mean in the short and long runs. The inclusion of the certified reference value in the arguments of their algebraic expression anchors their Q statistics to a very important reference in this situation. The associated EWMA charts also played a role. However, the cases UK and UU became insensitive and impractical in short and long runs. The use of the associated EWMA charts did not solve the lack of sensitivity. The use of the relevant certified value is very important to detect the systematic effect, and it is a clear advantage of cases KK and KU. This is the price to pay for the lack of essential information. The Q charts for the process variance performed as expected when the dispersion was under control.

In the presence of simultaneous small drifts in the mean and variance, Q charts for the process mean and process variance did not exhibit enough sensitivities for a fast detection of the shift in either parameters. The alarms appeared at the final sample numbers as in the classical charts. They showed a performance comparable to the charts of the individual measurements and the moving range of two, respectively. Although there was a delay before alarms appeared, it must be taken into account that the drifts were small (probably about a unit of standard deviation or less in the mean and about a half in the standard deviation). The EWMA charts related to the Q charts also showed a performance comparable to the EWMA associated to the chart of individual measurements, and they were a complement during the analysis of the charts.

The performance of Q charts in the presence of a gradual increase of the process mean and in the presence of a slight autocorrelation of the raw data will be discussed in the third part of this series.