1 Introduction

High-throughput and in-situ measurements have attracted growing interests in process monitoring and quality control in manufacturing. Recent advances in ambient ionization techniques, including transferring the ionization process from vacuum into an atmospheric environment, have prompted the introduction of mass spectrometry (MS) to high-throughput real-time analysis without the need of sample manipulation [1]. Desorption electrospray ionization (DESI) and direct analysis in real time (DART) are reportedly the most established ambient ion-sources in a number of frequently cited articles [2, 3]. However, the interactions among various factors are often overlooked during conventional method development of ambient mass spectrometry, since only one factor is investigated at a time, and in most circumstances, the already developed method requires re-optimization during its routine application [4]. Although ambient ionization mass spectrometry has been presented as a simple, high-throughput tool for diverse applications, its method robustness and system-suitability tests have not yet been discussed in the existing studies.

Since the concept of Quality by Design (QbD) became an important paradigm in the pharmaceutical industry, analytical method development following a similar approach has attracted great attention of many scientists [5]. The applications of QbD principles to analytical methods not only improve the understanding of the operation parameters and method performance, but also increase the analytical quality, data reliability, and regulatory flexibility [6]. Analytical design space (DS), a key procedure during the method development with a QbD approach, assures and increases the analytical quality whenever measurement uncertainty is of concern. Moreover, the analytical DS is a measurement of process robustness, which could be validated by proper system-suitability tests [7]. A number of discussions have been reported on general strategies during the application of QbD principles to analytical measurements [8, 9]. However, the overwhelming majority of the published procedures involved chromatographic separation using high/ultra performance liquid chromatography (HPLC/UPLC) and, surprisingly, only one of them was dedicated to capillary electrophoresis (CE) [10]. In this study, the flexibility of the QbD concept is explored in the development of an ambient mass spectrometry method, and the robustness of the developed method is evaluated.

In order to develop a QbD compliant analytical method, the goals of the method, which are also known as critical quality attributes (CQAs), must be clearly defined at the outset. Due to the higher sensitivity and specificity of the ambient MS analysis than that of spectral or chromatographic analysis, these developed methods are often aimed to extracting maximum information about the chemical compositions of a given sample without sample preparation. Meanwhile, each efficient signal on the MS spectra must meet performance criteria, such as accuracy, precision, and a good signal-to-noise ratio [11, 12]. Therefore, the performance of DART-MS could be an integration of multi-criteria. When a successful operation is a reflex of multiple responses, response overlay and desirability function become necessary [13]. Meanwhile, risk assessment [14] and factorial analysis are effective in identifying the most important factors, ranges, and levels associated with the analytical method [15, 16]. The design of experiments (DoEs), on the other hand, helps understand the relationships between the system quality and operating parameters. Ultimately, a method control strategy is always required to validate the analytical DS.

In this study, a strategy is presented for an ambient MS method developed with the QbD concept. As an illustration, a pharmaceutical preparation is monitored by direct analysis in real time mass spectrometry (DART-MS). A fishbone diagram and the Plackett-Burman experimental design are utilized to screen the factors [5]. Response surface analysis combined with the Box-Behnken experimental design is subsequently performed to derive the analytical DS [9].

2 Materials and Methods

2.1 Chemical and Samples

Samples used throughout the study were a liquid herbal preparation, compound Kushen injection, which contained alkaloids compounds with similar chemical structures and proven anti-tumor/anti-cancer activity [17]. Pharmaceutical preparations were provided by Shanxi Zhendong Pharmaceutical Co., Ltd. (Shanxi, China). Before each analysis, a sample (50 μL) was charged into a centrifuge tube with a volume of 0.5 mL and, at the same time, a glass sampling tube was washed successively with water and methanol. HPLC-grade methanol was supplied by Merck (Hesse-Darmstadt, Germany). The glass sampling tube was then immersed in the sample for 5 s to ensure that the amount of sample coated in the Dip-it probe was constant. All the preparations were directly analyzed without any pretreatment.

Standard substances, including sophocarpine, matrine, oxysophocarpine, and oxymatrine, were received from the National Institutes for Food and Drug Control (Beijing, China). D-Glucose and D-Fructose were purchased from Sigma-Aldrich (St. Louis, MO, USA). Ultrapure water was produced in the laboratory with the Milli-Q water purification system (Molsheim, France). Standard solutions of the six standards were prepared in water for the identification of the ion signals in mass spectrum.

2.2 DART-MS Equipment and Ionization Conditions

DART-MS analysis was carried out using a DART-SVP ionization source (IonSense, Saugus, MA, USA) interfaced to a triple-quadrupole mass spectrometer (MDS SCIEX API4000, Applied Biosystems, Foster City, CA, USA). High-purity nitrogen (N2) was selected as the ionizing medium. During the sample analysis, N2 was heated and transferred under the pressure of 0.2 or 0.5 MPa to form a hot gas stream for thermo-desorption and ionization of analytes. DART was operated either in a positive ion mode or a negative ion mode. Sample coated glass tubes were placed on the Dip-it sampler (IonSense, Saugus, MA, USA), and passed through the gas beam at a speed of 0.2, 0.6, or 1.0 mm·s–1. The sample was positioned away from the DART outlet with a distance of 10, 15, 20, or 30 mm. The grid electrode at the DART exit gun was ±50 or ±250 V. MS was adjusted to the positive ion mode with selected ion monitoring (SIM). All the operations of DART and MS were controlled by DART v. 3.0.3b and Analyst 1.5.0 software, respectively.

2.3 Methods

2.3.1 Critical Quality Attributes and the Modeled Response

CQAs are the responses measured to evaluate the quality of a developed analytical method. For DART-MS analysis, CQAs can be related to analytical sensitivity, accuracy, and repeatability of the respective ion, while the global quality is a summary of all the monitored ions. To mathematically describe the DART-MS response, the absolute peak area and signal-to-noise ratio (S/N) of each ion in the mass spectrum were utilized to evaluate the analytical sensitivity and accuracy. Repeatability of DART-MS was expressed by relative standard deviation (RSD) of repeated measurements. Figures 1 and 2 reveal the five major components and a representative full-scan DART-MS spectrum of a tested sample, respectively. To increase the analysis sensitivity, MS was adjusted to the selected ion monitoring (SIM) mode of 14 ions as summarized in Table 1.

Figure 1
figure 1

Chemical structures of the main alkaloids in the tested preparation

Figure 2
figure 2

Typical DART-MS profiling (positive ionization mode) of a tested sample. The adopted operational parameters were: heater temperature, 300 °C; sampler speed, 0.2 mm · s–1; and distance of DART outlet, 15 mm

Table 1 Ions Detected in the DART-MS Spectra with the Selected Ion Monitoring Mode and Method Validation at the Selected Working Point (n = 6)

In this study, Derringer’s desirability functions were introduced to assign desired and accepted values to CQAs. The measurements of each individual criterion were converted into desirability values using one-side transformation [18]. The scale of the desirability function ranged from d = 0, for a completely unsatisfactory response, to d = 1, for a highly desired response, according to Equation (1):

$$ {d}_i=\left\{\begin{array}{ll}0\hfill & \left({Y}_i\le {Y}_i^{-}\right)\hfill \\ {}{\left(\frac{Y_i-{Y}_i^{-}}{Y_i^{+}-{Y}_i^{-}}\right)}^r\hfill & \left({Y}_i^{-}<{Y}_i<{Y}_i^{+}\right)\hfill \\ {}1\hfill & \left({Y}_i\ge {Y}_i^{+}\right)\hfill \end{array}\right. $$
(1)

where Y i is the ith response variable; Y i is the minimum acceptable value of criterion Y i ; Y + i is the value beyond which improvements would be no longer important; r is a kind of weighting factor that describes the shape of the desirability function.

In chromatographic methods, S/N of chromatographic peak larger than 10 and RSD % of precision within 3 % were the capital features for good quantitative accuracy and repeatability [19]. Recently, several qualitative and quantitative determinations were developed with DART-MS [20, 21], where RSD within 15 % was acknowledged as acceptable deviation of analytical repeatability. The maximum peak area in Plackett-Burman experiment represents a highly desired response, as the maximum of a peak area usually suggests satisfactory ionization of an analyte. However, the S/N of ion peak is less than 10 if the peak area is an order of magnitude less than the maximum, and the peak integration becomes rather inaccurate. Consequently, the desirabilities of the peak area, S/N, and RSD could be defined by the following functions, where the factor r was set to 1:

$$ d\left(i,P\right)=\left\{\begin{array}{ll}0\hfill & \left(P\le {10}^6\right)\hfill \\ {}\frac{P-{10}^6}{ \max -{10}^6}\hfill & \left({10}^6<P< \max \right)\hfill \\ {}1\hfill & \left(P\ge \max \right)\hfill \end{array}\right. $$
(2)
$$ d\left(i,\frac{S}{N}\right)=\left\{\begin{array}{ll}0\hfill & \left(\frac{S}{N}<10\right)\hfill \\ {}1\hfill & \left(\frac{S}{N}\ge 10\right)\hfill \end{array}\right. $$
(3)
$$ d\left(i,R\right)=\left\{\begin{array}{ll}0\hfill & \left(R<15\right)\hfill \\ {}\frac{15-R}{15-3}\hfill & \left(3<R\le 15\right)\hfill \\ {}1\hfill & \left(R\le 3\right)\hfill \end{array}\right. $$
(4)

where i is the ith ion in the mass spectra; P represents the absolute peak area of the ion; \( \frac{S}{N} \) is signal-to-noise of the ion peak; R is RSD (%) of the six measurements. If the order of magnitude for the maximum peak area is 106, 106 in formula (1) is replaced by 105.

According to the weighted score tests in the preliminary experiment, a composite desirability function was derived to summarize the quality of the respective ion signal. The global performance of DART-MS spectra was, thus, an integration of all the monitored ions. Hence, the modeled response was calculated by the following Equation (5):

$$ D={\displaystyle \sum_{i=1}^{14}{D}_i={\displaystyle \sum_{i=1}^{14}\left[10d\left(i,P\right)+d\left(i,R\right)+5d\left(i,\frac{S}{N}\right)\right]}} $$
(5)

where D i is the score of the ith ion in the mass spectra.

2.3.2 Risk Assessment and Factorial Analysis

The risk assessment of the chromatographic method has been described in previous studies. A fishbone diagram for DART-MS is exhibited (Figure S-1 in Supporting Information), adapted from that reported by Borman et al. for HPLC [22]. The Plackett-Burmann experimental design was then utilized to estimate the effects of six independent factors, shown bold in the fishbone diagram, on the modeled response. The ranges of factors were defined in the preliminary experiment, and the levels and factors used in the Plackett-Burmann design are shown in Table S-1 (Supporting Information). Six replicates were performed for each determination, and their averages were recorded. The multivariate regression and Pareto ranking analysis were both performed by Minitab ver. 16.1.0.0 software (State College, PA, USA).

2.3.3 Response Surface Analysis

The Box-Behnken design with five center points was then employed to understand the relationships between the significant factors and modeled response, with levels and selected factors summarized in Table S-2 (Supporting Information). Other minor condition parameters were kept constant. The experimental design and response surface analysis were performed by Design-Expert ver. 8.0.6 software (Minneapolis, MN, USA). Sample analysis was conducted with six replicates, and their averages were recorded.

2.3.4 Method Validation

Analytical DS of DART-MS was derived from the results of response surface analysis. Two terms of method validation, including analytical repeatability and method robustness, were carried out at an operating work point.

3 Results and Discussion

3.1 Plackett-Burmann Experimental Design

For the modeled response, the temperature, sampler speed, and distance (P < 0.05) constituted statistically significant factors, according to the Pareto ranking analysis of the Placket-Burmann design. The coefficient of determination R 2 and adjusted R 2 of the model were calculated to be 0.9219 and 0.8283, respectively. Figure 3 illustrates the Pareto chart of standardized regression coefficients for the factors, which were derived from the multivariate regression analysis. Under varied gas pressures, no major differences in mass spectra were observed. It has been demonstrated that increasing the flow rate of the ionizing gas increases the number of metabolites detected in a sample, while high gas pressure could cause particle dispersion and lead to contamination in the MS inlet [23]. Gas temperature, gas flow rate/pressure, and grid voltage of DART-MS were optimized by DoEs, and the analysis of variance (ANOVA) suggested that the DART exit grid voltage had no significant impact on the signal intensity of the mass spectra [24]. These reported observations agree well with the current experimental results. Since nitrogen, not He as it is customary, was used as the discharge gas, gas pressure might be a significant factor for other studies, but nevertheless not for the current one. Among the parameters that affected the formation and transmission of ions in DART, significant factors were the temperature of the ionizing gas, linear rail speed of the sampler, and distance from the DART exit to the inlet of the mass spectrometer.

Figure 3
figure 3

Pareto chart of factorial analysis showing the effects of the factors on the DART-MS performance

3.2 Box-Behnken Design and Response Surface Analysis

Based on the results of the factorial analysis, DART was operated in the positive ion mode with N2 transferred under a pressure of 0.2 MPa, and the grid electrode at the DART exit gun was kept constant at +250 V. Meanwhile, the temperature (A), sampler speed (B), and distance (C) were optimized using response surface methodology.

After introducing multiple regression analysis, a fitted second-order polynomial equation was obtained. The ANOVA of the model revealed that the three factors in the first-order linear term, temperature (A 2) and sampler speed (B 2) in the second-order quadratic term, and interactive term of temperature and sampler speed (AB) were very significant, whereas the rest were rather negligible. Table 2 summarizes the significant test results on the coefficients of the regression model, which was adjusted with clearance of insignificant items. The quality of fit to the simplified second-order polynomial model was evaluated with the coefficient of determination R 2 and adjusted R 2 of the model, which were 0.9824 and 0.9718, respectively. The “fitness” of the model was evaluated through the probability value (P = 0.0000 < 0.05) and lack-of-fit test (P = 0.6612 > 0.05), suggesting it was an adequate model to accurately predict response variables. Additionally, the “Adeq Precision” of the model was 36.831, suggesting the model could be used for design space navigation [25]. The simplified second-order polynomial model for noncoded data could be expressed as:

Table 2 Analysis of Variance for the Response Surface Quadratic Model of the Three Factors on the Modeled Response
$$ D=100.08+0.74A-137.93B-1.67C-0.29 AB-7.87\times {10}^{-4}{A}^2+108.88{B}^2 $$
(6)

The planar contour plots between every two independent variables of the second-order polynomial model are presented in Figure 4. It could be observed that as the temperature increases from 100 to 250 °C and the sampler speed drops from 1.0 to 0.2 mm·s–1, the modeled response D increases rapidly. However, when the temperature exceeds 250 °C, response variable D exhibits a slight descent with an increase in the temperature. Meanwhile, the diminution of the linear rail speed and distance from the DART outlet to mass spectrometer inlet profit the increase in D. It is noted that at low temperature, the thermo-desorption of condensed-phase analytes is insufficient, whereas at too high temperature, the analytes become decomposed before they reach the MS detector and the absence of any undesired ion may degrade the signals (Figure S-2 in Supporting Information). A faster speed of sampler or a longer distance would result in a lower residence time of the sample in the ionizing gas, which makes the analyte molecules less likely to collide reactively to achieve sufficient ionization. Saang’onyo and Smith [26] and Li et al. [27] both reported that a speed of 0.2 mm·s–1 produced the highest signal on DART-MS spectra, and the DART-MS spectra of geniposide were similar after the gas temperature increased from 250 to 400 °C. In addition, the lower limit of distance in this work was set at 10 mm to minimize the contamination to the MS inlet.

Figure 4
figure 4

Contour plots of response surface analysis showing the effects of the ionization temperature, sampler speed, and distance on the DART-MS performance

3.3 DART-MS Design Space

Previous experiments acknowledged that D values above 180 ensured excellent DART-MS performance. The practical DS explored for DART-MS is depicted (Figure S-3 in Supporting Information), where the temperature has a maximum allowed increment of 50 °C. The highlighted points are the critical working points within the DS region.

A verification study of these critical points was carried out to inspect the accuracy and robustness of generated DS [28]. Six replicates on each point were executed by three analysts. The actual D values agree well with the predicted values in Table 3, indicating that good analytical quality assurance is provided by the analytical DS. The consistent results of the three analysts further demonstrate that the whole operating space is robust for the analytical performance.

Table 3 Parameters of Critical Points in Constructed Design Space of DART-MS, and the Experimental and Predicted Values of the Modeled Response D on Each Point

3.4 Working Point Selection and Method Validation

A constructed DS usually contains a number of potential working points. For analytical flexibility, an intermediate point in DS was selected as the operating work point in the current studies, and DART was adjusted to the temperature of 300 °C, sampler speed of 0.2 mm·s–1, and distance of 15 mm.

3.4.1 Repeatability

The repeatability was tested by sampling separately with intra-day and inter-day precision (n = 6). The analytical precisions of all the monitored ions are summarized in Table 1. The majority of RSD was within 5 %, indicating a near perfect repeatability of the developed method.

3.4.2 Method robustness

In order to challenge the method through maximizing the chances of problems to occur, robustness studies of a selected working point, including the variability of the process, analytical method, and analyst, need to be performed [29]. Since DART-MS is a technique free of sample preparation, its robustness was only assessed with minor variations to the distance, temperature, and sampling glass tube. None of the variable terms caused any significant changes in signal intensity in the DART-MS spectra and RSD of the ion peak areas, thus demonstrating that the selected working point was qualified.

3.5 Applications

The validated method was thus applied to identify the pharmaceutical quality of different manufacturing batches. Ten batches of the tested preparation, including two failed products, were collected and analyzed by the developed DART-MS method with six replicates. Authentication of these samples was carried out by principal component analysis (PCA) from the ion peak area with SIMCA P + v 12.0.1 software (Umetrics AB, Vȁsterbotten, Sweden). The first two principal components (PCs) captured 86.0 % of the observed variables. Samples of the 10 batches were identified by the score plot of PC1 and PC2 successfully (Figure 5). The validated method promises great application potentials in quality assurance of the pharmaceutical industries.

Figure 5
figure 5

Score plot of principal component analysis on DART-MS spectra data of the tested samples by the validated method. Samples of batch 1–8 were normal products, and samples of batch 9–10 were failed products

4 Conclusion

A systematic strategy has been presented to explore the design space of DART-MS. A method with the introduction of the QbD concept resulted in greatly improved repeatability and analytical quality. In this approach, performance of DART-MS was mathematically described as a composite desirability function of CQAs. Risk assessment and factorial analysis were successfully utilized to screen the factors with the largest effects on method performance. Three factors, the temperature, determined distance, and sampler speed, were found to be statistically significant on the modeled response, and their relationships were further investigated by the Box-Behnken design coupled with response surface analysis. The modeled response was applied to construct DS of DART-MS that provided assurance to analytical quality. The verification and robustness evaluations on a selected working point indicated that the validated method had an excellent performance in ion signal intensity and repeatability. A practical application of this method was illustrated in the end, which promised a great application potential of the QbD compliant MS method in quality control studies on the pharmaceutical industries.