Introduction

Mass spectral libraries play an important role in qualitative analysis for the identification of unknown compounds. Over the last 40 years, large sets of electron impact mass spectra have been collected, which are extensively used to test the identity of all sorts of chemicals. The established libraries are considered to be reliable, robust, and transferable.

The development of tandem mass spectral libraries began in the late 1990s, triggered by the invention of atmospheric pressure ionization techniques [15]. It was soon realized that collision-induced decomposition is much more difficult to control than electron impact fragmentation. A number of experimental conditions, including the kind of collision gas, the collision gas pressure, and the collision voltage, influence the fragmentation chemistry. Tandem mass spectra were found to be difficult to reproduce. At that time, reproducibility was considered to represent the prime prerequisite for using tandem mass spectral libraries as a universal identification tool. Over the years, several strategies have been developed to address the problem of missing reproducibility of tandem mass spectrometric fragmentation. In one approach, reference compounds were used to normalize or standardize experimental conditions [69]. Normalization, however, hardly improved the comparability of tandem mass spectra. Especially, problems with the cross-instrument spectrum variation are still encountered. Another attempt to improve the transferability relied on the collection of compound-specific fragment ion mass spectra acquired at several different collision energies [8, 1013]. Despite considerable success for using this kind of library on one instrument or a certain type of instrument, platform independence has hardly ever been proven. A promising approach towards a platform-independent tandem mass spectrometric library was presented recently [1418]. The MSforID project (www.msforid.com) relies on the combination of a highly efficient search algorithm with a comprehensive mass spectral library established on a high-resolution mass spectrometer. The developed search algorithm is based on peak matching and exhibits a high tolerance towards changes within the intensity distribution among different fragmentation pathways. The reference library was established on a quadrupole–quadrupole-time-of-flight instrument (QqTOF) using ten different collision energies for acquiring compound-specific reference spectra. In a multicenter study, we have demonstrated that the MSforID library is transferable to various instrumental platforms [16, 17]. Extending recent communications, we present herein a thorough evaluation of the match accuracy of the MSforID library. Statistic parameters were used to characterize the search reliability. Obtained results were referenced to those retrieved for the quadrupole–quadrupole-linear ion trap (QTrap) library established by the Weinmann group [13]. Almost 13,000 tandem mass spectra were included in this study. The MSforID search algorithm was used for spectral matching.

Experimental

MSforID library

The MSforID library contained over 9,900 spectra of 1,003 compounds relevant in clinical and forensic toxicology [14, 1618]. A detailed description of the mass spectral library is provided on www.msforid.com. All spectra acquired in positive ion mode (8,252) were used for the evaluation of the library search performance. Library spectra were acquired in Innsbruck using a QqTOF (QSTAR XL, AB Sciex, Foster City, CA, USA) equipped with a modified TurboIonSpray source [19]. Mass calibration and optimization of instrumental parameters were performed in the positive ion mode by infusion of a mixture of 1.0 mg/l caffeine and 1.0 mg/l reserpine dissolved in 0.05% aqueous acetic acid solution containing 50% acetonitrile (v/v) at a flow rate of 2.0 μl/min. The spray voltage was 4 kV. Gas flows of 1–3 arbitrary units (nebulizer gas) and 40 arbitrary units (turbo gas) were employed. The temperature of the turbo gas was adjusted to 200 °C. For MS/MS, the Q1 resolution was set to unit resolution. A nitrogen flow of 5 arbitrary units was used to initiate fragmentation. For the acquisition of reference spectra, compound solutions in 0.05% aqueous acetic acid solution containing 50% acetonitrile (v/v) were prepared and directly infused into the mass spectrometer at a flow rate of 2.0 μl/min. Depending on the ionization efficiency of the compound, solutions with concentrations in the range of 0.10 to 10.0 mg/l were applied. For each compound, fragmentation was accomplished at ten different CE values. Starting with 5 eV, the CE was increased in steps of 5 up to 50 eV, leading to low, medium, and extensive fragmentation of the reference compounds. Mass spectra were collected in the range 50–700 units. The accumulation time was set to 1.0 s. Product ion mass spectra were recorded on a personal computer with the Analyst QS software (1.0, service pack 8, AB Siex). Before storage in the library, files were filtered to delete unspecific signals in the reference spectra.

Weinmann library

The Weinmann library contained over 5,600 spectra of 1,253 compounds relevant in clinical and forensic toxicology [13]. PDF files of the generated spectra are accessible via the website http://www.chemicalsoft.de/MSMS_QTrap/MSMS_QTrap-index.php. All spectra acquired in positive ion mode (4,368) were used for the evaluation of the library search performance. Library spectra were acquired in Freiburg using a QTrap instrument equipped with a turbo ionspray source (QTrap 2000, AB Sciex). Dynamic fill time was used with minimum fill time of 6 ms and a maximum fill time of 250 ms, and fragments were trapped in a range starting at 50 units and ending at the precursor mass plus at least 5-unit tolerance. CID was induced with nitrogen at a pressure of 3.8 to 4.0 × 10−5 Torr at the collision energies 20, 35, and 50 eV. Additionally, a collision-energy-spread spectrum at 35 ± 15 eV was recorded, for which all the product ions generated by the 20, 35, and 50 eV were trapped simultaneously. All other potentials were default values corresponding to the standard instrument installation. The scan rate was set to 4,000 units/s. The instrument was calibrated daily with polypropylene glycol to achieve a mass accuracy of ±0.1 units and a resolution of 0.7 ± 0.1 units at half height for Q1 and at the specified standard resolution for the linear ion trap (LIT) at the scan rate of 4,000 units/s before acquisition of a series of compounds. Haloperidol and amitriptyline (positive ionization) were used for quality control of the spectra after every ten injections. For the acquisition of reference spectra, pure compound solutions—in some cases solutions made of tablets—were prepared and 1–2,000 ng of each compound was injected into the system using standard reversed-phase analytical columns with gradient elution. The average of the spectra across the chromatographic signal was added to a Microsoft Access database, which was generated by the Analyst software (version 1.4.1, AB Sciex). The number of fragment ions stored was limited to 16.

Spectra collected on multiple instrumental platforms

To study the capabilities of the mass spectral libraries to deal with spectra acquired on different instruments, spectra that were obtained from the following types of tandem mass spectrometers were used: QqTOF (QSTAR Pulsar I, AB Sciex), QTrap (QTrap 4000, AB Sciex), quadrupole–quadrupole–quadrupole (QqQ, TSQ Quantum Ultra, Thermo Fisher Scientific, Waltham, MA, USA), and linear ion trap Fourier transform ion cyclotron resonance mass spectrometer (LIT-FTICR, LTQ-FT Ultra, Thermo Fisher Scientific). The QTrap was operated in two different scanning modes: in “product ion scan (pi)” as well as in “enhanced product ion scan (epi)” mode. On the LIT-FTICR instrument, product ions were either analyzed at low resolution in the LIT or at high resolution in the FTICR. Altogether, 355 spectra were collected from amiloride, buphenin, cinchocaine, cyclizine, desipramine, dihydroergotamine, dosulepin, dixyrazine, ethambutol, etilefrine, etofylline, mefruside, methysergide, metoclopramide, phenazone, phenytoin, sulfamethoxazole, sulthiame, and tetracycline. A more detailed description of the data set can be found elsewhere [16, 17].

Library search

The principles of the library search approach have been described elsewhere [14, 16, 17]. Briefly, the measured product ion mass spectrum of an unknown compound represents the input for library search. The spectrum is compared with all mass spectra stored in the library. In each case, the similarity is expressed in the form of the “reference spectrum-specific match probability” (mp). Next, the mp values corresponding to a certain reference compound are combined to one value (=“average match probability”, amp) that specifies the similarity between the unknown and that specific reference compound. To facilitate comparison, amp is converted into the “relative average match probability” (ramp). Single ramp values range between 0 and 100. A high compound-specific ramp value indicates high similarity between the unknown and the reference compound. Finally, a list is gathered as output of the search algorithm, which is sorted in order of decreasing ramp. The substance with the highest ramp is considered to represent the unknown compound. The ramp value is used to classify the library search result. Recently, we could show that 50.0 represents a convenient cutoff point for the ramp value at which sensitivity and specificity of the library search approach both exceed 95% [16, 17]. As a final qualifying criterion to exclude putative false-positive hits, the m/z of the precursor ion is used. Only if the m/z of the best matching compound agrees with the m/z of the precursor ion will the identity be confirmed.

Automated library search was performed with a program written in ActivePerl 5.6.11 (Active State Corporation, Vancouver, BC, Canada). All calculations were performed on a personal computer under Windows XP™ operating system (1.7 GHz Pentium, 1.0 GB RAM). The mass tolerance was set to ±0.10 for low-resolution spectra and to ±0.010 for high-resolution spectra. For library search, spectra were converted into txt files. Each file contained the m/z of the precursor ion as well as a list of fragment ion m/z values and the corresponding relative signal intensities.

Performance evaluation

The performance of the two libraries (MSforID, Weinmann) was evaluated with three different experiments. In the first experiment, the spectra of the libraries were searched against their corresponding library after excluding either this single compound-specific spectrum or all compound-specific spectra prior to searching. In the second experiment, the libraries were searched against each other. Either library was used as reference or sample set. For the third experiment, spectra acquired on different mass spectrometric instruments were matched to both libraries. The number of positive identifications and the number of negative identifications were counted and used to calculate statistical parameters (Table 1), including sensitivity, specificity, positive likelihood ratio (LR+), negative likelihood ratio (LR−), positive predictive value (PPV), negative predictive value (NPV), and odds ratio.

Table 1 Summary of statistic parameters used to evaluate the performance of library search

Results and discussion

A set of experiments was used to evaluate and to compare the performance of two tandem mass spectral libraries. The first library was the MSforID library recently developed by the Oberacher group in Innsbruck; the second one was the Weinmann library established by Weinmann and Dresen in Freiburg. There are three main differences between the two libraries that might be responsible for performance differences. (1) The MSforID library was established on a high-resolution instrument. (2) For the Weinmann library, the number of fragment ions stored within a single reference spectrum was limited to the 16 most intense ions. (3) The MSforID library contains a larger number of compound-specific reference spectra (ten vs. four).

For all tests, the MSforID search algorithm was used. Neither any kind of training nor any kind of modification to the search algorithm was necessary to obtain reliable search results with the Weinmann library. The applied search algorithm represents a ready-to-use tool for handling any kind of tandem mass spectral library.

In the first set of experiments used to evaluate the search performance, the spectra of the libraries were searched against their corresponding library after excluding either this single compound-specific spectrum or all compound-specific spectra prior to searching. By taking a compound-specific spectrum out of the library and matching it against the remaining set of spectra, a positive control experiment was generated. This kind of experiment was used to calculate the sensitivity. A negative control experiment was generated by taking all compound-specific spectra out of the library and matching them against the remaining set of spectra. This kind of experiment was used to calculate the specificity. For each library, sensitivities and specificities observed at varying ramp cutoffs were combined to a library-specific receiver operating characteristic (ROC) curve (Fig. 1a). The performances of the libraries were judged by the position of the ROC curve. Generally, poor tests have lines close to the rising diagonal, whereas perfect tests would produce curves that coincided with the left and top sides of the plot, where both the sensitivity and the specificity are 1. The obtained ROC curves rise steeply and pass close to the top left-hand corner, which clearly proves that both libraries exhibit a high degree of predictive accuracy. As can be deduced from Fig. 1a, the MSforID library exhibited an overall better performance than the Weinmann library, even though the maximum sensitivity of the Weinmann library (0.979) was slightly higher than the maximum sensitivity of the MSforID library (0.976). To assess which parameter was responsible for the observed differences in test performance, plots of sensitivity vs. ramp cutoff and 1-specificity vs. ramp cutoff were generated (Fig. 1b, c). The sensitive curves were almost congruent. Sensitivity was ruled out. Major differences, however, were observed for the specificity curves. The comparable higher specificity of the MSforID library is mainly attributable to the higher mass spectrometric performance of the instrument used to generate the library entries. The maximum allowable mass tolerance was ±0.10 for the QTrap data vs. ±0.010 for the QqTOF data. The narrower mass window enabled the exclusion of a higher number of false-positive matches. The ROC curves were further used to determine the best-suited cutoff points for the ramp value at which optimal sensitivity and specificity would have been achieved with each library. For this purpose, the point on the curve closest to the left and top side of the plot was determined. Cutoffs in the range 40–50 were found to represent convenient test thresholds to classify search results. For the MSforID library, an optimal cutoff of 42.4 was obtained. A sensitivity of 0.953 and a specificity of 0.936 were achieved. For the Weinmann library, an optimal cutoff of 50.4 was determined. A sensitivity of 0.931 and a specificity of 0.903 were retrieved.

Fig. 1
figure 1

Evaluation of the reliability of a match in the MSforID library and in the Weinmann library illustrated with a ROC curves as well as plots of b sensitivity and c 1-specificity vs. ramp cutoff

A parameter that might affect the search performance is the number of fragment ions stored within a single reference spectrum. Restriction of the maximum number of fragment ions reduces the overall probability of randomly matching fragment ions during spectral comparison, which can influence both sensitivity and specificity. For the Weinmann library, the number of fragment ions stored within a single reference spectrum was limited to the 16 most intense ions. In the MSforID library, all detected fragment ions were stored except for those which were identified as unspecific noise due to implemented filtering steps. To study the impact of the restriction of the maximum number of fragment ions on search performance, modified MSforID libraries were generated. The number of fragment ions stored within a single reference spectrum was limited either to the three, five, ten, or 16 most intense ions. In the first set of experiments, the obtained libraries were searched against themselves, leaving either a single compound-specific spectrum or all compound-specific spectra out of matching. In each case, sensitivities and specificities were determined using ramp cutoffs of 40.0 and 50.0. The obtained results are depicted in Fig. 2a, b. For the modified libraries with a limit of five to 16 signals, a positive effect on both sensitivity and specificity was observed. The increase of specificity was moderate; a more noticeable improvement was obtained for the sensitivity. To decide which modified library would represent the most efficient one, the set of 355 spectra collected on different instrumental platforms was matched to each of the libraries. The number of positive matches using ramp cutoffs of 40.0 and 50.0 was counted (Fig. 2c). Overall, the best results were obtained for the MSforID (N max = 16) library. The performance of that specific library was tested with leave-one-out experiments. The obtained ROC curve is shown in Fig. 1a. The ROC curve was used to determine the best-suited cutoff point. An optimal cutoff of 43.2 was determined. A sensitivity of 0.963 and a specificity of 0.940 were achieved. The limitation was particularly advantageous for the fraction of true-positive matches (Fig. 1b). Within this group, average ramp values increased from 88.9 to 91.5. Accordingly, a larger fraction of true-positive matches were able to exceed a defined ramp threshold. At a ramp cutoff of 50.0, for instance, the sensitivity increased from 0.935 to 0.952. The observed changes of specificity were moderate (Fig. 1c). At ramp cutoffs below 50, a slight increase was observed. At a ramp cutoff of 40.0, for instance, an improvement of 0.005 was observed.

Fig. 2
figure 2

Impact of the maximum number of fragment ions stored in a reference spectrum on a sensitivity and b specificity obtained in leave-one-out experiments as well as c the relative number of positive matches received for the data set acquired on different instruments

To assess the performance of each library, i.e., the MSforID (N max = 16) library and the Weinmann library, with an independent data set, libraries were matched to each other. One library was used as sample set to search within the other one. Altogether, more than 12,600 tandem mass spectra (8,252 from the MSforID library (N max = 16) and 4,368 from the Weinmann library) were used for this experiment. Positive controls were represented by 1,669 spectra out of the Weinmann library and 4,755 spectra out of the MSforID (N max = 16) library. The remaining were negative controls. Different statistical parameters were used to compare the library search performances (Table 1) at two different cutoff values (40.0 and 50.0). The results are summarized in Table 2. Both libraries performed well and enabled the sensitive and specific identification of compounds. For instance, LR+ above 5–10 and LR− below 0.2–0.1 clearly indicate that both libraries provide strong to convincing test evidence. Irrespective of the kind of statistical parameter calculated, however, the MSforID (N max = 16) library exhibited better performance than the Weinmann library. Particularly, the sensitivity varied significantly. Parts of the false-negative matches can be explained with problems of the Weinmann library to match those spectra of the MSforID library efficiently to the correct compound that were collected at very low collision energies. In other cases, duplicated entries (=sets of reference spectra corresponding to one and the same compound stored under different names as well as sets of reference spectra corresponding to different stereoisomers of a certain compound) were found to give rise to false-positive results by leveling down ramp values of correct matches below the cutoff.

Table 2 Statistical evaluation of the reliability of library search by matching the MSforID library (N max = 16) and the Weinmann library against each other

As the final test of match performance, the set of 355 spectra collected on different instrumental platforms was matched to both the MSforID (N max = 16) library and the Weinmann library. The number of false-negative assignments was used as statistical parameter. Without defining any ramp cutoff, both libraries performed well. For both libraries, the false-negative rate was below 2.5% (Fig. 3a), which clearly suggests that the developed search algorithm represents an efficient tool to match tandem mass spectra from any source correctly to a library. Setting the ramp cutoff to 40.0, which is necessary to obtain a qualified match, we observed major performance differences between the two libraries. With the MSforID (N max = 16) library, only 2.8% false-negative assignments were observed. Irrespective of the kind of instrumental platform used, the false-negative rate was below 5.3% (Fig. 3b). Thus, the MSforID library can be rated as transferable and platform independent. With the Weinmann library, 16.3% of outcomes did not qualify. As can be deduced from Fig. 3b, the best results for the Weinmann library were obtained with “tandem-in-space” instruments (QqQ, QTrap, and QqTOF). The observed errors were 1.8–12.3% within this group of instruments. Matching of spectra acquired on the “tandem-in-time” instrument (LIT-FTICR) was less successful. For instance, more than 40% of spectra acquired in LIT-FTICR mode did not retrieve a qualified match. “Tandem-in-time” instruments are known to produce fragment ion mass spectra that contain a low number of fragment ions [16, 20]. Only a limited number of fragmentation pathways are activated. Particularly, low molecular weight fragment ions are very weak or even absent. The Weinmann library has problems to handle such kind of spectra and to display a qualified match. The MSforID library is more appropriate because this library contains a larger number of compound-specific reference spectra acquired at very low collision energies, exhibiting a higher degree of similarity to “tandem-in-time” spectra.

Fig. 3
figure 3

Number of incorrect assignments received for the sample spectra acquired on different instruments by matching them either to the MSforID library (N max = 16) or the Weinmann library. To obtain a qualified match, the ramp cutoff was set a to 0 or b to 40.0

Conclusions

We have performed a number of experiments to evaluate the performance of two important tandem mass spectral libraries (MSforID library and Weinmann library). Almost 13,000 spectra were included. The MSforID algorithm was used for library search. Statistical evaluation of the experimental results revealed the following:

  1. 1.

    The search algorithm developed for the MSforID library proved to be a ready-to-use tool to search within both kinds of tandem mass spectral library and in principle can be used for other libraries as well.

  2. 2.

    Both libraries tested enable the sensitive and specific identification of a compound using a ramp cutoff of 40–50, a parameter which can be set by the user.

  3. 3.

    Matches to the MSforID library are even more reliable due to high mass accuracy and mass resolution of ions produced by the QqTOF instrument. By limiting the maximum number of fragment ions stored in compound-specific spectra to 16, the performance could be further improved, due to elimination of low-abundance fragment ions.

  4. 4.

    Both libraries show good transferability to “tandem-in-space” instruments. The transfer to “tandem-in-time” instruments, however, was more efficient for the MSforID library. To use a library with such kind of instruments, compound-specific reference spectra should cover a large range of different collision energy settings, especially the range with low collision energies, since ion traps most often deliver only a very limited set of higher molecular weight fragment ions—with higher similarities to low collision energy fragment ion spectra than to high collision energy fragment ion spectra obtained with QqTOF or QqQ instrumentation.

  5. 5.

    For the generation of a highly efficient tandem mass spectral library which is transferable to high- and low-resolution instrumentation, it is beneficial to use a high-resolution “tandem-in-space” instrument.

Due to increasing performance and robustness, tandem mass spectral libraries have the potential to become important tools for the qualitative analysis of small (bio-)organic molecules. We believe that scientists would particularly benefit from a tandem mass spectral library for compound identification with different kinds of instrumentation that enables a fast and uncomplicated inclusion of new compounds (e.g., designer drugs, which are nowadays easily available due to globalization and the internet market). Libraries such as MSforID and the presented search algorithm could be a means for achieving this aim. Accordingly, we cordially invite scientist to contribute to the MSforID project by either providing reference compounds or submitting reference spectra.