Introduction

Bile acids are important because of their roles in the elimination of cholesterol and the absorption of lipids and fat-soluble vitamins in the intestine [1]. The two main bile acids present in human bile are cholic acid and chenodeoxycholic acid, and are primarily found as glycine or taurine amidates [2].Bile acids are found at millimolar levels in the gallbladder and at micromolar levels in serum and urine [3]. Many diseases, along with metabolic and liver disorders, are characterized by increased concentrations of the bile acids [1, 2]. Thus, a sensitive and quantitative method is needed for the diagnosis of liver and gastro-intestinal tract diseases [4]. Many chromatographic methods have been developed for the analysis of bile acids from various biological sources; these methods have been thoroughly reviewed by Scalia [4] and Roda et al. [3]. The current reference method for the analysis of bile acids is GC–MS. However, this method requires extensive sample pre-treatment including extraction, purification, hydrolysis of the conjugates and derivatization [3, 5]. In recent years, many LC–MS techniques have been developed for the qualitative and quantitative analysis of bile acids, including FAB–MS, thermospray MS, ESI–MS/MS and ion-spray MS [2, 6, 7, 8]. However, these methods are still somewhat restricted by low sensitivity, poor specificity, limited resolution, and/or long analysis times [3, 4].

In recent years, quantitative analysis with matrix-assisted laser desorption/ionization time-of-flight (MALDI–TOF) mass spectrometry has been demonstrated for compounds of biological interest [9, 10, 11, 12]. For quantification, internal standards are necessary to compensate for the poor shot-to-shot reproducibility inherent in the use of MALDI analysis [9, 13]. An ideal internal standard would be chemically similar to the analyte, close to the analyte in mass, and chemically stable during analysis [14]. While the internal standard, N -1-naphthylphthalamic acid (NPA), chosen for this analysis is chemically different, it still proved to be effective. Possible internal standards that were structurally similar to the analytes either did not give adequate signal or had interference peaks. Six cholic acid conjugates were selected for analysis: taurocholic acid (TCA), glycocholic acid (GCA), taurolithocholic acid (TLCA), glycolithocholic acid (GLCA), taurochenodeoxycholic acid (TCDCA), and glycochenodeoxycholic acid (GCDCA). The structures for these compounds and the internal standard are shown in Fig. 1. MALDI's high tolerance to contaminants, good mass resolution and low detection limits make it ideal for the analysis of bile acids from biological media [9]. The bile acids were separated from urine with solid-phase extraction (SPE) cartridges and quantitatively analyzed using MALDI–TOF mass spectrometry. Quantification of bile acids was demonstrated at biologically relevant concentrations directly from urine with minimal sample preparation.

Fig. 1.
figure 1

Structures and molecular weights of bile acids studied and the internal standard

Experimental

Chemicals

The matrix, 9-aminoacridine (9AA), was purchased from Sigma–Aldrich (St Louis, MO, USA). The internal standard, N -1-naphthylphthalamic acid, was purchased from Chem Service (West Chester, PA, USA). Taurolithocholic acid, taurochenodeoxycholic acid, taurocholic acid, glycochenodeoxycholic acid, and glycocholic acid were all purchased as the sodium salts from Sigma–Aldrich. Glycolithocholic acid was purchased from Steraloids (Newport, RI, USA) as a sodium salt. All chemicals were used without further purification. HPLC grade solvents were purchased from Fisher Scientific (Pittsburgh, PA, USA). Urine samples were collected from volunteers in the laboratory.

Sample preparation

The bile acids were initially dissolved in ethanol at the desired concentration. To make the spiked urine samples, 150 μL of these ethanol solutions was added to 3 mL of urine. A blank urine sample, where pure ethanol is added, was prepared and analyzed with each experiment. All stated concentrations for the bile acids were the concentrations of the bile acid in the urine. Varian Bond Elut C18 extraction cartridges with a sorbent mass of 100 mg and a cartridge volume of 3 mL were purchased from Varian Sample Preparation Products (Harbor City, CA, USA). Each sample was pulled through its own SPE cartridge under vacuum using the following protocol. The cartridges were first conditioned with 3 mL of methanol followed by 3 mL of water. The urine sample (2 mL) was then applied to the cartridge followed by 2 mL of water as a wash. The bile acids were then eluted with 1.5 mL of methanol, and the eluent was collected and used as is for MALDI analysis. The matrix, 9AA, was made up at a concentration of 20 mg mL−1 in 50:50 acetone–methanol, and the internal standard, NPA, was made up at a concentration of 28 μg mL−1 in methanol. Two volumes of the 9AA solution and one volume of the NPA solution were added to one volume of the methanol eluent.

Instrumentation and data analysis

A Voyager DE-STR time-of-flight laser mass spectrometer (Applied Biosystems, Foster City, CA, USA), with a nitrogen laser emitting at 337 nm, was used for all studies. The instrument was run in the negative-ion mode with the reflector on. The low mass gate was set at 250 m/z, and delayed extraction was set to a time of 70 nanoseconds. For each solution to be analyzed, 2.0 μL were applied in quintuplicate using the dried-droplet method [9] to a disposable 64-well gold plate. The auto-sampler was used with three different raster search patterns. Eight scans were collected at each position in a pattern and averaged to total 128 scans for each pattern. Positions with counts less than 10,000, because of not enough signal intensity, or with counts greater than 65,000, which is near the maximum detection limit of the detector, were discarded. For the suppression and reproducibility experiments, data acquisition, mass calibration and peak area values were determined by utilizing PerSeptive GRAMS/386 software (version 3.04 Level III, Galactic Industries). All the spectra for each search pattern were then averaged together, and the area under each peak was calculated using the integral function program. For the calibration curves, both peak area and height values were collected with GRAMS/32 AI software (version 6.0, Galactic Industries) to allow for comparison. The peak height and area were determined for each spectrum and the values for each raster pattern were averaged together. The limits of:

$$\displaylines{ {\rm peak area ratio }={\rm }{{{\rm bile acid peak area}} \over {{\rm NPA peak area}}} \cr {\rm RA }={\rm peak area ratio - peak area ratio of blank} \cr} $$

integration were calculated using the tangent algorithm, and the peak separation parameter was set at 50%. All stated peak area or height ratios, RA or RH, are calculated as shown below. The limits of integration for the blanks were the average limits of integration for the rest of the calibration curve, and peak areas and heights were calculated using the integrat.ab program. The peak area ratio of the blank was subtracted to account for the endogenous bile acids present in the urine samples. The blank signal was fairly small as the samples were from a healthy adult. All t-test values were calculated as described by Miller and Miller [15].

Suppression

In order to test for any suppression effects, urine samples were spiked, as described above, with each individual bile acid and one sample with a mixture of all six bile acids, for a total of seven samples. All of these solutions were analyzed as described above. The peak area ratio for each individual bile acid urine sample, R A(Individual), which contained a single bile acid, and the peak area ratio for that same bile acid when it was in the mixed sample, R A(Mixed), was calculated, and these two peak area ratios were compared using the t-test. When the peak area ratios were significantly different and failed the t-test, the concentrations and volumes of the internal standard and the matrix were optimized through an iterative process. When one set of conditions was found for which there was no significant difference between the peak area ratios for all of the bile acids, the method was considered to be optimized. The optimized conditions were as follows: volume ratio—2:1:1 9AA:NPA:SPE eluent, internal standard (NPA) concentration—28 μg mL−1, and matrix (9AA) concentration—20 mg mL−1.

Reproducibility

A urine sample with all six bile acids was prepared and stored in the refrigerator. A portion of this sample was analyzed on three consecutive days as previously described. The day having the highest peak area ratio, RA, was compared with the day having the lowest peak area ratio using the t-test.

Calibration curves

The urine samples for the calibration curves were prepared and analyzed as described above in triplicate on three different days. The average ratio of the peak areas and peak heights was then plotted against the concentration of the bile acid in the urine. Simple calibration curves were prepared with Microsoft Excel 97, and the more complex one with 95% confidence limits for slope and standard error bars for each point was calculated using Axum 6.0 (Release 2, Mathsoft). The values for the 95% confidence limits of the slope and intercept (Eqs 5.7 and 5.8, respectively, in Ref. [15]), the relative standard deviation of the slope (p. 22 in Ref. [15]), and the limit of detection (Eq. 5.11 in Ref. [15]) were calculated as described by Miller and Miller [15].

Quantification test

The peak area ratio, RA, of two of the calibration curve samples collected on one day was used with the appropriate calibration curve to calculate their concentrations. The 95% confidence limits of the calculated average concentrations were calculated as xo±tsxo, where sxo was determined using Eq. 5.10 from Miller and Miller [15]. This calculated average concentration was then compared to the actual concentration using the t-test. The percentage error was also calculated as:

$$\left| {{{{\rm actual - calculated}} \over {{\rm actual}}}} \right|*100\% .$$

Results and discussion

An example negative-ion MALDI spectrum of a spiked urine sample with glycine conjugates and taurine conjugates at concentrations of 107 and 4.29 μg mL−1, respectively, is shown in Fig. 2. These concentrations are from near the middle of the calibration curve. The internal standard, NPA, is seen at 290.1 m/z, and the bile acids are seen between 432 m/z and 515 m/z. Additional peaks arising from both the matrix, 9AA, and other urine components can also be seen in the spectra, denoted as M and U, respectively. However, none of these peaks interfered with the peaks of interest. The main method of ionization in 9AA is the removal of a proton [16]. Thus, the bile acids are seen as [M–H] peaks. For the glycine conjugates, this would be the proton of the carboxylic acid group, while for the taurine conjugates, the proton on the sulfonic acid group would be most easily extracted.

Fig. 2.
figure 2

MALDI spectrum of a urine sample spiked with glycine and taurine conjugates (107 and 4.29 mg mL−1, respectively. M =matrix peak; U =peaks from other urine components

Suppression

One of the major impediments to the application of MALDI to quantitative experiments is signal suppression, especially of less concentrated components [9, 11, 17]. However, this effect can be compensated for by the use of an internal standard [9, 18], and optimization of the experimental parameters [9]. In order to test whether the analyte signals were being suppressed by other components in the sample, each bile acid was prepared in two ways individually and as a mixture. All six of the bile acids were spiked into the mixed urine sample, while, in the individual state, each bile acid was the only one spiked into the urine sample. The area ratio of each bile acid for the individual, R A(Individual), versus mixed state, R A(Mixed), was compared using the t-test. If R A(Individual) and R A(Mixed) were significantly different, then the concentrations of the internal standard and the matrix and volumes of all components were varied until no significant difference was seen. The optimized conditions were as follows: volume ratio—2:1:1 9AA:NPA:SPE eluent, internal standard (NPA) concentration—28 μg mL−1, and matrix (9AA) concentration—20 mg mL−1. The peak area ratios, RA, with their standard deviations and statistic t for the final optimized method are shown in Table 1. The t-test critical value is 2.78 at the 95% confidence limit with four degrees of freedom. Thus since all the calculated statistic t values in Table 1 are below this value, these optimized conditions eliminated the suppression effects for the internal standard and the bile acids.

Table 1. Comparison of bile acids individually and in a mixture. All average peak area ratios, RA, are shown with their standard deviation. The t-test critical value is 2.78 at the 95% confidence limit

Figure 3 shows portions of the spectra from two urine samples with the same concentration of bile acids (51.4 μg mL−1) and internal standard (NPA, 28 μg mL−1), and illustrates how the internal standard and the optimized method compensated for signal suppression caused by an unexpected urine component, a hyoscyamine metabolite. Hyoscyamine is a prescription drug that was being taken when this urine sample was collected. If only the right hand sides of the spectra are examined, it would appear that these two samples do not have the same concentration of GLCA and GCDCA. The peak area for the bile acids increases by 37% for GCDCA and 42% for GLCA between 3A and 3B, which might falsely lead one to believe that the concentration of the urine sample that generated spectrum 3B was higher. However, if the ratios of the peak areas are compared, only a 1% decrease for GCDCA and a 2% increase for GLCA is seen between Figs. 3A and 3B, because the peak area of the internal standard, NPA, was similarly affected by the hyoscyamine metabolite. Thus, by using an internal standard and optimizing the method, the unexpected hyoscyamine metabolite did not cause any problems in the quantitative analysis.

Fig. 3.
figure 3

Spectra from two urine samples having the same bile acid (51.4 mg mL−1) and internal standard (NPA, 28 mg mL−1) concentrations. *metabolite of hyoscyamine

Reproducibility

In order to determine if the method was reproducible, a urine sample spiked with all six bile acids was prepared and analyzed on three consecutive days. A complete analysis was performed on each day from extraction to data collection. The average peak area ratios, RA, and standard deviations are shown in Table 2, along with the t statistic for the largest and smallest area ratios. The t-test critical value is 2.78 at the 95% confidence limit for four degrees of freedom, and all of the t values are well below this value. Since the largest and smallest peak area ratios shown were not statistically different, the third peak area ratio would not be statistically different either. Therefore, the developed method is reproducible over three consecutive days.

Table 2. Day-to-day reproducibility of bile acid peak area ratios. Peak area ratios (RA) are shown with their standard deviations. The largest and smallest area ratios were compared with the t-test. The t-test critical value is 2.78 at the 95% confidence limit

Calibration curves

Figure 4 shows portions of the MALDI spectra from four points on the calibration curve, and illustrates how the ratio of the internal standard to the analytes is used for quantification. The internal standard (NPA) is at a constant concentration of 28 μg mL−1, and the concentration of GLCA is at 171, 107, 51.4, and 8.57 μg mL−1. The peak height and area can be seen to decrease in intensity relative to the peak height and area of the internal standard.

Fig. 4.
figure 4

GLCA spectra as a function of concentration. The internal standard (NPA) is at a constant concentration of 28 mg mL−1. The height of the internal standard peak was normalized in all spectra to allow for visual comparison

The calibration curves for the glycine and taurine conjugates are shown in Figs. 5 and 6, respectively. Each point represents the average peak area ratio, RA, for that concentration over three days. The equations for the linear regression lines and the correlation coefficients are also shown in the figures for each analyte. While the correlation coefficient is a commonly reported statistic and all are greater than 0.996, it can easily be misinterpreted as opposed to the confidence limits of the slope and intercept [9, 15]. For clarity's sake, only one calibration curve for GLCA is shown complete with the standard error bars for each point and 95% confidence limits of slope in Fig. 7. Table 3 gives the confidence limits of the slopes and intercepts for all of the calibration curves, along with the relative standard deviation (%RSD) of the slope and the limit of detection (LOD).

Fig. 5.
figure 5

Calibration curves for glycine conjugates. Each point represents the average peak area ratio, RA for that concentration over three days. Observed linear range: 8–171 mg mL−1

Fig. 6.
figure 6

Calibration curves for taurine conjugates. Each point represents the average peak area ratio, RA, for that concentration over three days. Observed linear range: 0.238–9.52 mg mL−1

Fig. 7.
figure 7

GLCA calibration curve. Error bars depict the standard error of individual points. Dashed lines represent the 95% confidence limit of the slope

Table 3. Comparison of calibration curve statistics using peak areas and peak heights

The precision of the slope, as expressed by %RSD, was approximately two percent when using the peak area ratios and approximately three percent when using the peak height ratios. In addition, when the peak area ratios are used, the y-intercept value for all of the calibration curves is not statistically different from zero within the 95% confidence limits. y-Intercept values that are greater than zero can often be attributed to an improperly subtracted background [9]. At lower signal intensities, the peak shape is more variable which makes peak area measurements more accurate and precise than peak height measurements. Thus when measuring the very low signal of background spectra, the y-intercept values are slightly higher when the peak height ratios are used. Since the LODs were also slightly lower when the peak areas were used, the peak area ratios were chosen as the preferred measurement.

The matrix ionizes the sulfonate group of the taurine conjugates much more easily than the carboxylic acid group of the glycine conjugates [16]. The effect of this increased ionization efficiency of the taurine conjugates can be seen in the increased slope and lower detection limits of the taurine conjugates. The slopes of the calibration curves also follow an interesting pattern that can be attributed to ease of ionization. When examining either set of conjugates, it can be seen that the slope increases slightly with increasing number of hydroxyl groups on the ring system.

Quantification test

In order to test the accuracy of the method for analyzing unknown samples, the concentrations of two samples were determined for each bile acid from one day's measurements. While determining the concentration of an unknown sample from a measured peak area ratio is simple, calculation of the confidence level of the measured result is more complicated and must include the error of the slope and intercept [9, 15]. The average peak area ratio was used to determine the concentration with the previously determined calibration curve, and the 95% confidence limits of the average concentrations were calculated as xo±t (n-2)sxo, as described by Miller and Miller [15]. The calculated concentration was then compared to the known spiked concentration of the sample using the t-test. These values are all shown in Table 4. All of the t values were below the t critical value of 4.30, and thus it can be said with 95% confidence that the calculated average was not statistically different from the known value. This agreement is also reflected in the percent error values, which are all below ten percent.

Table 4. Comparison of actual and calculated concentrations of bile acids. The calculated average concentration values are reported with their 95% confidence limits. The t-test critical value is 4.30 at the 95% confidence limit

Comparison with other methods

The extraction, purification, hydrolysis and preparation of volatile derivatives required for the GC–MS reference method is tedious and time consuming. In addition, the derivatization required to make the sample volatile also results in the loss of information about conjugation [3, 4, 5]. The present MALDI–TOF method requires no derivatization step, and allows simultaneous analysis of multiple bile acids. Furthermore, with an analysis time for unknowns of approximately 10 min, the developed method is considerably faster than the GC–MS and HPLC–MS methods [7, 8]. The analysis time and limits of detection found here are comparable to the LC–ESI–MS/MS method developed by Perwaiz et al. [6]. A MALDI–MS system also has the advantages over an ESI–MS system of allowing multiple unknown samples to be analyzed on the sample plate, without any sample memory effects. Thus, a MALDI–MS method would allow more samples to be analyzed at one time. The present MALDI–TOF–MS method has also been applied to plasma samples and will be published independently.

Conclusion

In conclusion, a fast, easy method has been developed for the quantification of bile acids directly from urine by MALDI–MS. No significant suppression effects were seen and the method was found to be reproducible from day to day. The concentrations of known test samples were accurately determined based on the proposed calibration curves.