Introduction

Hyperbilirubinaemia, defined as a serum bilirubin concentration >250 µmol/l, is a frequent problem in term neonates. As a consequence, serum bilirubin measurement is the standard method to assess jaundiced neonates. It is, unfortunately, invasive, painful, and costly. To overcome these drawbacks, non-invasive methods of bilirubin measurement have been proposed. Since Yamanouchi described a transcutaneous bilirubinometer in 1980 [22], several devices have been developed to non-invasively assess serum bilirubin concentration [16, 17, 21, 22]. These instruments have been shown to give results that correlate well with serum bilirubin values, but exhibit large scattering [1]. Consequently, with the application of non-invasive bilirubinometers, the number of serum measurements could be reduced and painful heel pricks avoided but not completely abolished [5, 15]. Warnings have been published for the use of bilirubinometers in newborns with dark and yellow skin and in bright sunlight [23].

The easiest available and cheapest non-invasive method for assessment of jaundice is the human eye. As early as 1969, Kramer [13] had published a study which correlated the clinically observed cephalocaudal advancement of jaundice with the values of unconjugated serum bilirubin. Although in clinical practice the human eye is used to screen babies for jaundice, this method has, to our knowledge, not yet been systematically compared with non-invasive bilirubinometers.

The aim of our study was therefore to compare the clinical assessment of jaundice using the method of Kramer and two methods of transcutaneous bilirubinometry (Minolta JM-102 and BiliCheck) with serum bilirubin values. The second aim was to investigate the effect of skin colour and ambient light on the three non-invasive methods.

Subjects and methods

Patient selection

Full-term healthy newborns, with a birth weight of at least 2000 g and not older than 6 days, were recruited from July 1st 2002 until June 30th 2003 in the maternity ward. Exclusion criteria were: haemolysis, jaundice within the first 36 h, and phototherapy.

Clinical assessment of jaundice

The five dermal zones of the cephalocaudal progression of icterus, described by Kramer were used [13]. These dermal zones and the corresponding ranges of serum bilirubin values are given in Fig. 1. Jaundice progressing to zone 5 corresponds to a serum bilirubin of >250 µmol/l. All infants were assessed independently by the primary investigator and by the nurse in charge.

Fig. 1
figure 1

Correlation between icteric dermal zones (Kramer) and serum bilirubin values

The Minolta JM-102 bilirubinometer

The Minolta Airshields JM-102 bilirubinometer is based on a two-wavelength analysis and gives a reflectance unit (index) [11, 12]. The manufacturer recommends that each institution establish a conversion table adapted to its specific patient mix and reference method. The corresponding values obtained in a pilot study are shown in Table 1. The equation for converting the Minolta JM-102 index values to units of µmol/l is: y=x×16–124. Following the manufacturers instructions, two readings were taken from the sternum and only the higher one included [7, 12]. Care was taken to avoid skin areas that were bruised, excessively hairy, or hypermelanotic.

Table 1 Guide values for the conversion of Minolta JM-102 index numbers to serum bilirubin equivalents

The BiliCheck bilirubinometer

The BiliCheck system averages the spectra of five replicate measurements at one site to give a bilirubin estimate in µmol/l (or mg/dl) that is based on reflectance data analysed at multiple wavelengths [2]. Measurements were performed on the forehead and on the sternum.

Serum bilirubin assay

For each comparison, two samples of 40 µl blood, collected in capillaries from a heel prick, were centrifuged for 10 min and the serum bilirubin measured with a Bilimeter II (Pfaff, Neuburg, Germany), a standard photometer using two wavelengths (455 nm and 575 nm). This instrument was calibrated with test solutions and has a technical error of ±1.3% for values up to 350 µmol/l. In a previous study, its performance was compared to HPLC and was shown to correlate well with HPLC values [17]. The mean value of the two capillaries was used for comparison.

Study procedure

Before starting the investigation, nurses on the maternity ward were instructed in the clinical assessment of the progression of jaundice using the five dermal zones described by Kramer [13]. A nurse routinely checked each infant for jaundice at intervals no greater than 12 h. If jaundice reached the thighs (zone 3), the primary investigator undertook the same visual assessment of icterus without knowing the nurse’s result and performed measurements with the two transcutaneous devices. The ambient light, whether daylight or fluorescent light (Osram Lumilux warm white L36 W31) was recorded. Infants were also included in the study if jaundice did not reach zone 3 by 72 h; in these infants a blood sample was taken for bilirubin measurement together with the routine metabolic screening. Thus, no additional heel prick was required. For all blood samplings, the nurse administered an oral sweetener for pain relief [6] before pricking the pre-warmed heel and taking the two capillary blood samples. Blood sampling occurred within 10 min after the application of the three non-invasive methods. The sampled blood was protected from light and analysed within 30 min.

Data analysis

The statistical analysis was performed using SPSS (Version 11.5). We compared the different methods (non-invasive measurements versus serum bilirubin) using linear regression analysis, an intraclass correlation coefficient (ICC) [14, 18], a Bland-Altman analysis [4], and receiver operator characteristics (ROC) [10]. The ICC evaluates the level of agreement between different measurement methods. ICC=1 indicates perfect agreement while ICC=0 indicates no agreement at all. In contrast to the correlation coefficient (r) of the regression analysis, the ICC reflects systematic differences between methods [14]. For example, if one method always yields double the value of the other method, the conventional correlation coefficient shows a value of 1, which reflects the perfect linear relation. The ICC for such an example is, however, smaller than 1 and thus represents the mismatch. For the Bland-Altman analysis, Kramer zones (Fig. 1) and Minolta units (Table 1) were converted into µmol/l and the difference non-invasive values - serum values” plotted against serum values. This analysis was used to calculate the 95% confidence intervals (CI) (±2 standard deviations) and to investigate the influence of skin colour and ambient light. To compare the performance of the three non-invasive tests, sensitivity, specificity, and the area under the ROC curve [10] were calculated. The study protocol was approved by our local ethics committee and parental consent was obtained for each infant.

Results

Patients

A total of 140 infants born between 37 0/7 and 41 6/7 gestational weeks (median 39 weeks) were included in this study, of whom 92 were white (caucasian) and 48 were non-white (18 of Asian, 30 of Indian or African origin). The median birth weight was 3320 g (range 2050–4400 g). Ten infants (7%) were growth retarded with a birth weight <10th percentile.

Comparison between non-invasive methods and serum bilirubin

All three non-invasive methods correlated highly with serum bilirubin (Fig. 2) with the two bilirubinometers performing better than the clinical assessment. This result was also confirmed by the intraclass correlation (Table 2).

Fig. 2
figure 2

Results of non-invasive methods plotted against serum bilirubin values. White infants (open circles), non-white infants (solid triangles)

Table 2 Linear correlation coefficients (R2) and ICC for non-invasive methods compared with serum bilirubin values for white and non-white infants

The scattering for the Minolta instrument, quantified as ±2 SDs or 95% CI, was 4 units (equivalent to 56 µmol/l) for both white and non-white infants. The BiliCheck showed more variability (P=0.02) in non-white infants (±68 µmol/l CI) than in white infants (±52 µmol/l CI) (Fig. 3). Assessments made using the Kramer method had a 95% CI of ±1.5 zones (corresponding to ±75 µmol/l). The inter-observer agreement between clinical assessment by nurses and by the physician was significantly (P<0.05) better for white infants (kappa 0.56) than for non-white infants (kappa 0.36). Table 3 shows the sensitivity and specificity for all three non-invasive methods at various cut-offs for serum bilirubin values >250 µmol/l. The area under the ROC-curve was 0.98 for Minolta, 0.92 for Bilicheck over forehead, 0.88 for Bilicheck over sternum, and 0.84 for the Kramer method.

Fig. 3
figure 3

Bland-Altman-Plot: BiliCheck (forehead) in white (open circles) and non-white infants (solid triangles) versus serum bilirubin

Table 3 Sensitivity and specificity for detection of serum bilirubin >250 µmol/l by the method of Kramer (nurses) and the two transcutaneous methods (Minolta JM-102) and BiliCheck (on the forehead)

Influence of the ambient light

The Kramer method and the BiliCheck measurements over the sternum showed significantly less accuracy in daylight than in white fluorescent light (P<0.001 for Kramer, P<0.02 for BiliCheck). Measurements performed with the Minolta JM-102 instrument were not influenced by ambient light.

Discussion

Our study confirms the previously reported highly significant correlation between BiliCheck and Minolta-102 readings with total serum bilirubin values [3, 11, 16, 21]. The correlation of clinical assessment with serum bilirubin values for both nurses and the primary investigator is lower but still statistically significant. These results, based on linear regression analyses, were also confirmed by ICC, thereby ruling out systematic deviations.

BiliCheck applied to the forehead and clinical assessment by the physician were significantly less accurate in non-white infants than in white infants. The BiliCheck systematically underestimated serum bilirubin values by 25 µmol/l on average in non-white (P<0.001) infants (Fig. 3). This finding is in contrast with previous reports and should be investigated further. In the meantime, caution is indicated with transcutaneous bilirubinometers in infants with dark skin.

For clinical practice, the specificity and sensitivity of a test are considered to be the most useful parameters for identification or exclusion of patients with a disease (Table 3). As both can be varied—in opposite directions—by changing the cut-offs, a combined analysis such as the ROC area is best suited to compare the performance of a test. In the present study, carried out in term newborn infants, Minolta had the best performance (ROC=0.98), closely followed by BiliCheck over forehead (ROC=0.92) and over sternum (ROC=0.88), with clinical assessment providing a weaker, though still acceptable, performance (ROC=0.84). It should be noted that the presented results are valid only for term newborns. We have previously shown that all three non-invasive test performances were lower in preterm infants in the range of 34 0/7 to 36 6/7 gestational weeks [19].

Clinical assessment

Our study confirms the findings of Kramer [13] who reported a mean serum bilirubin increase of 50±37 µmol/l (±1 SD) for each dermal zone of jaundice in white and non-white infants. However, a closer analysis of the relation between dermal zones and serum bilirubin shows only a small increase of 13 µmol/l from zone 2 to zone 3. This means that the relation between these two variables is not linear but S-shaped (Fig. 2).

In our study all infants with jaundice progression below zone 3 (only head and upper trunk), assessed by nurses and by the physician, had serum bilirubin below 250 µmol/l. Therefore, for practical purposes this cut-off can be used to rule out hyperbilirubinaemia with very high confidence, eliminating the need to perform further investigations. In infants with jaundice progression to zones 3 and 4, the risk for hyperbilirubinaemia is 14% and 25% respectively. As a consequence, transcutaneous bilirubinometry or a serum bilirubin measurement should be performed in these infants.

Minolta JM-102

The Minolta (Airshield) JM-102 showed the best performance of all three non-invasive methods. However, the inconvenience of having to derive institutional based calibration factors, adapted to both the population and the reference method, remains a problem. Our action guide values were established with white infants (Table 1) and are approximately the same as those reported by Wick [20]. However, our reference range is about 20 µmol/l higher than that published by Harish and Sharma [11] who investigated a non-white population of 60 term newborns in India. The practical consequence from this difference is that skin pigmentation affects transcutaneous jaundice assessment.

With its high sensitivity, the Minolta JM-102 bilirubinometer is ideal for excluding infants without hyperbilirubinaemia (<250 µmol/l). This means that infants with a Minolta index below 22 need not endure a heel prick for serum bilirubin measurement. Of all infants with Minolta readings of 23 units or lower, only 6% had serum bilirubin levels >250 µmol/l and none had a serum bilirubin >300 µmol/l. Therefore, this cut-off can be considered safe provided that transcutaneous measurement is repeated every 4 to 6 h. It should be noted, however, that this is valid only for healthy term infants more than 36 h old and not for younger or for sick infants.

BiliCheck

Our findings confirm the previously reported good correlation of BiliCheck values with serum bilirubin values in healthy white term infants [8, 16, 17]. However, in infants with non-white skin, the BiliCheck instrument showed a significantly lower performance (Fig. 3), with wider CI and underestimation of serum bilirubin by 25±35 µmol/l, a finding similar to that already reported by Engle et al. [9]. Until this finding has been confirmed or refuted in a larger patient sample, BiliCheck values should be interpreted with caution in non-white infants.

In the daily clinical situation, a heel prick can only be avoided if the BiliCheck shows values below 190 µmol/l (sensitivity 94%, Table 3).

Another unexpected finding is the fact that BiliCheck measurements were affected by ambient light. For BiliCheck measurements over the sternum, the correlation with serum bilirubin was significantly less accurate with daylight than with fluorescent room light (P<0.018). The same tendency was observed over the forehead (P<0.184). We therefore suggest that transcutaneous bilirubinometer readings are dependent on the light conditions under which the measurements are performed. This hypothesis should be tested in a larger population.

Conclusion

All three non-invasive methods are well suited for estimation of serum bilirubin but have relatively large 95% CI. Non-invasive measurements may be affected by skin pigmentation and ambient light. In healty term newborns, hyperbilirubinaemia (>250 µmol/l) can be safely ruled out by eye if jaundice does not reach the abdomen or the extremities (Kramer zones 1 and 2), with <22 units (<230 µmol/l) for the Minolta instrument or with a cut-off of 190 µmol/l for the BiliCheck system. If respective thresholds are exceeded, serum bilirubin concentration should be measured.