Introduction

Neonatal jaundice is common, affecting 60–85% of term infants. Even though it is usually mild and does not need treatment, it must be controlled in order to avoid potential negative consequences [1, 2]. Pathological jaundice is the most common cause of readmission to the neonatal unit [3, 4]. The current tendency to early discharge newborns after birth raises concern about a potential underdiagnosis of pathological jaundice after discharge. Preventive interventions for the early detection of neonatal hyperbilirubinemia include the determination of transcutaneous total bilirubin [2,3,4,5]. Jaundice meters have been widely used and validated for monitoring bilirubin levels in the newborn for many years. The first generation of bilirubinometers showed an accuracy strongly limited by skin pigmentation; however, their reliability has improved over the past few years [6]. Their use is painless for the newborn and cheaper than measuring serum bilirubin [7, 8]. However, some controversies remain as to its reliability depending on gestational age, ethnic group or skin color, and exposure to phototherapy [9,10,11,12,13,14,15,16].

We treat a multiethnic population at our hospital; therefore, newborns have varying skin tones. Neonates have been classified depending on ethnicity, related skin tones, or following cosmetic charts in the literature. However, a neonatal skin color scale based on real skin tone regardless of ethnicity or country of origin does not exist to date [11,12,13,14,15,16,17]. In some cases, different skin colors have been demonstrated to interfere with transcutaneous bilirubin measurements; therefore, we believed that a skin color classification of neonates would help physicians to reassure the reliability of the transcutaneous bilirubin measurements [13, 14]. We created our own scale of four skin tones based on the real skin colors of newborns in the first 24 h of life, regardless of their parents’ origin and conducted this study in order to validate our own neonatal color scale that we decided to call "Neomar neonatal skin color scale".

Our objectives were to assess agreement in skin color assignment determined by our specific neonatal skin color scale between different observers, and to compare the skin color assignment with the values of melanin obtained by the colorimeter Mexameter® MX 18.

Patients and methods

This was a prospective, observational study that aimed to validate a specific neonatal skin color scale which we developed (Fig. 1). We compared inter-observer color assignment according to our neonatal skin color scale with the objective measure of skin melanin with a spectrophotometer.

Fig. 1
figure 1

Neonatal skin color scale

All our newborns were eligible to participate if their parents agreed to and signed a written informed consent (Appendix). We expected that the number of patients in each skin tone group would be balanced except for the “Dark” group which would perhaps account for less than a quarter of the newborns given the ethnic characteristics of our population.

We created a skin color scale of 4 groups based upon the newborn’s chest skin and nipples/areolae compared with photographs of neonatal chests (two patients per color) taken at 24 h of life. These eight photographs were taken in our neonatal unit in the morning, with ambient light, in the same place, with the same camera in automatic mode, without flash and at the same distance (15 cm) and angle (90°). We chose two patients of each color to create our color scale, but we did not take photos of the enrolled patients. We used a Canon PowerShot S95 digital camera (Canon Inc., Tokyo, Japan) ISO 3200 with a resolution of 10.0 megapixels.

After a brief training on the use of the new color scale, all doctors, nurses, and nurse assistants who work in our neonatal unit participated in this study by collecting cases and registering data. Patient recruitment started in October 2016 and finished when we enrolled the targeted number of patients (258 cases). Two observers, who were blinded to each other’s decision and to melanin index, assigned every newborn to a color group at 24 h of life according to our skin color scale. Color assignment was done in our neonatal unit, which is artificially illuminated; therefore, the environmental light conditions were the same regardless of the time of day. Our neonatal unit ambient light intensity is 100–120 lux.

At the moment of color assignment, skin melanin and erythema indices were measured in the sternal region with a non-invasive technique using the skin colorimeter Mexameter® MX 18 (Courage + Khazaka electronic GmbH, Köln, Germany). This reflectance meter is based on absorption/reflection of the light from the skin. Its probe emits three specific light wavelengths: 568, 660, and 870 nm, which correspond to green, red, and infrared light, respectively. A receiver measures the light reflected by the skin. As the quantity of emitted light is defined, the quantity of light absorbed by the skin can be calculated. The Mexameter® MX 18 provides the measurement of absorbed and reflected light at green and red wavelengths for hemoglobin and wavelengths in the red and the near-infrared for melanin. A melanin index is computed from the intensity of the absorbed and the reflected light at 660 and 880 nm, respectively. An erythema index is computed from the intensity of the absorbed and the reflected light at 568 and 660 nm. It is only necessary to apply the probe on the skin for 1 s. Measurements are not affected by other pigments such as bilirubin. The results are expressed as melanin and erythema indices in arbitrary units on a scale from 0 to 999. Measurement uncertainty is ± 5% [18, 19]. All measurements with the Mexameter® MX 18 were performed in the same room (our neonatal unit) with no daylight and under controlled ambient conditions (26 ± 2 °C).

Transcutaneous bilirubin was measured in the sternal region in all the participants at moment of color assignment (24 h after birth) by means of a jaundice meter (Dräger Jaundice Meter JM-105, Minolta, Dräger 157 Medical GmbH, Lübeck, Germany).

All newborns were in a quiet state, not receiving phototherapy, with intact skin and good peripheral perfusion at the moment of color assignment and of melanin, erythema, and transcutaneous bilirubin measurements.

Statistical analyses

We calculated that we needed a minimum sample size of n = 249 in order to obtain a Kappa index ≥ 0.7 taking into account a precision of 0.1 (which is the 95% confidence interval) and a p1 of 0.2 and p2 of 0.3 (which are the proportions that the observers 1 and 2 would give positive in a certain color category). We used Stata (STATA 12.0, College Station, Texas, USA) to calculate sample size and to perform the statistical analyses.

To analyze inter-observer agreement in the classification of neonatal skin color with our scale, we measured the Kappa index agreement and its 95% confidence interval. We also calculated inter-rater agreement with the intraclass correlation coefficient (ICC).

The Kendall tau-b correlation coefficient was used to test the correlation between our color scale and the Mexameter® MX 18 and between our color scale and the erythema index and transcutaneous bilirubin. We also compared the melanin index, erythema index, and transcutaneous bilirubin in the different skin color groups with ANOVA and Sidak post hoc tests. We also modeled the variable “skin color” as a function of the melanin and erythema indices and the transcutaneous bilirubin. Taking into account that our outcome is an ordered outcome in terms of color scale, we performed an ordered logistic regression taking color = 1 as the base.

Our hospital Ethics Committee accepted and approved this study. We obtained a written informed consent from the participants’ parents prior to enrolling them in the study. After creating our own neonatal skin color scale (see details in Fig. 1), we classified newborns according to this scale.

Results

We enrolled 258 healthy newborns in the present study: 112 from color 1, 98 from color 2, 40 from color 3, and 8 from color 4. Most were full-term infants who were not under phototherapy at the time of color assignment and melanin/erythema index measurement. Table 1 describes characteristics of our population and color groups. There were no differences among the four color groups in terms of gender, gestational age, birth weight, birth length, type of delivery, prematurity rate, or feeding choice. There were significant differences in the melanin index between the four groups (Kruskal Wallis, p = 0.000). There were no differences in the erythema index among the groups (Kruskal Wallis, p = 0.431). There were no differences in the levels of transcutaneous bilirubin among color groups 1, 2, and 3. There were differences between colors 1, 2, and 3 compared with color 4 (Kruskal Wallis, p = 0.029).

Table 1 Characteristics of our study population. SD, standard deviation

During the study, failures in the Mexameter® MX 18 did not allow us to obtain data for long periods, so the recruitment took longer than planned, a total of 2 years. The Kappa value was 0.73 (CI 95% 0.66–0.81, p < 0.001) for inter-observer agreement for color assignment according to our skin color scale. Thus, the agreement was substantial, as they agreed on 83% of the neonates. The ICC was 0.878 (CI 95% 0.847–0.904, p < 0.000).

Levels of mean melanin were significantly different (p < 0.001) among the 4 color groups (Fig. 2). Mean melanin index was 107 for color 1, 139 for color 2, 171 for color 3, and 278 for color 4. The Kendall tau-b correlation between our color scale and the Mexameter® MX 18 was 0.447 (p < 0.000). This indicates a moderate association between x (color scale) and y (Mexameter® MX 18).

Fig. 2
figure 2

Melanin and erythema indexes and transcutaneous bilirubin depending on color group

The Kendall tau-b correlation between our color scale and erythema was 0.053 (p NS) and between our color scale and transcutaneous bilirubin, 0.099 (p 0.043). Thus, those parameters show no relevant correlation. Erythema index and transcutaneous bilirubin did not show significant differences between the different skin color groups (Fig. 2 and Tables 2 and 3), whereas melanin index was significantly different (Fig. 2 and Tables 2 and 3). Moreover, in the ordered logistic regression, the color group classification was only significantly associated with melanin. Taking color = 1 as the base, the odds ratio (OR) for melanin was 1.02 (CI 95% 1.01–1.03, p < 0.001).

Table 2 Melanin and erythema indices and transcutaneous bilirubin at the moment of color assignment (24 h after birth) depending on color group. n, number of patients; SD, standard deviation
Table 3 Sidak post hoc test p values after comparing melanin index, erythema index, and transcutaneous bilirubin at the moment of color assignment depending on color group

Discussion

The accuracy and early diagnosis of pathological jaundice in a hospital with early discharge policies and a multiethnic population with varying skin tones, such as ours, is of great importance due to the risks of underdiagnosis [1,2,3,4,5]. Therefore, the reliability of the transcutaneous determination of bilirubin, although improved in the recent years [6], must be analyzed in these multiethnic populations because there are still some controversies on the impact of skin color [11,12,13,14,15,16]. In order to do so, we should be able to classify newborns depending on skin color.

Classifying neonates into ethnic groups is complicated, imprecise, and unreliable, due to the variability of skin color within the same ethnic group and an overlapping of skin types. Few studies have been done to measure the skin color of neonates, and there are not many studies which evaluate the impact of skin color on the accuracy of bilirubinometers in multiethnic populations [11,12,13,14,15,16,17, 20]. Currently available color scales are based on adult skin, which has been exposed to the sun, like the Fitzpatrick skin type chart. A reliable classification of skin phototype or skin color or tone does not exist for the newborn. The only two studies which classified newborns by skin color used color references from registered facial cosmetic brands to define three subgroups (light, medium, and dark) [15, 16]. One study evaluated the effect of skin pigmentation on the accuracy of pulse oximetry in infants with hypoxemia classified skin color using the Munsell System Soil Color Chart (2009 Revision, Munsell Color, Grand Rapids, Michigan), Hue 7.5YR [17]. However, all these colors differ greatly from the real neonatal skin tones; therefore, we decided to create and validate our own scale of four skin tones based on the real skin colors of newborns in the first 24 h of life. We believed that by classifying newborns according to their skin color, we would increase the reliability of transcutaneous bilirubin determination in each color group, and therefore avoid unnecessary blood tests as well as underdiagnosed pathological jaundice.

In order to validate our scale, we had to assess inter-observer reproducibility and agreement and also consider that actual skin color is affected by many substances. The main determinant of skin color is the pigment melanin, followed by hemoglobin. This is why we focused our study on their measurement, by means of the melanin index and the erythema index. The device we used, the Mexameter® MX 18, determines both pigments. The Mexameter® MX 18 is one of the two most used devices in dermatological research [19] and has been used in prior studies to assess skin color. Matias et al. compared skin color analyses by the Mexameter® MX 18, Antera® 3D, and Skin-Colorimeter® CL 400 in a sample of 30 adults who were exposed to a controlled ultraviolet B light [19]. Md Isa et al. measured skin color in first trimester pregnant women in Malaysia and compared the reliability of the Fitzpatrick skin type chart to the Mexameter® MX 18 [21]. Park used the same device to measure skin color of neonatal infants [20].

A limitation of our study is that we did not focus on other minor skin chromophores such as reduced hemoglobin, collagen, carotenes, and vitamins (e.g., riboflavin) [8, 22, 23]. However, melanin and hemoglobin are the main factors that determine skin color in healthy newborn infants who have not been exposed to food, chemicals, medication, or a stimulant such as sun light [20]. We considered that reduced hemoglobin would be irrelevant in our study because all the participants were healthy babies and had a normal oxygenation. We found differences in transcutaneous bilirubin between color type 4 and the others, but we believe that this is due to the small sample size of color 4 neonates and we would rather not draw conclusions from this. We did not measure serum bilirubin at 24 h of life when assessing skin color. However, we did measure serum bilirubin at 48–72 h with preliminary results indicating a good correlation with transcutaneous bilirubin. These results are anticipated to be published soon.

In the assessment of reproducibility and inter-rater reliability, we measured inter-observer agreement both with the intraclass correlation coefficient (ICC) and with the Kappa statistic. We obtained a good ICC 0.878 (CI 95% 0.847–0.904, p < 0.000) [24]. The result of the Kappa index of 0.73 is considered to be “substantial” according to Cohen, and is accepted as a good agreement for healthcare research because it is over 0.60. However, a better agreement with, for example, a Kappa above 0.81 would be better since it would be considered to be “almost perfect” [25].

We believe that our study confirms both of our hypotheses: our scale allows the classification of neonates according to skin color with a good inter-observer reproducibility and it correlates well with the melanin index (light skin has a lower melanin index than dark skin). More studies will be required to further validate the scale in other hospitals and to assess its clinical usefulness. We are currently conducting an observational study to assess correlation of transcutaneous and serum bilirubin depending upon skin color to reaffirm the clinical usefulness of our neonatal skin color scale.

Conclusions

Our proposed skin color scale correlates well with the level of melanin in neonatal skin at 24 h of life for colors 1, 2, and 3. A larger sample of color 4 neonates would allow us to obtain more reliable results for this group. Our results show that the only chromophore different among our four color groups is melanin, as there are no differences between the erythema index and the levels of transcutaneous bilirubin. This new neonatal skin color scale is reproducible and can be readily used to classify neonatal skin color. We are currently completing additional research that aims to establish the clinical relevance of the use of our neonatal skin color scale in order to increase the accuracy of the transcutaneous bilirubin determination.