Introduction

Near-infrared spectroscopy (NIRS) is a non-invasive monitoring technique that has been clinically used in the neonatal intensive care unit (NICU) for over a decade to continuously measure regional tissue oxygenation at the bedside. It may be a useful tool in critically ill neonates as a long-term trend monitor, assessing the balance of tissue oxygen delivery and consumption, providing cerebral and somatic oximetry values, and allowing earlier detection of hemodynamic and brain perfusion abnormalities.1,2

Near-infrared light is emitted from a light source, passes through the infant’s skin and subcutaneous tissue, and is partially absorbed by oxygenated and deoxygenated hemoglobin before being reflected to detectors on the same sensor. The tissue saturation level (rSO2) is calculated based on the ratio of arterial and venous blood (25:75 ratio) and the balance of oxygen delivery and consumption in the underlying tissue.3

Cerebral oxygen saturation (rScO2) is measured with a sensor placed on either side of the forehead. Values have been validated in neonates using jugular venous saturation.4,5,6 Interpretation of rScO2 measurements must take into account additional variables that may alter cerebral blood flow and oxygenation including systemic oxygenation (SpO2), cardiac output, anemia, carbon dioxide (CO2) tension, and metabolic demand.

Different types of sensors are available for clinical use, ranging from neonatal to adult sizes, with the majority of early neonatal NIRS literature describing the use of adult sensors.7,8 Recommended target ranges for rScO2 (55-85%) were established from population normative data using the small adult sensors and formed the basis for published and ongoing clinical interventional NIRS trials.7,8,9 Neonatal sensors were designed to better fit a newborn’s head and are now more commonly used in the NICU. Due to the shorter distances between the emitting light source and the detectors, there are known differences in measurements between the sensors. Generally, in vitro experimental models suggest the neonatal sensor has a linear correlation and reads approximately 10% higher than the adult sensor.10,11,12 This implies clinically significant differences, particularly in terms of cerebral hypoxia and hyperoxia cut-off values for regional tissue oxygenation, which may require intervention.8 However, there is a lack of clinical data to support this assumption, especially considering that studies directly comparing neonatal and adult sensors enrolled only a small number of neonates (n = 16).13

Therefore, it is essential that sensor differences be studied in a rigorous fashion in a prospective study of a neonatal population undergoing cerebral NIRS monitoring. We performed a two-center prospective observational study with the primary objective of determining the absolute difference in measurements of cerebral oxygenation in infants when using neonatal sensors versus adult sensors. This study was conducted in two level III/IV NICUs in Brazil and the US, where NIRS is routinely utilized in specific populations of neonates at risk for brain injury. We also investigated differences in sensor measurements under varying conditions of systemic hypoxia, bradycardia, and anemia as would be encountered in the clinical setting.

Methods

Study population

Infants were eligible for this study if they were admitted to the NICU and undergoing routine rScO2 monitoring at Lucile Packard Children’s Hospital Stanford, Palo Alto, CA, USA or at Irmandade da Santa Casa de Misericórdia de São Paulo, São Paulo, Brazil between November 2019 and May 2021. Per clinical guidelines at both centers, NIRS monitoring is performed in preterm infants <32 weeks’ gestation during the first 7 days of life or when a hemodynamically significant patent ductus arteriosus (PDA) is suspected. Monitoring is also done in term or preterm infants with congenital heart disease, hypoxic–ischemic encephalopathy (HIE), respiratory failure, hemodynamic instability or metabolic disorders.14 Exclusion criteria were skin integrity insufficient to allow placement of both neonatal and adult sensors. Approval of the institutional review board at each site was obtained, and written informed consent was required for participation.

Intervention

Standard placement of a neonatal cerebral sensor (INVOS™ OxyAlert™ Infant/Neonatal NIRSensor, IS, Medtronic) on the left or right forehead was matched with a small adult sensor (INVOS™ Small Adult SomaSensors, SAFB-SM, Medtronic) on the opposing side of the forehead for simultaneous measurement of rScO2. Two separate NIRS devices (INVOS™ 5100C, Medtronic, Minneapolis, MN) collected data in real-time: one calibrated for the neonatal sensor and the other calibrated for the small adult sensor. Mepitel® (Molnlycke, Gothenburg, Sweden) skin dressing was used under the sensors for skin protection as per unit policy. Measurements were continuously obtained for a 3-h period followed by exchanging the location of the neonatal and adult sensors (left to right side and right to left side) for a subsequent 3 h of monitoring. At the end of the monitoring period, the adult sensor was removed, and the neonatal sensor was left in place as per clinical team’s discretion. All data were downloaded for offline data processing. Demographic and perinatal variables including birth weight, gestational age, sex, antenatal steroid exposure, maternal race, Apgar scores at 1 and 5 min, small for gestational age status, and mode of delivery were recorded. During the study, healthcare providers at both sites used clinical guidelines published in the SafeBoosC phase II clinical trial to treat cerebral hypoxia/hyperoxia.15

Data processing

The raw time series data for each patient included rScO2 measurements every 3 to 7 s from the two NIRS sensors, as well as heart rate (HR) and SpO2 measures. Datapoints were flagged and discarded for non-physiologic measures (NIRS value of <15 or >95; HR ≥ 250 or =0; SpO2 = 0) or by an anomaly detection algorithm, which involved fitting a Loess regression to smooth data on a time scale of 2 minutes and removing data points of ≥3 standard deviations (SDs) (Supplementary Fig. 1). Data was synchronized by matching to the closest timepoint within 3 s and rolled up to the 1-min time scale by averaging.

Statistics

Descriptive statistics of the patient cohort were performed with means with SDs or medians with interquartile ranges (IQRs) for continuous variables and counts and percentages for categorical variables. For statistical modeling of time series data, a linear regression model with a generalized estimating equation (GEE) approach was used to account for the temporal autocorrelation within patients. The primary focus was to examine calibration of the sensors against each other: specifically, given a neonatal sensor measurement, the expected value from the adult sensor would be determined. To address this question, a linear GEE model was trained and also tested for possible non-linearities by adding logarithmic and quadratic terms. To study the impact of laterality of sensor position, as well as patient demographics, interaction terms were added to this linear model.

Measurement error in the adult sensor may lead to regression dilution bias and underestimation of the slope of the relationship between the two sensors, which was corrected using the method of Rosner et al.16 We estimated the correction factor λ by comparing within-patient variation to between-patient variation in the adult sensor measures. This comparison suggested for the linear model a correction factor of λ = 1.22 ± 0.06, corresponding to a measurement error of about 5 percentage points in either direction associated with the adult sensor. A simulation study suggested that this method is effective for this parameter regime.17 Confidence intervals combining the error from the regression estimate with the error from the estimate of λ were constructed using published techniques.18 We conducted two sensitivity analyses to confirm the robustness of this correction procedure, shown in Supplementary Figs. 2 and 3. First, we used a Demming regression,19 which is an error-in-variables model, and we assumed the two sensors had commensurate levels of measurement error (δ = 1). To handle autocorrelation, we randomly sampled datapoints from between 30 to 90 min apart, and to determine confidence intervals we bootstrapped this entire procedure. Second, we conducted a standard Bland-Altman approach to test for systematic miscalibration and level-dependence between the two sensors. Our results were robust to both of these sensitivity analyses. Finally, Supplementary Fig. 4 shows the calibration of the two sensors compared by left/right placement of the sensors.

The analytic plan for our primary analysis was registered publicly at https://osf.io/5rfws prior to conducting the analysis, although correcting for regression dilution bias was not considered in the original analysis plan. Data cleaning was conducted in R version 4.1.2. All regression models were trained in Python version 3.8.5, using the statsmodel package.

Results

Population

Forty-four infants were enrolled over an 18-month period. Demographic and perinatal characteristics, as well as indications for NIRS monitoring are shown in Table 1. The majority of infants were not receiving inotropic support, sedation, or anti-epileptic medications at the time of monitoring, and only 4 infants (9%) had a concomitant diagnosis of severe intraventricular hemorrhage.

Table 1 Demographics and perinatal characteristics.

Relationship between neonatal and adult sensor

The neonatal sensor on average demonstrated higher values than the corresponding adult sensor as shown in the case example in Fig. 1. The relationship between the adult and neonatal sensor values was modeled as a linear equation with mean and 95% confidence intervals shown in Fig. 2, with similarity of models considering data at the patient-level or data from all time points. Note that there was considerably more noise at the time point-level: the standard deviation of the model residuals was 7.2 percentage points for the time point-level model compared to 4.5 for the patient-level model. While the two sensors can be discordant at any given time point, they were strongly correlated when averaged over a longer time period. Table 2 provides conversion values between neonatal and adult sensor readings for a range of values. At the traditional thresholds of intervention for cerebral hypoxia (55%) or hyperoxia (85%), there were notable discrepancies between the neonatal and adult sensor differences (Fig. 3). For an adult sensor value of 55%, the neonatal sensor values demonstrated a somewhat normal distribution around a slightly higher mean value of 58.8 ± 4.0%. However, for an adult sensor value of 85%, the neonatal sensor values were primarily clustered at a median value of 95% (the highest measurable value displayed by the NIRS device) with an IQR of 92.4-95.0%. At a clinically acceptable, mid-range adult sensor value of 70%, the corresponding neonatal sensor values demonstrated a higher median of 79.7% (IQR 73.7–83.7%).

Fig. 1: Sample NIRS tracings.
figure 1

An example of the time series data: this patient initially had an adult NIRS sensor on the right side of their forehead and a neonatal sensor on the left; the sensors were switched after 103 min. Note that rScO2 in the 50–60% range are similar between neonatal and adult sensors, while there is larger difference when readings are higher.

Fig. 2: Comparison of the neonatal and adult sensors.
figure 2

a The results of the patient-level model. The rScO2 values for each patient were averaged over the monitoring period; this model compares the average neonatal sensor reading with the average adult sensor reading. The dotted line represents the best fit linear formula, with 95% confidence interval shown in gray. b The results of the time point-level model. The linear GEE model results are corrected for regression dilution bias and are shown by the dotted line and the gray region.

Table 2 Conversion table.
Fig. 3: Neonatal sensor readings at certain fixed values of the adult sensor.
figure 3

Histograms of the neonatal sensor reading when the adult sensor reads 55% (a), 70% (b), and 85% (c). Best fit from the time point-level GEE model for the mean neonatal sensor reading when the adult sensor is at the specified value.

No significant differences in rScO2 values were found with sensor placement on the left versus right forehead with either the neonatal or the adult sensor, when infants with IVH were excluded from this analysis. Similarly, the calibration between the neonatal and adult sensors did not change based on side of sensor placement (Neonatal sensor: Left side, mean (SD) = 72.9% ± 4.0%, Right side, mean (SD) = 71.6% ± 4.8%, p = 0.25; Adult sensor: Left side, mean (SD) = 63.0% ± 3.7% Right side, mean (SD) = 64.0 ± 3.1%, p = 0.33).

Sensor differences under varying clinical conditions

Under conditions of systemic hypoxia with SpO2 < 80%, rScO2 was lower than under conditions of normoxia for both the neonatal sensor (66.5 ± 4.3% vs. 70.9 ± 4.1%, p = 0.004) and for the adult sensor (60.4 ± 3.4% vs. 62.7 ± 3.1%, p = 0.03). However, differences between the neonatal and adult sensor readings under hypoxic conditions were smaller compared to differences under normoxic conditions (Fig. 4a): the mean difference in rScO2 between neonatal and adult sensors was 6.1 ± 2.0% during hypoxic conditions and 8.3 ± 1.9% under normoxic conditions with SpO2 ≥ 80% (p = 0.05).

Fig. 4: Comparing the two sensors under various patient conditions.
figure 4

a Box and whiskers plots of the distribution of values for both the neonatal and adult NIRS sensors, by the patient’s simultaneous SpO2. Median values shown by horizontal black lines. GEE model means are superimposed in white text, and the average difference between the two sensors is displayed in each box. The two sensors give readings that are more concordant for patients with low SpO2 than for patients with higher SpO2 (p = 0.05). b When classified by the patient’s simultaneous heart rate, heart rate does not significantly affect the difference between the two sensors. c When classified by the patient’s most recent hematocrit level, hematocrit levels did not significantly affect the difference between the two sensors.

Minimal differences in rScO2 were seen among periods of bradycardia (HR < 80 bpm), periods of normal heart rate (HR 80-180 bpm), and periods of tachycardia (HR > 180 bpm). While the neonatal sensor on average displayed higher values than the adult sensor, HR range did not significantly impact this relationship (Fig. 4b).

More anemic infants with hematocrit <35% demonstrated lower rScO2 compared to those with hematocrit >45%, independent of sensor type (neonatal versus adult). However, the differences between neonatal and adult sensor readings were not significantly different at various hematocrit ranges (Fig. 4c).

NIRS measures did not differ as a function of demographic variables including gestational age at birth, birth weight, and sex. The impact of other perinatal conditions including indication for NIRS monitoring or the presence of IVH were similarly not significant and were not adjusted for in the calibration model.

Discussion

Our study found a difference in rScO2 between the adult and neonatal NIRS sensors using INVOS™ 5100C (Medtronic, Minneapolis, MN) device. Neonatal sensors consistently displayed higher values than adult sensors, but the difference varied depending on the absolute value of rScO2. For an adult sensor reading of 85% (cerebral hyperoxia threshold), the neonatal sensor values were predominantly grouped at 95%. For an adult sensor reading of 55% (cerebral hypoxia threshold), the neonatal sensor values showed a normal distribution around a higher, mean value of 58.8% ± 4.0%. We found no difference when comparing sensor placement on the left versus right forehead. Systemic hypoxia (SpO2 < 80%) reduced the disparity between neonatal and adult sensor measurements, but heart rate and hematocrit level did not appear to have a significant impact on the relationship between the neonatal and adult sensor values.

Our findings differ from previous studies that described a fixed difference between neonatal and adult sensor readings, with neonatal sensors consistently reading approximately 10% higher. Dix et al. compared readings from 16 neonates with the INVOS™ 5100 C adult and neonatal sensor, measuring periods of 1 h each.13 The authors reported a close relationship between the two sensors (r = 0.88, p < 0.001) and an average difference of 10 ± 5%. Sorensen et al. used an in vitro model with a blood-lipid phantom that consisted of a mixture of isotonic saline, erythrocyte suspension, and Intralipid® 200 mg/ml. The INVOS adult and pediatric sensors were linearly correlated with the pediatric sensor reading systematically higher than the adult sensor (y = 0.96x + 17.91; r2 = 0.99).10 In a similar study, Kleiser et al. also utilized a blood-lipid phantom model mimicking the neonatal brain to establish a relationship between oxygenation values acquired using various oximeters and sensors. Specific intervention thresholds corresponding to rSO2 = 55% and 85% (measured by the INVOS adult oximeter) for a typical neonate with total hemoglobin concentration (ctHb) of 45 μM were estimated. Calculated neonatal rScO2 hypoxic and hyperoxic thresholds were 63% and 96%, respectively.12

Technical aspects such as differences in processing algorithms or scattering subtraction are a potential explanation for higher neonatal rScO2 values compared with adult sensors. The level dependence of the discrepancy may be explained due to a limitation of processing algorithms and different absorption properties of de-oxygenated hemoglobin. The INVOS™ 5100C (Medtronic) sensors use light-emitting diodes to emit near-infrared light of two wavelengths (730 and 810 nm). Two detectors are located next to the light-emitting diodes. By subtracting the shallow (shorter) signal from the deeper (further) signal, surface interference is minimized.20,21 Both adult and neonatal sensors from the INVOS™ 5100C device used in this study have two wavelengths and similar source-detector separation of 3 cm and 4 cm. However, the neonatal sensor was designed to have higher sensitivity, and the processing algorithm was adjusted to boost signal intensities transmitted through the infant’s thinner skull.22

Although the blood-lipid phantom experiment represents a robust in vitro model to analyze several components of rScO2 variability, clinical data is important to define the normal range of rScO2 values in neonates. The cerebral hypoxia threshold of 63% for the neonatal INVOS sensor is being utilized in the treatment guideline for SafeBoosC III, the ongoing largest randomized, pragmatic phase III clinical trial investigating interventions for cerebral hypoxia in preterm infants below 28 weeks gestational age to decrease the composite outcome of severe brain injury or death at 36 weeks postmenstrual age.9 This cerebral hypoxia threshold was established from preterm population-based rScO2 data of 55% using a small adult INVOS sensor10 and subsequently extrapolated to 63% based on the in vitro blood-lipid phantom model.12 Analyses of SafeBoosC III data may provide additional guidance on the effectiveness of this cerebral hypoxia threshold. The use of higher hypoxic thresholds, based on a small number of human comparative studies and in vitro models, may lead to unnecessary and potentially harmful interventions (e.g., increase in oxygen administration, red blood cell transfusion, volume boluses, or inotropes) and ultimately affect patient outcomes. In a separate multicenter study using neonatal sensors, Chock et al. studied the association between rScO2 values and the adverse outcome of death or severe neuroradiographic abnormalities. They found rScO2 < 50% was associated with this adverse outcome (area under the curve, 0.76). The use of risk based normative values rather than population norms may be an alternative approach to reduce unnecessary interventions.23

The differences in the hyperoxia threshold values also have several implications. In infants with HIE, high rScO2 value can be explained by low energy metabolism after severe brain injury. Previous studies using adult sensors described association between supranormal rScO2 at 24 h of life with death and adverse neurodevelopmental outcomes in this population.24,25,26 Since the difference between adult and neonatal sensors rScO2 measurements can be over 10% when absolute rScO2 is high, the sensitivity and specificity of these findings may significantly change when using neonatal sensors.

Our findings related to sensor placement on the left versus right forehead agree with previous literature.22,27 Lemmers et al. conducted a prospective study simultaneously monitoring rScO2 in 36 very preterm neonates during the first 3 days of life. Authors found a close correlation between left and right rScO2 (r = 0.89, p = 0.01). During stable and normal SpO2, differences between left and right NIRS-monitored rScO2 rarely exceeded 7%. This pattern was affected by an unstable arterial oxygenation pattern with substantial drops of SpO2, during which differences between left and right SpO2 values up to more than 10% were observed.27 Our study was unable to replicate this finding as we did not compare left-to-right differences with the same type of sensor during periods of unstable arterial oxygenation.

The reduced disparity between neonatal and adult sensor measurements during hypoxia may be explained due to the noted closer correlation between these sensors’ measurements when rScO2 values were lower. Previous studies have shown that low blood pressure and bradycardia, independent of hypoxemia, commonly affect regional tissue oxygenation.28 Decreased heart rate may lead to lower cardiac output and regional tissue oxygenation. Similarly, anemia may result in lower rScO2, although it remains unclear if there is a hematocrit threshold below which rScO2 is notably impacted.29 In our study, conditions of bradycardia and hematocrit <35% did not significantly decrease rScO2 to the same extent as systemic hypoxia.

To the best of our knowledge, this clinical study describes the largest cohort of infants being monitored simultaneously with adult and neonatal NIRS sensors, with approximately 4500 time points analyzed per infant. Additionally, our study evaluated the sensor discrepancies in a wide range of clinical scenarios and populations (prematurity, HIE, congenital heart disease, PDA), including periods of hemodynamic and ventilatory instability, which is important for generalization of results to a broader NICU population.

Our study has several limitations. First of all, we evaluated sensor readings from a single NIRS device. Several previous studies report significant differences between absolute values across different types of NIRS monitors.12,13,30,31 The discrepancy may even be higher when oxygenation is low. Andresen et al. monitored rScO2 in 10 preterm infants during apneic episodes with neonatal sensors from the INVOS™ 5100C and Nonin SenSmart™ X-100. The individual regressions displayed large and statistically significant variations in both infants and adults, suggesting that different NIRS devices give very different estimates when the oxygenation is low.32

Additionally, our study did not control for ctHb levels during monitoring periods. NIRS measures the average concentrations of oxyhemoglobin and deoxyhemoglobin, and previous studies showed pronounced dependence of sensitivity of oximeters based on the different ctHb.33,34 Kleisser et al. using a blood-lipid phantom model, revealed a ctHb dependence at the SafeBoosC intervention thresholds.12 At the hypoxic threshold rSO2, INVOS adult and neonatal sensors showed dependence on ctHb, with an uncertainty range of 9.2%. While low ctHb level may influence oxygen delivery and extraction, it is unlikely that the effect was significant in our study as most recent hematocrit, measured within a maximum of 82 h (mean 19.3 h) from NIRS data collection, did not affect the difference in readings between sensors.

In conclusion, this study adds relevant information regarding differences in rScO2 using neonatal and adult sensors in a large cohort of neonates monitored with a single device. Marked variability in differences during high and low rScO2 readings was noted, with approximately 10% difference when adult sensors read 85%, but nearly similar (58.8%) readings when adult sensors read 55%. These findings raise a concern that estimating fixed differences of approximately 10% between adult and neonatal sensors may lead to an inaccurate diagnosis of cerebral hypoxia and result in subsequent unnecessary interventions. Further technical investigations into why the adult and neonatal sensors read differently are needed and future studies with neonatal NIRS monitoring should evaluate the association between optimal cerebral hypoxia thresholds and outcomes in larger clinical trials.