Abstract
In neonatal intensive care units (NICUs), 87.5% of alarms by the monitoring system are false alarms, often caused by the movements of the neonates. Such false alarms are not only stressful for the neonates as well as for their parents and caregivers, but may also lead to longer response times in real critical situations. The aim of this project was to reduce the rates of false alarms by employing machine learning algorithms (MLA), which intelligently analyze data stemming from standard physiological monitoring in combination with cerebral oximetry data (in-house built, OxyPrem). Materials & Methods: Four popular MLAs were selected to categorize the alarms as false or real: (i) decision tree (DT), (ii) 5-nearest neighbors (5-NN), (iii) naïve Bayes (NB) and (iv) support vector machine (SVM). We acquired and processed monitoring data (median duration (SD): 54.6 (± 6.9) min) of 14 preterm infants (gestational age: 26 6/7 (± 2 5/7) weeks). A hybrid method of filter and wrapper feature selection generated the candidate subset for training these four MLAs. Results: A high specificity of >99% was achieved by all four approaches. DT showed the highest sensitivity (87%). The cerebral oximetry data improved the classification accuracy. Discussion & Conclusion: Despite a (as yet) low amount of data for training, the four MLAs achieved an excellent specificity and a promising sensitivity. Presently, the current sensitivity is insufficient since, in the NICU, it is crucial that no real alarms are missed. This will most likely be improved by including more subjects and data in the training of the MLAs, which makes pursuing this approach worthwhile.
D. Ostojic and S. Guglielmini have contributed equally for this chapter.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
1 Introduction
Preterm infants are not able to regulate their oxygen levels reliably. Hypoxia or hyperoxia may lead to injury, in particular of the brain, which may lead to long-term disabilities. Therefore, oxygen saturation as the most commonly monitored proxy for oxygenation and vital parameters are continuously measured in neonatal intensive care units (NICUs). If the oxygen saturation is out of the set range, an alarm sounds and the staff reacts accordingly by adjusting the oxygen supply or stimulating the infant for breathing. However, 87.5% of the alarms have been shown to be false alarms , which are mostly caused by movements of the infants [1]. False alarms increase the stress for patients, parents, and staff. In particular, if an infant creates many false alarms and suddenly a real critical situation occurs, the response time of the caregivers may be increased. This is due to a phenomenon known as ‘alarm fatigue’ [2,3,4] or as the ‘crying-wolf syndrome’ [5]. Alarm fatigue is a real safety concern and may harm the patients [2,3,4]. One option to solve this problem is to harness technology to suppress false alarms, for example the Fuzzy-logic, a multivalued logic for classification of mathematical objects with blurred boundaries. To caregivers, the vital signs are signals with fuzzy boundaries. For example, a value of SpO2 slightly below its 87% threshold may mean the patient is still doing fine and the alarm can be dismissed as false. Using the Fuzzy-logic based system [1], a suppression of 99.4% of all false alarms was demonstrated. Unfortunately, neither Fuzzy logic nor machine learning algorithms (MLAs) have been implemented in monitoring devices in NICUs, due to insufficient accuracy and legal issues [6, 7]. MLAs are a promising modern approach that may solve the problem of false alarms in NICUs. The aim of our project, therefore, was to test how efficiently MLAs are able to reduce the rates of false alarms by intelligently analyzing data from standard physiological monitoring and from cerebral oximetry.
2 Methods
Data of 25 preterm infants were recorded in the NICU of the University Hospital Zurich (Switzerland) under a declaration of ‘no objection’ by the Ethics Committee of Zurich (KEK-ZH, Req-2016-00720). Parental consent was obtained prior to enrolment. Data of 11 participants were excluded due to missing actionable alarms, or events labeled by the caregivers as real alarms, or due to poor signal quality. The demographic data of the remaining 14 subjects are shown in Table 1.
The standard monitoring device (SMD), the Infinity Delta XL (Dräger, Germany), provided arterial oxygen saturation (SpO2) and the heart rate (HR). The data were streamed for (mean) 55 min via the serial port at a time resolution of 0.5 s. Simultaneously, our in-house built near-infrared spectroscopy (NIRS) oximeter (OxyPrem) [8] measured cerebral oxygen saturation (StO2) from the parietal cortex.
An additional research SpO2 sensor [9], on the opposite fronto-parietal side, measured HR and SpO2. The alarm limits of the SMD were set to: SpO2 < 87%, SpO2 > 95% (only for assisted ventilation or additional O2 supply), HR < 80 bpm and HR > 250 bpm. These limits were frequently crossed and many alarms were created. The staff labeled these alarms as true (C1) or false (C0) according to their professional judgment (an example is shown in Fig. 1 and Table 2).
The pre-processing of the 14 datasets was performed in two steps: (i) the research SpO2 sensor caused interferences on StO2 (NIRS) and hence StO2 was filtered to reduce this noise (Fig. 2), (ii) data were of different sampling rates and were resampled to the same rate of 2 Hz.
The HR and SpO2 of the SMD and the StO2, the oxy- and deoxy-hemoglobin concentration ([O2Hb] and HHb]) of the OxyPrem were directly employed as MLA features. Three additional features that represent the area between the StO2, SpO2 and HR signals and their respective alarm limits were calculated. The rationale for this is that the severity of a deviation depends on both the time and the depth. This also represents the concept of the caregivers. In addition, the last feature corresponds to the phase difference between the [O2Hb] and the [HHb].
To simplify the classifier’s complexity and improve the accuracy of the classification, the most relevant features were automatically identified by a feature selection step. First, the Fisher’s discriminant ratio [10] reduced the set from 10 to 6 features. The Sequential Backward Selection [11] decreased the set further to 4 features: Area SpO2, HR, SpO2 and StO2. The feature selection was carried out on the individual subject level, and in two subjects solely HR and SpO2 were selected. The resulting reduced feature sets were used to train and test the four supervised machine learning algorithms : (i) decision tree (DT), (ii) 5-nearest neighbors (5-NN), (iii) naïve Bayes (NB) and (iv) support vector machine (SVM). Each was implemented in the MATLAB Statistics and Machine Learning Toolbox. The learning phase was repeated 10 times. The classifier was tested each time on different data. To avoid overfitting [12], a ten-fold cross-validation procedure was employed, i.e. in a rotational manner, nine subsets were used as a training set and the remaining one as a test set.
3 Results
During a mean measurement time of 55 min in 14 subjects, 27 true and 578 false alarms occurred. The real alarms occurred at a rate of median 1.5 per measurement. The most important features to discriminate the type of alarm in the classification phase were the area below the SpO2 threshold, HR, SpO2 and StO2, in this order. To evaluate the performance of classifications the accuracy (percentage of correctly classified instances over the total number of instances), sensitivity (percentage of situation classified as real alarms that actually were the real alarms), and specificity (percentage of situations reported as false alarms that actually were the false alarms) are shown in Fig. 3 and Table 3. These values are based on the average of the 10 iterations. It is noteworthy that the cerebral StO2 improved the classification accuracy.
4 Discussion, Conclusions and Outlook
The achieved specificity of >99% and the sensitivity (87.52%) is comparable to the results reported in [13,14,15] for (N)ICUs, realized either with MLAs or with other approaches such as, e.g., the employment of multimodal descriptive statistics, Fourier and Hilbert transform to categorize the alarms. It has to be kept in mind that, generally, MLAs require a large amount of data to be trained effectively. In our case, the amount of data was certainly not optimal, because it was too few data with too few real alarms, due to the imbalance between true and false alarms . Taking this into consideration, impressive accuracy, specificity and sensitivity were, nevertheless, achieved. Of course, for a clinical application the highest sensitivity of 87.52% by the DT MLA would not be sufficient. Since missing a real alarm could be detrimental for patients, 100% detection of the real alarms is needed. To obtain higher sensitivity in the future, we will include larger datasets and use strategies to handle the above-mentioned imbalance. We conclude that all four tested MLAs yield promising results and we are convinced that 100% sensitivity can be achieved by increased training. This study shows that indeed the false alarms can be eliminated with modern MLAs, and that once the legal problems are solved, MLAs can rapidly be implemented in the standard care of the NICU. This will lead to better working conditions for the staff and a higher quality of care, as well as less stress for the neonates and their parents. Ultimately, this will reduce brain injury and long-term disabilities.
References
Wolf M, Keel M, von Siebenthal K et al (1996) Improved monitoring of preterm infants by Fuzzy logic. Technol Health Care 4:193–201
Sendelbach S, Funk M (2013) Alarm fatigue: a patient safety concern. AACN Adv Crit Care 24:378–386. quiz 87-8
Johnson KR, Hagadorn JI, Sink DW (2017) Alarm safety and alarm fatigue. Clin Perinatol 44:713–728
Li T, Matsushima M, Timpson W et al (2018) Epidemiology of patient monitoring alarms in the neonatal intensive care unit. J Perinatol
Lawless ST (1994) Crying wolf: false alarms in a pediatric intensive care unit. Crit Care Med 22:981–985
Johnson AEW, Ghassemi MM, Nemati S et al (2016) Machine learning and decision support in critical care. P Ieee 104:444–466
Cerka P, Grigiene J, Sirbikyte G (2015) Liability for damages caused by artificial intelligence. Comput Law Secur Rev 31:376–389
Kleiser S, Ostojic D, Nasseri N et al (2018) In vivo precision assessment of a near-infrared spectroscopy-based tissue oximeter (OxyPrem v1.3) in neonates considering systemic hemodynamic fluctuations. J Biomed Opt 23:1–10
Proença M, Grossenbacher O, Dasen S, et al. (2018) Performance assessment of a dedicated reflectance pulse oximeter in a neonatal intensive care unit. In: James patton UIaC, editor. 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Honolulu, HI
Webb AR (2002) Statistical pattern recognition, 2nd edn. Wiley, Chichester
Aha DW, Bankert RLA (1996) Comparative evaluation of sequential feature selection algorithms. Learning from data lecture notes in statistics, vol 112. Springer, New York
Chandrashekar G, Ferat S (2014) A survey on feature selection methods. Comput Electr Eng Elsevier 40:16–28
Eerikainen LM, Vanschoren J, Rooijakkers MJ et al (2016) Reduction of false arrhythmia alarms using signal selection and machine learning. Physiol Meas 37:1204–1216
Plesinger F, Klimes P, Halamek J et al (2016) Taming of the monitors: reducing false alarms in intensive care units. Physiol Meas 37:1313–1325
Monasterio V, Burgess F, Clifford GD (2012) Robust classification of neonatal apnoea-related desaturations. Physiol Meas 33:1503–1516
Acknowledgments
This work was supported by the Nano-Tera RTD project NewbornCare, the Clinical Research Priority Program (CRPP) Molecular Imaging Network Zürich (MINZ) of the University of Zurich and the Swiss National Science Foundation project 159490.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ostojic, D. et al. (2020). Reducing False Alarm Rates in Neonatal Intensive Care: A New Machine Learning Approach. In: Ryu, PD., LaManna, J., Harrison, D., Lee, SS. (eds) Oxygen Transport to Tissue XLI. Advances in Experimental Medicine and Biology, vol 1232. Springer, Cham. https://doi.org/10.1007/978-3-030-34461-0_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-34461-0_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34459-7
Online ISBN: 978-3-030-34461-0
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)