Keywords

1 Introduction

As a result of the introduction of modern technologies of intensive care and neonatal resuscitation, the number of surviving premature infants with a gestational age of fewer than 30 weeks is increasing annually. According to the International Classification of Diseases of the 10th revision (ICD-10) [1], prematurity is usually classified according to two criteria, taking into account both birth weight and gestational age. By bodyweight at birth, four degrees of prematurity are distinguished: Ist degree of prematurity with the body weight of 2500–2000 g., II degree with the body weight of 1999–1500 g., III degree of premature children born with a very low body weight of 1499–1000 g. and IV degree of prematurity, children born with extremely low body weight (ELBW) of 999–500 g. Taking into account the gestational age at birth, prematurity is divided as follows: Ist degree of prematurity 35–37 weeks of gestation, II degree of prematurity 32–34 weeks, III degree of prematurity 29–31 weeks, and IV degree of prematurity 22–28 weeks of gestation.

The use of artificial lung ventilation (ALV) is a widely used treatment for preterm infants with a gestational age of fewer than 30 weeks. Thus, 89% of newborns with EBMT require mechanical ventilation on the first day of life, and in 95% of surviving premature infants, respiratory support was used during inpatient treatment [2]. In addition, studies on the priority of invasive or non-invasive respiratory support revealed that non-invasive respiratory support for premature infants with EBMT at birth was used in 83% of cases, but these patients still needed mechanical ventilation during inpatient treatment [3]. Research by B.J. Stoll et al. showed that 74% of premature babies with a gestational age of fewer than 28 weeks. At birth, they required surfactant therapy during the period of inpatient treatment [4]. A study of newborns with a gestational age of 25–28 weeks at birth and a relatively stable respiratory activity at birth revealed that 46% of them subsequently needed to switch from non-invasive methods to intubation and mechanical ventilation [4].

In this regard, mechanical ventilation is currently actively used in premature infants with acute respiratory failure at birth, despite the fact that non-invasive methods have proven advantages over mechanical ventilation. Invasive ventilation can contribute to the occurrence of various complications, including increased risks of death, as well as the development in more distant periods of deviations in the neurological development of the child [2, 5].

Thus, the treatment of patients with EBMT at birth requires their extubation as early as possible in order to prevent the development of possible complications during mechanical ventilation. It was found that each additional week on mechanical ventilation is associated with an increase in the risk of subsequent neurological development delay in the child. In addition, the endotracheal tube, as a foreign body, can act as an entry gate for pathogens, increasing the risk of ventilator-associated pneumonia and sepsis [6].

According to some data, the success of the transition from invasive to non-invasive respiratory support ranges from 60–73% [7, 8] to 80–86% [9]. Neonates who have not been successfully extubated have a high risk of episodes of hypoxemia and hypercapnia, bradycardia, cerebrovascular accident, and atelectasis [10]. In this regard, the task of finding predictors of their effective and safe transition from mechanical ventilation to non-invasive respiratory support is very urgent and vital for this category of patients.

The morbidity structure of a patient with EBMT is characterized by a greater severity and combination of existing disorders due to morpho-functional immaturity, lack of intrauterine well-being of the fetus during pregnancy, and concomitant infectious pathology [11, 12]. One of the main tasks of the doctor of the intensive care unit is the timely recognition of the stage and depth of the pathogenetic process of the symptom complex of diseases in each specific case. The treatment tactics in such patients, based on medical experience and professional intuition, can be subject to a certain amount of error. Currently, technologies for modeling biological systems are being implemented in medicine. Such modeling is based on the use of digital technologies that allow doctors, regardless of the level of professional training and equipment of a medical institution, to choose personalized therapeutic tactics. An important condition for this is the simplicity and availability of mathematical models for institutions at all levels of medical care.

The aim of this work is to test and refine the previously developed statistically reliable method for assessing the severity of the condition of premature infants with a gestational age of fewer than 30 weeks at birth on the basis of multivariate statistical analysis of data on a larger sample of patients [13].

2 Research Objective

In the first stage, anamnestic data and the results of medical and diagnostic procedures were assessed in 62 premature infants who were in the department of anesthesiology and intensive care (with wards for newborns) of the State Institution “Republican Scientific and Practical Center “Mother and Child”, Minsk, Belarus.

20 signs were analyzed: anamnestic data, acid–base state of blood at birth, near-infrared spectroscopy (NIRS) for the first 24 h of a child's life, respiratory support for a newborn, hemodynamic status, and hydro balance in the first 24 h of a child's life. The analyzed signs are given in Table 1.

Table 1 Anamnestic and laboratory data

The examined patients required respiratory support. Examination time: for indicators X1–X6–in the delivery room, for indicators X7–X20–in the first 24 h of life. All premature infants were divided into two samples. The first sample consists of 32 newborns and is referred to as the training sample. It was used to investigate the fundamental possibility of a statistically significant classification of the state of newborns by severity. The second sample of 30 preterm infants is referred to as the test sample. According to it, the adequacy of the classification of the state of newborns proposed for the training sample is checked. And if the adequacy turns out to be satisfactory, then in the future, for the entire sample of 62 newborns, the condition is classified assessment and a methodology is developed for a comprehensive assessment of the health of premature newborns with a gestational age of fewer than 30 weeks.

3 Mathematical Models and Methods

For practical use, the method of assessing the health of premature newborns should, on the one hand, be simple and understandable for doctors. On the other hand, it should be based on a rigorous mathematical model that makes it easy to interpret the obtained results. Logistic regression responds to these conditions [14]. However, for its effective application, it is required to successfully select a set of diagnostic features that allow dividing the existing set of observations into clusters. In this case, despite the relative simplicity, classification methods based on logistic regression are more effective than more powerful recognition procedures.

Logistic regression proves to form a well-interpreted indicator of severity in the form of the probability of attributing a patient to a particular group of patients. For the two classes, it is as follows.

A training sample of feature values is given as X1, X2, …, Xm(xiyi), i = 1, 2, …, n,

where \({{\varvec{x}}}_i = \left( {\begin{array}{*{20}c} {x_{i0} } \\ {x_{i1} } \\ \cdots \\ {x_{im} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} 1 \\ {x_{i1} } \\ \cdots \\ {x_{im} } \\ \end{array} } \right)\)—vector of values of the i-th object, \({{\varvec{X}}} = \left( {\begin{array}{*{20}c} 1 & {x_{11} } & \ldots & {x_{1m} } \\ 1 & {x_{21} } & \ldots & {x_{2m} } \\ \ldots & \ldots & \ldots & \ldots \\ 1 & {x_{n1} } & \ldots & {x_{nm} } \\ \end{array} } \right)\); \(X_j = \left( {\begin{array}{*{20}c} {x_{1j} } \\ {x_{2j} } \\ \cdots \\ {x_{nj} } \\ \end{array} } \right)\); \({{\varvec{y}}} = \left( {\begin{array}{*{20}c} {y_1 } \\ {y_2 } \\ \ldots \\ {y_n } \\ \end{array} } \right)\), \(y_i \in \left\{ {0;\;1} \right\}\)—binary variable indicating affiliation of the i-th object to the corresponding class, for example, to the first class at yi = 0 and to the second–at yi = 1; m is the number of features for each object; n is the number of observations. The classification is performed using the logistic function [14]

$$h({{\varvec{x}}}) = \frac{1}{{1 + \exp \{ - {{\varvec{b}}}^T {{\varvec{x}}}\} }},$$
(1)

taking values in the interval (0; 1). The threshold is \(h({{\varvec{x}}}) = 0,5\). Vector \({{\varvec{b}}} = (b_0 \;b_1 \;...\;b_m )^T\) in (1) defines a separating linear boundary described by the hyperplane equation \(\Pi :\;\;{{\varvec{b}}}^T {{\varvec{x}}} = 0\).

Let us introduce the function \(W({{\varvec{x}}}) = {{\varvec{b}}}^T {{\varvec{x}}}\). Let us define the region D1 of possible values of x for the first class as \(D_1 = \{ {{\varvec{x}}}:W({{\varvec{x}}}) < 0\}\), and for the second class as \(D_2 = \{ {{\varvec{x}}}:W({{\varvec{x}}}) > 0\}\). Toгдa \(\forall {{\varvec{x}}} \in D_1\) \(h({{\varvec{x}}}) < 0,5\) и \(\forall {{\varvec{x}}} \in D_2\) \(h({{\varvec{x}}}) > 0,5\). If x belongs to the hyperplane ∏, then \(h({{\varvec{x}}}) = 0,5\). Those. for an arbitrary observation of x*, the probability of its assignment to the first class is equals \(P({{\varvec{x}}}^* \in D_1 ) = 1 - h({{\varvec{x}}}^* )\), and to the second class \(P({{\varvec{x}}}^* \in D_2 ) = h({{\varvec{x}}}^* )\).

In [14], a method was proposed for calculating the coefficients of the vector b by solving the problem \(Q({{\varvec{b}}}) = \sum_{i = 1}^n {\ln \left( {1 + e^{ - y_i {{\varvec{b}}}^T {{\varvec{x}}}_i } } \right)} \to \mathop {\min }\limits_{{{\varvec{b}}} \in {{\varvec{R}}}^m }\).

The vector of coefficients b is estimated by different algorithms, for example, the Newton–Raphson algorithm is used [15, 16]. However, with the correct classification of all observations, the objective function Q(b) has a zero lower bound at infinity and therefore problem (4) will not have an exact solution. An increase in the components of vector b causes an unlimited increase in some values \(- y_i {{\varvec{b}}}^T {{\varvec{x}}}_i\). As a result, computational errors grow, leading to memory overflow and stopping the algorithm [16].

The program [17] was used to estimate the vector of coefficients b. It implements a zero-order algorithm that uses a random search with a fixed length of the vector b, described in [18], at each iteration. This eliminates the uncontrolled growth of computational errors [19] and ensures the stability of the algorithm.

4 Results and Discussion

Cluster analysis revealed the presence of two groups of premature children according to the severity of their health conditions, which were defined as “severe” (D1) and “very severe” (D2) [20, 21].

Thus, taking into account the conclusions of the cluster analysis, based on the results of a complete medical examination, a training sample was formed: 21 patients were assigned to group D1, and 11 patients to group D2. The solution to the classification problem is to try to distinguish between the groups of patients D1 and D2 and, if successful, to construct a decision rule for classification. To do this, it was necessary to solve the problem of multidimensional classification (recognition) of two groups (clusters) in terms of anamnestic data and the results of medical and diagnostic procedures X1X20. The essence of the solution consists in finding such a set of indicators from the initial set that would allow (if it is possible in principle) to statistically reliably recognize the differences in these groups. This task was solved in two stages. First, using discriminant analysis, a system of informative features was formed, and then, using logistic regression, a decisive classification rule was built.

Using discriminant analysis [22, 23] in the Statistica package, a discriminant function was constructed with a minimum p-level that was less than 0.0001. The following qualitative and quantitative indicators were informative ones: X3, X4, X6, X8, X9, X10, X14, X16, X17, X18. Recognition of two groups (D1) and (D2) was statistically reliable since all the indicators have high statistical reliability (over 98%). This means that the formed system of indicators sufficiently distinguishes between “severe” and “very severe”. All patients from the training sample were correctly classified.

Then, based on statistically significant features obtained using discriminant analysis, the decision rule of classification was constructed in the form of binary logistic regression. To ensure the computational stability of the algorithm for constructing the logistic regression, a data standardization procedure was performed (bringing the indicators to zero mean and unit variances). The integrative predictive index has the following form:

$$\begin{aligned} Z &= b_0 + b_3 X_3 + b_4 X_4 + b_6 X_6 + b_8 X_8 + b_9 X_9 \\ & \quad + b_{10} X_{10} + b_{14} X_{14} + b_{16} X_{16} + b_{17} X_{17} + b_{18} X_{18} , \end{aligned}$$
(2)

where b0 = 7.268, b3 = –10.171, b4 = 7.229, b6 = –20.056, b8 = 2.388, b9 = 18.367, b10 = –8.374, b14 = –9.813, b16 = –3.721, b17 = 16.845, b18 = –16.898.

If the result calculated according to the predictive rule (2) is less than zero, then the child is predicted to have a “very severe” condition with probability P0 = eZ / 1 + eZ. If the result Z is greater than zero, then the child is predicted to have a “severe condition” with probability 1–P0 = 1 / 1 + eZ. Group D1 included 21 patients, group D2–11 patients. All patients were correctly classified.

In the second stage, we will check the quality of recognition according to the predictive rule (2) using a test sample of another 30 premature infants with an established severity of health. Here, as before, the data standardization procedure was carried out.

Table 2 shows the results of recognizing the state of health of patients. Shown here are: №№—observation number (serial number of the patient in the total sample; IND—belonging of observations to one of the groups (IND = 0—group “severe”, IND = 1—group “very severe”); P0—probability of attribution of the patient group “very severe” (IND = 1); 1–P0—the probability of referring the observation (child) to the group “severe” (IND = 0); W—group calculated for patients according to the prognostic rule (2) (W = 0—group “severe”, W = 1—group “very severe”).

Table 2 Results for the test group of patients

According to Table 2, five out of thirty patients were misclassified (№№ 34, 37, 40, 45, 48), which is 16.7%. This discrepancy is explained by a sufficiently small training sample for constructing a predictive rule (1). It should be noted that for the used training sample of 32 observations, this is an acceptable result, which gives grounds to speak of a successful choice of the feature space.

Now we consider all 62 observations as a training sample (32 patients from the training sample plus 30 new patients from the test sample). Boys were 31/50.00%, girls 31/50.00%. 44 newborns were from singleton pregnancies (70.97%), 18—from multiple pregnancies (29.03%). The mean gestational age of the infants was 28.095 ± 0.947 weeks. At the same time, 24 newborns had III degree of prematurity and 38 infants had IV degree of prematurity by gestational age.

The body weight at birth was 1061.774 ± 222.501 g, the III and IV degrees of prematurity by body weight were 31 babies in each group. The indicated clinical data of patients indicate that the studied sample was formed from an extremely severe group of resuscitation pediatric patients.

Three children were born through the vaginal birth canal (3/4.84%), with abdominal delivery—59/95.16% of infants (38/64.41%—with emergency and 21/35.59%—with planned delivery).

All the children (62/100.00%) in the delivery room received surfactant replacement therapy at a dose of 248.639 ± 88.903 mg/kg.

The discriminant analysis showed that in this case the indicators X2, X3, X4, X5, X8, X9, X10, X14, X16, will be statistically significant (p-level less than 0.05), i.e. only three features have been updated here.

Table 3 shows the results of discriminant data analysis. The discriminant function was constructed with the minimum p-level, which was less than 0.0001. The resulting model includes features whose p-level turned out to be less than 0.05, i.e. all indicators turned out to be statistically significant with reliability higher than 0.95.

Table 3 Results of checking the statistical significance of signs for the test group of patients. Checking for signs

Based on the formed diagnostic features, an integrative prognostic index was built as follows.

$$\begin{aligned} Z & = b_0 + b_2 X_2 + b_3 X_3 + b_4 X_4 + b_5 X_5 + b_8 X_8 \\ & \quad + b_9 X_9 + b_{10} X_{10} + b_{14} X_{14} + b_{16} X_{16} , \end{aligned}$$
(3)

where b0 = 2.312, b2 = 7.488, b3 = –6.937, b4 = 3.601, b5 = 7.039, b8 = 2.436, b9 = 11.227, b10 = –6.559, b14 = –6.050, b16 = –8.604.

Rule (3) made it possible to correctly classify all 62 patients. The classification results are shown in Tables 4 and 5.

Table 4 Classification results for the training sample of patients
Table 5 Results of classification for the test sample of patients

Analysis of Tables 4 and 5 shows that all patients were recognized, there is not a single case when the probability P0 would be close to the borderline situation of 0.5. Thus, based on a sample of 62 observations, we obtained a variety of diagnostic indicators for recognizing the health status of premature newborns.

Let us compare the considered groups of premature newborns using entropy analysis [24, 25]. Here, as before, the data standardization procedure was carried out.

The results of vector entropy analysis for statistically significant indicators are shown in Table 6.

Table 6 Results of vector entropy analysis of groups “severe” (IND = 0) and “very severe” (IND = 1)

We can see that in the “very severe” group, compared with the “severe” group, the entropies of chaos and self-organization are much higher (the entropy of chaos is 2.87 or 27.4% higher, and the entropy of self-organization is greater by 0.42 or 38.4%). This means that in “very severe” newborns, in general, the variation in indicators, their variability is higher, and the tightness of the relationship between indicators is lower. Accordingly, the total entropy of “very severe” is 3.29 or 35.4% higher than that of “severe” ones. Thus, from the point of view of considering the organism as a complex system, the elements (informative diagnostic indicators) in the group (in the system) “very severe” behave more chaotically and interact much less in comparison with the elements of the group (of the system) “severe”.

5 Conclusion

The combination of a set of signs of anamnestic data, signs of acid–base state at birth, signs of respiratory support of the newborn, hemodynamic status, and NIRS allows us to reliably judge the severity of the condition of premature infants with a gestational age of fewer than 30 weeks.

The use of the calculated decisive rule when working with premature newborns is an affordable technique that can help doctors diagnose in advance the severity of the pathological process occurring in the child's body and provide the necessary complex of therapeutic manipulations in a timely manner to stabilize the condition of patients in intensive care units for newborns.

Vector-entropy analysis showed a significant difference in the behavior of groups of premature newborns “severe” and “very severe” as complex multi-dimensional systems. Informative diagnostic indicators in the “very severe” group behave more chaotically and interact much less in comparison with the elements of the “severe” group.

The obtained results can be introduced into the practice of children's intensive care units at various levels of perinatal care for newborn children.

This work was supported by a joint Russian-Belarusian project of the Russian Foundation for Basic Research (grant no. 20–51-00,001) and the BRFFR (grant no. M20R-008).