Introduction

Hemophagocytic lymphohistiocytosis (HLH) is a life-threatening disorder characterized by pathological immune activation and hyperinflammation. Early diagnosis and adequate therapy are essential for the survival of HLH [1, 2]. However, early diagnosis is challenging because HLH has a wide range of clinical manifestations, including fever, pulmonary dysfunction, cytopenias, liver disease, coagulopathy, skin manifestations, and neurologic symptoms [1, 3, 4]. Moreover, the diagnosis of HLH requires multiple laboratory tests, several of which are time-consuming to perform. According to the diagnosis criteria in the HLH-2004 study, a diagnosis of HLH can be established if patients had a molecular diagnosis consistent with HLH or five of the following eight clinical criteria were met: the presence of fever, splenomegaly, cytopenias, hypertriglyceridemia and/or hypofibrinogenemia, hemophagocytosis, decreased natural killer (NK)-cell function, elevated ferritin, and elevated soluble CD25 (sCD25) levels [5]. Three of these criteria—fever, splenomegaly, and cytopenias—can be easily assessed during routine clinical practice. However, the other six criteria are less likely to be evaluated unless there is a specific reason to do so, particularly for genetic testing and assessments of hemophagocytosis, NK cell function, and sCD25 levels. Given the complexity of the diagnostic criteria for HLH, the initial step in the diagnostic process is to suspect the presence of the condition and then proceed to order the diagnostic tests. Clinical data showed that adequate diagnostic testing is one of the most important factors for the early diagnosis of HLH [6].

Because HLH is relatively rare and has a heterogenous presentation, deciding when to suspect HLH and order diagnostic tests can be difficult [1, 7, 8]. Given that the outcome of HLH can be fatal, some researchers have proposed that HLH should be suspected when any of the typical and diagnostic features of the condition are present [1, 7, 8]. However, there is a lack of consensus regarding when to suspect HLH, and the effectiveness of suspecting HLH when any of its features are presented is yet to be determined. Screening is a way of identifying patients who may have an increased risk of a certain disease and thus enables early diagnosis and treatment. The development of screening tests for HLH could aid in the decision of when to suspect HLH.

An ideal screening test for HLH would aid in selecting patients for the HLH diagnostic workup in an efficient manner by reducing the number of unnecessary diagnostic tests while minimizing the risk of missing HLH cases. At hospital admission, the presence of fever, splenomegaly, and cytopenias can be observed through the first routine physical examination and laboratory test. These three criteria serve as the earliest screening tests for HLH. However, the performance of these three criteria has not yet been determined. In our previous study, all HLH patients fulfilled the criteria for cytopenias during hospitalization; nearly 67% of the patients exhibited cytopenias within 48 h of hospital admission, and 33% developed cytopenias thereafter [6]. Although most HLH patients exhibit fever and splenomegaly, these symptoms are also commonly seen in various other diseases, making them non-specific markers for the screening of HLH. To support the development of evidence-based diagnostic procedures for HLH, the screening performances of fever, splenomegaly, and cytopenias need to be evaluated. Furthermore, for HLH patients who did not exhibit simultaneous fever, splenomegaly, and cytopenias during the early stages, additional screening tests are necessary.

Different from diagnostic criteria, screening criteria need to be cost-effective and, ideally, include items that can be easily checked in routine practice. Given that several laboratory tests, such as complete blood cell count and liver and kidney function tests, are routinely performed during hospital admission for hospitalized patients in most clinical settings and that abnormalities in these tests have been observed in HLH patients [9,10,11], the potential utility of these tests for early screening of HLH deserves further evaluation.

This study has three objectives: (1) to assess the effectiveness of using fever, splenomegaly, and cytopenias as early screening criteria for pediatric HLH; (2) to construct a screening model utilizing the most commonly measured laboratory parameters; and (3) to develop a step-by-step screening procedure for pediatric HLH. The developed screening procedure can aid in the early diagnosis of pediatric HLH. It can also serve as a tool for identifying high-risk populations for the purpose of studying early diagnosis and treatment of HLH.

Methods

Study Population

The clinical data of pediatric patients hospitalized at Hunan Children’s Hospital (Changsha, China) between 1 January 2018 and 31 March 2022 were reviewed retrospectively. Patients with data obtained within 24 h of admission allowing evaluation of fever, splenomegaly, and cytopenia status were included. The outcome of interest was the diagnosis of HLH during hospitalization. We excluded patients suspected of having HLH who were either discharged or deceased before a definitive diagnosis could be established.

The study protocol was reviewed and approved by the Medical Ethics Committee of Hunan Children’s Hospital (approval numbers: HCHLL-2019-40 and HCHLL-2022-50). The study was performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The requirement for written informed consent was waived by the Medical Ethics Committee of Hunan Children’s Hospital.

Diagnoses and Variables

HLH was diagnosed according to the HLH-2004 criteria [5]. Fever, splenomegaly, hemoglobin level, and platelet and neutrophil counts were selected as screening parameters because they were the most readily available items in the HLH-2004 diagnostic criteria. Other candidate screening parameters were selected from the most commonly performed laboratory blood tests, including complete blood cell count and liver and kidney function tests, which were conducted in over 90% of patients during hospital admission assessments. A total of 34 laboratory parameters were examined (Table S1), including white blood cell, platelet, red blood cell (RBC), neutrophil, monocyte, lymphocyte, eosinophil, and basophil counts; lymphocyte, neutrophil, monocyte, eosinophil, and basophil percentages; hemoglobin, hematocrit, mean RBC volume, mean RBC hemoglobin concentration, and mean RBC hemoglobin; total protein, albumin, total bilirubin, direct bilirubin, indirect bilirubin, aspartate aminotransferase, alanine aminotransferase, total bile acids, globulin, lactate dehydrogenase (LDH), creatinine, uric acid, blood urea nitrogen, myoglobin, creatine kinase (CK), and CK-MB. The first test result within 24 h of hospital admission was extracted from each patient’s electronic medical records and used to develop the screening model.

Development of the Screening Procedure and Statistical Analysis

To determine the optimal combination of fever, splenomegaly, hemoglobin level, platelet count, and neutrophil count for the screening of HLH, the performance of various combinations was evaluated. If the screening performance of the optimal combination was found to be unsatisfactory as a screening test, a step-wise screening procedure was considered an alternative. Some cases of HLH patients can be identified through screening steps based on the criteria of fever, splenomegaly, and cytopenias. However, some HLH patients may be missed during this screening step. To address this issue, a scoring model was developed to identify these patients using commonly available laboratory parameters.

The study population that requires a model-based screening process was randomly divided into a training (70%) and validation (30%) set using the PROC SURVEYSELECT procedure in SAS 9.4 software. The screening score model was constructed in three steps using data from the training set. First, independent variables were selected from all candidate variables through logistic regression analysis based on the Akaike Information Criterion (AIC)-optimal selection method with cross-validation [12]. The variable selection procedure was described in more detail in a previous study [13]. We then assigned a score for each variable; the screening score was calculated by summing the scores of all variables. The included laboratory variables were dichotomized according to local reference ranges; for variables that were also included in the HLH-2004 diagnostic criteria, the cutoff values used therein were applied for dichotomization. Instead of identifying new cutoff points, we used existing reference ranges for laboratory parameters to avoid overfitting and improve generalizability. Another logistic regression model was constructed using dichotomized variables. Each variable was weighted by multiplying its regression coefficient (beta value) by 10 and rounding to the nearest integer. Patients were assigned scores of either a specified weight or 0, depending on whether or not the criteria for a given screening variable were met. For each patient, the total score was calculated by summing up the scores of all screening variables. Finally, we identified the optimal cutoff point for the screening score: a logistic model was developed that included the score as its only independent variable; the optimal cutoff point corresponded to the maximum Youden’s index in the receiver operating characteristic curve (ROC). Patients with a summed score that exceeded the cutoff point were considered to be at higher risk for HLH, such that they would benefit from further diagnostic workup. To evaluate the screening performance of the scoring criteria, the area under the curve (AUC), sensitivity, specificity, false-negative rate (FNR), false-positive rate (FPR), negative predictive value (NPV), and positive predictive value (PPV) were calculated using data from the training and validation sets.

Data were presented as absolute values with percentages or quartiles, as appropriate. The Chi-squared and Wilcoxon rank sum tests were used for between-group comparisons. All tests were two-tailed, and the type 1 error rate was set to 5%. Missing data were not imputed. Statistical analyses were performed using SAS (ver. 9.4, SAS Institute) and R (ver. 4.1.3, R Core Team) software.

Sensitivity Analysis

In the primary analysis, the HLH screening score was applied to patients who had reached the final step of the screening process. A sensitivity analysis was conducted to assess the performance of the screening score as a standalone screening step for all patients. Since EBV-HLH is one of the most common forms of HLH in pediatric patients [7], especially in Asia, another sensitivity analysis was conducted to evaluate the ability of the screening score to identify EBV-HLH and non-EBV HLH cases. As the availability of scoring parameters may vary across countries, to provide alternative screening parameters, a sensitivity analysis was conducted. This analysis evaluated the screening performance after either omitting or replacing a single scoring parameter.

Results

Study Population and Incidences of HLH

Between January 2018 and March 2022, the medical records of 83,965 pediatric patients admitted to Hunan Children’s Hospital documented fever and splenomegaly status at hospital admission. Of these patients, 5863 (6.98%) were excluded from our analysis due to missing data on hemoglobin levels, platelet counts, or neutrophil counts within 24 h of admission; a further 52 (0.06%) patients with suspected HLH were excluded because there was no definitive diagnosis. Thus, a total of 78,050 (92.96%) patients were included in the final analysis (Fig. 1). Table 1 shows the demographic and clinical characteristics of the included patients. Among all included patients, 160 (0.2%) were diagnosed with HLH. Patients with HLH were more likely to have fever (96.3%) and splenomegaly (70.6%) than those without the disease (fever, 43.2%; splenomegaly, 4.9%). The rates of cytopenias affecting ≥ 1 lineage, ≥ 2 lineages, and 3 lineages were 83.8%, 51.9%, and 23.1% in patients with HLH; these rates were significantly higher than those of patients without HLH (12.8%, 2.4%, and 0.8%, respectively).

Fig. 1
figure 1

Study flowchart

Table 1 Demographic and clinical characteristics of hospitalized pediatric patients with and without HLH

Screening Performance of Fever, Splenomegaly, and Cytopenias

The performance of the screening process using different combinations of fever, splenomegaly, and cytopenias is presented in Table 2. Ideally, screening criteria for HLH should have high sensitivity and specificity. As shown in Table 2, fever, splenomegaly, and cytopenias alone had low sensitivity and/or specificity. The sensitivity was highest (99.4%) for “fever or splenomegaly,” but this combination had a low specificity (54.7%). Sensitivity decreased while specificity increased when cytopenias affecting more lineages were considered (Table 2).

Table 2 Screening performance of fever, splenomegaly, and cytopenias at hospital admission for pediatric HLH

Our stepwise screening strategy used combinations of criteria with high sensitivity or specificity. Using fever or splenomegaly as the first screening criterion, 99.4% sensitivity and 54.7% specificity were achieved. This initial screening step classified 54.58% of the patients as low risk for HLH, and there was one false-negative HLH case (Fig. 1). Cytopenias affecting ≥ 2 lineages was selected as the second screening step for patients with fever or splenomegaly because it showed better overall performance compared with cytopenias affecting ≥ 1 lineage and 3 lineages. Patients with cytopenias affecting ≥ 2 lineages plus fever or splenomegaly were classified as high risk, warranting further evaluation. The first two steps achieved a sensitivity of 51.9% and a specificity of 98.4%. However, because 48.1% of the HLH cases remained unrecognized, another screening step was needed to identify HLH patients with fever or splenomegaly in whom cytopenias did not affect two or more lineages.

The Scoring Model

Patients with cytopenias affecting ≥ 2 lineages plus fever or splenomegaly were randomly allocated to the training set (n = 23,875) or validation set (n = 10,231). Thirty-four laboratory parameters were compared in the patients in the training set with versus without HLH (Table S1). Twenty-three laboratory parameters showed significant between-group differences (P < 0.05; Table S1). Together with fever and splenomegaly, these parameters were included in the logistic regression analyses. Based on the method of AIC-optimal selection through cross-validation, six parameters were selected as model predictors (with an AUC of 94.39%): splenomegaly, the platelet and neutrophil counts, and the albumin, total bile acid, and LDH levels. The distribution of these parameters was presented in Fig. 2a–f for the HLH and non-HLH patients in the training set. To develop the screening score criteria, the laboratory parameters values were dichotomized using cutoffs based on the HLH-2004 diagnostic criteria (platelet count, 100 × 109/L; neutrophil count, 1.0 × 109/L) or local laboratory references (albumin level, 35 g/L; total bile acid level, 9.67 μmol/L; LDH level, 450 IU/L). A logistic regression model was developed, including splenomegaly and the five dichotomized lab parameters as predictors (Table S2). The regression coefficients obtained from the second logistic model were multiplied by 10 and used as the score for each parameter when the dichotomized criteria were met. For each patient, the scores of six parameters were summed to obtain the total screening score. The median screening score was 64 (interquartile range (IQR): 46–78) for patients with HLH versus 9 (IQR: 0–22) for patients without HLH (Fig. 2g).

Fig. 2
figure 2

Distribution of scoring parameters and the ROCs of the screening score. Distributions of a splenomegaly, b platelets, c neutrophils, d albumin, e total bile acid, f lactic dehydrogenase (LDH), and g screening score in patients with and without HLH in the training set. ROCs of the screening score in the h training set and i validation set

According to the maximum Youden’s index in the ROC, the optimal cutoff point for the screening score was 37. The AUC of the logistic regression model in which the screening score was the independent variable was 93.0% for the training set and 93.6% for the validation set. The performance of the screening score is shown in Table 3. The sensitivity and specificity were 83% and 91.2% for the training set and 87.0% and 90.6% for the validation set, respectively (Table 3).

Table 3 Screening performance of the screening score for HLH in the pediatric patient presented with fever or splenomegaly but without cytopenias ≥ 2 lineages

The Three-Step Screening Process for HLH

A three-step screening procedure for HLH was developed (Fig. 3): Step 1: Is fever or splenomegaly present? (Yes: risk for HLH should be considered, go to Step 2 for further evaluation; No: less likely HLH; sensitivity = 99.4%, specificity = 54.7%); Step 2: Are cytopenias affecting at least two lineages? (Yes: consider HLH; No: go to Step 3; sensitivity = 52.2%, specificity = 96.4%); Step 3: Calculate the screening score. Is the sum of the score > 37? (Yes: consider HLH; No: less likely HLH; sensitivity = 84.2%, specificity = 91.0%). The overall sensitivity and specificity of the three-step screening procedure were 91.9% and 94.4%, respectively.

Fig. 3
figure 3

Three-step screening procedure for pediatric HLH

Sensitivity Analysis

In the primary analysis, as the third step of the screening procedure, an HLH screening score was calculated for patients with fever and/or splenomegaly and a lack of cytopenias affecting two or more lineages. To evaluate the effectiveness of the screening score as a standalone screening step for all patients (n = 78,050), a sensitivity analysis was conducted. The results indicated a sensitivity of 90.6% and a specificity of 92.1%.

Among the 160 HLH patients included in this study, 107 were EBV-positive. An additional sensitivity analysis of the ability of the screening score to identify EBV-HLH and non-EBV-HLH cases revealed sensitivities of 93.5% and 84.9%, respectively.

In the logistic regression model, total bile acid and hemoglobin levels had similar regression coefficients; only one of these parameters was required because the inclusion of both did not improve the screening model’s performance. The screening procedure including the hemoglobin level had the same Youden index (0.863) as that of the procedure including total bile acid level. However, the sensitivity of the latter procedure was higher (91.9% vs. 90.6%), which could result in a higher identification of true HLH cases; therefore, the total bile acid level was included among the final scoring criteria. Because the total bile acid level might not be routinely measured worldwide in hospitalized children, and given that it can be influenced by food intake and that its normality range refers to the test performed while fasting, we conducted further sensitivity analyses by omitting this parameter or replacing it with the hemoglobin level (Table S3). After omitting the total bile acid level, the sensitivity (92.5%) of the screening procedure increased by 0.6%, but the specificity (93.1%) decreased by 1.3%. Replacing the total bile acid level with the hemoglobin level yielded a sensitivity of 90.6% and a specificity of 95.7%; these values indicate that the hemoglobin level can serve as an alternative if total bile acid data are not available.

Discussion

In this study, the screening criteria of cytopenias affecting ≥ 2 lineages plus fever or splenomegaly had a sensitivity of 51.9% and a specificity of 98.4% for identifying HLH among pediatric inpatients. To improve the screening sensitivity, we developed a screening score using common laboratory parameters and established a three-step screening procedure for pediatric HLH. The overall sensitivity and specificity were 91.9% and 94.4%, respectively.

Completing all assessments needed to establish whether at least five of the eight HLH-2004 criteria are met is time-consuming, and tests for some criteria might not be available in low-resource settings; therefore, simplified criteria have been proposed [14, 15]. Smits et al. (2021) identified a so-called minimal parameter set consisting of phagocytosis, splenomegaly, cytopenias, ferritin, triglycerides, and fibrinogen for predicting HLH [14]. The HScore was developed for diagnosing HLH in adults and is based on nine variables including known underlying immunosuppression, high temperature, organomegaly, triglyceride, ferritin, serum glutamic oxaloacetic transaminase, fibrinogen levels, cytopenia, and hemophagocytosis features on bone marrow aspirate [15]. Validation studies showed that the HScore can be used in both adult and pediatric populations [16, 17]. However, these diagnostic tools were designed to be used when HLH is suspected, and the decision to order a diagnostic workup for HLH is based mainly on clinical experience. Nearly 63% of our pediatric HLH inpatients did not exhibit fever, splenomegaly, and cytopenias simultaneously at hospital admission. In other words, a large proportion of patients later diagnosed with HLH did not present with typical HLH features. For an early diagnosis of HLH, better screening methods are needed.

We observed marked differences in laboratory test results at hospital admission between inpatients who developed HLH and those who did not, suggesting that although the early diagnosis of HLH might be challenging, early indicators are present. Using a data-driven approach, we developed a screening procedure for HLH based on clinical and laboratory parameters that are regularly assessed during hospital admission. This procedure has the potential to offer the earliest possible screening results for HLH, which could aid in the decision-making process for HLH diagnostic evaluation and monitoring. We also provided an alternative model for cases with unavailable data for the total bile acid level.

Patients may develop HLH before or after hospital admission. In our previous study, approximately 33% of pediatric HLH patients were diagnosed after 3 days of hospitalization [6]. The interval between HLH symptom onset and definitive diagnosis ranges from 4 days to over 2 months [6]. The screening procedure developed in this study can be used to identify patients at high risk for HLH. For patients with positive screening results who do not meet the HLH-2004 diagnostic criteria, ongoing monitoring of HLH development is important. Furthermore, for patients with a negative screening outcome, the results may change over time, and ongoing monitoring for HLH should be conducted if there are ongoing concerns regarding the diagnosis.

Early diagnosis of HLH is essential for successful treatment. However, many patients are not diagnosed until severe symptoms have developed, which explains why diagnoses are often made in the intensive care unit [4, 18, 19]. To improve survival, early diagnostic markers and less aggressive treatments for HLH are needed. Studies aiming to improve early diagnosis and treatment should include patients with early-stage HLH. Another potential application of our screening tool is the identification of high-risk patients for clinical studies. For example, using a nested case-control study design, biological samples could be collected from patients with positive screening results. Such patients may develop HLH before or after sampling, or not at all, during hospitalization. For patients with positive screening results at hospital admission who are subsequently diagnosed with HLH, early biological samples could be analyzed for early diagnostic markers. Most biomarker studies analyzed samples collected after HLH was suspected or diagnosed [20,21,22]. Our screening method constitutes a new approach to identifying patients at an early stage and collecting early samples.

This study had several limitations. First, because of the retrospective design, some clinical features of HLH, such as skin manifestations and neurologic symptoms [1], were not assessed as candidate screening criteria. Similarly, the screening performance of triglycerides, fibrinogen, and ferritin levels could not be evaluated because those parameters were only assessed in a small proportion of patients in our cohort. Furthermore, not all patients in our study underwent diagnostic tests for HLH, therefore, HLH may have been underdiagnosed. Additionally, we could not distinguish between primary and secondary HLH because genetic tests were conducted on only a few patients. Because the clinical manifestation of primary HLH and secondary HLH may differ [23], the ability of our screening procedure to detect primary and secondary HLH requires validation. It should also be noted that our screening procedure was developed using data from a single center, which could result in selection bias. Multicenter validation studies are needed given that the cutoff of the screening score could differ among populations, similar to the HScore [24,25,26]. We were also unable to distinguish patients who developed HLH before versus after hospital admission because not all patients with HLH underwent HLH diagnostic workup at hospital admission. Finally, our screening procedure needs to be validated in prospective studies, and there is a continuous need for improvement in both the screening and diagnostic procedures for HLH.

Conclusion

A significant proportion of pediatric HLH patients present at the hospital without simultaneous fever, splenomegaly, and cytopenias. Our three-step screening procedure, based on common clinical and laboratory parameters, can effectively identify pediatric patients who may be at higher risk for HLH.