Introduction

Noninvasive differentiation of benign and malignant lymphadenopathy remains a challenge. Given the significant overlap in size and morphology between benign and malignant lymph nodes [1, 2], standard anatomical imaging modalities such as CT, MRI and ultrasound (US) have demonstrated limited accuracy in lymph node characterization [1, 3, 4]. This is an especially challenging problem in children, who are susceptible to a wide range of infections associated with lymphadenopathy that can persist over weeks and months. While several algorithms for the evaluation of children with lymphadenopathy have been published [5,6,7], the definitive management of children with unexplained lymphadenopathy has not been established. Serial imaging with ultrasound and CT is often performed, but the frequency of imaging and duration of persistent lymphadenopathy necessitating further workup can vary significantly. Biopsy provides definitive diagnosis but is invasive and often requires surgical excision under sedation or general anesthesia for adequate sampling in the pediatric population. Molecular imaging techniques such as 18F-2-fluoro-2-deoxyglucose positron emission tomography (18F-FDG PET) can be used to identify malignant lymphadenopathy and sites of primary malignancy [8,9,10]. However, concerns regarding ionizing radiation exposure, financial cost and need for sedation associated with 18F-FDG PET imaging in children all preclude its routine use in pediatric patients with lymphadenopathy [1, 11].

In this population, an important goal is to identify imaging features beyond size that can discriminate benign from malignant lymph nodes on routine CT images. Textural analysis (TA) is a post-processing technique that can be applied to diagnostic images to quantify heterogeneity of individual voxel signal intensities within tissues or organs of interest [12]. The distribution of these signal intensity values can be associated with specific microscopic disease processes and identify features beyond radiologist visual inspection [12, 13]. TA has been applied to CT and MR images of a variety of tissues to help differentiate benign from malignant processes and predict lesion treatment response, but these applications have been limited to adult cohorts [14,15,16,17,18].

The purpose of this study was to establish whether TA of enlarged lymph nodes on contrast-enhanced CT images in pediatric patients can distinguish benign from malignant lymphadenopathy.

Materials and methods

Patients and imaging

Our institutional review board approved this single-institution retrospective query of a radiology report database to identify pediatric patients 18 years of age and younger who had undergone CT imaging of the neck, chest, abdomen or pelvis between 2006 and 2016. Only studies performed with intravenous contrast agent were included. We reviewed CT reports to identify studies in which at least one enlarged lymph node measuring 10–20 mm (mm) in maximum short-axis diameter was present. We then reviewed the electronic medical records of all patients and determined the underlying cause of lymphadenopathy based on provider notes and clinical data.

Malignant diagnoses were based on histology from previous biopsy or surgical resection. Lymphadenopathy was considered malignant in cases where lymph node biopsy or resection demonstrated the presence of tumor cells by pathology report. In patients with histology-proven malignancy, enlarged lymph nodes without histological diagnosis were also considered malignant if they (1) demonstrated high avidity by 18F-FDG PET or iodine-123 metaiodobenzylguanidine (MIBG) scintigraphy that was similar in intensity to the primary malignancy, and (2) decreased in size following systemic cancer therapy. All malignant lymph nodes were selected and measured on CT scans performed prior to therapy initiation. Lymph nodes were classified as benign if the lymphadenopathy was present in the context of confirmed inflammatory bowel disease (IBD) or acute appendicitis based on endoscopic mucosal bowel biopsy or surgical appendectomy, respectively. To exclude malignancy in these patients, we obtained clinical follow-up for at least 12 months (mean duration 30.8 months) after imaging to ensure that malignancy was not subsequently diagnosed and did not account for the lymphadenopathy.

All CT images were acquired on a 16- or 64-slice multidetector CT scanner (LightSpeed; GE Healthcare, Waukesha, WI) following administration of intravenous contrast agent. Five-millimeter slice thickness was used for all studies. CT dose parameters were based on a previously published standard institutional pediatric low-dose algorithm based on patient size and indication [19], with statistical iterative reconstruction applied (adaptive statistical iterative reconstruction, GE Healthcare).

Lymph node selection and textural analysis

Fellowship-trained body radiologists (A.M.C., abdominal imaging fellow; A.K., 2 years of post-fellowship experience; S.M., 5 years of post-fellowship experience; A.T., 4 years of post-qualification experience) reviewed CT images and identified lymph nodes measuring 10–20 mm in maximum short-axis dimensions based on the original radiology report and confirmed by the axial image containing the widest lymph node diameter. We then manually placed a single region of interest (ROI) contouring the periphery of the lymph node. We performed lymph node selection and ROI placement in a blinded fashion with respect to lymph node final diagnosis.

TA was performed on the identified lymph nodes on 12-bit images (4,096 gray levels) using TexRAD (Feedback Medical, Cambridge, UK), a commercially available software program utilized in previous TA studies [18, 20,21,22]. TA was performed for each lymph node both with and without image filtration by a spatial band-pass filter that extracts and enhances characteristics of varying sizes and intensity differences (spatial scale filter; SSF) [18, 21, 23,24,25]. Histograms of intensity values for each node were then quantitatively analyzed for various TA features including mean (average value of the pixels within the ROI), standard deviation (SD: a measure of how much variation or dispersion exists from the mean), entropy (a measure of irregularity), mean value of positive pixels (MPP: average value of all pixels with positive values), skewness (a measure of the symmetry of the histogram) and kurtosis (a measure of the peakedness of the histogram) [24,25,26].

Statistical analysis

We performed the Student’s t-test to determine significance of differences between continuous variables. We also performed bivariate logistic regression modeling to identify texture features associated with malignancy, with chi-square evaluation of the Wald Z-statistic used to assess statistical significance of predictors. Stepwise logistic regression analysis was also performed with optimal model selection based on Akaike Information Criterion (AIC) minimization. The AIC is an established estimator of the relative quality of statistical models for a given set of data, with AIC minimization reflecting minimal loss of information associated with a given model and thus higher model quality [27].

Receiver operator characteristic (ROC) analysis of log likelihood ratios for the variables in the optimal regression model was then used to determine the optimum threshold values for distinguishing benign and malignant lymph nodes, with test performance characteristics calculated for different threshold values. We then applied the McNemar test to allow for comparison of the sensitivity, specificity and accuracy of TA in distinguishing benign from malignant lymphadenopathy when compared to the use of lymph node attenuation as the determinant. For all statistical tests, P<0.05 was considered significant. All statistical analysis was performed using Stata version 13 (StataCorp, College Station, TX).

Results

We identified a total of 63 CTs demonstrating lymphadenopathy and sufficient clinical data for categorization during the study period, including 34 patients with malignant lymph nodes and 29 patients with benign lymph nodes (secondary to appendicitis or inflammatory bowel disease). Mediastinum and retroperitoneum were the most common sites for malignant lymphadenopathy in this study cohort (7 cases each), while mesenteric lymphadenopathy constituted a significant majority of the cases of benign lymphadenopathy (19 cases). The overall study cohort comprised 40 males and 23 females, with average ages of 12.4 years and 12.1 years at the time of CT acquisition in the malignant and benign groups, respectively. Patient demographics and underlying diagnoses are summarized in Table 1. Mean follow-up of the patients in the benign group was 30.8 months.

Table 1 Demographics of study patients, including gender, age at CT acquisition, location of lymphadenopathy and underlying primary diagnosis

Each lymph node had TA performed on the primary unfiltered image as well as the image after 2-mm spatial scaling filtering (SSF2) to highlight clustered features (Fig. 1). The SSF2 filter performed best among the different filter settings in distinguishing benign from malignant lymphadenopathy (SSF1–6; other data not shown). Malignant and benign lymph nodes demonstrated significantly different TA parameters of mean, SD, MPP and skewness on the unfiltered image data and significantly different TA parameters of mean, entropy, MPP and skewness on the filtered image data (Table 2).

Fig. 1
figure 1

Axial contrast-enhanced CT images show examples of benign (a, right inguinal lymphadenopathy in a 9-year-old boy with Crohn disease) and malignant (b, left cervical lymphadenopathy in a 10-year-old boy with Hodgkin lymphoma) lymphadenopathy with regions of interest (ROI) applied using TexRAD to allow calculation of texture quantification parameters with and without the use of a 2-mm filter. MPP mean value of positive pixels, SD standard deviation

Table 2 Univariate analysis using the Student’s t-test identifies texture quantification parameters that exhibit significant differences in benign and malignant lymphadenopathy, with and without 2-mm image filtration

We then performed bivariate regression analysis on these TA variables individually as predictors of lymph node malignancy. Several TA variables (mean, SD, MPP and skewness) individually are significant predictors of lymph node malignancy based on unfiltered image data, while mean, entropy and skewness are significant predictors based on filtered image data (Table 3). Stepwise logistic regression was then performed on TA features in aggregate (Table 3) to identify the best predictive model of lymph node malignancy based on AIC minimization. A model including mean and MPP values performed best for the unfiltered image data (Goodness of Fit C-statistic AUC of 0.875), while a model including the TA features of mean and entropy performed best based on the filtered image data (C-statistic AUC of 0.907).

Table 3 Univariate regression analysis was performed using the Wald Z-statistic on the texture quantification parameters demonstrated in Table 2 to be significantly different between benign and malignant lymphadenopathy to identify those with malignant predictive potential for both unfiltered and filtered lymph nodes

We then calculated optimum threshold values for defining malignancy from the logistic regression models using receiver operating characteristic analysis of log likelihood ratios (LLRs) for the TA variables in each model (Table 4). With these threshold values (LLR>−0.03 for unfiltered image data; LLR>0.73 for filtered image data), the unfiltered image data analysis exhibited 88.2% sensitivity, 72.4% specificity and 81.0% accuracy (percentage correctly classified) for detecting malignant lymph nodes. Filtered image data analysis exhibited 82.4% sensitivity, 86.2% specificity and 84.1% accuracy for detecting malignant lymph nodes.

Table 4 Optimum threshold values for defining malignancy derived from the logistic regression models using receiver operating characteristic analysis of log likelihood ratios (LLRs) for the textural analysis variables in each model

Finally, we compared the performance of CT textural analysis to the performance of mean lymph node Hounsfield unit (HU) attenuation value without TA for determining lymph node malignancy. Based on ROC analysis of the unfiltered lymph node data, the optimum CT attenuation threshold for detecting malignancy was a mean HU threshold greater than 29.8, which yielded 100% sensitivity, 6.9% specificity and 57.1% accuracy (Table 4). The specificity and accuracy performance of the TA-derived threshold values, both based on the unfiltered and filtered image data, were both statistically superior to the non-TA-based attenuation threshold (P<0.05, McNemar test, Table 4). To illustrate the varying outcomes, individual cases assessed using mean lymph node HU alone, which demonstrate true-positive, true-negative, false-positive and false-negative results compared to tissue diagnosis, are shown in Fig. 2.

Fig. 2
figure 2

Sample application of textural analysis in cases of benign and malignant lymphadenopathy on axial contrast-enhanced CT images. Right paratracheal lymphadenopathy in a 10-year-old boy (a) and portocaval lymphadenopathy in a 14-year-old girl (b), both with Hodgkin lymphoma. Mesenteric lymphadenopathy in a 7-year-old boy (c) and a 6-year-old girl (d), both with acute appendicitis. A Hounsfield unit (HU) value of 79.2 was the threshold used when textural analysis (TA) was applied to distinguish benign from malignant lymphadenopathy. FN false negative, FP false positive, TN true negative, TP true positive

Discussion

Accurate characterization and management of lymphadenopathy in children is challenging because distinguishing benign from malignant lymph nodes cannot be reliably achieved using size criteria [28,29,30]. Other non-size-based imaging techniques have been applied to lymph node characterization, with 18F-FDG PET being the most widely used imaging technique [31, 32]. However, 18F-FDG PET cost and ionizing radiation limit its routine use in children and in clinical practice, and it is therefore typically reserved for children with established malignancy. This study evaluates CT textural analysis as a novel tool to differentiate benign from malignant lymph nodes in pediatric patients, a tool that has not been explored previously in the published literature. Advantages of this technique include the fact that CT is routinely used in pediatric patients to assess distribution of lymphadenopathy initially detected by US or physical exam. Recent technology advances have made CT imaging fast enough to be performed in almost all children awake without the need for sedation. We restricted our analysis to lymph nodes between 10 mm and 20 mm in short-axis diameter, which is a size range that represents lymphadenopathy that is indeterminate for malignancy based on size alone. Also, recent hardware and image reconstruction innovations have led to substantial reduction in CT ionizing radiation doses in children [33]. Part of this dose reduction involves the use of iterative reconstruction algorithms to restore diagnostic image quality in low-dose image datasets. It is known that the iterative reconstruction algorithm used can impact textural analysis [33, 34]. In our study, we limited CT studies to images reconstructed with statistical-based reconstruction algorithms, to minimize this variation. TA was also performed on images of a uniform thickness (5 mm).

Our results indicate that TA can achieve a sensitivity and specificity of 88.2% and 72.4%, respectively, in the absence of a filter, and 82.4% and 86.2%, respectively, with additional image filtration, for lymph node categorization in a pediatric population, significantly improving the performance of CT. Because CT textural analysis examines patterns of signal intensity on a per-voxel basis, it can extract imaging features regarding the distribution of voxel signal intensities within a lymph node that are beyond radiologist visual image interpretation. Similarly, CT textural analysis can also highlight clustering of voxel signal intensities through spatial scaling filtration. The use of both CT textural analysis filtering and heterogeneity analysis resulted in improved performance of lymph node characterization compared with mean attenuation values alone. Stepwise logistic regression demonstrated that decreased MPP and increased entropy are TA features that combine with lower mean HU attenuation to best model lymph node malignancy. Our results suggest that tumor cell replacement of lymph nodes leads to both reduced and more heterogeneous enhancement compared with benign reactive lymph nodes.

This improved ability to exclude malignancy in children with lymphadenopathy could reduce the need for additional imaging tests and biopsies to follow enlarged lymph nodes if further clinical application of TA is shown to be practical. This is particularly relevant in young children who could be spared the ionizing radiation dose associated with 18F-FDG PET as well as potential complications associated with percutaneous or surgical biopsy. One potential implication for our study is a new role for artificial intelligence in the evaluation of lymph nodes seen on CT. We believe that our results provide validation for the use of artificial intelligence in lymph node characterization that would augment radiologist interpretation of CT examinations.

Our study has limitations, including its retrospective nature as well as the relatively small numbers of lymph nodes included. We specifically only included lymph nodes within a size range that was indeterminate for malignancy (10–20 mm maximum short-axis diameter) and had a clearly defined diagnosis of benignity or malignancy, which led us to exclude many lymph nodes from consideration. Lymph nodes of this size were selected because they were thought to represent the greatest diagnostic challenge given the significant overlap between malignant and benign etiologies in this range. We note that the anatomical distribution of benign lymph nodes (all within the abdomen) was more limited compared with the malignant lymph node cohort (7/34 mediastinal). This was a result of our selection of two abdominal diseases (acute appendicitis and IBD) associated with lymphadenopathy as the source of benign lymph nodes. Given that no significant difference in TA features was observed between the mediastinal and retroperitoneal nodes in the malignant group, we do not believe that anatomical location significantly impacts TA results. We also acknowledge that not all lymph nodes had histological confirmation of diagnosis, particularly in the benign lymph node group. We intentionally chose appendicitis and IBD cases because of their high incidence of benign lymphadenopathy. There is a small risk of bowel adenocarcinoma in people with IBD, primarily in adults [34], but none of the children with IBD in our cohort demonstrated bowel malignancy on follow-up. In more challenging scenarios, such as children with known malignancy and concomitant infectious/inflammatory conditions, we believe TA would be beneficial but this would require future investigation. Finally, we performed textural analysis of lymph nodes based on a single manually segmented ROI without blinding of the study participants. Future studies with volumetric analysis tools are needed to perform TA of lymph nodes in their entirety. In addition, the financial costs and added time associated with routine implementation of CT textural analysis have not been considered.

Conclusion

This study demonstrates the ability of textural analysis to distinguish benign from malignant lymphadenopathy on CT imaging in children. If shown to be clinically applicable, this has the potential to reduce the need for serial follow-up imaging and tissue sampling. The addition of textural analysis to pediatric CT protocols could ultimately aid in the characterization of indeterminate lymph nodes incidentally detected on imaging. However future studies are needed to confirm the clinical applicability of TA in distinguishing benign from malignant lymphadenopathy.