The role of high-resolution computed tomography (HRCT) as a diagnostic and prognostic tool in idiopathic pulmonary fibrosis (IPF) is well recognized, as well as the importance to quantify the extent of parenchymal abnormalities using HRCT both in clinical practice and clinical trials [1]. Nevertheless, a solely visual score and assessment of interstitial lung disease are affected by high inter- and intra-observer variability and poor sensitivity to minute changes [2].

Recently, many automated tools have been proposed to overcome these limitations aiding the radiologist in the diagnosis of IPF and assessment of prognosis and disease progression [3]. Automated software called CALIPER (Computer-Aided Lung Informatics for Pathology Evaluation and Rating) was developed by the Biomedical Imaging Resource Laboratory at the Mayo Clinic Rochester (Rochester, MN, USA) in 2007 in order to characterize and quantify lung abnormalities on HRCT and subsequently implemented and tested in different interstitial lung diseases [4].

Jiao et al. conducted a systematic review published in this issue of European Radiology, focusing on the relationship between CALIPER-derived parameters and pulmonary function tests (PFTs) in patients with IPF. The review included nine studies that examined the predictive value of CALIPER-derived parameters for mortality, outcome stratification, and disease progression in IPF patients [5].

As to understand the possibility of using CALIPER in different clinical settings, it might be useful to highlight some technical aspects of the software. CALIPER is a texture analysis tool, based on histogram signature mapping techniques that recognizes the main interstitial lung abnormalities, according to Lung Tissue Research Consortium (LTRC), as follows: low attenuation areas, ground-glass opacities, honeycombing, and reticular pulmonary infiltrates [4]. Through LTRC, nine hundred seventy-six 15 × 15 × 15-pixel volumes of interest from HRCT with pathologically confirmed ILD have been selected by a consensus of expert radiologists to identify the canonical histogram signatures for each of the classes of visual abnormality and to create the training sets. The signatures of each of the visual classes of interstitial lung abnormalities have been then used for the volumetric classification of the HRCT data and to create the dataset to compare each new CT tested [6]. Total ILD score can be obtained by the sum of ground-glass, reticular abnormalities, and honeycombing volumes; the ILD percentage score can be obtained with the ratio of the total ILD score to total lung parenchymal volume [6].

For vascular-related structures (VRS), during lung segmentation, the software excludes the larger vessels at the hilum and calculates the volume of pulmonary vessels with a diameter approximately larger than 3 mm. The ratio of the total vessels volume divided by the total lung parenchymal volume defines VRS percentage score [7].

Jiao et al. [5] have summarized the different role of CALIPER in IPF patients.

First of all, 5 studies have been evaluated, assessing the association between CALIPER-derived parameters and PFTs at baseline and demonstrating a strong correlation between CALIPER-ILD and CALIPER-VRS with forced vital capacity (FVC). The same results have been confirmed in the 1-year follow-up timepoint.

Regarding CALIPER-derived parameters annual variation, two studies have demonstrated a significant increase of ground-glass and reticulation in IPF patients.

Then, in Jiao et al. [5], the role of CALIPER in predicting mortality and long-term outcome has been reviewed, underlying that both CALIPER-VRS and CALIPER-ILD predict mortality, with CALIPER-VRS as a strong and independent survival predictor and a predictor of long-term outcome. The longitudinal changes of CALIPER-honeycombing and CALIPER-VRS also correlate with mortality rate. As highlighted previously, CALIPER-VRS is the most promising CALIPER parameter to assess IPF patients at baseline and during follow-up.

CALIPER might be used to stratify IPF patients’ outcome, one of the main goals in clinical practice, in Gender-Age-Physiology (GAP) score with better results compared to “normal” GAP score. Moreover, a threshold of CALIPER-ILD ≥ 20% and CALIPER-VRS ≥ 20% stratifies the IPF patients’ mortality, instead a threshold of CALIPER-ILD ≥ 20% and CALIPER-VRS ≥ 5% predicts the FVC reduction during follow-up.

Following early papers that have evaluated the effectiveness of CALIPER-derived parameters through their correlation with PFTs, this important review summarizes the main contributions that CALIPER could provide in the management of IPF patients, in particular, the role of the software to predict mortality and prognosis at baseline. In fact, considering the unpredictable course of the disease, the possibility to stratify the patient at the first CT evaluation is important for the clinical management of IPF. Furthermore, the paper underlines the software performance to assess disease progression, both in treated and untreated patients. Therefore, this tool may be helpful in clinical practice, given the poor reproducibility and time-consuming nature of visual scoring, to monitor the disease progression.

The main limitations of the automatic software have been also underlined by the review, as the necessity to create big database to validate in large cohort the results obtained in single studies, or the technical issues determined by different reconstruction kernel and section thickness.

The future direction of CALIPER studies, and more in general of quantitative analysis studies, should aim to overcome these limitations and to determine the sensitivity of the minimal changes demonstrated by software useful for assessing a disease progression, actually defined by PFTs.