Introduction

Arterial calcification is a characteristic of advanced atherosclerosis [1]. In general, arterial calcification occurs at two anatomical sites in the arterial wall, the media and the intima. The intimal calcification, which is seen as patchy scattered deposits occurring within atherosclerotic plaques, has been shown to be associated with plaque vulnerability [2, 3].

Several imaging techniques have been employed to detect and quantify arterial calcification in routine clinical work. Plain radiography has traditionally been used to detect arterial calcification, especially aortic calcification. Aortic calcification detected on radiography of the prelumbar region or thorax constitutes an independent risk factor for cardiovascular morbidity and mortality [48]. On the other hand, quantification of coronary artery calcification (CAC) predicts the likelihood of future coronary heart disease events [912]. Currently, multi-detector row computed tomography (MDCT) is considered the gold standard for evaluating CAC. CAC quantification by CT allows for risk stratification as well as disease monitoring [13, 14].

As atherosclerosis is a systemic disease, the presence of AAC is associated with CAC, which has been demonstrated by MDCT of the thorax [15, 16]. However, CT is expensive, requires a high radiation dose and is only performed for dedicated indications. On the other hand, the most frequently applied radiological imaging technique in patients with suspected coronary artery disease is still chest radiography, which is widely used and considered a diagnostic baseline imaging technique. Direct assessment of CAC on chest radiography permits neither reproducible detection nor exclusion of CAC [17]. Contrariwise, AAC can easily be noted and quantified on chest radiography [5, 8]. If AAC as determined by chest radiography correlates with CAC, AAC may be employed as a surrogate marker of CAC and possible coronary artery disease. Hence, the purpose of this study was to evaluate the association between AAC detected on chest radiography and CAC as determined by MDCT.

Materials and methods

Patient population

The local institutional review board approved this retrospective study and waived the requirement for written informed consent. Patients who had undergone unenhanced cardiac CT with CAC evaluation and chest radiography within 6 months in the time span between March 2005 and July 2010 were eligible for inclusion in this study. The total number of CAC examinations during this time span was 214. Exclusion criteria were interference from pacemakers and coronary bypass grafting staples on CT. The final study population consisted of 128 patients (53 female, 75 male; mean age 69.3 ± 14.7 years). Indication for CT imaging was risk stratification in suspected coronary heart disease (n = 81) or assessment of aortic root anatomy in patients evaluated for transcatheter aortic valve implantation (n = 47). The latter patients were included on purpose to avoid bias caused by including only patients with an increased likelihood of coronary artery disease. Chest radiographies were performed preoperatively in 54 patients and for assessment of suspected cardiovascular and/or pulmonary disease in 74 patients.

Aortic arch calcification scoring from chest radiography

Chest radiographs (postero-anterior, n = 98; antero-posterior, n = 30) were reviewed by two independent observers (observer A, 4 years of experience; observer B, 12 years of experience). For assessment of intraobserver agreement, observer A repeated the review of all chest radiographies after a 4-week interval. Observers were blinded to patient-identifying information and to results of CT calcium scoring. First, AAC was defined as a dichotomous variable: either present or absent. Then, AAC was graded semi-quantitatively on a four-point scale: grade 0 = no visible calcification; grade 1 = small spots of calcification or a single thick area of calcification of the aortic arch; grade 2 = two thick areas of calcification; grade 3 = more than three areas of thick calcification or more than quadrant calcification of the aortic knob (Fig. 1a).

Fig. 1
figure 1

Four-point grading scale of aortic arch calcifications (AAC) and distribution of AAC grade on chest X-ray. a The extent of AAC on conventional chest X-ray was divided into four grades: grade 0 = no visible calcification; grade 1 = small spots of calcification or a single thick area of calcification of the aortic knob; grade 2 = two thick areas of calcification; grade 3 = three or more areas of thick calcification or more than quadrant calcification of the aortic knob. b Distribution of AAC grade determined by reviewing chest X-rays in all 128 subjects

CT: image acquisition

Computed tomography examinations were performed in 81 patients using a 64-slice multi-detector CT system (Brilliance, Philips Medical Systems, Best, the Netherlands) equipped with a standard cardiac reconstruction and postprocessing package. Imaging parameters were as follows: tube voltage, 140 kV; effective tube current, 300 mAs; collimation, 64 × 0.625 mm; pitch, 0.2; rotation time, 0.5 s.

In 47 patients CT examinations were performed using a dual-source multi-detector CT system (SOMATOM Definition, Siemens Health Care Sector, Forchheim, Germany). CT parameters were as follows: tube voltage, 120 kV; tube current time product 80 mAs/rotation; collimation, 2 × 32 × 0.6 mm; slice acquisition 2 × 64 × 0.6 mm (z-flying focal spot technique), prospective ECG-gating; rotation time, 0.33 s.

All CTs were performed without injection of contrast material. Examinations were performed during inspiratory breath hold. Images were reconstructed with 3-mm slice thickness, 1.5-mm reconstruction increment and a medium kernel (kernel B).

Coronary artery calcium scoring

Image evaluation was performed using semi-automatic software (“Heartbeat CS”, MxView 5.0.1, Philips Medical Systems in 81 patients and “Calcium Scoring”, Siemens Healthcare Sector in 47 patients). All pixels with a density above a defined threshold (130 HU) were automatically colour marked. Lesions were selected manually and the software recognised the lesions on subsequent images. From the selected areas, the software calculated the lesion volume (in cubic millimetres) and CAC score for each patient according to the Agatston method [18]. For further analysis, CAC was defined as a dichotomous variable: either present or absent. Then CAC scores were categorised into the following four grades: grade 0, CAC score of 0–10; grade 1, CAC scores of 11–100; grade 2, CAC scores of 101–400; grade 3, CAC scores of greater than 400 [12].

Statistical analysis

The role of the four-point grading system of AAC in conventional chest radiography was assessed as a potential surrogate marker of coronary artery disease with CAC scoring by MDCT considered as the standard of reference.

Discrete variables are given as numbers and percentages, continuous variables as means ± standard deviation or means (95 % confidence intervals). Intraobserver agreement was calculated from the two AAC readings of observer A. Interobserver agreement was calculated from the first AAC reading of observer A and the AAC reading of observer B. We calculated the mean AAC grade of the three AAC readings (first and second readings of reader A and the reading of reader B) for further statistical analyses. Agreement between the calculated mean AAC grading and CAC scores was calculated by using weighted ĸ statistics, with linear weightings for rating differences. A ĸ value over 0.81 was considered excellent agreement, whereas a ĸ value of 0.61 to 0.80 was considered good agreement [19]. Data distribution was assessed. Non-normally distributed data were log transformed to obtain a normal distribution. Following statistical analyses log-transformed values were back-transformed. CAC scores among the four AAC gradings were compared using analysis of variance (F test) and polynomial contrast tests. For AAC, a cut-off between 0 and 1 or 2 and 3 was used, respectively, to compute sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy. Receiver operating characteristic (ROC) curves were generated for both cut-off values. A P value less than 0.05 (two-tailed) was considered statistically significant. Nominal P values are reported without adjustment for multiplicity. All statistical analyses were carried out using SAS 9.2 software.

Results

Mean CAC as determined by MDCT score was 487 (range 0–3,339). Scores of 0 to 10 were found in 38 patients, scores of 11–100 in 20 patients; scores of 101–400 in 28 patients and scores greater than 400 in 42 patients. Polynomial testing revealed a significant linear trend between age and the amount of CAC (P < 0.001).

AAC grading on conventional chest radiography classified 52 patients with grade 0, 32 patients with grade 1, 23 patients with grade 2 and 21 patients with grade 3 (Fig. 1). Excellent intraobserver and good interobserver agreement of AAC grading was noted, with weighted ĸ statistics of 0.82 (95 % CI 0.75–0.89) and 0.75 (95 % CI 0.68–0.82), respectively.

Moderate agreement between the calculated mean AAC grading of the three readings and CAC scores was found, with weighted ĸ statistics of 0.46 (95 % CI 0.36–0.56). Analyses of CAC scores of the four different AAC grading groups revealed a mean CAC score of 8.0 (95 % CI 4.6–13.8) for AAC grade 0, 125.1 (95 % CI 58.9–265.8) for AAC grade 1, 292.3 (95 % CI 133.8–638.7) for AAC grade 2 and 1,048.2 (95 % CI 429.9–2,555.7) for AAC grade 3 (Fig. 2). There was a linear trend between four-point AAC grading and CAC scoring (P < 0.0001), and significant differences of CAC scores between AAC graded 0 and 1 (P < 0.0001), as well as between AAC grades 2 and 3 (P = 0.0368), but not between AAC graded 1 and 2 (P = 0.1282).

Fig. 2
figure 2

Significant positive linear trend between four-point AAC grading and CAC score. Polynomial testing revealed a linear trend between CAC and AAC grades (P < 0.0001). Despite the overall significant trend, there was no significant difference between mean values of CAC scores of AAC grade 1 versus 2. Filled circles indicate mean values, and bars indicate 95 % confidence intervals

As four-point grading of AAC revealed no significant difference between the CAC scores of AAC grades 1 and 2, we pooled AAC grades 1 and 2 to obtain a three-point grading scale for further analyses. Analyses of the three-point AAC grading revealed a linear trend between AAC grades and CAC scoring (P < 0.0001), and a significant difference in CAC scores among all three resulting AAC grades: AAC grade 0 versus grades 1–2 (P < 0.0001), grades 1–2 versus grade 3 (P = 0.0017) and grade 0 versus grade 3 (P < 0.0001; Fig. 3). The mean CAC score was of 8.0 (95 % CI 4.6–13.8) for AAC grade 0, 188.3 (95 % CI 109.1–324.8) for pooled AAC grades 1 and 2, and 1,048.2 (95 % CI 427.8–2,568.0) for AAC grade 3.

Fig. 3
figure 3

Significant difference in mean CAC scores of three-point AAC grading. Pooling of AAC grades 1 and 2 revealed not only a linear trend (P < 0.0001) of the resulting three-point AAC grading and CAC score, but also a significant difference between the CAC scores of all three AAC grades: grade 0 versus grades 1–2 (P < 0.0001), grades 1–2 versus grade 3 (P = 0.0017) and grade 0 versus grade 3 (P < 0.0001). Filled circles indicate mean values, and bars indicate 95 % confidence intervals

We pooled AAC grades 0, 1 and 2 and compared them with AAC grade 3 to assess the diagnostic performance of AAC grading to correctly identify patients with CAC scores greater than 400. The mean CAC score for pooled AAC grades 0–2 was 38.8 (95 % CI 24.2–62.3) and significantly lower (P < 0.001) than for AAC grade 3 with a mean CAC score of 1,048.2 (95 % CI 349.3–3,145.1; Fig. 4a). ROC analyses revealed an area under the curve (AUC) of 0.675 (95 % CI 0.600–0.751) (Fig. 4b). The cut-off between AAC grades 0–2 and 3 was true positive in 81 patients, true negative in 17 patients, false positive in 27 patients and false negative in 3 patients for correct identification of a CAC score greater than 400. Sensitivity was 38.6 %, specificity 96.4 %, PPV 85.0 %, NPV 75.0 % and accuracy 76.6 %.

Fig. 4
figure 4

Line plot analyses and ROC analyses of CAC scores of pooled AAC grades 0–2 versus AAC grade 3. a Aligned line plot analyses illustrate the significant difference (P < 0.001) in CAC scores of pooled AAC grades 0–2 versus AAC grade 3. b Receiver operating characteristic (ROC) analyses derived from calculated differences of pooled AAC grades 0–2 versus AAC grade 3 (AUC = 0.675; 95 % CI 0.600–0.751)

Discussion

The extent of aortic arch calcification on conventional chest radiography and its significance are generally disregarded in routine clinical work. This study compared the extent of AAC depicted on chest radiography with CAC as determined by CT. The excellent intraobserver and good interobserver agreement of semi-quantitative AAC grading confirmed that our modified four-point grading of AAC, which has been proposed recently by Iijima et al. [5] and Hashimoto et al. [8], is a reliable and reproducible method. We found a significant linear trend between AAC and CAC. However, when evaluating the validity of semi-quantitative AAC grading to assess CAC, the four-point grading revealed no significant difference between CAC of AAC grades 1 and 2. Therefore we introduced a three-point grading scale by pooling AAC grades 1 and 2 which revealed significant differences in CAC among the resulting three AAC grades.

It is known that the presence of CAC closely correlates with the risk of acute cardiovascular events [20, 21]. According to Rumberger et al. a negative or extremely low coronary calcium score (10 or lower) is consistent with the absence of a significant coronary obstructive lesion [12, 22]. Calcium scores of 11 to 100 are consistent with mild atherosclerotic plaque burden. In these patients active risk modification is critical and daily use of aspirin may be indicated. Calcium scores of 101 to 400 are consistent with moderate coronary plaque and a high likelihood of associated moderate non-obstructive coronary disease. In these patients lowering serum cholesterol levels should be considered and clinical risk stratification should be performed by exercise testing. Patients with coronary calcium scores higher than 400 have an advanced plaque burden with a 90 % specificity for at least one obstructive coronary lesion and are at a high risk of developing symptomatic ischaemic disease [9, 12]. Although CT-based CAC scoring is widely available, plain chest radiographies remain a diagnostic baseline imaging technique in patients with suspected coronary artery disease. Unfortunately, quantifying CAC on plain chest radiographies suffers from an unacceptable rate of false-positive and false-negative findings and an insufficient interobserver variability [17]. Thus, plain chest X-ray (CXR) are of no value for direct detection of CAC. However, as demonstrated in this study, assessment of AAC on chest radiography is reliable and is supported by a linear trend between AAC and CAC. Moreover, using a cut-off between AAC grades 2 and 3 revealed a specificity of 96.4 % for ruling out CAC scores greater than 400. This creates a high probability of CAC scores greater than 400 in case of the presence of an AAC grade 3. These results seem to advocate the proposed AAC three-point grading system on CT as a tool for risk stratification in patients with suspected coronary artery disease. These findings are in line with recent studies which demonstrated that AAC constitutes a strong independent predictor of increased cardiovascular events [5, 8]. However, we modified the grading system as described by Iijima et al., as—in our opinion—differentiating between “thin” and “thick” areas is rather difficult and grade 3 (severe calcification) is not only present in the case of circumferential calcification [5, 8].

Although calcifications are present in both “stable” and “unstable” plaques, it may be assumed that a greater calcified plaque burden increases the likelihood of both plaque subtypes. Indeed, the extent of coronary atheromatous disease remains the most powerful predictor of subsequent or recurrent cardiac events [23]. However, indirect determination of CAC by quantification of AAC on chest radiographies could help to avoid costly CTs only in the minority of patients having AAC grade 3. The radiation exposure of 0.01–0.05 mSv on conventional chest radiographies is significantly lower when compared with CAC scoring by CT [24]. However, the exact quantification of CAC by CT has to be considered when weighing the different imaging techniques. In the majority of cases having AAC grades 0–2, one cannot avoid the cost of CT.

The obtained high specificity of 96.4 % in our study population seems to advocate our AAC grading system for the screening of asymptomatic patients, where a high specificity is desired, even with a trade-off for sensitivity. However, it has to be kept in mind that our study population underlies a certain bias owing to the inclusion criteria. In screening radiology the diagnosis of an asymptomatic disease must be balanced by the need for a limited amount of useless diagnostic work-up in the screened population. On the other hand, the low sensitivity of 38.6 % makes AAC grading inadvisable for confirmation of coronary artery disease in symptomatic patients [25]. However, if chest radiography has been performed in an individual, the acceptable PPV and NPV of 85.0 % and 75.0 %, respectively, indicate that a report on the presence and exact semi-quantitative grade of AAC on conventional chest radiography is justified, serving as a surrogate marker of CAC.

This retrospective study has several limitations. First, owing to its retrospective nature, the gold standard for detecting coronary artery disease, coronary angiography, was not performed. Second, there are neither data on further diagnostic procedures like stress-ECG nor follow-up data that provide a link to the cardiovascular risk in the reported patients. However, as the aim of the study was to evaluate the correlation of AAC grade on chest radiographies and CAC scoring as determined by MDCT, follow-up information is not necessarily needed. Moreover, the association between AAC (determined by MDCT) and coronary artery calcification and increased risk of death has been demonstrated in recent studies [15, 16]. Third, chest radiography evaluated in our study included postero-anterior views (acquired in the upright position) as well as antero-posterior views (acquired in the supine position), which is generally harder to interpret and which may have an influence on the diagnostic performance. However, this reflects the everyday practice in the clinical setting and hence should not be interpreted as a potential bias.

In conclusion, semi-quantitative aortic arch calcification detection on chest radiography is a reliable tool and positively correlates with coronary artery calcification scoring as determined by CT. The significant differences in CAC scores as a function of a three-point AAC grading advocates stating of aortic arch calcifications as “not present”, “moderate” or “severe” in a comprehensive radiological report. Our results indicate that grading of AAC as “severe” strongly suggests coronary artery calcification, which advocates integration of AAC grading into the risk stratification strategy in patients with suspected coronary disease.