Introduction

“Early detection”, while necessary, is insufficient to establish the efficacy of a test intended to screen for cancer. Only reduction of disease-specific mortality among a screened population can indicate efficacy. Clinical researchers with an interest in lung cancer have pursued an effective screening test for over 50 years. The National Lung Cancer Screening Trial (NLST) answered this call, randomizing 50,000 individuals at risk of lung cancer to either low-dose helical CT or CXR annually for three years. The results showed that LDCT screening reduced lung cancer-specific mortality among heavy (30 pack-years or more) current or former (within 15 years) smokers between the ages of 55 and 75 from 309 to 247 deaths per 100,000 person years (relative risk 0.8) [1••]. Since then, after announcements about the cost effectiveness of screening for lung cancer with LDCT, the USPSTF has issued draft recommendations for annual LDCT for heavy current or former smokers age 55–79 (http://www.uspreventiveservicestaskforce.org/draftrec.htm). Cautious editorial writers have correctly pointed out the perils of indiscriminately offering screening on a widespread basis. These hazards are numerous, and some are potentially insidious. Even before these preliminary announcements, many centers began to offer screening with LDCT. The objective of any screening program should, first and foremost, be to minimize harm when screening an asymptomatic population. NLST results allude to the potential benefits of screening, but converting a clinical trial into clinical practice is the challenge. This review will attempt to outline the elements of that process.

Converting Efficacy into Effectiveness

The results of the NLST were robust, causing great excitement for the lung cancer community. But clinical trial efficacy cannot automatically be converted into real-world effectiveness. When trying to generalize the findings of a clinical trial, it is necessary to examine the details of the study, and ask: “Do the patients in the study resemble the patients in your examination room?” The NLST investigators anticipated this question and compared participants in the NLST with the population of eligible smokers in the US using data from the US census tobacco supplement. They found that screen-eligible subjects in the US population were less educated, slightly older, and more likely to be current smokers than the NLST study subjects [2]. Although each of these differences was small, one cannot ignore the potential effect they would have on real-world efficacy of LDCT screening. Other important questions include: “Did the setting of the study have any effect on the outcome of the study? If so, do the qualities of the study setting match your practice setting?” For example, screening programs must recognize the need for quality control of the specifications of CT imaging, the qualifications and expertise of the interpreting radiologists, and the standardized criteria for reporting abnormalities. Each of these factors was prospectively determined and closely monitored in the conduct of the NLST [3••], and the effect of these “behind the scenes” quality-control measures on the effectiveness and safety of LDCT screening should not be taken for granted.

Furthermore, although evaluation of abnormal screening findings was not standardized in the NLST, all findings were generally followed up in centers where there is significant expertise in the management of incidental or screen-detected nodules. It is a mistake to look at the data on screening in isolation from the setting in which these reductions in mortality were achieved. Anyone starting a lung cancer screening program should understand both the risks and benefits of screening, including the incidence of false-positive results and appropriate diagnostic pathways. Most nodules discovered on LDCT are benign, and the process should focus on obtaining proof that the nodule is benign with the least amount of cost and risk to the patient. In almost all instances this means careful reassurance of anxious patients, and serial follow up at intervals meaningful enough to determine if growth has occurred but that preserve treatment options. In addition to the potential for physical harm and psychological distress, the potential public health cost(s) of screening must be considered.

Recognizing the enormous potential for harm is necessary when screening populations for a very rare condition (less than 2 % of those screened). This potential harm is not only measured in the costs associated with screening but also in the reductions in quality of life, the potential for invasive procedures for incidental findings, and, of course, the risk of biopsy and or surgery. It is for these reasons that I recommend screening be primarily conducted in centers with access to multidisciplinary teams with experience in management of incidental lung nodules. LDCT screening for lung cancer is not a single test performed in isolation. Screening must be considered as a process that has, at its core, the objective of preventing death from lung cancer. Consequently, it is impossible to overstate the importance of effective tobacco-cessation efforts being built into any screening program. Implementing a screening program should start by addressing several key questions: in turn we will discuss each of these.

  • Who should be screened?

  • How do we communicate screening benefits and risks to patients?

  • What is the role of smoking cessation as part of a lung cancer screening program?

  • In what environment should screening take place and test results be managed?

  • How should CT examinations be performed and interpreted for screening?

  • What is the proper approach for patients with an abnormal screen?

  • What is the proper approach for patients with a negative screen?

  • What data should lung cancer screening programs collect?

  • What is the cost effectiveness of lung cancer screening?

Who Should Be Screened?

Identifying individuals for whom LDCT screening can potentially be beneficial requires a quantifiable measure of lung cancer risk. The NLST investigators sought to maximize the benefit of screening by reducing the factor of competing mortality in a high-age group. An age criterion of 55–74 years was combined with a 30 pack-year or more tobacco history, which had to be either current or recent (within 15 years; [1••, 3••, 4]). The median number of tobacco pack-years for the NLST subjects was approximately 48. Available data suggest that only current or recent former smokers meeting these criteria for tobacco use and age range should be considered for LDCT screening.

Tobacco pack-years is a valid reflection of lung cancer risk. However, there is wide variation in lung cancer risk even among those who are heavy smokers [57, 8••]. Although we can easily identify smoking as a risk factor for lung cancer, it is more useful as an epidemiologic tool for than for predicting the risk for any individual. The importance of identifying risk at an individual level is further emphasized by a recent study published by NLST investigators in which they examined risk of lung cancer death among NLST participants in the control group, developing a multivariable regression model of clinical features that predicted a five-year risk of lung cancer death. They then determined whether the benefit of screening was distributed equally among quintiles of risk. As might be expected they found that subjects in the three highest quintiles of lung cancer mortality risk (60 % of participants at highest risk of lung-cancer death) accounted for 88 % of screening-prevented lung-cancer deaths and for 64 % of participants with false positive results. The 20 % of participants at lowest risk (quintile 1) accounted for only 1 % of prevented lung-cancer deaths [9]. If screening is offered on a more widespread basis, it will be more effective and more cost effective when we can identify those whose history puts them at the highest risk of lung cancer mortality.

Several models have evolved to quantify lung cancer mortality risk over a defined period of time. These models use characteristics easily identified in demographic and medical history to quantify risk of lung cancer ([5, 6, 10, 11]; Table 1). Although all of these models have internal validity and have to different extents been validated on external populations, none has been used to prospectively identify those who might benefit from LDCT screening. This need for prospective validation and refinement of risk-prediction models suggests a need to develop registries to track screened individuals and the outcomes of LDCT screening.

Table 1 Examples of published models of lung cancer risk

A useful comparison of screening efficacy is the number of individuals needed to screen (NNS) to prevent one death [12]. For the NLST the risk of death from lung cancer in the LDCT and control populations was 1.4 and 1.7 %, respectively a difference of 0.3 %, this gives a NNS of ~320 people to save one life. For comparison, with annual mammography for women, screening between 380 and 1,900 persons (depending upon age) for ~10 years results in one life saved. By this measure, LDCT screening of high-risk individuals is very effective intervention. Using NNS as a measure of screening efficacy can also be helpful when considering whom to screen. A paper by Bach and Gould elegantly pointed out that data describing average risks and benefits from large studies accurately capture those risks and benefits for few, if any, individual(s). Although the NNS for the entire NLST population was 320, the average individual in the NLST (a 65 year old with a 50 pack-year current tobacco history) has an NNS of 256 whereas for very low risk individuals (the authors used as an example a 40 year old former smoker) the NNS was over 35,000 to prevent one lung cancer death [13•]. When properly applied, the NNS statistic is therefore unique in that it is useful information both for public policy makers and for individuals in the examination room.

How Do We Communicate Screening Benefits and Risks to Patients?

In addition to relating absolute risks of lung cancer death as accurately as possible, there are other useful items that patients should consider when being offered LDCT screening. Individuals seeking screening should be counseled not only on the benefit of LDCT with regard to reducing lung cancer mortality, but on the likelihood of a positive screen, and subsequent management, the possibility that a significant non-lung cancer-related abnormality may be detected, and the implications of a negative screen. The risk of radiation exposure should also be acknowledged, especially when patients express concerns. The nature of screening and the implication of false positive findings are difficult for many people to comprehend. Patients should be counseled on the notion that screening for lung cancer is not a “test” but rather a process, and one that carries measurable risks. Below are some questions that may patients may raise.

What is the likelihood of a positive “positive” CT screen, and what does it mean? Lung nodules 4 mm and larger without specific benign features (specific patterns of calcification or fat), are regarded as “positive” for the purposes of screening. In NLST, 27 % of CT screened individuals had an abnormal screen on the first round, and 39 % were abnormal after three rounds of screening. In other screening trials, most of which are single-group cohorts, up to 50 % of subjects had an abnormal first screening CT examination. In the Mayo Clinic cohort, after five annual CT screening examinations, at least one non-calcified lung nodule was found in 74 % of participants [14]. The likelihood of a nodule being detected depends, in part, on the individual’s geographic location, the thickness of the CT slice reconstruction, and the radiologist’s experience. In NLST the vast majority of abnormal screens are, of course, not cancer. Only 4 % of patients with a positive screen had lung cancer. Conversely, 96 % of the abnormal findings were false positives. Most abnormal screens are small lung nodules, 4–8 mm in size, for which the probability of lung cancer is very low, and proper management is additional low-dose nodule CT examinations. Nodules less than 4 mm are not regarded as a positive screen, and individuals with these findings should continue to the next annual LDCT. Data from a non-randomized trial suggest that simply increasing the threshold definition of “abnormal” from 4 mm up to 7 or 8 mm can significantly reduce the number of “false positives” (by perhaps 50 %) with only a 5 % decrease in the number of cancers detected [15]. This practice, although appealing in principle, cannot be recommended unless larger studies can confirm that a higher threshold for abnormal does not diminish screening efficacy.

If my screening CT is negative, do I need to come back for another one? Screening for lung cancer must be thought of as a process, not a single test. To achieve results comparable with those from a clinical trial one must duplicate all the features of the trial. In the NLST, adherence to screening was over 90 % [1••]. Adherence is a major hurdle to converting efficacy to effectiveness in cancer screening and other public health initiatives [16]. Subjects seeking screening for lung cancer should be able to articulate that the reason to be screened is to reduce their probability of dying from lung cancer, and that this is best achieved first and foremost by tobacco cessation, then by LDCT screening. The latter takes three annual LDCTs to complete. Only those who adhere to the prescribed schedule of CT screening can expect to realize the reduced probability of lung cancer death.

What if you find a nodule? A discussion of how to evaluate screen detected nodules is given in the section “What is the proper approach for patients with an abnormal screen?

What is the likelihood that an invasive procedure will be required to evaluate an abnormal screen? Very few invasive procedures were performed in the NLST population and the incidence of complications was only 1.4 % in screened individuals who had an invasive procedure. Among those without cancer, less than 0.1 % of the positive screening tests led to a major complication after an invasive procedure. Of the over 26,000 subjects in the CT group of NLST, 16 subjects (10 of whom had cancer) died within 60 days of an invasive procedure (<0.03 %) [1••]. These data further emphasize the need for patients with screen-detected abnormalities to be guided through the evaluation by clinicians experienced in evaluating indeterminate nodules. Nothing will negate the efficacy of screening quite as fast as a complication from an invasive procedure for a benign nodule. Knowing when to do nothing has become a necessary skill.

What about the risk from radiation? Most of our understanding of radiation-induced cancer comes from extrapolation of exposure–disease history from survivors of atomic bomb fallout. Assumptions about risk include the notion that DNA repair capacity from a single large dose of radiation is the same as DNA repair after smaller doses. These assumptions are almost certainly flawed, and are likely to lead to over-estimates of risk. Nevertheless, widespread use of CT imaging, especially among those with other cancer risk factors, is likely to cause a real but very small number of additional cancers. Estimates of the size of this risk vary, but all are well below the number of additional lives that could be spared lung cancer mortality as a result of LDCT screening [17, 18]. We should acknowledge to our patients that there is risk with any medical imaging that uses ionizing radiation. All use of radiation should be actively limited to the lowest possible level. Newer software-based methods that can further reduce the already low dose of screening CTs are being developed, and the imperative to reduce radiation dose should never be taken for granted [19]. One of the major achievements of registries for coronary calcium screening CT scans was the robust reductions in radiation dose that were achieved simply by tracking this as an outcome measure in centers participating in the registry [20]. The point I stress, again, is that screening should only be conducted in centers that can match the stringent quality-control criteria for the physics of the scanner that were developed and monitored in the NLST.

What is the Role of Smoking Cessation as Part of a Lung Cancer Screening Program?

If the purpose of a lung cancer screening program is to reduce lung cancer mortality, we must acknowledge that this objective is achieved most efficiently by smoking cessation. Unquestionably, this is an essential component of any lung cancer screening program. All patients seen in a screening setting should be given counseling and offered pharmacotherapy options for smoking cessation. Effective smoking cessation requires overcoming both the symptoms of nicotine addiction and the deeply ingrained habit. Picking up and lighting a cigarette are natural and automatic behavior for most smokers. A 30 pack-year smoker has picked up a pack, removed a cigarette, lit it, and taken a drag nearly 220,000 times. Imagine how good you would be at anything you did that often, and how little conscious thought it would require on your part. This is the habit, and these are the features of smoking cessation that require conscious thought and active effort to overcome. Smokers are often able to navigate the three or so weeks of nicotine withdrawal but fall victim to the habitual and automatic nature of picking up and lighting a cigarette [21]. Tobacco cessation products are effective at reducing nicotine withdrawal symptoms but less effective at interrupting habitual behavior. Bupropion and varenicline may be effective in part because they disrupt the neural pathways involved in habitual maintenance [2224]. Follow up phone calls to encourage continued abstinence after physician visits may also be effective [25]. Table 2 lists a suggested hierarchy of drugs to aid smoking cessation.

Table 2 Suggested pharmacologic regimens for treating nicotine dependence based upon severity of addiction

In What Environment Should Screening Take Place and Test Results Be Managed?

Screening in general is a primary care discipline. Therefore, it is natural to assume that the responsibility for lung cancer screening will fall to primary care providers also. This addresses critical features of access, and convenience, both necessary elements of public health intervention. However, there are important differences between current cancer screening strategies and lung cancer screening; the greater incidence of false positives in LDCT is just one. Additionally, as noted by the authors of the study, the low incidence of complications from investigation of screen-detected abnormalities in the NLST may not be duplicated if follow up is conducted outside high-volume centers [1••]. One of the most important factors determining the success of screening will be the surgical mortality of lung cancer resection, which was lower in the NLST (1 %) than previously reported for the general US population (4 %) [1••, 26]. Also, because screening for lung cancer should only be offered to high-risk individuals and not to the general population, there is great need to identify individuals at high risk of lung cancer. Estimating risk is not a standardized or precise practice, even in highly specialized centers [5, 6, 10, 11]. These caveats indicate that the responsibilities of well run screening programs include:

  1. 1

    development and validation of tools to assess risk;

  2. 2

    dissemination of guidelines within the region or institution for managing abnormal CT screening results for primary care physicians; and

  3. 3

    serving as consultants for managing abnormal findings, especially when additional intervention (e.g. positron emission tomography (PET) scan, biopsy, or surgical resection) may be needed.

How Should CT Scans Be Performed And Interpreted for Screening?

The NLST used a low-radiation exposure protocol for LDCT screening scans. Although the term “low-dose” is not standardized, the NLST scan conditions resulted in radiation exposure of approximately 1.5 milliSievert (mSv). For comparison, a conventional chest CT involves exposure to approximately 5–8 mSv. Slice thickness should be 3 mm or smaller, with overlapping reconstructions at 50 % of the slice thickness. For each indeterminate nodule, the size, shape, morphology, lobar/segmental location, series, and slice number should be explicitly recorded in the report to facilitate comparison with results from future examinations. CT reports should include standard guidelines for further evaluation of positive test results. At our institution it is standard practice to include tables of recommendations from the Fleischner Society Guidelines in any CT report of incidentally detected nodule(s) [27, 28].

What is the Proper Approach for Patients with an Abnormal Screen?

It is important to follow published guidelines on the management of small pulmonary nodules, and although in-depth discussion of the management of screen-detected nodules is not appropriate in this review, one must always be aware that evaluation of nodules starts first and foremost with an estimate of the pre-test probability of cancer [29]. Numerous prediction models exist for this purpose [3032]. One widely used model uses six easily identifiable variables; three clinical variables (age, cancer history, and tobacco history) and thee nodule characteristics (diameter, upper/lower lobe location, and edge characteristics; [31]).

Most screen-detected abnormalities are small nodules with very low probability of cancer, and reassurance of patients with lung nodules is one of the most common tasks in screening. An excellent study by Weiner et al. [33] in which they elicited patient preferences about “nodule discussions” with their doctors provides guidance for approaching these conversations. The objective for most patients with indeterminate nodules is to prove they are benign. Ultimately, we have only two tools to prove a non-calcified nodule is benign—surgery and time. For the vast majority of indeterminate small nodules, the proper approach to achieve proof is simple serial observation. In addition to citing data that fewer than 4 % of lung nodules found in NLST participants were cancer, there are other data that can be used for this reassurance. In one non-randomized study, 27,500 individuals had a negative baseline screening CT. Subsequently 1,460 (5.3 %) developed a new lung nodule on subsequent annual screens. Of these new nodules, only 5 % (n = 70) were lung cancer [34]. I reassure patients with new indeterminate lung nodules of the safety of serial CT observation when the probability of cancer is very low. When these reassuring facts are related to patients in language they comprehend combined with a very low probability of cancer, nearly all patients see the logic behind conservative recommendations for follow up. When approaching a patient with a nodule of low probability for lung cancer, the Fleischner Society Guidelines [27] have performed very well for both patients and physicians in detecting malignancy while minimizing the number of CT scans required to achieve proof of benignness [27, 28].

For those with intermediate probability nodules large enough to be characterized on PET—generally 8 mm or larger—a CT–PET scan should be performed [29]. A PET scan can prove neither the presence nor absence of cancer, but it is a very useful tool to move the pendulum toward or away from diagnosis of cancer, and serves as an indispensable staging tool for those with a high probability of cancer. Biopsy offers very little benefit and considerable risk to an otherwise healthy individual who has a nodule/mass with a high probability of malignancy. These patients should be referred for surgical resection [29, 35]. A biopsy should be considered in limited circumstances—for a high-risk surgical patient with a suspicious nodule which may require radiotherapy, or when CT scan, clinical assessment, and PET provide discordant evidence of malignancy. A biopsy is a tool to prove the presence of cancer; it will rarely provide definitive evidence that a nodule is benign. A benign result on a biopsy such as granuloma or organizing pneumonia is not license to ignore the nodule. Some degree of follow up to exclude false-negative biopsy results is usually advisable to ensure resolution or lack of growth.

What is the Proper Approach for Patients with a Negative Screen?

Patients with normal screening results should be reminded that screening for lung cancer is a process over time, not a single “test”, and that the mortality benefit of screening seen in the NLST was achieved on the basis of three yearly CT scans. There are currently no data to support screening beyond three years, but USPSTF recommendations are based on modeling data from the NLST and extrapolated to a large population. Their recommendations suggest annual screens between the ages of 55 and 80 for those with an appropriate tobacco history. I am concerned about the open ended-nature of these recommendations (which currently are issued in draft form only). Further data and projected modeling of mortality and cost-effectiveness analysis from published screening studies will probably furnish information enabling a decision on whether to screen beyond three annual CT scans. For now, I look for further guidance from yet to be published studies.

What Data Should Lung Cancer Screening Programs Collect?

Establishing a lung cancer screening program in the current health care environment should prompt a commitment to establish a registry and standardized database of screened subjects and a biospecimen repository (where feasible) to facilitate development and/or validation of biomarkers, and a commitment to develop and/or prospectively validate a model that measures risk at the individual level. Minimal suggested variables for registries include:

  • lung cancer risk factors of screened individuals (with standardization, using published models of risk);

  • results of the screen and conditions of the CT scan (dose, slice thickness);

  • presence of any abnormal findings;

  • further testing conducted to investigate those findings; and

  • complications resulting from screen-detected abnormalities.

Registries developed with the purpose of prospectively tracking the results of LDCT screening can hold screening programs accountable for the results of screening and promote quality control, just as surgical registries have done for common procedures [3638].

What is the Cost Effectiveness of Lung Cancer Screening?

One of the major strengths of the NLST study was the prospective inclusion of a sub-study designed to determine the cost per quality-adjusted life year ($/QALY), or cost-effectiveness, of lung cancer screening. These data (unpublished at the time this paper was written) were announced on 24 June 2013 at a joint meeting of the National Cancer Advisory Board and the NCI Board of Scientific Advisors (http://videocast.nih.gov/live.asp?live=12906&bhcp=1 accessed 3 September 2013). The $/QALY for three rounds of LDCT screening of high-risk smokers enrolled in the NLST was $72,916—in line with other screening tests in practice today, including mammography. Every cost-effectiveness study is based on assumptions, and the effect of these assumptions is modeled in a sensitivity analysis when the base case analysis is complete. During this public presentation, the major uncertainties in the NLST analysis suggested that, to the extent the assumptions were incorrect, the calculations of $/QALY were likely to be an overestimate. Selected variables that would improve the cost effectiveness of LDCT screening for lung cancer included: underestimation of the efficacy of plain CXR as a screening modality (unlikely), additional “catch up” cases in the control group (cases that were not detected in the control group that would eventually develop during longer follow up—the model that was used assumed no further catch up), decreasing cost of LDCT, and fewer follow up studies conducted for indeterminate nodules. More importantly, variables that could significantly increase $/QALY were also identified, and most were simply the opposite of the variables that would reduce costs. One very important fault that could unacceptably increase the $/QALY (reduce cost effectiveness) of LDCT screening is screening a population with lower risk for lung cancer than that of the NLST-screened population. As we await the peer reviewed publication of these data, we must keep these factors in mind when making decisions about whom to screen.

Conclusions

Lung cancer screening with three yearly annual LDCT scans has the potential to reduce mortality from lung cancer among a population of older heavy smokers who can benefit from lung cancer treatment. Although the importance of this achievement by the NLST investigators cannot be over stated, it should be viewed as first step, not as an objective. Many major questions remain about how to best realize this mortality reduction, minimize harm, and contain costs in a practical real-world context. Screening for lung cancer can and should be conducted, but it will be most effective if it is accompanied by continued research into risk modeling, patient communication, biomarkers of risk, and diagnostic biomarkers also. For clinicians seeking to establish a program of lung cancer screening, I urge this be conducted responsibly, adhering to practices specified in the design of the NLST, and with as much attention given to proper selection of those at risk, and management of screen-detected abnormalities, as is given to the initial screening process.