INTRODUCTION

Alcohol misuse is common1 and causes extensive morbidity and mortality.2 Brief alcohol counseling decreases drinking,3,4 and a National Commission on Prevention Priorities designated alcohol screening and brief alcohol interventions the third highest US adult prevention priority.5 Routine alcohol screening is required to identify patients who might benefit from brief alcohol counseling.6

The Veterans Affairs (VA) Health Care System implemented routine screening for alcohol misuse in 2004,6 and since 2006 has required that the Alcohol Use Disorders Identification Test — Consumption Questions (AUDIT-C) be used for screening. Each VA is expected to meet performance targets, but the approach used to implement alcohol screening is left up to individual facilities or networks (e.g. at triage, by primary care providers, paper questionnaire, etc.). Over 90% of VA outpatients nationwide are screened with the AUDIT-C and rates of documented brief alcohol counseling in VA are increasing.7

The AUDIT-C has been validated when interviewer-administered, and when completed on mailed questionnaires with results shared with primary care providers,813 but little is known about its performance when implemented as part of routine clinical care. A prior small study raised concerns about the quality of clinical alcohol screening in VA despite the use of validated screening questionnaires.14 In January 2008, an electronic clinical reminder was disseminated that prompted VA providers to ask AUDIT-C questions verbatim, in a private setting, and in a nonjudgmental manner.

The purpose of this study was to evaluate the quality of clinical alcohol screening in the VA from 2007-2008, by comparing the results of the AUDIT-C documented during routine clinical care to the results of the AUDIT-C completed on a confidential mailed survey within 90 days of the clinical screen. A second aim was to evaluate factors associated with discordance between the results of clinical and survey alcohol screens, which was not possible in the previous smaller study.

METHODS

The study sample included VA outpatients who were in two independent quality improvement programs conducted for the VA Office of Quality and Performance: the External Peer Review Program (EPRP)15 and the Survey of Healthcare Experiences of Patients (SHEP).16 Patients were eligible if they had AUDIT-C results available from both programs between October 1, 2006 and September 30, 2008. EPRP assesses VA care, including AUDIT-C alcohol screening, by reviewing VA medical records. SHEP is the VA’s patient satisfaction survey (response rate of 54.5% during study), which also includes the AUDIT-C. The EPRP sampled a random sample of outpatients nationwide, but oversampled certain subgroups (e.g. women 52–69 years old and patients with specific chronic diseases). Both the EPRP and SHEP sampled patients with recent outpatient visits, but the samples were independent during this study, so the study sample resulted from chance overlap in patients included in the EPRP and SHEP (Fig. 1).

Figure 1
figure 1

Study sample. The prevalence of positive survey screens among all of 427,612 survey respondents who completed the AUDIT-C was 13.4% (95% CI 13.3-13.5) and the prevalence of positive clinical screens in all 207,181 patients who had AUDIT-C results abstracted from the medical records was 6.4% (95% CI 6.3-6.5). These were higher than in the study sample, as expected, likely due to greater survey non-response in younger patients 16 and heavier drinkers.30

The study was approved by the VA Office of Quality and Performance and the University of Washington and VA Puget Sound Institutional Review Boards, with waivers of written informed consent and HIPAA authorization.

Measures

Alcohol Screening

AUDIT-C questions ask about typical quantity and frequency of drinking and the frequency of binge drinking and have established 3-month test–retest reliability.17,18 AUDIT-C scores range from 0 to 12 points. Scores of 5 or more points were considered a positive alcohol screen in this study, consistent with the VA performance measure for brief alcohol counseling implemented in October 2007.7,19 This threshold minimizes the burden of counseling patients with false-positive AUDIT-C screens.9,10

Clinical Screen

Clinical AUDIT-C screening documented in the medical record and abstracted by EPRP medical record reviewers could be conducted by telephone, at intake for appointments by medical assistants or nurses, by medical providers, or by self-report (e.g. on paper) with results later entered into VAs electronic medical record. Approaches to screening vary across VA clinics and facilities, but anecdotal reports suggest that most used a clinical reminder in the electronic medical record to score the AUDIT-C and document screening.

Survey Screen

AUDIT-C results from mailed SHEP surveys received within 90 days of the clinical screen were used as a comparison standard. The survey introduction stated “all information is strictly anonymous. It will not be shared with your doctor”. Survey screens were classified as follows: no past year alcohol use (AUDIT-C score 0 points); low-risk drinking (1-3 points men; 1-2 women); possible alcohol misuse (4 points men; 3-4 women); alcohol misuse (5-7 points); and severe alcohol misuse (8-12 points).2023

Discordant Screening Results

Screening results were considered discordant if one screen was positive (AUDIT-C ≥ 5) and the other negative.

Other Measures

Patient Characteristics

Race, education, marital status, and income were self-reported on surveys. Gender, age, and tobacco use in the past year were obtained from medical record reviews. A measure of any mental health or alcohol or other substance use disorder diagnosis documented in the year prior to medical record reviews was based on ICD-9 codes from administrative data obtained by EPRP.

Regional VA Networks

Patients were assigned to one of 21 VA networks based on where they were screened clinically. Network directors oversee operations at all hospitals and freestanding community-based outpatient clinics in their region, and their annual contracts include financial incentives for meeting performance targets. If discordance varied across networks it could reflect different clinical cultures and alcohol screening strategies. Although greater variation was expected at the facility level (n = 128), the study sample included too few patients for precise facility-level estimates of discordance.

Temporal Factors Potentially Associated with Discordance

Three temporal measures were evaluated: the number of days between the two screens (≤14; 15-30; 31-60; and 61-90 days); the order of the survey and clinical screens; and the date of the clinical screen relative to implementation efforts. A brief alcohol counseling performance measure initiated October 20077 was hypothesized to act as a disincentive to identify alcohol misuse and thereby increase discordance, whereas an alcohol screening clinical reminder disseminated 3 months later (January 2008), was hypothesized to decrease discordance because it prompted providers to ask screening questions verbatim, in a private setting and non-judgmentally.

Analyses

The dichotomous results of survey and clinical screens were cross-tabulated to identify discordance in the total sample and across demographic, clinical, and temporal subgroups. Multivariable logistic regression was used to identify factors independently associated with discordance. Two models are presented—one with survey screen results and one without. The smaller model included demographic characteristics, tobacco use, prior year mental health or substance use disorders diagnosis, VA network where clinical screening occurred, and the three temporal measures. To evaluate whether findings were confounded by severity of self-reported alcohol misuse, the second model added results of survey screens (5 AUDIT-C categories). The adjusted prevalence of discordance is presented to reflect the magnitude as well as the statistical significance of differences in prevalence rates of discordance across subgroups. All analyses were conducted using STATA, version 11.1.24

RESULTS

The study included 6,861 patients with clinical and survey AUDIT-C screens within 90 days of each other (Fig. 1). Participants were predominantly older, white men, and 17% were women due to oversampling (Table 1). About twice as many patients screened positive for alcohol misuse on survey screens as on clinical screens: 11.1% (95%CI 10.4-11.9%) and 5.7% (5.1- 6.2%), respectively. On average, the time between clinical and survey screens was 54 days (SD =22), and clinical screens preceded survey screens for 71.7% of the sample, as expected because the clinical screen would precede the survey screen when both medical record reviews and patient satisfaction surveys were triggered by the same outpatient visit.

Table 1 Study Sample: VA Outpatients who had Both Clinical and Survey Alcohol Screens within 90 Days (N 6,861)

Discordance Between Survey and Clinical Screening

Overall, 561 (8.2%; 7.5-8.8%) of 6,861 patients had discordant clinical and survey screens, with patients who screened positive on the survey much more likely to have discordant screening results (Table 2). Among the 561 patients with discordant screening results, 468 (83%) screened positive on the survey screens, and 93 (17%) screened positive on clinical screens. Whereas 468 (61.2%; 57.7-64.6%) of 765 patients with positive survey screens had discordant screens, only 93 (1.5%; 1.2-1.8%) of 6096 patients with negative survey screens had discordant screens (Table 2). In contrast, 93 (23.8%; 19.6-28.1%) of 390 patients who screened positive on clinical screens had discordant screens, and 468 (7.2%; 6.6-7.9%) of 6,471 who screened negative on clinical screens had discordant screens (Table 2). Among patients whose clinical screens indicated no alcohol use in the past year (AUDIT-C score 0), 21.9% reported drinking on survey screens. In comparison, among those whose survey screens indicated no past-year alcohol use, only 8.7% had clinical screens indicating past-year drinking.

Table 2 Discordant Clinical and Survey Alcohol Screening Results (N = 6,861)

Association with Patient Characteristics

Younger, male, and unmarried patients, those who smoked or had prior mental health or substance use disorder diagnoses, and patients who reported alcohol use on the survey AUDIT-C, especially those with positive survey screens, were more likely to have discordant alcohol screening results (Table 3). The pattern of discordance across AUDIT-C scores from clinical screens showed the expected normal distribution around the screening threshold (Fig. 2 Panel a). In contrast, there was a non-normal pattern of discordance across AUDIT-C scores from survey screens (Fig. 2; Panel b). For example, among those with the highest survey screens (AUDIT-C scores 8-12) 106/228 (46.5%) had discordant clinical screens (Table 3), reflecting at least 4-8 point differences between AUDIT-C scores from survey and clinical screens.

Table 3 Prevalence of Discordant Results between Clinical and Survey Alcohol Screens (N = 6,861)
Figure 2
figure 2

Among patients with each AUDIT-C score based on clinical screens (Panel a) or survey screens (Panel b), the percent with discordant results on the other screen (N = 6,861). Alcohol screening results are considered discordant if a patient screened positive on the clinical or survey screen but not the other. Panel a: The percent of patients with each AUDIT-C score (0–12 points) on the clinical screen who had discordant survey screen results. Panel b: The percent of patients with each AUDIT-C score (0–12 points) on the survey screen who had discordant clinical screen results.

Differences across VA Networks

Rates of discordance differed across networks ranging from 4% to 13% (p = 0.002). The network prevalence of discordance among patients with positive survey screens ranged from 43 to 100% (p = 0.002), whereas the prevalence of discordance among patients with negative survey screens did not differ across networks (0.5–4%; p = 0.236). Compared to a large network with the lowest screening discordance (Network A), 11 of 21 networks had significantly higher discordance in bivariate logistic regression analyses.

Timing and Order of the Two Screens

There was no association between discordance and the time between, or order of, survey and clinical screens (Table 3). There was also no association between discordance and the timing of clinical screens with regards to implementation efforts, including a performance measure for brief alcohol counseling or dissemination of a screening clinical reminder that included recommendations aimed at improving the validity of screening (Table 3).

Multivariable Analyses

In multivariable analyses including all variables in Table 3 except survey AUDIT-C scores, the prevalence of discordance was increased in many patient subgroups (Table 3). However, many patient characteristics associated with discordance are known to be associated with AUDIT-C scores25 and after adding survey AUDIT-C scores (5 categories) to the model only two factors aside from survey AUDIT-C scores were associated with discordance: self-reported Black/African American race and VA network (Table 3), suggesting that many associations were confounded by alcohol misuse. There were no significant interactions between race and VA network, or between race and AUDIT-C groups (p’s 0.99 and 0.95 respectively).

DISCUSSION

This study of alcohol screening in the VA found that 61% of patients who screened positive for alcohol misuse on a mailed survey screened negative when screened clinically despite use of the same validated screening questionnaire, suggesting that many patients who could benefit from brief alcohol counseling are being missed by clinical screening in VA. Black race and the VA Network where clinical screening was conducted were the only factors other than survey AUDIT-C scores that were associated with discordance between clinical and survey AUDIT-C screens in fully-adjusted analyses.

Some discordance between clinical and survey screens is expected. Patients might be more motivated to report drinking honestly in clinical settings if they feel that their provider needs the information and/or it might be relevant to their health. In addition, some discordance is expected due to random measurement error, changes in patients’ drinking between two screens, or regression to the mean. However, the observed discordance between clinical and survey AUDIT-Cs cannot be accounted for by these factors. Discordance due to random measurement error and changes in drinking would be expected to occur in a normal distribution around the screening threshold similar to that observed across clinical screening scores (Fig. 2 Panel a). However, the distribution of discordance across survey screening scores was not normal (Fig. 1, Panel b). Furthermore, discordance due to changes in drinking should increase as the time between screens increases, but no such association was observed. Discordance cannot solely reflect regression to the mean or decreased drinking at repeat screening because there was no association between the order of screens and discordance. Furthermore, randomized controlled trials suggest that repeated screening leads to lower reported consumption on later screens.2628 This bias would be expected to result in lower AUDIT-C scores on the survey, which tended to follow the clinical screen, the opposite of the observed association.

Social desirability bias likely contributed to the observed discordance. Over twice as many patients who screened positive on confidential mailed surveys had discordant results compared to patients who had positive clinical screens (61% vs. 24%). If the latter is an estimate of the magnitude of “expected” discordance, 37% (95% CI 34-41%) of patients who screened positive on surveys had “excess” discordance. Patients may under-report alcohol consumption on clinical alcohol screening due to stigma or a desire to avoid discussing their drinking with providers.29 The fact that Black patients were more likely than White patients to have discordant screening results might reflect greater social desirability bias among these patients, although it could also reflect bias due to differences in the way the AUDIT-C was interpreted and/or administered across racial/ethnic subgroups.

Factors other than social desirability bias likely contributed to the magnitude of the observed discordance for several reasons. First, the AUDIT-C was validated in studies that used interviewers to administer screens,10 and in which patients were aware providers would receive screening results.8 Second, the variation in discordance across VA networks suggests that institutional factors contributed to the observed discordance. Anecdotal reports suggest that considerable variability exists across sites in the privacy of screening. Differences in training and/or decisions about who conducts screening might also contribute to variability; medical assistants or nurses may be more likely to follow screening instructions verbatim than primary care or mental health providers assessing alcohol use as part of the medical history.

Several limitations of this study should be noted. In order to study a large diverse national sample without alerting providers that the quality of screening was being evaluated, this study used AUDIT-Cs from confidential mailed surveys as a comparison standard. While in-depth interviews are often the ideal comparison standard, recruiting patients and providers for such studies biases results.30 In addition, primary data collection would have delayed assessment of the quality of screening. Finally, use of secondary data allowed us to evaluate regional variation in the quality of alcohol screening in a cost-efficient manner. For these reasons, the AUDIT-C from a mailed survey was the best available comparison standard for this translational study of alcohol screening implementation.

Other limitations relate to the study sample. Patients who returned the VA’s outpatient satisfaction survey are older and drink less than non-respondents,30 potentially under-estimating discordance since patients who screened positive for alcohol misuse had higher discordance. The study sample was also too small to evaluate facility-level variation. VA patients differ in important ways from other outpatients, and the VA health care system is currently atypical in that it uses performance incentives to achieve high rates of alcohol screening. Finally, this study did not collect data on alcohol screening procedures across VA networks, so it could not determine whether differences in implementation of alcohol screening account for the observed differences in discordance across VA networks.

Nevertheless, this study has important implications for health care systems implementing routine alcohol screening. Almost 7% of the total study sample screened positive for alcohol misuse on surveys but were missed by clinical screening, and 1.5% of the total sample had severe alcohol misuse that was missed. No known prior studies have evaluated the validity of alcohol screening when integrated into routine clinical care. This study found that use of validated questionnaires does not—by itself—ensure the quality of screening and suggests that the quality of clinical alcohol screening should be monitored, even when well-validated screening questionnaires are used. While it is unknown whether a similar issue affects screening for other health risk behaviors and mental health conditions, this issue merits evaluation. Self-administered measures of alcohol screening, whether on paper, online as in electronic health risk assessments (eHRA),31 or by interactive voice recording on the telephone (IVR),32 may be the most valid approaches to implementing alcohol screening. If alcohol screening is administered by clinicians they may need focused training to prepare them to screen in a valid manner.33,34

This study is another demonstration of the challenge of developing effective performance measures for preventive care.35 The current VA alcohol screening performance measure that sets targets for rates of alcohol screening creates incentives for documentation of screening results, but does not provide incentives for high quality screening that identifies patients with alcohol misuse. Furthermore, setting very high targets for screening could contribute to lower quality screening by encouraging providers to document screening when they do not have time to ask screening questions verbatim in a private setting. Future research must address the need for performance measures that create incentives for providers not only to screen, but to identify patients with alcohol misuse.

The VA has recently succeeded in achieving high rates of annual alcohol screening. Brief alcohol interventions are effective for patients with alcohol misuse,36 and patients with alcohol use disorders benefit from referral or repeated brief interventions, with and without lab monitoring or medications.3740 This study suggests that mandating clinical use of a validated alcohol screening questionnaire does not ensure high quality screening. Three out of every five patients who screened positive for alcohol misuse on confidential mailed surveys were not identified by clinical screening. Put another way, 6,821 patients with alcohol misuse would be missed out of every 100,000 patients screened in VA. Moreover, significant variation across VA networks, after accounting for differences in patient characteristics, suggests organizational influences on the quality of alcohol screening. Together these findings indicate a need to focus on monitoring and improving the quality of alcohol screening in order to identify as many patients as possible who could benefit from brief alcohol interventions.