Practical Assessment of Alcohol Use Disorder in Routine Primary Care: Performance of an Alcohol Symptom Checklist

BACKGROUND

Sixty-one million people in the United States (US) drink alcohol above recommended limits1 and 13.9% of US adults meet past-year criteria for alcohol use disorder (AUD).2 In the US, AUD is associated with 3% of deaths annually and 5% of years of life lost due to premature mortality.3,4 Despite this large burden of disease, up to 78% of patients with AUD who visit primary care go undiagnosed5 and 80% or more go untreated1,2,6,7,8,9 even though effective AUD treatments can be delivered in primary care.10,11,12

A lack of practical, validated measures for diagnosing AUD in routine medical care perpetuates this treatment gap. Although brief screening measures, such as the Alcohol Use Disorders Identification Test–Consumption screen (AUDIT-C),13,14 can detect potentially harmful levels of alcohol consumption, there is currently a lack of practical measures for assessing AUD symptoms. Experts recommend using an Alcohol Symptom Checklist, where patients self-report the presence or absence of the 11 AUD symptoms defined by the Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5)15 to facilitate patient-provider discussions about AUD and other substance use disorders.16,17 However, the psychometric performance of such a measure when used in routine care has never been tested.

In 2015, Kaiser Permanente (KP) Washington began implementing an Alcohol Symptom Checklist,18,19,20 which is now used in all primary care clinics and has been completed by over 11,000 patients who reported high-risk drinking on an alcohol screening measure (AUDIT-C scores of 7–12). The present study evaluated the psychometric performance of this Alcohol Symptom Checklist by assessing its ability to measure AUD symptoms along a unidimensional continuum, discriminate AUD severity consistently with scientifically accepted views of AUD,15 and measure AUD severity similarly across subgroups based on age, sex, race, and ethnicity.

METHODS

Study Setting

This cross-sectional study used secondary clinical data from electronic health records (EHRs) and insurance claims of KP Washington, an integrated health system in Washington State.

Screening and Assessment Procedures

As part of an effort to integrate behavioral health care within primary care, all 32 KP Washington primary care clinics implemented annual alcohol screening for adult patients, followed by an EHR prompt to have patients complete an Alcohol Symptom Checklist if the screening indicated high-risk drinking (i.e., AUDIT-C scores of 7–12).20 As of February 2020, alcohol screens were completed by 91% of adult primary care patients. Most alcohol screens were completed during primary care appointments (65.0%); others were completed during mental health (28.5%) and other types of visits (e.g., obstetrics, urgent care; 6.5%). In the most common case where screening occurred in primary care, the EHR automatically prompted check-in staff and medical assistants to give the paper screening questionnaire if it had not been completed within the past year. Primary care patients then completed the screening questionnaire in waiting areas or exam rooms and medical assistants entered responses into the EHR prior to the patient’s appointment with primary care providers. If a primary care patient screened positive for high-risk drinking within the past year without completing a subsequent Alcohol Symptom Checklist, the EHR prompted medical assistants to give them a paper-based Alcohol Symptom Checklist, which medical assistants entered into the EHR.

Patient Population

Patient eligibility for this study was identified based on EHR data and included (a) ≥ 1 visit to a KP Washington primary care clinic between October 1, 2015, and February 29, 2020, (b) screened positive for high-risk drinking (AUDIT-C score of 7–12), (c) completed the Alcohol Symptom Checklist 0–365 days after the positive alcohol screen, and (e) at least 18 years old when the checklist was completed. The study was approved by the KP Washington Health Research Institute’s Institutional Review Board with a waiver of consent and HIPAA authorization to use existing EHR data.

Measures

Alcohol Screening Measure

The AUDIT-C13,14 is a validated21,22 self-report measure of alcohol consumption included in KP Washington’s annual behavioral health screen. It includes three items that assess the frequency of drinking, typical drinks per drinking day, and frequency of heavy drinking (eSupplement). Items are answered on a 5-point scale (0–4 points) and then summed (total score 0–12 points). If AUDIT-C scores were 7–12,23 the EHR prompted staff to ask patients to complete an Alcohol Symptom Checklist.

Alcohol Symptom Checklist

KP Washington’s Alcohol Symptom Checklist (eSupplement) is an 11-item self-report questionnaire that asks patients whether they have experienced each of the 11 DSM-5 AUD criteria within the past year, a timeframe consistent with diagnostic standards.15 Summed scores reflect criteria counts (0–11) that may aid the determination of an AUD diagnosis (≥ 2 criteria endorsed) and its severity as defined in DSM-5 (mild: 2–3 criteria; moderate: 4–5 criteria; severe: 6–11 criteria), if clinicians determine symptoms are recurrent.

Demographics and Other Measures

Age was grouped into similar categories as a previous study evaluating an AUD diagnostic measure in a national epidemiological sample (18–24, 25–44, 45–64, and 65+ years).24 Sex was coded as male or female based on available EHR data. Race and ethnicity were coded to align with categories defined by the US National Institutes of Health (race: White, Black/African American, Asian/Asian American, American Indian/Alaska Native, or Native Hawaiian/Pacific Islander; ethnicity: non-Hispanic or Hispanic). Participants with other (4.5%) or unknown race (5.2%) or unknown ethnicity (5.4%) were included in full-sample analyses but excluded from race- and ethnicity-specific analyses. Medicaid and Medicare insurance statuses were obtained from enrollment records. AUD diagnoses up to 2 years before completing Alcohol Symptom Checklists were identified using ICD-9 and ICD-10 codes from EHRs and insurance claims.

Analytic Approach

Psychometric analyses tested whether the Alcohol Symptom Checklist performed in a manner consistent with current scientific conceptualizations of AUD15 and with the performance of AUD diagnostic measures tested in previous epidemiological studies.24,25,26

Unidimensionality

Current diagnostic standards15 and population-based studies24,25,26 characterize AUD as a brain disorder with a unidimensional continuum of severity reflected by the number of AUD criteria present. This contrasts popular perceptions of AUD as a binary or categorical condition that is often assumed to be severe (e.g., “alcoholism”)27,28 and also contrasts historical diagnostic standards that distinguished alcohol abuse and alcohol dependence as separate diagnoses without severity designations.29

In this study, confirmatory factor analysis tested whether the Alcohol Symptom Checklist measured a single, continuous underlying dimension of AUD severity (vs. multiple dimensions) by specifying a model with all 11 AUD criteria loading onto a single, continuous latent variable. The factor model was fit using maximum likelihood within a two-parameter logistic item response theory (IRT) framework via the mirt package in R.30,31,32,33 Adequacy of the unidimensional factor model was evaluated by comparing model fit indices to a priori cutoffs for acceptable fit: comparative fit index >.95, root mean square error of approximation <.05, standardized root mean square residual <.05 (possible range for all indices: 0–1).34,35

Item Characteristics

IRT analyses characterized how well the Alcohol Symptom Checklist items discriminated higher versus lower latent AUD severity. Specifically, the IRT model36 estimated each item’s ability to discriminate whether a patient has higher versus lower latent AUD severity (discrimination parameter) and where along the continuum of latent AUD severity the item best discriminates (severity parameter). These parameters were illustrated graphically using item characteristic curves to show the probabilities of endorsing each of the 11 items based on a patient’s underlying latent AUD severity.

Differential Item Functioning

To ensure equitable interpretation of Alcohol Symptom Checklists, it is important to understand whether the number of AUD criteria endorsed on the checklist is influenced by other factors besides AUD severity, such as the patient’s age, sex, race, or ethnicity. If a patient’s likelihood of endorsing specific Alcohol Symptom Checklist items is influenced by demographic factors, rather than by their AUD severity, those items are said to have differential item functioning, which may suggest the instrument measures AUD severity differently for different subgroups.

Differential item functioning was tested using established procedures37,38,39 that compared the fit of an IRT model that assumed identical item discrimination and severity parameters for all demographic subgroups to models that estimated discrimination and severity parameters separately for each subgroup. Differential item functioning was indicated when the latter model resulted in significantly better fit via likelihood ratio testing.

Assessing the Clinical Impact of Differential Item Functioning

Importantly, differential item functioning can be statistically significant without having a clinically meaningful impact on the determination of AUD severity.37,40 Because AUD diagnosis and its severity are determined by the number of AUD criteria present,15 knowing the impact of differential item functioning on the number of criteria endorsed is more clinically important than knowing whether there is differential item functioning for individual items within the checklist.40

We evaluated the cumulative impact of differential item functioning in three ways, namely (1) examining how the expected number of criteria endorsed by patients with the same latent AUD severity increased or decreased due to differential item functioning across different demographic subgroups,37 (2) comparing estimates of latent AUD severity (factor scores) with versus without correction for differential item functioning,41 and (3) examining differences in fit indices of models with versus without correction for differential item functioning for each demographic subgroup.42,43

RESULTS

Sample Description

Of 449,795 primary care patients who completed AUDIT-C screens, 17,760 (3.9%) had AUDIT-C screening scores of 7–12 indicating high-risk drinking at any point during the study period. Within this subgroup, 11,464 (64.5%) completed an Alcohol Symptom Checklist and were included in subsequent analyses (88.5% were completed the same day as the AUDIT-C screening). Among those with high-risk drinking, patients were more likely to subsequently complete an Alcohol Symptom Checklist if they were male, had an AUDIT-C score 10–12, or did not have a past 2-year AUD diagnosis from a healthcare provider; however, rates of Alcohol Symptom Checklist completion did not differ by age, race, ethnicity, or Medicaid/Medicare insurance status (see Table 1). There were 2064 patients (18.0% of those with Alcohol Symptom Checklists) who completed more than one checklist; for these patients, only the first checklist was retained to maintain independent observations.

Table 1 Characteristics of Sample: Patients With and Without Alcohol Symptom Checklists After Screening Positive for High-Risk Drinking on the AUDIT-C

Characteristics of the sample that completed the Alcohol Symptom Checklist are shown in Table 1. On average, participants were 43.6 years old and 30.5% were female. Most participants were White and non-Hispanic, but racial and ethnic minority subgroups were large enough (n = 163 to 780) to adequately test differential item functioning.44

Rates of AUD Criteria

Of patients who completed an Alcohol Symptom Checklist, 54.1% reported ≥ 2 AUD criteria, exceeding the threshold for an AUD diagnosis, including 20.1%, 12.1%, and 21.8% reporting symptoms consistent with mild, moderate, and severe AUD, respectively. In contrast, only 19.6% of patients who completed the Alcohol Symptom Checklist had AUD diagnosed within the 2-year period before completing the checklist based on EHR diagnoses and insurance claims. The 11 AUD criteria were endorsed at varying rates, ranging from 16.2% reporting being intoxicated more than once in situations where it was dangerous to 44.9% reporting continued drinking despite knowing or suspecting it creates or worsens mental or physical problems (Table 2).

Table 2 Prevalence of Specific Alcohol Use Disorder (AUD) Criteria and Item Response Theory (IRT) Parameter Estimates

Tests of Unidimensionality

Confirmatory factor analysis indicated that the Alcohol Symptom Checklist measured latent AUD severity along a unidimensional continuum. The unidimensional factor model provided excellent fit34,35 to the 11 Alcohol Symptom Checklist items: comparative fit index = 0.99, root mean square error of approximation = 0.04, standardized root mean square residual = 0.02.

Item Characteristics

Item characteristic curves (Fig. 1) show the probabilities of each AUD criterion being reported with respect to a patient’s latent AUD severity. IRT discrimination and severity parameters are shown in Table 2. Most items had high discrimination parameters, indicating they were strongly associated with AUD severity (Table 2). Comparatively, discrimination parameters were somewhat lower for the two items assessing tolerance and hazardous use. Some items discriminated AUD severity better when latent AUD was mild to moderate (e.g., physical/psychological problems, larger/longer, craving) as demonstrated by lower severity parameters; other items discriminated AUD severity better when AUD was moderate to severe (e.g., hazardous use, withdrawal, neglect roles) as demonstrated by higher severity parameters (Table 2).

Fig. 1
figure 1

Item characteristic curves for the eleven alcohol use disorder (AUD) criteria assessed on the Alcohol Symptom Checklist. Latent AUD severity (x-axis) is distributed such that 0 corresponds to the sample mean (approximately 2.98 AUD criteria) and ±1 corresponds with ±1 standard deviation (SD) from the mean (SD = 3.23 AUD criteria).

Differential Item Functioning

Seven items had statistically significant differential item functioning associated with age (tolerance, larger/longer, quit/control, hazardous use, social/interpersonal problems, craving, activities given up). Three items had significant differential item functioning associated with sex (larger/longer, hazardous use, craving) and race (tolerance, larger/longer, quit/control). One item had significant differential item functioning associated with ethnicity (quit/control; see eSupplement for detailed reporting).

Clinical Impact of Differential Item Functioning

Figure 2 displays the expected number of AUD criteria endorsed (y-axis), based on a patient’s latent AUD severity (x-axis) for all demographic subgroups (shown as separate curves that nearly overlap). Differences in expected criteria counts for individuals from different demographic subgroups with the same latent AUD severity were small and never diverged by more than half of one criterion, suggesting minimal cumulative impact of differential item functioning on total criteria counts and therefore minimal impact on clinical interpretation. When AUD severity was held constant, differential item functioning was expected to produce differences in AUD criteria counts that never exceeded 0.38 criteria for age (patients aged 18–24 reporting fewer criteria), < 0.1 criteria for sex, < 0.42 criteria for race (white patients reporting fewer criteria), and < 0.1 criteria for Hispanic ethnicity (Fig. 2; see supplement for additional details).

Fig. 2
figure 2

Analyses of differential test functioning for demographic subgroups. Latent alcohol use disorder (AUD) severity (x-axis) is distributed such that 0 corresponds to the sample mean (2.98 AUD criteria) and ±1 corresponds with ±1 standard deviation (SD) from the mean (SD = 3.23 criteria). Separate curves are shown to illustrate the expected number of AUD criteria endorsed (y-axis) for each demographic subgroup. Vertical distances between the plotted lines indicate the expected differences in the number of AUD criteria endorsed that are attributable to differential item functioning, when latent AUD severity is held constant between members of different subgroups. In all cases, the expected differences between expected number of AUD criteria endorsed attributable to differential item functioning (i.e., vertical distances between lines) were less than half of one criterion, which is not a clinically meaningful difference. More detailed differential item functioning results are presented in the online supplement.

Estimated values of latent AUD severity (i.e., factor scores) with versus without correction for differential item functioning were similar and highly correlated (r ≥ 0.996), indicating that correction for differential item functioning typically had minimal impact on estimated latent AUD severity. Differences in the comparative fit indices of models with identical versus independently estimated discrimination and severity parameters for each subgroup were never larger than 0.01, indicating that correction for differential item functioning had minimal effect on absolute model fit (see eSupplement).42

DISCUSSION

Currently, there are no validated, practical measures available to assess AUD symptoms in routine primary care despite the considerable disease burden1,2,3,4 and ability to treat AUD.10,11,12 Prior research has shown that implementation of routine alcohol screening and assessment of AUD symptoms among patients with high-risk drinking as part of behavioral health integration is feasible19 and affordable,45 and that patients are often willing to report AUD symptoms on Alcohol Symptom Checklists.18 The current study further shows that the Alcohol Symptom Checklist, administered in routine care and entered into patients’ EHRs, is psychometrically valid, performs equitably across age, sex, race, and ethnicity subgroups, and can detect AUD symptoms for many patients who report high-risk drinking but have not been previously diagnosed with AUD.

The two items that exhibited somewhat lower discrimination parameters (tolerance, hazardous use) also had lower discrimination parameters in prior studies of AUD symptom measures,24,26,46 suggesting these criteria may discriminate AUD severity less strongly than other criteria. Similar to prior epidemiological studies,24,25,26 some items demonstrated differential item functioning, but the cumulative effect of differential item functioning on overall AUD criteria counts and estimated AUD severity was small and not of great enough magnitude to have a clinically meaningful impact.24,26

Most patients with AUD contact the health system through primary care47,48 and only 8% of them receive care in traditional addiction treatment settings.49 Primary care is therefore uniquely positioned to provide AUD-related care that many patients will not otherwise seek or receive.10,50,51,52 Primary care settings are increasingly adopting practices to screen patients for high-risk drinking,53 which is recommended by the US Preventive Services Task Force followed by more in-depth risk assessment when high-risk drinking is detected.54

Alcohol Symptom Checklists can be implemented as part of a broader effort to address behavioral health within primary care and can systematically assess AUD symptoms among patients who report high-risk drinking. Using the checklist after patients report high-risk drinking may help prompt patient-centered discussions about the negative consequences of alcohol use (e.g., “you indicated wanting to cut down or stop drinking but being unable to do so – can you tell me more about that?”), while also potentially helping improve problem recognition28 and guiding AUD diagnosis, treatment, and management.16,17,18 The use of a written checklist (versus verbal assessment of AUD symptoms) may help facilitate more valid responding, mitigate potential patient discomfort, and avoid stigma.55,56 Additional research may test whether Alcohol Symptom Checklists improve rates of AUD diagnosis and treatment engagement.

Limitations and Strengths

Our study has limitations. Patients may have underestimated or underreported their drinking on screens and/or underreported AUD symptoms. Additionally, 35.4% of patients with high-risk drinking did not have an EHR-documented Alcohol Symptom Checklist and reasons why are unknown (e.g., clinic staff may have had limited time or discomfort administering it; patients may have refused or presented with more urgent concerns). Results should therefore be interpreted as supporting the performance of the Alcohol Symptom Checklist among patients who report high-risk drinking and who complete the checklist as part of routine care. Sex, race, and ethnicity were modeled as categorical variables that do not capture the dimensional, intersectional, social, historical, and cultural factors that could potentially affect drinking and AUD (e.g., sexism, racism, social determinants of health).57,58 Our use of a routine care sample also prevented us from administering structured diagnostic interviews of AUD symptoms to test criterion validity. Future studies should compare the checklist to a gold standard criterion.

The study also has several strengths. Evaluating data collected in routine care (rather than during research participation) greatly enhances external validity by focusing on the performance of the Alcohol Symptom Checklist completed clinically where results are entered into EHRs and available to healthcare providers. The routine care sample also reduces the risk of sampling bias by including individuals who might not enroll in research studies. The psychometric evaluation used rigorous methods. The AUDIT-C is widely used and freely available for public use, along with the Alcohol Symptom Checklist (eSupplement), improving the potential to disseminate these tools in practice.

CONCLUSION

This study validated the Alcohol Symptom Checklist as a practical measure of AUD symptoms that provides scaled, unidimensional information to help primary care clinicians assess the presence of AUD and its severity, doing so equitably across age, sex, race, and ethnicity subgroups. Routinely screening for high-risk drinking and assessing AUD symptoms when patients report high-risk drinking may help primary care providers diagnose AUD, inform clinical decision-making, and promote patient-centered discussions about alcohol.