Introduction

Antiretroviral therapy (ART) can reduce viral load to undetectable levels in PLWH, forestalling the progression to AIDS (Chan, Wong, & Lee, 2006; Herbst et al., 2009; Palella et al., 1998), and reducing the risk of HIV transmission (Krakower, Jain, & Mayer, 2015; World Health Organization, 2012). Patient behavior forms the foundation of this success and also serves as its rate-limiting step. Suboptimal ART adherence leads to AIDS related morbidities and risk of mortality (Sherr et al., 2010; Bangsberg et al., 2001), creates opportunities for transmission through sexual/drug injection networks (Friedman et al., 2007; Johnson et al., 2014) and increases drug-resistant viral strains (Assoumou et al., 2013; Ferreira et al., 2013; Sinha et al., 2012). Adequate adherence helps preserve the effectiveness of this class of treatments and protect both individual and public health.

Clinicians try to identify patients with suboptimal medication adherence. In a global healthcare context of limited resources and high patient-to-provider ratios (Park-Wyllie, Kam, & Bayoumi, 2009; World Health Organization, 2006), there is a need for adherence monitoring methods that are simple, efficient, and cost-effective. Unannounced pill counts, electronic drug monitoring, tracking pharmacy refill records, and the direct observation of therapeutic drug levels each have the benefit of third party objectivity but carry a heavy financial and person-hour burden; these methods may be unrealistic for scale up in many areas where routine adherence monitoring is needed most. Self-report measures are more expedient and cost-effective, but are subject to biases of recall, reporting, and social desirability. Some estimates suggest rates of “adherence inflation” that are as much as 15 % greater than results using objective measures of adherence (Shi et al., 2010).

The visual analogue scale (VAS) is a type of rating system in which the respondent is presented with a line that visually represents a range of possible ratings or responses to a question. The respondent is instructed to place a mark at a point on the line that represents their rating or response. The VAS line can be presented horizontally or vertically; lines can vary in length, but the accepted standard length of 10 cm is commonly used (Wewers & Lowe, 1990). Labeled endpoints anchor the scale boundaries and can be quantitative (e.g., 1–10, 0–100 %) or qualitative (e.g., best–worst, severe–slight). The VAS typically assists in measuring subjective clinical phenomena (e.g., pain, dizziness) that are otherwise difficult to describe (Torrance, Feeny, & Furlong, 2001; Wewers & Lowe, 1990).

Marking a numbered line suggests a process that is ostensibly simple and language independent, but there are both conceptual and practical concerns about the use of the VAS to measure patient medication adherence. In HIV for example, the VAS is typically presented as a numerical scale of doses taken, anchored by 0 and 100 % on either end of the line. Respondents are then asked to mark the line representing their adherence for the past 30 days. While simple on its face, this process may require a burdensome level of conceptual abstraction. Researchers conducting cognitive interviews have observed that the mathematical manipulation needed to respond to this standard VAS prompt led to a greater number of errors relative to other self-report measures (Wilson et al., 2014). Additional concerns relate to VAS data management. Despite pretentions to quantification, over reporting adherence is as common in VAS data as in other more qualitatively framed self-report measures (Wilson, Carter, & Berg, 2009). In most samples, VAS score distributions exhibit a pronounced negative skew, violating statistical assumptions for parametric testing of this outcome. Subsequent transformation of the variable can impact interpretability (Cohen et al., 2003).

However, there are several factors that might also incline researchers and clinicians toward VAS use in medication adherence research and care. By providing data on a continuous rating scale, the VAS permits more sophisticated analytic possibilities than can be found with data from categorical response sets or Likert scales (Treiblmaier & Filzmoser, 2009). With its single item-structure and visual/graphical format, the VAS appears simpler and briefer to administer than a standard HIV medication adherence questionnaire (Feldman et al., 2013; Kerr et al., 2012). These same physical characteristics also make the VAS inherently well-mated to the touch screen interface of today’s handheld technology and the burgeoning field of mHealth (Muessig et al., 2015). Web-based survey design platforms typically provide a VAS option for presenting questionnaire items and studies have demonstrated the feasibility and acceptability of using patients’ own communication devices for adherence monitoring (Bastawrous & Armstrong, 2013; Brinkel et al., 2014; Furberg et al., 2012). All this makes the VAS a potential option for use in the next wave of mHealth surveillance and interventions, including those targeting medication adherence. In the last 10 years, researchers have studied the VAS measurement of ART adherence compared with other standard measurement instruments. This has yielded a mixed literature: how well the VAS measures ART adherence remains unclear. As the emphasis on monitoring of and interventions to promote medication adherence continues to grow, there is a pressing need to assess the value of this ostensibly simple and expedient self-report measure.

The present study provides a quantitative review and meta-analysis of the VAS used to measure ART adherence. We seek to address the following research questions: (1) What is the average strength of association between the VAS and other measures of medication adherence? (2) How well do VAS scores predict patient viral load? (3) Do methodological factors of VAS administration and study design influence the strength of the VAS–viral load relationship? We hypothesized a priori that studies with stricter adherence to standards of methodological quality would report greater concordance between VAS and viral load values in their samples.

Methods

Data collection

Multiple strategies were used to identify relevant studies. Two independent researchers conducted Boolean searches of publications PubMed/Medline, PsycINFO, and CINHAL electronic databases. Unpublished or “gray literature” was sought using the electronic database Proquest Dissertations and Theses as well as through hand searches of recent year oral and poster abstracts from the archives of the International AIDS Conference and the International Conference on HIV Treatment and Prevention Adherence (2012–2014). Search terms included permutations of visual analog scale and adherence (e.g., [“VISUAL ANALOG SCALE” OR “VISUAL ANALOGUE SCALE” OR VAS] AND [“ADHERENCE” OR “MEDICATION ADHERENCE”]). Studies which met the following a priori criteria were included in this review: (1) use of VAS to measure antiretroviral adherence with PLWH; (2) a comparison of VAS to at least one other measure, biomarker or clinical outcome of antiretroviral adherence; and (3) sufficient data to calculate effect size. Whenever a study met criteria (1) and (2) but did not report sufficient data for effect size calculation, we contacted the corresponding author to request the necessary information.

Coding of studies

Using a standardized, pilot tested coding form (available upon request), two reviewers independently abstracted study data including general information (e.g., study location, year of data collection), participant characteristics (e.g., age, gender, race/ethnicity), design elements, and comparison measures used in addition to the VAS. Coders displayed an acceptable rate of agreement (agreement rate: 93.75 %; weighted κ: 0.817, p < 0.001). Discrepancies were reconciled through discussion.

Risk of bias

In order to assess for risk of bias within individual studies (Higgins & Green, 2011) we used a methodological quality (MQ) rating form developed by the United Kingdom’s National Institute for Health and Care Excellence (NICE) specifically for the assessment of “quantitative studies reporting correlations and associations” (NICE, 2012). This 19-item instrument evaluates the extent bias may be present through appraisal of such factors as population sampling (selection bias), methodological/analytical design (measurement bias), and statistical power. Raters score items with either a minus sign (“−”) indicating high risk of bias; a plus sign (“+”) indicating moderate risk of bias; or a double plus sign (“++”) indicating low risk of bias for that parameter. In accordance with these guidelines, we report descriptions of internal and external validity summary ratings categorically, converting these to numerical scores as necessary for the purpose of testing methodological quality scores as a moderator.

We also assessed for the risk of publication bias favoring studies demonstrating greater concordance between the VAS and other measures or outcomes as well as any asymmetry of effect sizes by study. In addition to graphing a standard “funnel plot” (Fig. 1) we also calculated its statistical equivalent using Begg’s correlation between observed effect size and inverse weighted variance (Begg & Mazumdar, 1994). The resultant correlation was not significantly different from zero (z = −0.10; p = 0.7) and the only notable asymmetry in the funnel plot consisted of one highly weighted study that reported null effects (Gionatti et al., 2013). These findings imply a very low risk of publication bias.

Fig. 1
figure 1

Funnel plot of included studies

Analytic approach

In this review, effect sizes (ESs) were estimated using the correlation coefficient r (Card, 2012; Cooper & Hedges, 1994; Hedges & Olkin, 1985) that we converted to standardized for (Zr) before applying inverse variance weighting. For reporting purposes we back-transformed all results from Zr to the more familiar r coefficient (Field, 2005) with 95 % confidence intervals, scaled on a continuum from −1 to 1. Positive decimals indicate a positive correlational relationship (i.e., stronger concordance between adherence measures) and negative decimals indicate a negative correlational relationship. An important exception to this is viral load, which typically correlates negatively with measures of adherence. In order to maintain consistency across adherence measures and aid in comparison and interpretation, we present viral load–VAS correlations with the sign reversed. Confidence intervals containing zero reflect an insignificant correlation. Throughout this report we use the basic descriptors put forward by Cohen (1992) to characterize small (r = 0.1), medium (r = 0.3), and large (r = 0.5) effect sizes. All characteristics of this meta-analysis are reported using PRISMA guidelines (Moher et al., 2009).

Many studies reported using multiple adherence measures (e.g., self-report, pill count, electronic data monitoring) in addition to the VAS. To uphold the assumption of independence, a multivariate approach (Becker et al., 2000; Gleser & Olkin, 1994) was followed when more than five comparisons were available for sensitivity analysis but due to the large variability of comparisons measures and small number of studies per comparison measure, average effect sizes (ESs) were calculated separately by type of comparison measure in order that each study would contribute only one outcome per average ES synthesized. When multiple measurements were made within a given type of measure in the same study, such as multiple different measures of self-reported adherence other than the VAS, ESs were averaged within that study before combining ESs between studies. For analyses, correlation coefficients were standardized using the Fischer Zr transformation (Field, 2001). Inverse variance weights for each outcome were also calculated. Final meta-analytical tests of derived ESs were performed using SPSS version 21 (IBM Corp., 2012) and R package metafor (Viechtbauer, 2010). Weighted mean ESs were calculated to estimate overall strength of association between the VAS and each of the comparison variables. ESs were analyzed using random-effects assumptions with the magnitude of heterogeneity across ESs assessed using the I 2 statistic and its confidence interval (Huedo-Medina et al., 2006; Higgins, Thompson, Deeks, & Altman, 2003). Analyses using the Q statistic as a measure of variance in a meta-analytic analog to the one-way ANOVA (Wilson, 2002) assessed whether study characteristics explained variability in the ability of the VAS to predict viral load across studies. Methodological quality rankings have been identified as an under-analyzed element of the data reported in meta-analyses (Johnson et al., 2014). Derived internal and external validity scores were entered into a series of weighted least squares regression models incorporating random-effects assumptions (Schmidt, Oh, & Hayes, 2009; Wilson, 2002) and used the moving constant technique (Johnson & Huedo-Medina, 2011) to produce estimates at meaningful levels of the moderators.

Results

Literature search outcomes

Our search yielded 235 articles. After reviewing title and abstract, 139 studies were excluded as false hits or duplicate (k = 26) results. Of the remaining 96 studies initially retained, 73 were studies using VAS that did not measure ART adherence, did not report analyses directly comparing VAS scores with other measures or outcomes, or relied on caregiver report. The remaining 23 full-text articles met inclusion criteria however eight papers did not report sufficient data to calculate effect sizes. We contacted the primary authors of these studies to request additional data. Five authors responded and their reports (Graham et al., 2012; Gionatti et al., 2013; Kagee & Nell, 2012; Mbaugbaw et al., 2012; Segeral et al., 2010) are therefore included (see Fig. 2). Listings of excluded studies with rationale are available upon request. This meta-analysis reflects a final set of 20 studies that examined the use of VAS to measure ART adherence in samples of PLWH. Table 1 reports study characteristics and findings.

Fig. 2
figure 2

Literature search PRISMA flow chart

Table 1 Characteristics of included studies (k = 20)

Study characteristics

These 20 peer-reviewed publications represent samples taken over a 18 year period (1996–2014; median = 2006). Included studies were distributed across four continents representing eight different countries from the regions of sub-Saharan Africa (k = 6), East and Southeast Asia (k = 6), North America (k = 6), and Western Europe (k = 2).

Participant characteristics

This meta-analysis represents a total of 6138 participants across included studies, with an aggregate completion rate of 94.13 % (range 71.4–100 %) among consented participants. East or Southeast Asian participants constituted 27.5 % (n = 1688) of the total sample, 18.9 % (n = 1162) of the sample made up of sub-Saharan African participants, North American studies accounted for 17.8 % (n = 1096), and European studies represented 35.7 % (n = 2192). The mean participant age was 38.9 years and 39.5 % of the sample identified as female. Of studies reporting socio-economic indicators (k = 13) of their samples, a majority of participants endorsed limited educational attainment (i.e., less than a high school diploma or its equivalent) and met poverty guidelines for their respective countries. Ten studies reported information on participants’ familiarity with antiretrovirals; of these, 46.8 % of the overall sample was ART naïve (i.e., within 3 months of having initiated ART).

Design characteristics

Most (k = 12) studies reported the comparison and testing of adherence measures as a primary methodological aim of the study, but 40 % of studies identified other aims (e.g., STD surveillance study, adherence intervention study). Included studies exhibited variability (mean 1.6; range 1–4) in the number of comparison measures used to study the VAS. The most frequently reported comparison was of VAS to viral load (k = 12), followed by self-report measures (k = 9). Fewer studies reported comparisons of VAS with pill count (k = 6), electronic data monitoring (k = 4), and pharmacy refill records (k = 3). A total of nine studies (Berg et al., 2012; Buscher et al., 2011; Gill et al., 2010; Gionatti et al., 2013; Graham et al., 2012; Hong et al., 2013; Oyugi et al., 2004; Segeral et al., 2010; Walsh et al., 2002) reported administering the VAS as a self-report questionnaire. Eight studies (Giordano et al., 2004; Kagee & Nell, 2012; Kerr et al., 2012; Maneesriwongul et al., 2006; Mbaugbaw et al., 2012; Peltzer et al., 2010; Wang et al., 2008) reported administering the VAS through a face to face interview; Audio Computer Assisted Self-Interview (ACASI) was used in four studies (Amico et al., 2006; Do et al. 2013; Kalichman et al., 2009; Pellowski et al., 2015). Four studies reported normalizing patients’ experience of missed doses to reduce reporting and social desirability biases (Amico et al., 2006; Berg et al., 2012; Kalichman et al., 2009; Walsh et al., 2002).

VAS concordance with other adherence measures and viral load

Depending on the comparison measure used, average VAS measurements varied in their strength of association: these data are summarized in Table 2. Comparisons of VAS with other subjective self-report measures (k = 9; r = 0.61; 95 % CI 0.44, 0.74); with objective pill count data (k = 6; r = 0.72; 95 % CI 0.54, 0.85); and with objective data from electronic drug monitors (k = 4; r = 0.51; 95 % CI 0.19, 0.73) revealed a large effect size. Aggregating the three studies comparing VAS to pharmacy refill data (k = 3; r = 0.02; 95 % CI −0.54, 0.56) yielded no meaningful association. The average strength of association between VAS and viral load was small (k = 12; r = 0.25; 95 % CI 0.16, 0.36) but significant. All comparison measures exhibited sufficient variation attributable to heterogeneity across studies as reflected in their I 2 values. Of these, only viral load had a sufficient number of studies (k = 12) to warrant an investigation of possible moderators that might account for this observed variation. Figure 3 displays the forest plot of VAS/viral load effect sizes with 95 % confidence intervals.

Table 2 Mean effect sizes by measurement method
Fig. 3
figure 3

Forest plot of VAS/viral load effect sizes (k = 12), in descending order by study sample size

Did methodological factors moderate the effect size of the VAS–viral load relationship?

In attempting to account for the heterogeneity in the VAS adherence–viral load relationship reported across studies, we examined methodological parameters as categorical predictors of effect size. These results are summarized in Table 3. First, we considered whether the researchers’ stated aims were to specifically assess VAS methodology. Measurement studies of VAS did report larger effect sizes (k = 6; r = 0.31; 95 % CI 0.17, 0.44) than other studies using the VAS (k = 6; r = 0.20; 95 % CI 0.07, 0.32). Because assessment of adherence using VAS asks participants to recall the adherence over a period of 30 days, we also compared the factor of cross-sectional versus longitudinal study design. We defined a longitudinal study as one with 30 days duration or more. Longitudinal studies reported an overall larger effect size (k = 5; r = 0.35; 95 % CI 0.20, 0.48) than cross-sectional studies (k = 7; r = 0.19; 95 % CI 0.08, 0.30).

Table 3 Categorical moderator analyses of VAS–viral load association

In addition, we considered certain elements of the VAS administration protocol as moderators of the VAS–viral load relationship. The mode of administration was one such factor. Studies using an interview format showed larger effect sizes (k = 4; r = 0.35; 95 % CI 0.22, 0.46) than studies relying on a self-report questionnaire (k = 8; r = 0.18; 95 % CI 0.07, 0.28). Another factor we assessed was researchers’ attempts to combat participants’ tendency to over-report their adherence. Studies that sought to normalize the experience of missing doses showed larger effect sizes (k = 3; r = 0.34; 95 % CI 0.16, 0.50) than those that did not (k = 9; r = 0.22; 95 % CI 0.11, 0.32).

Other study characteristics did not account for variance in effect size across studies and thus constitute “pertinent negatives” for this analysis. Studies that sampled U.S. and Western European populations (k = 5; r = 0.28; 95 % CI 0.15, 0.34) were not significantly different in their estimates of the VAS and viral load relationship from those that sampled African or East and Southeast Asian populations (k = 7; r = 0.22; 95 % CI 0.09, 0.87). Studies abstracting viral load data from participant medical records (k = 6; r = 0.26; 95 % CI 0.12, 0.39) did not show a significantly different average effect size from those that performed blood draws and assessed viral load as part of study procedures (k = 6; r = 0.24; 95 % CI 0.09, 0.37).

This meta-analysis also operationalized methodological quality through continuous variables. These results are summarized in Table 4. In multivariate meta-regression controlling for year of data collection and study sample size, internal validity scores were positively associated (B = 0.1634; 95 % CI 0.0581, 0.2686; β = 0.5400; p = 0.0067) with VAS–viral load effect sizes. In contrast, similar model testing external validity scores showed this variable was not associated with effect size (B = 0.1595; 95 % CI −0.0885, 0.1378; β = 0.1618; p = 0.6691) in studies reporting an association between VAS and viral load.

Table 4 Meta-regression analyses of VAS–viral load association on methodological quality, controlling for study characteristics

Discussion

This paper summarizes the literature evaluating the use of visual analogue scales to measure ART adherence with a meta-analysis of how study characteristics and methodological quality factors moderate the VAS/viral load relationship. Overall, the VAS exhibited large strength (r = 0.5–0.7) associations with other self-report measures and with objective pill count and EDM data. Significant correlations between the VAS and these comparison measures have been observed historically (Hugen et al., 2002; Liu et al., 2001). At the same time, three studies using pharmacy data provide a contrasting estimate that shows no aggregate relationship. Unfortunately, the low number of included studies analyzing the relationship between pharmacy claim data and the VAS makes further interpretation of this finding difficult and may be a result of the difficulty researchers have in accessing complete and valid pharmacy administration records (Farmer, 1999; Hess et al., 2006) for use in adherence research.

In medication adherence research and clinical practice, measures that reliably predict a biological outcome are particularly valuable. This meta-analysis reports an aggregate VAS–viral load effect size estimate that is statistically significant but nonetheless quite small, leaving a good deal of the residual variance (90 %+) across these two variables unexplained by this mean effect size estimate. There are likely biological and behavioral factors that impact viral load irrespective of self-reported ART adherence (Abioye et al., 2015; Maldonado-Martinez et al., 2016). It is important to explore the potentially modifiable factors that influence the utility of a self-report adherence measure to predict a biological outcome. We approached this by comparing the different ways researchers administer and study the VAS in their studies. The modifications we identified (i.e., using face-to-face interviews, normalizing non-adherence, greater study internal validity) appear to substantially strengthen the VAS–viral load relationship; though still modest, moderators approximately doubled the variance accounted for in the effect size estimates across studies.

As a group, studies that set out to specifically investigate the VAS to measure ART adherence showed larger effect sizes than other studies using the VAS. Methodological research aiming to validate the VAS as a credible tool could be more likely to apply greater rigor in administering the tool. Longitudinal studies also showed significantly larger effects (r = 0.35) than the smaller (r = 0.15) effects observed in cross-sectional studies. Cross-sectional research is generally limited in its ability to imply directionality or causation among variables. In this meta-analysis we found that cross-sectional studies more frequently (5:1) relied on retrospective chart abstracted viral load data. These can occur before the typical 30 day VAS recall period, rendering the comparison less valid and perhaps explaining the decreased performance of the VAS in cross-sectional studies. In this sample of studies, type of viral load data did not appear to impact effect sizes, however we were limited to bivariate analyses of categorical moderators. An interesting future analysis would compare the relative contributions of viral load data source and study design.

How researchers administered the VAS also appeared to influence the VAS–viral load relationship. Interviews, whether face to face or audio computer assisted (ACASI), may confer an advantage in using the VAS to measure medication adherence. One possible reason for this could be the potential to provide detailed or individually tailored instructions. As the VAS does require a solid grasp of numeracy to use, such instruction could make a difference in this population, as the modern HIV pandemic disproportionately affects those with lower educational and socio-economic attainment (Pellowski et al., 2013).

In this review, two raters independently assessed the methodological quality of included studies, with high interrater reliability. Methodological quality ratings separately addressed studies’ internal validity (e.g., accounting for covariates, using appropriate design and analysis) and external validity (e.g., sampling techniques, population descriptions). Moderator analyses showed that studies with higher internal validity scores reported larger VAS–viral load effect sizes. In contrast, external validity scores did not moderate this relationship. Taken together, these analyses support that notion that researchers and clinicians alike may be able to optimize VAS measurement through a set of best practices. It also tentatively suggests that mode of recruitment and population sampling does not appear to influence the VAS–viral load relationship as reported in the current literature.

While these findings help provide additional information about the VAS as a medication adherence measure, there are limitations worth noting. Included studies reported a variety of measures alongside the VAS; at the same time the number of studies using any one type of comparison measure was low, with the exception of viral load (k = 12). This presented the challenge of how best to integrate and present the data yielded by this meta-analysis while upholding the necessary statistical assumptions. One approach would be to average effect sizes across measures within each study first and then estimate an overall mean effect size with the largest (k = 20) available sample. We chose against this approach here, feeling that the information on the strength of association between VAS and comparison measures was more meaningful when considered separately, avoiding an “apples and oranges” problem. We also were not able to statistically test the difference between ESs from the different comparison measures (i.e., this meta-analysis did not answer the question of whether the VAS or a questionnaire is superior in predicting viral load) due to the issue of stochastic dependence (Gleser & Olkin, 1994) in analyzing studies with multiple outcomes. Another limitation of the present study is the small number of included studies (k = 20). Our literature searches revealed studies that met inclusion criteria but did not report data sufficient to calculate an effect size. Unfortunately, the authors of these studies were not able to supply these data upon request. As more studies continue to emerge within the literature, it will be important to revisit these analyses and further refine our understanding of the VAS and its relevance in adherence measurement.

In conclusion, VAS demonstrates high levels of concordance with many other measures of adherence. Its ability to predict viral load in samples of PLWH is comparatively weak. Deceptively simple in design and easy to deploy, the VAS is nonetheless subject to cognitive biases and conceptual burdens for the respondent. VAS appears to perform differently under different methodological conditions and favors studies with longitudinal design and greater internal validity. Administration procedures also have the power to optimize VAS ability to predict viral load. Providing an interview format and informing respondents that missed doses are a normal patient experience are two elements of VAS administration that appear to further enhance its utility. Future studies should consider carefully these design and implementation factors when planning to use VAS to measure medication adherence.