Introduction

High adherence to antiretroviral therapy (ART) is a major determinant of sustained HIV virologic suppression, immune restoration, decreased development of drug resistance, improved quality of life, and reduced risk of HIV transmission [1,2,3,4]. Commonly used methods to assess ART adherence in randomized controlled trials have included patient self-report, followed by electronic monitoring, and pill count [5]. Other methods have included pharmacy refill data [6, 7] and assessing pharmacologic drug levels in blood [8] or hair [9].

Despite its paramount importance, there is no gold standard for estimating ART adherence [1, 10], and researchers often resort to feasible and cost-effective methods, which may yield biased or inaccurate results. For example, self-reported adherence is prone to recall and social desirability biases [11, 12] and may overestimate adherence [13, 14]. Yet, due to its relatively low cost, ease of administration, and specificity for detecting non-adherence, self-reported adherence is the most commonly used adherence measure in HIV clinical trials, with most trials only using one adherence measure [5]. Objective metrics of adherence have proven to be critical to the interpretation of clinical trials of pre-exposure prophylaxis [15] and are of increasing interest in ART monitoring but, to be incorporated into routine clinical care, more feasible, acceptable, cost- and time-effective metrics are needed.

In this pilot study, we collected ART adherence data to examine the correlation of self-reported adherence with three innovative methods to estimate adherence that were all implemented using remotely conducted study procedures. These methods included text messaged photographs of pharmacy refill dates to ascertain pharmacy-refill-based adherence, text messaged photographs of pills for pill-count-based adherence, and ART levels measured in home-collected hair samples for pharmacologic-based adherence. These three methods were selected because they were time-efficient for study staff and participants, were objective, may be cost-effective, and were collected remotely.

Methods

Setting and Study Participants

The methods used for this pilot study, conducted from March through October 2017, have previously been described in detail [16]. In brief, participants across the US were recruited using online social networks, as well as via advertisements in clinics and organizations serving people living with HIV (PLWH). To be eligible, PLWH over 18 years of age had to have been on an ART regimen containing tenofovir (TFV) disoproxil fumarate (TDF), emtricitabine (FTC), darunavir (DRV), or dolutegravir (DTG) for at least 3 months (based on self-report), have access to the internet to enable communication with study staff, have access to a mobile phone with capabilities to take and send photographs via text messages, and be willing to collect and send in hair samples every 2 months. Those who received automated ART refills, took renally-dosed ART due to chronic kidney disease, or were unable to provide a hair sample (e.g., due to baldness) were excluded. A total maximum incentive of $270 was offered for completion of all study activities. The study was approved by the University of California, San Francisco (UCSF) Institutional Review Board.

Data Collection

Written consent was obtained online before data collection. All data, except hair samples, were collected using text messages and online surveys. We used Mosio (a Health Insurance Portability and Accountability Act [HIPAA]-compliant clinical research text messaging software) for text message data requests and Qualtrics (HIPAA-compliant data collection software) for text message data requests and online surveys. At baseline, and monthly for 6 months, participants provided the following information when prompted by four sequential text messages from the study: (1) self-rated ART adherence [17], (2) photograph of the refill date on their currently-used ART medication bottle, and (3) photograph of the pills remaining in their currently-used ART medication bottle. Hair samples were collected at baseline and at months 2, 4 and 6 using home hair collection kits mailed to the participants by the study staff [18]. These hair samples were sent to the UCSF Hair Analytical Laboratory (HAL) to measure antiretroviral concentrations. Details about the home collection of hair samples has been described elsewhere [18] and detailed on the study website (rxpix.ucsf.edu). For those taking more than one antiretroviral medication per day, we chose to study a target antiretroviral for all adherence measures based on a pre-specified hierarchy (TDF > FTC > DRV > DTG > TAF). For instance, for a participant on TDF/FTC along with DTG, we used TDF as the target antiretroviral for adherence measures. This was based on the amount of research conducted on specific antiretroviral levels in hair.

Measures

Adherence to ART was measured in four ways as described below

  1. 1)

    Proportion of days covered (PDC): Adherence based on pharmacy refill dates from the text-messaged photographs of the ART bottle was calculated using PDC [19]. PDC is the proportion of days between any two dates that the participant ‘was covered by’, (i.e., had a supply of) the medication. The calculation adjusts for early refills (i.e., overlapping days) and therefore can have a maximum value of 100% (minimum 0%). For the purposes of calculating PDC, participants were considered as being ‘in the study’ from their first baseline text message date until their last text message date. Therefore, for each individual, any pills that covered days outside this period were ignored. PDC was calculated bi-monthly at months 2, 4 and 6.

  2. 2)

    Pill count-based adherence (PCA): This measure was used to calculate adherence based on the number of pills on hand during each follow-up (using the text-messaged photographs of the pills remaining in the bottle during consecutive follow-ups) and the number of pills received as refills between follow-ups (using the text-messaged photographs of the medication bottle). The formula used was: [(Number of pills on hand at previous follow-up—Number of pills on hand at current follow-up + Number of pills dispensed between the previous and current follow-up)*100/(Number of doses prescribed between the previous and current follow-up)] [20, 21]. Its value ranged between 0 and 100%. PCA was calculated bi-monthly at months 2, 4 and 6.

  3. 3)

    Hair drug concentration (HDC): Antiretroviral concentration in hair was measured by UCSF HAL from samples collected at baseline and at months 2, 4 and 6. These were measured for four antiretroviral medications—TFV, FTC, DTG, and DRV—and data were provided in nanograms (ng)/milligram (mg) hair. Details about the analysis of hair samples have been described elsewhere [22, 23].

  4. 4)

    Self-rated adherence (SRA): This measure was recorded at each time point by asking participants a single item: “Thinking back over the past 30 days, please rate your ability to take all your medications as prescribed” [17]. The response options were: (1) excellent, (2) very good, (3) good, (4) fair, (5) poor, and (6) very poor. This item has been linked to the more objective adherence measure, MEMS caps, with the approximate correlation with adherence percentage based on MEMS caps being: very poor = 0%, poor = 20%, fair = 40%, good = 60%, very good = 80%, and excellent = 100% [17]. Given that self-reported adherence data are generally skewed and over-estimated, and that less than 80% adherence represents a low level of adherence resulting in suboptimal virologic outcomes, we dichotomized self-rated ART adherence (0 = good through very poor, 1 = excellent or very good). For the follow-up periods, bi-monthly values were calculated at months 2, 4 and 6.

Data Analysis

First, univariate descriptive statistics such as means and frequencies were calculated to characterize the sample. To obtain measures that were comparable to the bi-monthly HDC, we calculated the bi-monthly values of SRA, PDC, and PCA for each participant.

The central analysis was to examine the degree of association between the four measures of ART adherence. We first performed a logarithmic (base 10) transformation on the PDC, PCA, and HDC measures to render normal distributions. To address HDC values of zero during the logarithmic transformation, we added the lower limit of quantification to all HDC values, based on drug category—0.02 ng/mg for FTC, 0.002 ng/mg for TDF, 0.02 ng/mg for DTG, and 0.04 ng/mg for DRV [24]. The log-transformed values for PDC, PCA, and HDC were then ranked within drug category and the resulting values were used as the inputs to examine correlations between the adherence measures. The correlations were estimated in Mplus 8.1 using full-information maximum likelihood (FIML) in order to incorporate observations with incomplete data into the analysis under the conditionally missing at random (MAR) assumption. Cluster-adjusted standard errors and test statistics were employed to properly account for the nesting of repeated observations within participants. We report the correlations and their p-values.

To obtain a better understanding of the four adherence measures in this sample, we performed two types of exploratory analyses by drug category. The first was to calculate the mean or proportion at baseline and months 2, 4, and 6. For the interval-type measures—PDC, PCA, and HDC—we calculated the means at baseline (only for HDC) and for months 2, 4 and 6 (for all three). For the binary measure (SRA), we calculated the proportion of participants who self-rated their adherence as excellent/very good. In the second of these exploratory analyses, we tested if each adherence measure at months 2, 4, and 6 differed significantly from that at baseline to examine for changes over time. This second analysis also served to examine whether adherence changed over time, possibly due to Hawthorne effects (i.e., changes in participants’ ART adherence due to their awareness of being “observed”). For the interval-type measures—PDC, PCA, and HDC—we used the non-parametric Sign test for this purpose; for the binary SRA, we used the non-parametric McNemar’s test to test for the equality of marginal frequencies at the two time points under consideration. We report the p-values from these tests.

Results

Of the 93 individuals enrolled in the study, two were dropped from analyses because they only had data at baseline. The average age of the analytic sample of 91 participants was 44 years (SD = 13.2), and 62.6% were White, and 25.3% were African-American/Black. The majority (84.6%, N = 77) identified as male; 8.8% (N = 8) identified as female, (4.4%, N = 4) as transgender female, (1.1%, N = 1) as transgender male, and (1.1%, N = 1) as genderqueer. At enrollment, most participants (90.1%) self-reported an undetectable viral load and 85.7% rated their adherence to HIV medications as ‘excellent’ or ‘very good’. Across baseline to month six, text message data were available for 80-88 participants (i.e., 89.9–96.7% of retained participants) and hair data were available for 75–88 participants (i.e., 84.3–94.6%). Detailed data on retention, missing data, feasibility, and acceptability metrics have previously been published [25].

All the measures were positively correlated with each other with varying strengths (Table 1). The strongest correlation was between PCA and PDC (r = 0.68; p < 0.001) and the weakest correlation was between SRA and HDC (r = 0.14, p = 0.34).

Table 1 Correlations (p-values) of the four measures of adherence

For interval-type measures (PCA, PDC, and HDC), the sample mean at each time point is presented in Table 2 by specific antiretroviral medication. For SRA, the number presented is the proportion of participants who self-rated their adherence as excellent or very good. As indicated in Table 2, only three comparisons for HDC and one comparison for SRA yielded a statistically significant difference from baseline to the applicable post-baseline time points. There were no statistically significant differences found for PCA and PDC. Therefore, we believe there was minimal Hawthorne effect.

Table 2 Mean/proportion of the measures of adherence over time, by category of drug

Discussion

This study examined three remotely collected objective adherence metrics for ART. Our results indicate statistically significant correlations between pharmacy-refill-based adherence via text messaged photographs of pharmacy refill dates, pill-count-based adherence via text messaged photographs of pills, pharmacologic-based adherence via self-collected home hair samples, and self-rated adherence. PCA was strongly correlated with PDC, and only SRA and HDC were not statistically significantly correlated. There were no appreciable changes in mean adherence based on the four methods of assessment over the 6 months of the study.

In a prior study, we demonstrated that there was a high degree of correlation and agreement between antiretroviral levels in hair collected by trained staff and at home by participants, without evidence of measurement bias [16]. We also noted a high degree of acceptability of home collection of hair every 2 months and feasibility and acceptability of all remotely conducted study procedures whereby 90.3% of participants reported being extremely or very satisfied with participating in a remote research project [25]. In qualitative exit interviews, many participants reported an improvement in their ART adherence [25], although there was no substantial change in adherence over the 6 months of the study in this current analysis.

In this study, SRA was not substantially correlated with HDC, a finding that is consistent with other studies demonstrating poor correlation between HDC and SRA [26, 27] and the generally poor utility of SRA to predict clinical outcomes [28]. SRA levels may be higher than those of the other measures, which may be due to participants overestimating levels of adherence, a disadvantage of SRA measures (see Table 3). However, PDC and PCA had a higher degree of correlation over the course of the study. Both pharmacy-refill- and pill-count-based adherence have been shown to be associated with HIV viral load [7, 29]; however, to our knowledge, their correlation with each other has not been examined in the literature. Even though PCA and PDC are considered to be structurally correlated measures because they are from the same data source (i.e., the pill bottle), our research [25] demonstrated that it is a misconception to believe that they are different ways to use the same information, yielding the same final result. We believe that the main reason for the discrepancy between PCA and PDC is that some participants reported stock supplies of medications; therefore, the pill bottle photographed for refill date was not necessarily the bottle used to fill their medication box. Finally, of our four adherence measures, HDC is the only marker of actual medication ingestion (i.e., direct method) and strongly predicts virologic response [30]. This pharmacologic measure had medium-sized statistically significant correlations with PDC and PCA. We believe that these correlations were not higher because PDC and PCA are not measures of medication ingestion and have certain inherent characteristics that are prone to exploitation, such as photographing the refill date on a pill bottle that is not the one in current use, using pills from an older stockpile, and using pills from multiple bottles while photographing only one.

Table 3 Advantages and disadvantages of four measures of ART adherence

Since there is no gold standard of adherence measurement, many studies use SRA as a single measure [5], even though prior research has shown that a combination of methods is usually the most suitable approach for medication adherence assessment [13]. Table 3 details advantages and disadvantages of text messaged SRA, photographed and text messaged PCA, photographed and text messaged PDC, and HDC based on home-collection of hair samples. The decision regarding optimal combinations of adherence measures relies on factors related to the setting (i.e., clinic or research); available resources (i.e., financial and staffing); and centrality of adherence measurements to the research question or clinical services. In general, it may be useful to combine complementary measures of short-term and long-term adherence, such as PCA or PDC and HDC, which can provide shorter-term feedback (e.g., via PCA or PDC) along with measures of longer-term exposure to ART (e.g., via HDC). Future research should examine optimal combinations of measurements.

Data presented in this paper represent the first hair DTG concentrations reported in the literature. Additionally, a strength of this study was our ability to recruit a national sample of participants for a completely remotely conducted study, which resulted in the participation of individuals with disabilities or busy schedules that may have prevented them from participating in non-remote research [25]. However, because this was a pilot study, we had a relatively small sample size and could not administer all possible adherence measure (e.g., various SRAs, dried blood spots, etc.). Additionally, our research is subject to several other limitations. The fact that over 90% of participants self-reported undetectable viral load meant that the study sample likely had relatively high levels of adherence, limiting our ability to examine a range of adherence levels. We had three participants on DRV and have included their data in Table 2 for completeness. Additionally, we recruited those with access to a mobile phone with capabilities to take and send photographs via text messages and our sample was primarily male. Therefore, our results may not be generalizable to other populations, such as those of other genders and those who are not technologically savvy. Further limitations of each adherence method are highlighted in Table 3.

Conclusions

The novelty of our study lies in the fact that we (1) collected all the data entirely remotely among a national sample of PLWH, (2) examined the correlation of three novel and objective measures of ART adherence, and (3) reported data on hair DTG concentrations. Data for all measures are easy to collect remotely; PDC, PCA, and HDC are objective; PDC and PCA are inexpensive and amenable to use in clinical practice and HDC measures actual medication ingestion. Significant levels of correlation between PDC, PCA, and HDC make all three viable candidates for further investigation and use in future HIV treatment and prevention studies.