Introduction

There is increasing recognition both of the importance of achieving good adherence to product use in clinical trials assessing efficacy of new HIV prevention methods [1] and the challenges to assessment of such adherence. Mathematical simulations have indicated that non-adherence will result in efficacy being underestimated in microbicide trials [2, 3]. This suggests lack of compliance with study protocols could have played a role in the borderline or lack of effect against HIV observed in some vaginal microbicide trials [4, 5]. The critical role of adherence to product use in microbicide trials was demonstrated in the Centre for the AIDS Programme of Research in South Africa (CAPRISA)’s phase llb trial of 1 % tenofovir gel. Although the results provided the first evidence that a vaginal microbicide gel may offer protection against HIV acquisition [6], they also revealed HIV incidence was significantly higher for women in the trial who reported using gel less frequently than directed, that is, who had lower adherence. In the first clinical trial of oral preexposure prophylaxis (PrEP) among men and transgender individuals, incidence was also higher for those with lower adherence. Moreover, this study found that self-reports of adherence to tablet use were much higher than plasma drug levels indicated [7], evidence that adherence by self-report to an HIV prevention method in a clinical trial is significantly over-reported. This finding was replicated in FEM-PrEP, the recent oral preexposure prophylaxis clinical trial among African women [8]. Analysis of study drug detected in plasma from a subset of participants indicated that less than 40 % showed evidence of recent pill use, although 95 % reported always or almost always taking the pill; and pill count suggested 88 % adherence. Self-report and pill-count have previously been found to overestimate adherence to HIV treatment [9], but not at the magnitude demonstrated in these prevention trials to date.

Up to now, few alternatives to self-report for assessing adherence in clinical trials of microbicide gels have been available. With the exception of the Carraguard trial that included a dye stain assay to assess applicator insertion [5] and the CAPRISA trial that included counts of returned empty applicators, these trials have had to rely on self-reported data because low systemic absorption of vaginal gels presented challenges to developing biomarkers [10]. The validity of such self-reports is questionable because participants may be reluctant to indicate that they have not used the study product given the emphasis placed on adherence during counseling. As biomarker assessment for gels remains elusive, improved measurement of self-reported product use is needed to understand and interpret results from microbicide trials [11] and has relevance for future HIV prevention trials in which reliance on self-reports for assessing compliance with study protocols represents significant cost savings and reduced complexity for trial implementation.

Increasingly, audio computer-assisted self-interviewing (ACASI), a technology developed in the United States to collect data on sexual behavior and drug use, is being used to collect sensitive information from individuals in surveys and clinical studies in developing countries. This method addresses concerns about accuracy of self-reported sensitive behaviors collected in face-to-face interviews (FTFI) where the presence of an interviewer may introduce over-reporting of socially desirable behavior and under-reporting of behaviors that are sensitive or socially undesirable. Evidence suggests that the application of ACASI significantly increases the reporting of sensitive information in the United States among adolescents and young adults [12], STD patients [13], and injection drug users [14]. In developing countries, however, the effect of self-administered questionnaires, whether collected by computer or paper and pencil, has been mixed. Depending on the country and question asked, self-administered questionnaires detected either no difference or a higher rate of socially undesirable behavior [1520]. In a few instances the reverse was found, with a higher rate of socially undesirable behavior reported in the FTFI [2124]. In addition, several studies in developing countries have examined consistency of reporting by interview mode and found ACASI data more likely to be internally discrepant than data generated by FTFI, likely because interviewers—whether explicitly directed to or not—reconcile inconsistent data [16, 23, 25].

Interview modes have been compared extensively for reporting of sexual activity, vaginal hygiene, and contraceptive use in studies of HIV prevention methods in Africa. Findings suggest behaviors such as sexual activity and hormonal contraceptive use [17] are underreported via both ACASI and FTFI interviews [26]. In studies that were not clinical trials, ACASI has been shown to perform better than FTFI when compared to a biomarker outcome, but the evidence is not overwhelming. In Brazil, there were stronger associations between self-reported risk behaviors and STIs with ACASI compared to FTFI [27]. In a placebo gel methodological experiment in South Africa the same dye stain assay used in the Carraguard trial, to detect applicator insertion, and Rapid Stain Identification of Human Semen (RSID), to detect exposure to a partner’s ejaculate in the prior 48 hours, were used to validate reports in ACASI and FTFI; results were mixed. For most behaviors, including anal intercourse (AI), multiple partners, and forced sex, ACASI generated significantly higher reporting. In addition, ACASI participants were more likely to report having had sex without gel. However, comparison of reported and tested applicators did not indicate a propensity for more honest reporting of gel insertion with ACASI. Analyses comparing reported unprotected sex with the RSID test results revealed more agreement with ACASI than with FTFI, but differences were small [28]. These findings suggest that although ACASI does increase reporting of some sexual behaviors, it may be less effective in reducing over-reports of adherence in clinical trials.

Because no HIV prevention trial has included an experimental component comparing self-reports of product adherence in computerized interviewing with conventional FTFI, an ancillary study to assess the effect of interview mode on self-reports of gel use in a trial with an active product was conducted at sites participating in a Phase IIB trial of two candidate microbicide gels, BufferGel and 0.5 % Pro 2000 [4]. The objectives of this study were (1) to assess the acceptability of using ACASI in a clinical trial in a low literacy setting; (2) to determine whether ACASI yields higher reports of non-adherence to microbicide gel regimens and condom use and higher numbers of reported sexual partners and anal sex than FTFI; and (3) to identify predictors of consistent reporting between both FTFI and ACASI.

Methods

All women enrolled in the microbicide safety and effectiveness trial (HPTN 035) [4] at the Blantyre and Lilongwe sites in Malawi who were scheduled for a quarterly follow-up study visit that included a behavioral assessment were offered enrollment into the study. Participants in both trial arms—those assigned to use vaginal gels and those assigned to the condom-only arm—were eligible. This ancillary study, referred to as 035B, was introduced to HPTN 035 participants after they completed the quarterly “Follow-up Behavior Assessment” (FBA), which was administered as an FTFI. The protocol and materials were all reviewed and approved by the local IRBs. Upon obtaining informed consent for 035B, the women were instructed on how to use a handheld computer. After successful completion of a short series of practice questions, women in the condom-only arm answered six questions and women in the gel arm answered ten questions; all but the last question were identical to questions asked in the FBA conducted face-to-face in the regular visit. A new final question obtained the women’s preferences on mode of interview (Table 1). The ACASI questionnaires were completed in the same clinic and often the same room as the FTFI interviews. Clinic staff were available to assist participants if they required help using the device or understanding the questions.

Table 1 Questions included from HPTN-035 FBA

The questionnaire was administered via ACASI with software specifically developed for this study by the Population Council using Microsoft Windows-based development tools adapted for a handheld computer. The incorporation of graphics for non-literate participants was an important element in customizing the ACASI application. Images were employed for some multiple-choice questions and literate women, if they desired, could read the questions on the screen at the same time they were listening to them in Chichewa, a local language. They then answered the questions by touching either an image, a “yes,” or a “no” response block coded by color on the screen. Images (drawings from educational materials and instructions included with condoms sold in Malawi developed locally) consisted of a male partner (for a question on the number of sex partners), a couple having sexual intercourse (for a question on number of sex acts in the prior 7 days), a condom (for a question on number of sex acts during which a condom was used), and a gel applicator (for a question on number of sex acts during which gel was used). For the last question on interview mode preference, pictures of a computer and a woman were shown. To enable use by non-literate participants, an interactive screen, depicting a man’s face, was developed to allow for reporting more than one partner. Each time a participant tapped the image on the screen in response to this question, a new face appeared to indicate a different partner. (See Fig. 1, which shows the handheld device with the question and images). Because of these images and the audio, a participant did not need to be able to read the question or response options. The software was also programmed with selected internal consistency checks, which verified whether answers to questions regarding the number of sexual acts when the product or condom was used/not used exceeded the total number of sexual acts reported in a prior question. When an inconsistent or invalid answer was entered, the participant heard a recorded warning and was able to re-enter her answer. The prompt was translated, and the exact wording of the English prompt was “Answer can’t be more than the number of times you had vaginal sex in the past week. I will ask you the two questions again.” Study staff were available to answer any questions that participants had; however, staff were trained not to provide or suggest answers to the participants.

Fig. 1
figure 1

Format of questions in handheld ACASI: sample question on number of partners

Surveys were implemented on Hewlett Packard iPAQ handhelds that had Lithium Ion rechargeable batteries, 3.5 inch backlit touch-screen displays, built-in SD (Secure Digital) and CF (Compact Flash) expansion card slots, integrated speakers, and headphone jacks. The hardware and software could accommodate interruptions in internet access and power that may occur during normal operations, particularly in developing countries. A web-based solution for data collection could have been more efficient and easier to implement, but it would require a constant internet connection not currently feasible in such settings as Malawi. As surveys were completed on the handheld computers, the resulting data were saved to SD cards in XML format. The data on the SD cards were only deleted from the SD cards once successfully merged with the data on the dedicated data manager computer, considered the source ACASI computer. The data manager computers were laptops with access to the Internet; each site was assigned its own data manager laptop and these were considered the source data units. Data from the laptops were compressed, encrypted, and uploaded to the secure trial data management center SCHARP in Seattle, WA where they were then made available for quality control checks and statistical analysis.

Only patient study identification codes—no other identifying information—were entered on the handhelds by study coordinators. The computers were password protected, and the data were backed-up at the sites on a daily basis. The computerized interview process, including administration of the consent form and instructions, lasted approximately one hour. The actual mean duration of questionnaire completion was 9.8 min for the gel users and 5.3 min for those in the condom-only arm, with a median duration of 9.5 and 4.8 min, respectively. This study involved no product or other intervention or experimental procedures.

There were 899 women eligible for the study; 722 (80 %) of them consented, and 672 completed questionnaires. Of these, 663 had matched data from both interviews and 585 completed both questionnaires on the same day (see Fig. 2 for Consort Table). To avoid issues of temporality regarding comparison of ACASI and FTFI responses, only those completing both interviews on the same day were included; bivariate differences were analyzed with McNemars’ tests and Student t-tests, as appropriate. We constructed univariate and multivariable logistic regression models to assess the demographic and baseline behavioral characteristics associated with each behavior and with reporting the same response in both interview modes to questions on number of sex partners in the past three months, number of times had sexual intercourse in the past week, condom use at last sex, gel use at last vaginal sex act, and AI in the past three months. In addition to comparing consistency between the ACASI and FTFI among women interviewed the same day, data on different responses within the computerized interview based on the programmed checks were also analyzed. A response was considered different if the answer to Q3, Q4, Q5, or Q6 (numbers of times gel or condoms were used in the past week) exceeded the response to Q2 (in the past week, how many times did you have vaginal sex?).

Fig. 2
figure 2

Consort table of O35b

Results

Almost all the women were married. Their mean age was 26 years (median, 25 years; ages ranged from 18 to 53) and 78.5 % had not completed their primary school education. Although most women lived in a house they owned (67.0 %), few had electricity in their homes (9.7 %) (Table 2). There were significant differences between the two sites; women in Blantyre were more likely to be educated and have electricity but less likely to own a home (results not shown). The mean and median number of reported sex acts in the past week was three at baseline, although higher in Blantyre than Lilongwe (mean 3.5 vs. 2.7, median 3.0 vs. 2.0; results not shown), and the mean number of condoms used in the past week was 1.8 (2.0 median). More participants from Blantyre than Lilongwe reported no condom use in the past week and fewer reported 100 % condom use (results not shown) but there was no difference in the proportion reporting condom use at last sex act (54 %). Reported gel use at last sexual act was slightly lower in ACASI than with FTFI (73.5 % vs. 77.2 %, p = 0.11). Reports of condom use in the last sex act were also lower in ACASI than in the FTFI (60.9 % vs. 65.5 %, p = 0.05). More women reported AI in ACASI than in the FTFI (5.0 % vs. 0.2 %, p < 0.0001). Although a small percentage (1.0 %) of the participants in the ACASI group reported more than one sex partner in the past month, no one reported more than one sex partner in the FTFI group. A higher percentage of women reported no partners in ACASI than FTFI (12.3 % vs. 4.8 %) (Fig. 3).

Table 2 Demographic and baseline characteristics of women doing interviews on the same day (n = 585)
Fig. 3
figure 3

Reported behaviors by mode of interview: N = 585 ** p = 0.11 * p ≤ 0.05

Analyses of differences between ACASI and FTFI responses revealed that the proportion of women providing the same response in both modes varied by question, with the highest agreement, 95.2 %, in responses to the question on AI in the past 3 months, followed by 90.1 % agreement in reporting the number of sex partners, then 82.0 % agreement for gel use at last sex, and finally the lowest agreement (72.0 %) for the number of times a participant had sex in the past week. Note that for behaviors that are infrequent, for example, AI, the proportion providing the same response will inevitably be high, even though, as observed here, the differences between ACASI and FTFI are significant. Table 3 shows the results of the univariate and multivariate logistic regression models. Each model included the following independent variables: age, educational level, home ownership, electricity in home, site, number of sex acts in last week, number of times using a condom in last week, overall condom use last week (univariate only), and condom use in the last act. For the models comparing responses to the number of sex partners in the past three months, only having higher education (some secondary) was associated with reporting the same response by mode (unadjusted odds ratio (UOR) 5.01, 95 % CI 1.54–16.34 and adjusted odds ratio (AOR) 4.96, 95 % CI 1.39–17.70). For condom use at last vaginal sex, those with some secondary education (UOR 3.24, 95 % CI 1.64–6.38 and AOR 2.89, 95 % CI 1.36–6.09), home ownership (UOR 0.48, 95 % CI 0.31–0.73 and AOR 0.63 95 % CI 0.39, 1.02), or from the Blantyre site (UOR 1.54, 95 % CI 1.04–2.29) were associated in either univariate or multivariable models with reporting the same answer. For gel use, only being from the Blantyre site was associated with the same report (UOR 2.42 95 % CI 1.34–4.39 and AOR 2.73, 95 % CI 1.41–5.28). Only younger age (<24 years) in univariate analysis was associated with the same reporting of AI in the past three months (UOR 0.28, 95 % CI 0.08–0.99). Finally, no characteristics or behaviors were associated with the same report of sexual frequency in the past week although more of those with some secondary education gave the same report (AOR 1.78, 95 % CI 0.91–3.49, p = 0.09).

Table 3 Demographic and baseline characteristics of women who reported the same response on ACASI and FTFI during interviews on the same day

In addition to comparing responses by interview mode, we analyzed consistency of responses within ACASI. In the condom-only arm (n = 153), 15 % of participants reported an inconsistent answer, that is, a response where the number of sex acts with and/or without a condom exceeded the reported number of sex acts. After being given the opportunity to change their response, 17.4 % of those with an originally inconsistent response replaced it with another inconsistent response. In the gel arm (n = 432), where there could be four inconsistent responses (comparing each answer to the individual Questions 3, 4, 5, 6 to the response given for Question 2 on the frequency of vaginal sex, see Table 1 for questions), 71.3 % of participants reported no inconsistent responses; 28.7 % reported inconsistently in at least one response (19.2 % of those were limited to one inconsistent answer, another 8.1 % reported two inconsistent answers, and 1.4 % reported three inconsistent answers). These 124 (of 432) women generated 171 inconsistent responses. For the 171 responses that prompted a consistency query, participants changed their responses 160 times. Changed responses were in the right direction 90.6 % of the time (145/160 responses). However, even if all four pairwise comparisons were consistent, the total number of sex acts reported in Questions 3–6 could still exceed the number reported in Question 2. Indeed, for most participants in the gel arm, the total number of sex acts reported in Questions 3–6 exceeded the total reported in Question 2, both upon initial questioning (83.6 %) and after the individual consistency checks (78.8 %). Of the 359 participants who reported having sex in the past week in response to Question 2, for only 37 (10.3 %) was the number of sex acts reported equal to the total number of sex acts reported in Questions 3–6. After the checks, the number consistent increased to 52 (14.5 %). The remaining participants reported fewer sex acts in the individual questions relative to the total (6.1 % after the initial questions and 6.7 % after the consistency checks). Note that all responses on the FTFI were consistent because interviewers were trained to detect inconsistencies and correct them before recording a response.

Finally, a much greater preference for ACASI was reported by respondents. Among the 585 women responding to this question, 85 % preferred ACASI to an FTFI.

Discussion

This methodological study, conducted during the course of a phase IIb clinical trial of two topical microbicides, showed four important results. First, it demonstrated significant differences in responses to questions on sexual behavior and gel use when administered by ACASI compared with an interviewer. Second, it suggested that use of ACASI in international HIV prevention trials among low literacy study populations may result in slightly lower, and presumably more accurate, reporting of adherence to use of study products such as microbicides but certainly higher reporting of other sensitive and highly relevant behaviors such as AI. Given that sexual behaviors are known to be under-reported and adherence over-reported, the direction of the ACASI reports supports the assumption that in this study it produced more accurate data. Third, this study demonstrated that consistency or edit checks can be programmed within ACASI and may reduce, although not necessarily eliminate, internally discrepant responses. Finally, women participating in the study found ACASI preferable to FTFI.

Although reported adherence to coitally dependent products (gels and condoms) was lower and AI higher when reported via ACASI rather than FTFI, what is notable is the high level of inconsistency between the FTFI and ACASI and the lack of association with concordance of reporting of most characteristics and behaviors, aside from education, for three of five outcomes: number of sex partners, number of times had sex in the last week, and condom use at last sex act. It may be that when a product is used frequently the burden of recall is too heavy for these women to accurately report at the level of detail required by study questionnaires. The behaviors may not be salient enough to be memorable and, therefore, accessible for accurate recall. Almost all women in this study were reporting marital sexual activity and behaviors surrounding that activity with their spouse. If frequent and routine enough, a behavior is recalled as a schema, i.e. a generic description of classes of events and is stored in memory with little or no detail about specific episodes [29, 30]. Research on memory retrieval suggests schematic recall may be less accurate than recall of specific episodes for which memory retrieval is detailed and unique. Furthermore, schematic behaviors are more likely to be culturally conditioned (i.e., reported in normative terms reflecting the respondent’s community or cultural tradition [31]), also suggesting less precise memory of such behaviors regardless of interview mode.

Informing respondents that the value they provided for one question—e.g. gel use—exceeded the value responded for another question during the same period—e.g. frequency of vaginal sex—did not necessarily eliminate discrepancies. Respondents continued to be inconsistent even after a prompt about an inconsistency. Such an approach lengthens the duration of the interview and may cause frustration among participants if they are increasingly told they need to correct a response for the sake of consistency. Yet, this mechanism also serves to aid recall and may result in more accurate responses for a small but significant portion of respondents. Because these data reflect the potential amount of exposure to HIV, even a small amount of correction may contribute greatly to an understanding of the efficacy of the products being tested.

Another finding of note was the significant difference in reporting on numbers of partners in ACASI compared to FTFI. The higher reporting of no partners with the computer suggests that women may be reluctant to report that they are not having sex because one of the eligibility criteria for the trial is being sexually active [32], defined as having had vaginal intercourse at least once in the past three months. This finding suggests ACASI may be useful for collection of behavioral data used at screening for trials and to determine eligibility. While risky behavior and consequently the amount of exposure to HIV would be underestimated when numbers of partners are underreported in trials, it may also be overestimated when such behavior is over-reported. Therefore, ACASI may prove useful in characterizing and estimating potential exposure to HIV that may be expected in a trial if used during trial screening procedures as well as during trial implementation.

Of even more potential impact in this study was the significantly higher reporting of AI reported in ACASI than in FTFI. Based on the data from the existing FTFIs, the amount of AI occurring among women had been presumed to be extremely low and not enough to represent a significant risk for transmission. The higher reporting of the practice of AI via ACASI may produce a more complete account of the actual exposure to HIV among women who participated in the study by types and amounts of exposures. Utilization of ACASI in other vaginal microbicide trials may contribute importantly to the analysis of protection from vaginal microbicide products since underreporting of AI can dilute the power to detect efficacy [33]. While there are no other published data on AI in Malawi, in South Africa slightly higher levels of AI have been reported (5–11 % reporting AI in the past three months or one month) [28, 34] than those reported with ACASI in this study. Socio-cultural differences between Malawi and South Africa and differences in the populations studied, such as proportion married and age range, suggest that the prevalence of AI found in our study is reasonable and supports our finding that no reports of AI in FTFI is evidence of underreporting.

When conducting cross-over studies comparing ACASI with FTFI, one concern is that the sequence in which the interview modes are administered could possibly affect the results. Previous studies have addressed this issue by randomizing groups to take either the ACASI first, followed by the FTFI, or vice versa. These studies have found that the associations between interview mode and responses to sensitive questions were not modified by the order in which the modes of interview were administered [17, 24]. Therefore, although the order in which questionnaires were administered was not randomized in this study, given the findings from other studies we do not feel this had a significant impact on our results. Moreover, because of the study design, in which several hours elapsed in a single visit before women were asked the same questions, variability in recall by mode of interviewing was not likely to be a factor. However, because women who reported differently in the ACASI could have remembered their response from the FTFI, the differences we observe are likely to be a lower bound estimate of true reporting differences by mode.

These findings provide evidence that in HIV prevention trials conducted in international settings assessments of self-reported adherence and behaviors involving exposure to HIV may be improved by use of ACASI and that the implementation of this approach may be preferred by trial participants. However, when considered in the context of the mixed findings reported in other studies that also used biomarkers to validate reporting by interview mode, ACASI is clearly not a cure-all for over-reporting of adherence. Concern is heightened when the results reported from the first clinical trial of an antiviral oral prophylaxis (IPrEx) that included biomarker validation of adherence demonstrated large discrepancies when compared with self-reported levels of pill taking [7] followed by the halting of a trial of oral prophylaxis among women in Africa demonstrating even greater discrepancies between self-reported pill taking and drug levels detected in blood[8], illuminating the pitfalls of reliance on self-reported adherence in HIV prevention trials, regardless of data collection mode.

While ACASI and other methods of collecting self-reports such as coital diaries [35] may increase the validity of self-reports of adherence to biomedical HIV prevention methods over that obtained in FTFI, no method of interviewing may be sufficient to overcome participants’ reluctance to report nonuse of study product in trials where they are heavily counseled to be adherent to the product being studied and to use condoms. Adherence in future trials needs to be monitored prospectively [1] and with methods that correlate more closely to biomarkers (such as drug concentrations in blood, vaginal, or rectal fluid). A more appropriate role for participant questionnaires in such trials may be to focus on experiences with product use such as challenges and behaviors related to adherence and the context of non-use. Additionally, correlations of self-reports of product use with presumably more objective methods such as electronic monitoring devices, including the Wisepill, and Wisebag, may provide important information to support adherence in future trials and eventually in the open market if and when products become licensed. Nevertheless, further research and new approaches to collection of data on self-reported adherence may be necessary to reduce the significant over-reporting that remains a challenge for HIV prevention trials.