Introduction

Antiretroviral therapy (ART) is effective for both the management and prevention of HIV [1, 2]. ART suppresses HIV viral load, increases CD4 cell count, reduces the incidence of co-infections, and decreases all-cause morbidity and mortality in HIV-infected populations [36]. However, there are disparities in ART access and effectiveness. Economically and socially vulnerable adults are at greater risk for becoming infected with HIV, and after infection experience sub-optimal outcomes relative to other HIV-infected populations [79]. For many socially marginalized individuals, their primary connection to healthcare occurs in community clinics, emergency rooms, and jails, and as a result, they are more likely to experience interruptions in care [1013].

To deliver quality HIV care, providers require information on the clinical histories of their patients. Because marginalized adults access care inconsistently or from multiple providers who do not share medical records systems, their providers may have less access to clinical history information than they would with patients who have a more consistent source of care. In a situation of chaotic care, where access to clinical information is not assured, providers may not be able to use important information such as the duration of infection and nadir CD4 cell count, neither of which can be confirmed by a test, when making clinical decisions. In these cases, information is limited to available records and patient-provided data. It is also the case that investigators studying other aspects of HIV prevention and care in marginalized populations must rely on self-reported clinical data when conducting analyses [1315].

Previous studies of self-reported clinical information relevant to HIV care fall into two broad groups: those that consider the general population of HIV-infected adults and those that consider a specific sub-population, such as unstably housed HIV-infected adults [1622]. The first body of research suggests that patients’ own reports of CD4 cell counts and HIV viral loads are consistent with patient medical records [1618]. The second area of study indicates that clinical information reported by sub-populations of vulnerable patients and their clinical records are not as highly correlated. Fisher explored data from individuals with histories of substance abuse, and reports low reliability for if and when patients report being told they were HIV-infected [20]. However, other studies of HIV-infected adults who were unstably housed show agreement between medical records and self-reports greater than 75 % for recent CD4 cell count and HIV viral load [19, 21].

A patient’s lowest ever, or nadir, CD4 cell count is strongly correlated with outcomes that occur later in the progression of HIV-disease. Lower nadir CD4 cell counts are associated with a reduction in the antibody response to pneumococcal conjugate vaccines, a recommended vaccine for all persons with immunocompromising conditions [23, 24]. Lower nadir CD4 cell counts are also associated with an increase in the risk of myocardial infarction, and an increase in the risk of neuropathic degeneration [2326]. Nadir CD4 cell count also appears to be inversely related to the size of the HIV reservoir in persons on prolonged ART [27]. These correlations emerge in cases when the nadir CD4 cell count was realized many years before the event in question. Despite its importance in clinical decision-making, individual recollection of nadir CD4 is understudied, and whether persons living with HIV (PLHIV) can recall it accurately has yet to be established [26]. No previous studies have addressed whether this important indicator can be reliably obtained via self-report from HIV-infected individuals, with or without multiple co-morbidities.

Thus, the evidence paints an incomplete picture of reliability of patient recall: many HIV-infected adults accurately report their most recent CD4 count and viral load, including those who are homeless or who are drug users not in treatment, but inaccurately recall dates of clinical visits and initial HIV infection [19, 20, 22]. Factors that have been found to influence recall of CD4 cell count include injection drug use, years of education, and duration of insurance coverage [17, 18]. Similarly, substance use and years of education, and additionally, age, have been found to influence recall of dates of clinical visits [21].

In this paper, we discuss early evidence regarding the validity of nadir CD4 cell count, as reported by a socially and economically vulnerable incarcerated adult population and compared to data extracted from community and jail medical records. This analysis provides further data on the validity of recall of CD4 cell count and HIV viral load in a cohort of multiple co-morbid HIV-infected adults. We use sensitivity and specificity analysis to enhance the treatment of these data and present the results of logistic regression analyses that explore the relationships between patient characteristics and the odds of obtaining accurate data from self-reports.

Investigators have used an array of techniques to validate self-reported data: Pearson’s r, Spearman’s ρ, Cronbach’s α, sensitivity and specificity analysis, and, more commonly, percent agreement, Cohen’s κ, and the intra-class correlation coefficient (ICC). Bland and Altman warn in one of a classic series of notes that the oft-used ICC “measures the strength of the relation between two variables, not the agreement,” and that “it would be amazing if two methods designed to measure the same quantity were not related” [28]. Muller and Buttner are also critical of the ICC, as well as Pearson’s r, and do not identify a satisfactory test for validating self-reported data; Maclure and Willett state that “when assessing validity, there are better alternatives to κ,” among them sensitivity and specificity analysis [29, 30]. Sim and Wright note that the magnitude of κ is influenced by the prevalence of the attribute and potential non-independence of ratings, problems abated by the use of sensitivity and specificity analysis [31].

Methods

Subject Recruitment and Selection

Participant data were drawn from individuals surveyed in an ongoing study of HIV-positive jail detainees transitioning to the community post-release [32]. The parent study was approved by the UCSF Committee on Human Research (IRB#:10-00077). Participants are incarcerated HIV-infected adults who report problematic substance use and unstable housing, and are recruited while detained in the San Francisco jail system. At the time of data collection, 207 individuals had completed a baseline survey, which occurs while detained. Follow-up interviews were scheduled for 2, 6, and 12 months after an individual’s release from jail and 186 (90 %) individuals participated in at least one follow-up interview. These interviews took place at the study facility in downtown San Francisco, or, if the participant was re-detained, in jail. Interviews were administered through headphones in a private room with an audio computer-assisted self-interview (ACASI) system and study staff available to provide assistance. We utilized each participant’s earliest follow-up interview for comparison with EMR, and found no differences related to the timing of the interview that was used.

Survey Data

We used demographic data and information about reported drug use from the baseline survey and five clinically relevant items from the follow-up survey in this analysis. The clinical items included the duration of HIV-infection, recent and nadir CD4 cell count, and recent HIV viral load.

Electronic Medical Record (EMR) Data

We abstracted HIV related information from EMR kept by the San Francisco jails, as well as from a central database constructed for community health providers in San Francisco. The San Francisco jails have maintained an electronic database of detainee information since 1995 that includes notes from physician encounters and psychiatric consultations, detailed records of all medications dispensed, dates of laboratory tests and their results, and detention history. These records are available for all participants. Similarly, city and community health providers in San Francisco maintain EMR that are accessible in a common database, containing the information for only those adults who access health services from city and community providers. HIV test results, CD4 cell count test results, and HIV viral load test results are all available in the city-wide database, and were combined with the EMR from the jail system before being compared to survey results.

Statistical Analysis

We first compared self-reported clinical information from surveys with values recorded in medical records to determine the proportion of instances where the data agreed. Participants were asked to classify their CD4 cell counts into one of four categories (see Table 1), and, if they did not report their most recent HIV viral load as “undetectable” were asked to classify their most recent results into four categories. The categories chosen reflect the ranges of CD4 cell counts that have previously been used to guide treatment: 200, 350, and 500 CD4 cells/mm3. CD4 cell counts and HIV viral load results were available in the EMRs as integer values, with the exception of viral load counts classified as less than some detection limit (75 or 50). These values were converted into corresponding categorical variables, and comparing these to the self-reported classifications allowed us to determine whether a response was accurate or inaccurate. All statistical analyses were performed with Stata 12 (StataCorp 2011, College Station, TX).

Table 1 HIV-related questions and response codes as they appear on the survey tool

Sensitivity and Specificity Analysis

We calculated sensitivity and specificity of recall for classification of nadir CD4 cell count, most recent CD4 cell count, and most recent HIV viral load. The sensitivity and specificity of recall estimate the proportion of the population whose self-reported CD4 cell count or HIV viral load concurs with their EMR when reporting values as above or below a specific clinically relevant value. Explicitly:

  1. i.

    Was the participant able to correctly identify whether his or her nadir CD4 cell count was <200 or ≥500 cells/mm3?

  2. ii.

    Was the participant able to correctly identify whether his or her most recent CD4 cell count was <200 or ≥500 cells/mm3?

  3. iii.

    Was the participant able to correctly identify whether his or her most recent HIV viral load was undetectable?

We calculated separate sensitivity and specificity measures for the <200 cells/mm3 and ≥500 cells/mm3 cutoffs, for both nadir CD4 cell count and most recent CD4 cell count. In order to compare results from this study to previously published work, we calculated several other measures: α, κ, Pearson’s r, Spearman’s ρ, ICC, and percent agreement.

Logistic Regression Analysis

In line with previous findings, we identified four characteristics available from the baseline survey that might impact the accuracy of self-reported data: (i) duration of HIV-infection, (ii) less than a high school education, (iii) homelessness at the time of the follow-up survey, and (iv) injection drug use [17, 18, 21]. We also included three characteristics that have not been previously reported on: (i) presence of an Axis I or Axis II psychiatric disorder (excluding substance-related diagnoses), (ii) hepatitis C co-infection, and (iii) ART adherence, rationalizing that these characteristics could be correlated with contact with the healthcare system and overall health literacy—other attributes previously identified as having a plausible connection with recall [17, 19].

Axis I disorders (such as schizophrenia) and Axis II disorders (such as autism or antisocial disorder) were combined into one indicator. Division into subgroups by disorder was not possible because of the sample size. We assume that all of these disorders are potentially correlated with an individual’s health literacy and ability to recall health-related information. Poor adherence is defined as having surpassed low level viremia—i.e., a most recent HIV viral load >1000 copies/mm3. Characterizations of viremia vary throughout the literature; low level viremia commonly refers to an HIV viral load between 50 and 1000 copies [33]. HIV drug resistance can also play a role in development of viremia, as non-optimal drug regimens may be harder to adhere to or more likely to cause side-effects. Hepatitis C co-infection indicates that an individual has ever been infected with Hepatitis C after their original HIV diagnosis. This includes participants with active Hepatitis C infection and participants who have undergone successful treatment and no longer have detectable Hepatitis C viral loads.

We included variables representing these characteristics in logistic regression models to assess whether any were associated with significantly different odds for self-reporting clinical values that matched EMR values.

Results

Participant Characteristics

All participants had histories of incarceration and unstable housing. The population was predominantly male (80 %) and African-American (55 %), with an average age of 44. About one-third of participants were non-Hispanic whites (see Table 2). More than half (56 %) had been diagnosed with an Axis I or Axis II psychiatric disorder (excluding substance-related diagnoses), and 57 % had previously tested positive for antibodies to Hepatitis C. At the time of the first follow-up survey, 28 % reported living alone or with a spouse; 37 % reported living with a friend or relative or at a hotel, 25 % were homeless, and 11 % were in jail or residents of in-patient rehabilitation programs. Only 5 % reported that they were currently employed, and no participant reported having private insurance. While EMR indicated that all participants had previously used non-prescription drugs, only 89 % of participants reported previous use on the survey. Two-thirds of participants (67 %) reported previous injection drug use. One-third (32 %) of respondents reported that they never completed high school or obtained a GED. Almost half of participants (48 %) were detained one or more times after their initial release but before their first follow-up interview.

Table 2 Demographic characteristics of study participants

Two hundred and seven individuals completed the baseline survey. Hundred and eighty six participants completed at least one follow-up interview, and 21 participants did not complete any follow-up surveys for an overall response rate of 90 %. Respondents and non-respondents did not differ significantly by any demographic or health characteristics measured in the baseline survey or extracted from EMR.

Nadir CD4

94 % of participants provided a self-reported nadir CD4 cell count. No demographic characteristics differed significantly between respondents and non-respondents, and both groups had similar mean CD4 cell counts and HIV viral loads. When classifying themselves as having a nadir count of less than 200 cells/mm3, the sensitivity of individual self-reports was 82 % [95 % confidence interval (CI) 71, 89] and the specificity was 73 % (95 % CI 63, 81), indicating that four-fifths of respondents whose EMRs show a nadir CD4 cell count under 200 report a concurrent value, and that three-quarters of respondents whose EMRs do not show a nadir CD4 cell count under 200 report a matching categorization. Self-reported classification of nadir count of greater than or equal to 500 cells/mm3 had markedly lower sensitivity, 56 % (95 % CI 35, 76), but very high specificity: 93 % (95 % CI 88, 97).

Logistic regression analysis, using the binary outcome of correct versus incorrect classification as the dependent variable, revealed that the presence of an Axis I or Axis II psychiatric disorder was significantly associated with a lower likelihood of reporting a nadir CD4 cell count concurrent with EMR: adjusted odds ratio (OR) 0.48 (95 % CI 0.24, 0.97; P < 0.05) (see Table 3). When participants without Axis I or Axis II disorders were considered separately, the sensitivity of self-report increased to 89 % (95 % CI 74, 96) and the specificity to 77 % (95 % CI 59, 89). The sensitivity and specificity of self-report are much smaller when considering only participants with these disorders: 66 % (95 % CI 41, 86) and 43 % (95 % CI 20, 70), respectively.

Table 3 Results of logistic regression analyses

HIV Viral Load

Seventy-eight percent of participants answered questions about their most recent HIV viral load. The demographic and health characteristics of respondents were not significantly different from non-respondents. Sensitivity for reporting an undetectable HIV viral load in concurrence with the medical record data was 93 % (95 % CI 84, 97) and specificity was 77 % (95 % CI 66, 86).

Logistic regression analysis did not uncover any significant relationships between participant characteristics and the likelihood of reporting recent HIV viral load results that accord with EMRs.

Most Recent CD4

Ninety percentage of participants reported information from their most recent CD4 cell count, again with no significant differences in demographic or health characteristics between respondents and non-respondents. When classifying themselves as having a most recent CD4 cell count of less than 200 cells/mm3, the sensitivity of self-reports was only 58 % (95 % CI 41, 74), but specificity was 95 % (95 % CI 89, 98). When reporting a most recent CD4 cell count of greater than or equal to 500 cells/mm3, self-reports were 70 % sensitive (95 % CI 57, 81) and 91 % specific (95 % CI 83, 95).

Only Hepatitis C co-infection was statistically significantly associated with the likelihood of correct classification. Participants who had ever tested positive for the presence of antibodies to Hepatitis C were less likely to report values for recent CD4 cell count that matched their EMR, with an adjusted OR of 0.44 (95 % CI 0.20, 0.98; P < 0.05).

Table 4 lists the results of the validation measures used in this study as well as the results of five previous studies examining the agreement of self-reported CD4 cell counts and HIV viral loads. No previous analyses address the validity of self-reported nadir CD4 cell count, thus restricting the comparison of results to only the most recent CD4 cell count and HIV viral load tests. Only the Sohler (2009) study had more responses than the current analysis, but was limited to comparing the percent agreement.

Table 4 Measures of agreement from this analysis compared to previous validation studies of self-reported HIV-related clinical variables

While we believe that sensitivity and specificity are optimal measures of agreement, we have included six other commonly used metrics to facilitate comparison between our results and previous published analyses. The lowest values of Cohen’s κ from this study suggest “moderate” to “substantial” agreement, as do the values of κ from every other study listed [34]. The Cronbach’s α realized in the current and previous studies suggest “acceptable” to “good” agreement [35]. The ICC, the related Pearson correlation coefficient, and the Spearman ρ realized in this study all fall within the rather large range of previously published values.

The interpretations of these values vary and are not as straightforward as the interpretations of the sensitivity and specificity. In the case of nadir CD4 cell count, for example, the value of κ suggests “moderate” agreement, and the value of α suggests “good” agreement, while the sensitivity indicates that 82 % of individuals were able to correctly categorize their lowest nadir CD4 cell count as less than 200 cells/mm3. Table 4 also indicates that providers might be more cautious when relying on self-reported values from patients with confirmed Axis I or Axis II psychiatric disorders. When these individuals are removed from the analysis the value of κ suggests “substantial” agreement and the value of α still denotes “good” agreement, while the interpretation of the sensitivity is clearer: excluding patients with Axis I or Axis II disorders causes a 7 % increase in sensitivity. In this case, 89 % of individuals were able to correctly categorize their lowest CD4 cell count.

Overall, Table 4 suggests that even in marginalized populations, self-report of important clinical factors can be recalled with modest accuracy, this being best characterized and understood by calculating the sensitivity and specificity of correct classification into relevant categories.

Discussion

The results of this analysis reveal that in a socially and economically vulnerable population of participants, recall of nadir CD4 differs depending on its value (less than 200 vs. greater than or equal to 500) and history of psychiatric diagnosis, but can nevertheless be considered reliable. False negatives are of concern in this population, as under-treatment is a potential problem. A sensitivity of 82 % for nadir CD4 cell count less than 200 indicates that the majority of individuals are able to recall that their CD4 cell count has previously been very low, in accord with medical records. The proportion of these individuals able to recall values in accord with their medical records rises to 89 % when individuals with Axis I or Axis II psychiatric disorders were excluded. In both cases, the specificity of recall was around 75 % (73 % in the first case and 77 % in the second). Thus, about 25 % of reports by individuals who believe that they have never had a CD4 cell count under 200 may be inaccurate.

Concordance is poorer between self-reported values and medical record data regarding the most recent HIV viral load. When considering whether their most recent HIV viral load was undetectable or not, reports by respondents in our sample agreed more often with medical record data than in studies by Kalichman or Kinsler, but when considering a categorical measure of most recent HIV viral load, respondents in our sample agreed less often [17, 19]. When considering percent agreement and κ, our results showed lower accuracy than previously published results. However, when considering ICC, Pearson’s r, or Spearman’s ρ, our results showed greater concordance. Sensitivity may provide a more accurate depiction of the 145 individuals who responded to questions about their most recent HIV viral load, 93 % of those with undetectable levels correctly recalled this fact. This is greater than the 75 % sensitivity previously reported in a pilot study in New York [36].

When categorizing their most recent CD4 cell count, study participants reported more disparate values than the previously published results would suggest, when comparing percent agreement, Pearson’s r, Spearman’s ρ, Cohen’s κ, and the ICC. Sensitivity was lower than in the only other study to calculate this measure: 58 % for categorization of most recent CD4 cell count as less than 200 and 70 % for categorization as greater than or equal to 500, compared to the previously reported but unpublished results of 80 % for both groups [36]. Specificity was much greater and similar to the CHAIN report: 95 % for CD4 cell counts of less than 200 and 91 % for counts greater than or equal to 500 CD4 cells, compared to 96 and 100 %, respectively [36]. In one previously published study, Sohler (2009), our study population outperformed expected results when considering the reported percent agreement. The population in Sohler shares similar characteristics to our study population, and they were able to recruit more than 300 responses [21]. However, categorization in that study was based on a CD4 cell count of greater or less than 350 cells/mm3. The differing results might occur because participants are more easily able to classify themselves into extreme categories.

These findings extend the body of research into the validity of self-reported clinical data among HIV-infected patient populations. Findings from Calsyn et al. suggest that homeless individuals with Axis I and Axis II psychiatric disorders have a hierarchy of preferred needs, and that the accuracy of self-reported data is better if the data is related to a need towards the top of the hierarchy [37, 38].During a clinical examination, a treating physician may choose to emphasize different aspects of treatment for HIV-infection. The recall of participants may, therefore, give insights into which aspects are discussed more thoroughly or are given disproportionate weight in discussions with care providers. As a result, one possible explanation for the higher accuracy of recall of most recent HIV viral load compared to most recent CD4 cell count (93 % sensitivity compared to 58 and 70 %) is that the focus on HIV viral load as a measure of adherence and infectivity has translated from health care providers to patients.

There are several limitations to this analysis: the sample is not a random sample but rather a convenience sample and the responses in this analysis were taken from a much larger survey that was not intended for this purpose. While no participants reported having private insurance, it is possible that some individuals obtained health care from private providers in San Francisco or public or private providers in a different city. EMRs that are limited to publically provided care in San Francisco may therefore be incomplete. In the context of nadir CD4 cell count, the important missing values are any that are lower than the nadir CD4 cell count recorded from EMRs, because there is a risk of misclassifying those who correctly reported a lower nadir CD4 cell count that differs from their EMR. If all of the participants who reported a lower nadir CD4 cell count than listed in their EMR were correct (i.e., if their nadir CD4 cell counts were established at private providers or providers outside of San Francisco and their EMRs were subsequently incorrect), the sensitivity for nadir CD4 cell count under 200 would rise from 82 to 87 %, and the specificity from 73 to 100 %.

These limitations are counterbalanced by various strengths. The population studied in this analysis has multiple co-morbidities: in addition to HIV-infection, individuals must have multiple detentions in San Francisco, a history of unstable housing, and current or previous substance use. Given that the maximum occupancy of the San Francisco jails is 2200 people, and the prevalence of HIV-infection is approximately 4 %, there are at most 88 HIV-infected individuals detained at any time. Exclusion criteria (English-speaking, multiple detentions, unstable housing, and drug use) reduce the number of individuals eligible for the parent study. Data collection took place for a period of 24 months, and we believe that the sample of 207 individuals used in this analysis represents a large proportion of this population. Additionally, the length of time that the San Francisco jails have been keeping EMRs, combined with the lack of private insurance in this population, suggests that the jail medical records most likely represent a high standard for the comparison of clinical data related to HIV infection.

Conclusions

Socially and economically marginalized HIV-infected adults often experience disruptions in care, and they are less likely to have access to their full medical history. Physicians are therefore dependent on self-report when assessing the risk of a range of adverse events associated with low nadir CD4 cell count. Overall, our findings indicate that despite the vulnerable nature of the population in this study, recall of two important clinical markers of HIV status—nadir CD4 cell count and undetectable HIV viral load—were largely concurrent with existing EMR data. These results suggest that researchers can rely on the recall of PLHIV in other study settings that utilize self-reported information about HIV disease.