Introduction

Between 20 and 70% of people aging with HIV report difficulties with one or more aspects of cognition even with well-controlled infection [1]. Any decline in cognitive capacity can affect everyday life, including producing difficulties at work, problems with household management, and poor medication adherence [2,3,4]. These in turn affect the quality of life [4].

Since the introduction of combined antiretroviral treatment (cART), the spectrum of HIV-associated neurocognitive disorders (HAND) has changed, with a substantial reduction in dementia (from approximately 20 to 5% of patients) but a higher prevalence of mild-to-moderate forms of cognitive impairments [5, 6]. It can be difficult to distinguish mild, but clinically relevant symptoms reported by people living with HIV, from everyday memory lapses common in the general population. As yet, there is little consensus on how to identify mild cognitive impairment in the clinical setting.

Clinicians are reluctant to rely on self-reported cognitive difficulties, fearing lack of insight in those who report no difficulties and over-reporting in those who do. This hampers communication about the very real concerns that people with HIV experience. There is a heightened interest in obtaining patients’ perspectives on their conditions in order to inform the benefits and risks of new treatment approaches. To this end, between 2013 and 2017 the Food and Drug Administration (FDA) held 21 meetings with patients with different health conditions including people with HIV. The results were published as a series of 21 “Voice of the Patient” reports [7]. Cognitive concerns were raised by many patient groups including people with HIV using words such as “confusion,” “disorientation,” “hard to concentrate, can’t focus,” “inability to process information,” “can’t find the right words,” “inability to multi-task,” and “problems with decision-making.” However, there is no consensus on how to ask about cognitive difficulties experienced by persons living with HIV. No HIV-specific voice-of-the-patient measure exists to assess these difficulties and there is no information as to whether their nature and frequency differ from those of the general population.

In HIV research, two approaches to obtain information on cognitive concerns from patients themselves have been adopted. One is to use a generic questionnaire on reported cognitive difficulties, not developed specifically for people living with HIV but with items that are likely to reflect their experiences. These include the Patients Assessment of Own Functioning Inventory (PAOFI) [8], the Perceived Deficits Questionnaire (PDQ) [9], and the Cognitive Failures Questionnaire (CFQ) [10].

The other approach is to use an HIV-specific measure of health-related quality of life that includes cognitive items as a subscale. However, the World Health Organization Quality of Life-BREF (WHOQOL-HIV-BREF) [11] includes only 1 cognitive item (concentration); the HIV Medical Outcomes Survey (MOS-HIV) includes only 4 cognitive items (memory, concentration, and executive function) [12], and the PROQOL-HIV includes 2 items (memory and concentration) [13].

Given the importance that the FDA has placed on including the voice of the patient in the evaluation of therapies, they have published guidance for developing “patient-reported outcomes” for measuring the symptoms of disease and treatment outcomes [14]. Central to this approach is a strong theoretical conceptual framework adjusted to the context using patient input; pilot testing of the items, response options, and scoring method; and validation.

We previously reported on the development of a pool of items reflecting cognitive difficulties of persons living with HIV [15] following the FDA guidance. Forty-eight important and prevalent concerns, evaluated on a 3-point ordinal scale, were retained to form the test version of the Communicating Cognitive Concerns (C3Q) Questionnaire for people with HIV. The items covered the cognitive domains of memory, concentration, executive function, language, emotions, and motivation. The 48-item version is presented in Table S1. This paper describes the process to produce a total score for these items.

The objectives of this study were (i) to identify a set of items from the test version of the C3Q that best fit the Rasch model to create a hierarchically ordered set representing a mathematical quantity of cognitive concerns; (ii) to validate the stability of the item hierarchy of this reduced set of items in a separate sample of people with HIV; (iii) to estimate the extent to which the hierarchy of the final set of items from the target population of people with HIV is similar to the hierarchy from testing in people without HIV; and (iv) to estimate the extent to which the C3Q total score co-calibrated with converging constructs.

Methods

The data from this study came from two sources. The HIV sample came from participants in the Positive Brain Health Now (+BHN) Cohort [16]. Participants were ≥ 35 years old, HIV+ for at least 1 year, and able to communicate adequately in either French or English. The community sample comprised Canadians registered with a survey company, Hosted in Canada Surveys, who were > 40 years of age with no major health conditions.

Statistical methods

To identify the best set of items (Objective 1), data from the first 204 participants in the + BHN cohort who completed the C3Q were used to provide preliminary estimates of fit of the 48 items to the Rasch Model and item locations. The Rasch partial credit model was used through the RUMM 2030 software. The steps taken to fit the data to the model followed those recommended by Tennant and Conaghan [17]. The explanation for the iterative steps in a Rasch analysis and the interpretation of the parameters from the Rasch model [18] are given in Table S2.

The aim of Rasch analysis is to identify items that do not fit this hierarchy of low to high ability. In this context, we purposely chose all items identified by the participants, knowing in advance that many would overlap or not fit the underlying trait. Items with misfits were investigated and removed one at a time until the best model was obtained. After each deletion, item and person fit statistics were re-examined to identify improvements to the model. A plot of the distribution of people and items along the ability metric of latent trait (person-item threshold distribution plot) was used to assess whether the final set of items optimally targeted the population.

The sample size for Rasch analysis depends on the required degree of precision of the person and item estimates, and the targeting of the sample. A sample size of 64 participants is considered sufficient to give a stable item calibration within ± 0.5 logit, when the sample is well targeted, rising to 144 when the sample is poorly targeted [19]. The sample size for this first analysis was 204 people with HIV.

To test the stability of the hierarchy over time (Objective 2), the hierarchy of the reduced set of items was confirmed using data from the + BHN cohort of participants (n = 703) at their last visit. A second Rasch analysis was conducted on these data and the hierarchies were compared to the developmental sample (n = 204) using Wilcoxon Signed-Rank Test.

To compare item hierarchy with a group of people without HIV (Objective 3), the final set of items that fit the Rasch model from the HIV group were administered to a group of 484 people registered with Hosted in Canada Surveys. Because the sample sizes were large, demographic characteristics and the response distributions across items were considered meaningful if there was more than a 10% difference between the HIV+ and HIV− respondents. The hierarchies from the HIV+ validation group and this community sample were tested using the Wilcoxon Signed-Ranked Test.

For objective 4, the relationships between the final item set for the C3Q and known constructs were estimated. Convergent validity was assessed by evaluating the strength of the correlation between the C3Q and the total score of other measures in the cognitive domain, including the B-CAM© [20, 21] and the Perceived Deficits Questionnaire (PDQ) [9]. Data from 703 people with HIV were available and used for these analyses. Correlations with mood from the Mental Health Index subscale of the RAND-36 [22] and with work-productivity subscale of SPS (work impairment score) [23] were carried out; mean C3Q was calculated for those working and those not working. All are variables were scored from 0 to 100 with 100 the best outcome.

The B-CAM [20, 21] comprises a battery (37 items) of computerized tests of processing speed, attention, memory, and executive function. Rasch analysis was used to develop B-CAM and all items fit the model, indicating that a legitimate total score can be derived for the construct, i.e., cognitive ability. B-CAM has a total score on a continuum from 0 to 33 with higher scores reflecting better cognitive ability.

The PDQ [9] covers cognitive activities that the person reports they have difficulty with (for example, “during the past 4 weeks, how often did you forget if you had already done something?”). It comprises 20 items in 4 cognitive domains including attention/concentration, retrospective memory, prospective memory, and planning/organization that are scored on a 5-point scale from 0 to 4 (higher indicates more deficits) and is scored from 0 to 80 with a cut-point to indicate cognitive impairment of 40 or more. Cronbach’s alpha of > 0.95 has been reported in two samples of people with major depressive disorder [24].

The Work Impairment Score (WIS) of the Stanford Presenteeism Scale was used as a measure the ability to function at work despite health problems. The WIS is the sum of responses to 10 items scored on a 5-point ordinal scale for the amount of time in the past 4 weeks the individual experiences work challenges (0–100). Higher scores indicate better work function. The WIS has demonstrated good reliability (Cronbach’s alpha = 0.83) and validity (significant positive relationship with SF-36) [23].

The RAND-36 [22] is a self-report questionnaire that consists of eight subscales assessing the health domains of physical functioning, social functioning, role limitations due to physical health problems, role limitations due to emotional health problems, vitality/energy, bodily pain, general health perceptions, and mental health perceptions (MHI). MHI subscale score was used in this study. Scores range from 0 to 100, where a higher score indicates less disability. Cronbach’s alpha of 0.71–0.92 has been reported for RAND-36 [25].

Results

Of the total sample enrolled in the +BHN study, 703 participants answered the C3Q and their data were analyzed for objective 1 (n = 204 using data from first assessment) and objectives 2–4 (n = 703 using data from last assessment). The characteristics of the participants in the two HIV+ samples and the community group are presented in Table 1. There were no substantial differences between the groups on these variables, although with the large sample sizes many of these small differences were significantly different.

Table 1 Demographic characteristics of the three samples

Table 2 shows the decision process about each item with respect to its fit to the Rasch model. Most of the items that were deleted showed misfit except for the items from the language domain (items 29 to 33) which showed a high degree of residual correlation (0.53). This often suggests a second dimension, but further testing did not support this. Iteratively deleting these items identified item 31 as the best representative of this cognitive domain; of note, it was the item with the simplest wording. One item (Item 11: forgetting the topic of a conversation that I just had) showed differential item functioning (DIF) by gender and, as other items covered similar content, it was deleted.

Table 2 Results of the Rasch analysis on each item from the developmental sample (n = 204)

After the iterative steps, the pool of 48 items was reduced to 18 by deleting six items for the misfit, 10 items for correlation, and one item for DIF. The final set of items fit the model (χ2 = 44.8; df = 36; p < 0.15) with optimal fit properties.

Figure 1 shows the distributions of the items (lower part of the graph) and the members of the HIV+ group (upper part of the graph) across the measure of self-reported cognitive ability (horizontal axis), from the lowest ability on the left to the highest ability on the right. The item thresholds (location) ranged from − 2.43 logits for [lowest threshold of] item 15 to + 2.15 logits for [highest threshold of] item 22. The individual ability ranges from − 2.86 to 3.78 logits. Targeting was good; participants were mainly in the medium ability range from − 2 to 2 logits with a mean location of 0.99 (SD: 1.54) after omission of extremes. Thus, there are enough easy and difficult items to accurately estimate an individual’s self-reported cognitive ability across a wide range. Of the 204 people assessed, 21 (10.3%) were at the ceiling, and only 2 were at the floor. Ceiling effects < 15% are considered adequate [26].

Fig. 1
figure 1

HIV+ person-item threshold distribution

An expanded sample of people with HIV+ (n = 703) answered the 18 items of C3Q. These data were tested for fit to the Rasch model. Item 18 (I’m afraid of doing new activities) and Item 13 (I lose focus when I have to pay attention to two things at a time) did not fit the model (fit residual = 2.98 and − 2.88, respectively). However, after selecting 10 random samples (n = 400 based on 35 thresholds) and testing the fit of the items, the mean of fit residual for item 18 and item 13 were 1.7 and − 1.8, respectively, indicating that items 18 and 13 fit the model.

The distributions of item thresholds and individuals across the measure of cognitive ability are shown in Fig. 1. The figure illustrates that, while the C3Q items are reasonably well distributed (extending from − 2.4 to + 2.4 logits), some individuals, especially at the higher end of the spectrum (n = 33, 18%), cannot be measured by this set of items. In other words, there are not enough [cognitively] difficult self-report items for individuals with high cognitive ability.

Table 3 shows the item hierarchy for all study groups was compared between the two groups of HIV+ individuals who completed the C3Q (n = 204 and n = 703). A Wilcoxon signed-rank test showed no substantive differences in the ranks between these two HIV+ samples (p = 0.5419). For the validation sample of 703 people HIV+, the item at the lowest end of the self-reported cognitive ability spectrum (average location across the two thresholds: − 1.59) was Item 8, “I forget I have food cooking.” This indicates that people who endorse that they frequently leave food cooking have very poor self-reported cognitive ability as most people would not endorse this level. Item 14, “I lose focus and end up with too many thoughts in my head,” is at the top of the spectrum and indicates that people who endorse “rarely” would have high self-reported cognitive ability. The item threshold at location 0 is the item that 50% of people endorse.

Table 3 Item locations for the HIV+ and HIV− groups

The average location across the three groups is also presented in Table 3. Of interest is the difference between the validation sample of 703 people HIV+ and the HIV− controls that showed a difference of 0.8 logits, (95% CI 0.6–1.0). As all the items for both samples fit the Rasch model, a simple score can be legitimately produced by assigning values of 0, 1, and 2 to each of the three response categories (frequently, sometimes, rarely) and summed. The correlation of this simple score with the logit scale was 0.97. Higher scores (maximum 36) indicate more cognitive “ability” or fewer concerns. The HIV+ group scored a mean of 25.7 (SD: 8.03). The HIV− group scored a mean of 29.1 (SD: 7.26). This difference between HIV+ and HIV− groups on the logit scale and the difference on the simple scale (3.4: 95% CI 2.5–4.3) is greater than ½ SD, considered clinically relevant [27].

Table 4 shows the response distributions of all 18 items for the HIV+ sample and for the HIV− sample. For all items, the proportion of people responding that they rarely had the problem was higher for the HIV− than the HIV+ (bold font) but for two items (#1, #7), the difference was smaller than our threshold difference of 10%.

Table 4 Frequency distribution of the C3Q items for HIV+ (n = 703) and HIV− (n = 484) respondents on the C3Q items

The extent to which the data from the C3Q were consistent with convergent constructs was assessed by the strength of the correlation between the C3Q (simple scoring) and other standard measures of cognitive ability, emotional function, and the downstream outcome of work and work productivity. The raw mean of the C3Q based on the 0, 1, 2 scale scores (range 0 to 36) was 25.7 (SD: 6.0) but when rescored to be out of 100, the mean was 71.1 (SD: 22.4). For the controls, the raw mean was 29.1 (SD: 7.2) and 80.8 (SD: 20.2) scored out of 100. The difference between the HIV+ sample and controls on the raw score was 3.4 (95% CI 2.64–4.16) and 9.7 (95% CI 7.2–12.2) when scored out of 100.

The C3Q showed a strong to moderate correlation with the PDQ (r = − 0.82; higher is more impairment) but low correlation with the B-CAM, a test of cognitive performance (r = 0.14). The correlations with mood (MHI subscale of the RAND-36) and with work productivity were moderate (0.56, 0.60, respectively). People who were working (n = 317) scored on average 73.8 (SD: 20.6) on the C3Q, while people not working scored on average 68.8 (SD: 23.4), equivalent to a difference of 5.0 (95% CI 1.6–8.3).

Discussion

The cornerstone of patient-centered care is providing care that considers patients’ goals, preferences, values, and needs. In such an important life-area as cognition, addressing patients’ concerns is primordial but this cannot be done without directly identifying these concerns. Aging with HIV has brought the measurement of cognitive concerns to the forefront. Even though treatment success has reduced the most severe forms of cognitive impairment, milder forms of cognitive impairment are prevalent. People aging with HIV recognize cognitive challenges. When asked to nominate areas where HIV affected their quality of life, cognition was one of 34 areas spontaneously nominated. In contrast, among people with multiple sclerosis, cognition was one of 60 areas nominated [28]. A call to action to develop measures of everyday cognitive function [29] was taken up by this paper. The C3Q would fill this gap for people with HIV.

The development of this measure followed best practices as recommended by the Food and Drug Administration (FDA) guidance [14]. The initial steps of this work have been described previously. Briefly, considerable effort was made to obtain the patients’ voices about their cognitive concerns, with some 300 people living with HIV around the world providing input [15]. This process is unique in the field of HIV. The concerns voiced covered the five cognitive domains of memory, attention, executive function, visuospatial skills, and language (collectively including 14 sub-domains) which are all the domains within the traditional neurocognitive framework [30]. Two additional domains emerged: emotional consequences and motivation.

An important requirement in applying the FDA guidance is that total scores from measures with multiple domains should be supported by evidence that the total score represents the concept of interest, in its complexity. In this paper, we report on the justification for a total score across items from different cognitive domains using Rasch Measurement Theory which is a strong method of providing an evidence base for the extent to which a set of items form a real measure [31].

The process started with 48 items covering the four domains of memory, attention, executive function, language, and two additional domains related to emotions and motivation. Rasch analysis indicated that 18 items fit the Rasch model: 8 for memory (items 1–8), 6 for concentration (items 9–14), 2 items for executive function (15 and 16), one for language (item 17), and one emotional item for fear of new activities (item 18).

The fact that all items fit the Rasch model, and other model parameters were optimal, provides evidence of a mathematically legitimate total score derived from combining these items. While the Rasch model yields scores on a logit scale, a simple scoring system derived by assigning numerical values to the responses can be applied if the correlation between the original logit scale and the simple ordinal scale is high (> 0.9). In this sample, the correlation was 0.97 indicating a simple scoring system can be used. Table S3 provides C3Q items and scoring in English. The questionnaire may also be downloaded, free of charge, in English and French, at https://brainhealthnow.org.

This study also showed that the items of the C3Q had similar levels of difficulty both for HIV+ people and controls although there were some differences (see Table 3). In particular, items #8 (forget food cooking) and #15 (making decisions) had different locations for these two groups (difference in logit was > 0.5 on a scale with mean on 0 and SD of 1) with HIV− people finding these items harder to endorse as not being problematic than people HIV+. Item #1 (forget tasks or activities I need to do) showed the opposite effect, with HIV+ people finding this item easier to endorse as not problematic. It is not unusual with 18 items to find at least one to differ by chance. Items #1 and #7 did not meet our critical value of 10% difference between groups, but item #10 did (see Table 4). Whether or not people with HIV have a unique cognitive profile, as opposed to a different severity on the same difficulties reported by people from a general population sample, will need to be verified in a different sample. However, these data suggest that if so, it is only on a few items, 3 of 18.

The distribution of the C3Q in the validation sample was not normally distributed (mean logit 1.4; SD 1.7) as many people had higher cognitive ability than could be measured by these self-report items. This suggests that if there is a need to have a measure of cognitive ability for people with high cognitive ability, direct measurement of cognitive performance is likely needed.

The items identified by the HIV+ group had some overlap with other self-report cognitive questionnaires used for people with neurological disorders: 8 items from the 20-item Perceived Deficits Questionnaire (PDQ) [8] overlapped and 9 of 16 items in the Neuro-Qol Short Forms for executive function and general cognitive concerns [32] overlapped. This would suggest that people with HIV have some unique challenges.

Rasch analysis has been applied to cognitive performance tests previously. Hobart et al. [33] found that 10 of 11 components of the Cognitive Behavior section of the Alzheimer’s Disease Assessment Scale (ADAS-Cog) fit the Rasch model, however, was not accurate enough to discriminate between people based on their cognitive performance. Koski et al. [21] found that the 24 items of the Montreal Cognitive Assessment (MoCA) (a pencil and paper test of cognitive performance) [34] fit the Rasch model but poorly targeted people with HIV as the items were too easy and were passed by almost all people.

The observation that the C3Q correlated with work-productivity (0.6) supports is interpretation as reflecting cognition needed for work. The low correlation with measured cognitive performance (0.14) is not unexpected as in general there are low correlations between measured and self-report behaviors [35].

There are a number of unique features, strengths, and limitations of our approach for developing and testing the C3Q. The development was done on a large sample (n = 703) of people with HIV around the world who provided rich qualitative expressions of their cognitive concerns [15] including qualitative information obtained through secondary analyses of existing interviews [13]. Item validation was done on a separate sample. Recruitment of these samples took advantage of modern technology and used HIV-specific internet sites for recruitment. This permitted a large sample of people to be accrued in a relatively short period of time, with minimal expense. Limitations of this approach are that little personal information was gathered, to ensure complete anonymity. As a result, no information on HAND was available from this test sample. A strength of this study was the acquisition of data on item responses from an HIV− group facilitated by recruitment using web-based survey resources (Hosted in Canada Surveys). This was efficient: 484 people were enrolled within hours, at a cost of approximately $4 CAD per person. However, little personal information was obtained other than age, sex, education, and location. In addition, both the HIV+ and HIV− group were Canadian. It is unlikely that geographical location would affect the response distribution across items. Our analysis of the C3Q items showed that demographic characteristics did not affect the ordering of the items. While we instructed the survey company that we wanted people without major health conditions, it is possible that a few people with HIV were inadvertently included in this group.

Conclusion

This study contributed evidence that the items of the C3Q reflect the everyday cognitive challenges faced by people with HIV. The items aligned hierarchically in a similar way among people with HIV as in general population controls. However, people with HIV reported more challenges than controls of similar age. Often when people voice cognitive concerns, their importance is downplayed and attributed to aging. These data indicate that age alone is not the reason for these concerns among people with HIV. C3Q can help people with HIV communicate their cognitive concerns to their health care team so that mitigating strategies can be put in place. There is evidence that a healthy lifestyle and management of cerebrovascular risk factors can improve cognition [36,37,38]. Including this measure as part of a clinical encounter can help clinicians with the management of this aspect of patient health, which is important for the quality of life [39].