Introduction

In the USA, the Centers for Disease Control and Prevention (CDC) estimates that the rate of HIV infections remained stable in recent years, with 49,273 estimated infections in 2011 [1]. However, the annual number of diagnoses among men who have sex with men (MSM) has increased to represent 65 % of diagnoses [1]. The US National HIV/AIDS Strategy has set goals to reduce new infections, increase access to care and health outcomes for those living with HIV, and to reduce HIV-related health disparities among most at risk populations such as MSM [2]. Testing is a key component of this strategy, because those who are unaware of their status cannot enroll in treatment or take enhanced measures to prevent transmission to partners. The CDC estimates that one in five HIV-positive Americans are unaware of their status, and that number may be more than twice as high among MSM [3]. As such, the CDC has expanded their initiatives to increase HIV testing in the USA generally [4] and specifically for MSM [5].

In the 2008 cycle of the National HIV Behavioral Surveillance System in 21 cities, 61 % of HIV-negative MSM had received a test in the prior 12 months and only 10 % had never tested [6]. However, the CDC is now recommending that all MSM get tested for HIV at least once a year, and that sexually active MSM might benefit from HIV testing every 3–6 months [7]. With this increased focus on promoting frequent and repeated HIV testing, it is important to consider the psychological and behavioral ramifications in order to prevent potential iatrogenic effects.

Psychological Reactions to HIV Testing

When testing became widely available in the early 1990s with the advent of oral fluid and rapid antibody testing, research focused on emotional and psychological reactions surrounding the receipt of HIV test results. One of these studies showed high levels of anxiety before receiving test results, with immediate relief among those who tested negative [8]. Another study found that HIV-seronegative men were less likely to correctly anticipate their results (43 % versus 78 % for seropositives), and that over the following year had lower levels of hopelessness than untested controls [9]. These effects, found during a time without highly effective HAART medications, were similar to what had been found with testing for fatal diseases without cures [10]. During this time, HIV testing was controversial due to extensive concerns about the confidentiality of test results and the potential for discrimination against those who tested positive [11]. Over time, there has been less research on psychological reactions to testing as a consensus emerged that adverse events after HIV testing (e.g., depression, suicide) are rare and justifiable in light of the individual and public health significance of knowledge of HIV serostatus.

Behavioral Reactions to Testing

In terms of research on behavioral outcomes of HIV testing, most research has sought to determine if those who test positive subsequently decrease transmission risk behaviors. Meta-analyses and recent large-scale studies find reductions in unprotected sex subsequent to receiving an HIV-positive test result, with most recent research in this area focused on heterosexuals in developing countries [1217]. In terms of behavior change among those who test negative, meta-analyses report either inconsistent or no overall effect on risk behavior subsequent to testing negative [12, 15]. Some research has suggested increased sexual risk taking after testing HIV negative. One study found an increase in gonorrhea incidence 6 months after HIV testing among those who tested HIV negative compared to their pre-test incidence and also compared to those who tested HIV positive [18], suggesting increased engagement in sexual risk behaviors after testing negative.

In discussing research in this area, Helleringer and Reniers [19] raise the important point that there is likely to be substantial heterogeneity in how individuals respond to HIV testing: some may drastically reduce their risk behaviors, others may do so only marginally, and a third group may even increase risk taking. They point out that such heterogeneity could play an important role in sustaining an epidemic because a small number of “superspreaders” who increase their risk could contribute disproportionately to HIV transmission. However, to our knowledge, there is no psychometrically validated measure to assess these reactions patterns, nor research specifically examining this heterogeneity.

Repeated Negative Test Results

Another important consideration is that most prior research has focused on reactions to a single HIV test result, and patterns of reacting may differ upon repeated negative test results. For example, principles of operant conditioning suggest that repeated pairing of a pleasurable behavior (i.e., unprotected sex) with no punishment (i.e., negative test result) could reinforce engagement in such risk behaviors. In fact, some prior research has found repeat negative HIV test results to be correlated with increased HIV risk behaviors among MSM [2022], with the mechanism hypothesized to be perceived invulnerability to HIV enhanced by feelings of “dodging the bullet” after multiple negative results. Some have even called for strengthened practices to reduce risk behaviors among repeated negative testers, particularly among young MSM [23, 24]. However, the impact of repeated negative test results has not received sufficient research attention given CDC recommendations to scale up frequent testing among sexually active MSM.

Current Study

In this study, we aimed to (1) describe the development and psychometric validation of a scale to measure diverse beliefs and intentions following receipt of a negative HIV test, (2) describe patterns of responding to a negative HIV test, and (3) estimate associations between scale dimensions and risky sexual behavior. If validated and if dimensions are associated with HIV risk behaviors, such a tool would be useful in testing settings to identify individuals who may be inclined to increase their HIV risk behaviors subsequent to a negative result and provide them with tailored risk reduction counseling.

Methods

Development of the Inventory of Reactions to Testing HIV Negative

Online synchronous chat room-based focus groups were used as formative research for the development of the Inventory of Reactions to Testing HIV Negative. The use of online focus groups has gained increased consensus in marketing, health, and education research due to their efficiency and ability to allow anonymous participation [25, 26]. The approach is particularly useful for studying stigmatized populations as it eliminates the need to meet with strangers in an unfamiliar location [26]. Comparisons of synchronous online and offline focus groups indicate both produce similar quantity and quality of data [27], but the online approach is characterized by greater dynamism and immediacy [25].

Two focus groups (N = 9) were conducted with MSM who were recruited at HIV testing clinics upon receiving an HIV-negative test result. Participants were asked (1) to describe the first experience of receiving an HIV test, how they felt afterwards, and how it influenced their later sexual behavior; (2) similar questions about their most recent HIV test; (3) what it means to get a negative test result; (4) why some individuals may engage in risk behaviors and then test negative repeatedly; and (5) what it means to receive multiple negative test results and how that might affect behavior.

Transcripts of focus groups were reviewed, and then distinct comments were identified using the constant comparison method in order to identify the range of personal and hypothetical reactions to testing negative described by participants [28]. The list of reactions obtained from the focus groups were then distilled into a set of items that asked both about reactions to a negative test result and reactions to multiple negative test results. These items were shared with 10 researchers and project staff with expertise and experience with HIV testing and counseling, who suggested item refinements.

The final version of the scale consisted of a total of 16 items which began with one of three stems: “A negative HIV test means …,” “After a negative HIV test result, I feel …,” or “The more times I test negative for HIV ….” The full text of all 16 items is presented within Table 2. The first 11 items contained the following instructions: “For each statement below, please tell us how much you agree or disagree about the effects of HIV testing on your health beliefs and sexual behavior.” The final five items contained the following instructions: “The following statements are about your feelings as a result of receiving more than one negative HIV test result in your lifetime.” Response options ranging from 1 (strongly disagree) to 5 (strongly agree).

Quantitative Procedures and Participants

As a recruitment technique for several large research projects focused on MSM, we utilized a brief, preliminary screening survey advertised on Grindr (a geospatial smartphone application for MSM to meet). For 3 days in May in 2013, we advertised on Grindr, providing two methods for users to access our survey: (1) a pop-up ad with text encouraging users to click through to take our survey, and (2) a banner ad shown to users while they were logged on to the application. The pop-up ad was shown to first time users logged on to the application within a 24-h period, which was displayed on three separate days for a 24-h period. The banner ad was shown for an entire 24-h period coinciding with the first 24-h period of pop-up advertisement. Both the pop-up ads and banner were only shown to Grindr users who logged on to their account in the NYC area. Although there was no incentive for participants to take our survey, they were informed that the survey would screen them for other studies for which they would be compensated if they were eligible and joined. The survey was conducted using Qualtrics and took approximately 4 min to complete. All men were at least 18 years of age, and relevant procedures were approved by the institutional review boards of the researchers’ institutions.

Deduplication

In total, the survey was opened 4,556 times by 3,490 unique IP addresses. In 2,815 (61.8 %) of cases, one survey was started per IP address and these were deemed unique and valid respondents. The remaining 1,741 started surveys resulted from only 675 IP addresses. From these 1,741 surveys, we retained one incomplete survey and removed 1,035 cases of incomplete data resulting from duplicate IP addresses (i.e., the same individual opened a survey more than once but never completed one) in order to get an accurate sense of the number of individuals who opened a survey. We only identified 25 IP addresses that were used to fully complete more than one survey. In all cases, we carefully screened the patterns of completion and responses to identify valid responses. We identified potentially duplicate cases based on shared zip codes and other demographic features, time to complete the survey, and length of time between completed surveys. Because this was an unpaid survey, we were conservative in our decisions to delete completed cases and considered the possibilities of cohabiting couples and people using shared WiFi networks and deleted only those completed cases that raised suspicion based on multiple instances of shared demographics (e.g., zip code, age), multiple completed surveys within a short time period, or successive attempts to change responses to one or two questions which might be thought to influence subsequent study eligibility. In total, we removed 10 completed responses that were suspected of being duplicates based on shared IP address, zip code, and demographic information. This resulted in removing a total of 1,045 of the 1,741 cases from repeat IP addresses, retaining 696 cases in addition to the 2,815 that originated from unique IP addresses.

Data Cleaning

Removing the duplicate responses resulted in a dataset of 3,511 started surveys that were believed to be unique responses, from which 1,701 (48.4 %) individuals proceeded through the first page and provided informed consent. One participant who provided consent was skipped to the end of the survey after reporting being 17 years of age. In total, 1,198 (72.7 %) of the participants who provided informed consent reached the end of the survey. From these, we removed two individuals who identified as female, two individuals who identified as transgender, three individuals who identified as straight, 198 individuals who were HIV positive, and three individuals who had missing data. Because of the nature of the scale being investigated in these analyses, only those who had received at least one HIV test in their lifetime were asked to respond to the scale and only those who had received at least two tests in their lifetime were asked to respond to the final five items about repeat testing. As such, 82 individuals who had never received an HIV test, 134 individuals who had not been tested within the past year, and 49 individuals who had only been tested once in their lifetime were excluded from analyses as well. This resulted in a final analytic dataset of 725 HIV-negative and unknown HIV status MSM who had been tested within the past year and more than once in their lives.

Measures

Demographics Characteristics

Participants were asked to report their age, relationship status, sexual identity, race, ethnicity, and HIV status. Table 1 contains a list of the response options for each of the demographic characteristics.

Table 1 Demographic and behavioral characteristics of the full sample and differences in newly developed subscales

HIV Testing Behavior

All participants who responded that they were HIV negative or of unknown status were asked a series of follow-up questions. Participants were asked how long ago they received their last test, with options of “within the last 3 months,” “3–6 months ago,” 6–12 months ago,” “more than 12 months ago (1 year),” and “never.” Additionally, participants were asked how many HIV-negative test results they had received in their lifetime, with response options of 1, 2–4, 5–9, 10–14, and 15 or more. As mentioned previously, those participants who indicated they had only received one test result in their lifetime were excluded from further analyses.

Sexual Behavior

All participants who had previously indicated being in a relationship were asked how many times they had anal insertive and receptive sex with and without using condoms with their main partner with four free numerical response questions (each corresponding to a different combination of sexual position and condom use). Participants with main partners were also asked to report the HIV status of their main partners. Participants were next asked to report how many casual male sexual partners they had within the prior 3 months, with sex defined as “any sexual contact that could lead to an orgasm.” All participants who wrote in a value greater than zero were asked a series of follow-up questions—participants were asked “how many of these partners told you they were the same HIV status as you?” and “how many of these partners did not tell you their HIV status or told you they were a different HIV status than you?” Participants were then asked to separately report on their behavior with seroconcordant and serodiscordant partners with questions capturing both insertive and receptive anal sex both with and without condoms.

Inventory of Reactions to Testing HIV Negative

The newly developed scale was described above. For this study, items asking about reactions to testing HIV negative were displayed only to those who had been tested within the past year. Items asking about reactions to multiple negative tests were only administered to those who had received more than one test in their lifetime.

Analytic Plan

We began by examining basic demographic and behavioral characteristics of the sample as a whole. We next utilized SPSS version 20 to split the dataset of 725 individuals into two random subsamples of approximately equal size. This resulted in one subsample (subsample 1) of 360 individuals and a second subsample (subsample 2) of 365 individuals. We then utilized subsample 1 to conduct an exploratory factor analysis (EFA) of the newly developed Inventory of Reactions to Testing HIV Negative in Mplus version 7.11 with the default Geomin oblique rotation and maximum likelihood estimation. We requested that the software display output for models with one to six factors and utilized Cattell’s scree test as well as a comparison of standard fit indices such as the likelihood-based information criteria (i.e., AIC, BIC, ABIC) and residual-based fit indices (i.e., RMSEA and SRMR) to select the best-fitting model. We utilized item factor loadings exceeding 0.40 as evidence of a meaningful item contribution to a factor. Upon selecting the best-fitting model and examining the item loadings, we removed all items that did not load onto any factors or cross-loaded onto multiple factors and ran a second EFA to examine fit of the chosen factor solution without the poorly fitting items.

After establishing a good-fitting model from EFA, we utilized subsample 2 to conduct a confirmatory factor analysis (CFA), again using Mplus version 7.11. We fit a CFA based on the results of the EFA within subsample 1, and we examined standard indicators of model fit [2936], which included comparative fit index (CFI) greater than 0.95, root mean square error of approximation (RMSEA) less than 0.06, Tucker Lewis index (TLI) greater than 0.95, and standardized root mean square residual (SRMR) less than 0.08. As is standard in the practice of CFA [37], we also examined the modification indices to explore any potential sources of misfit in the model and then fit a final model based on theoretical assumptions and in consultation with the statistical evidence.

For the final set of analyses, we utilized the subscale structure identified within the final model to calculate subscale scores by averaging across subscale items for the full sample (N = 725). We calculated internal consistency (i.e., Cronbach’s α) statistics for each of the proposed subscales. We followed this by utilizing one-way analysis of variance (ANOVA) to examine demographic and behavioral differences in subscale scores. The final set of analyses was a series of negative binomial regressions conducted in SPSS version 20 to predict the number of unprotected anal intercourse (UAI) acts with casual partners. Five men had missing data as a result of software error and 37 men had no sexual activity with casual partners, making the analytic sample a total of 683 men. We overrode the default in SPSS to fix the negative binomial dispersion parameter to 1.0 and allowed it to be freely estimated. We then conducted models separately examining the unique role of each subscale score. We also ran separate models testing the extent to which the subscale scores’ impact on UAI was moderated by (a) whether or not participants had received an HIV test in the prior 3 months and (b) the number of HIV tests participants had received in their lifetime. Within each model, we adjusted for potentially confounding demographic variables based on bivariate associations significant at p < .05. We interpreted significant interactions by plotting subscale scores at ±1.5 standard deviations.

Results

As can be seen in Table 1, the sample was diverse with regard to race/ethnicity, with more than half of the sample being men of color. A majority (59.7 %) was aged 18–29, single (77.0 %), gay-identified (88.3 %), and reported an HIV-negative (versus unknown) status (94.6 %). More than half (53.7 %) of the sample had received an HIV test in the prior 3 months. There was variability in the number of HIV-negative test results men had received in their lifetime, with approximately one-third having received 2–4 tests (36.4 %) or 5–9 tests (34.1 %). Additionally, a majority (54.1 %) had engaged in recent (past 3 months) UAI with a casual or main partner.

The results of the initial and final exploratory factor analyses are presented in Table 2. After examining the scree plot and fit indices for the models ranging from 1 to 6 factors in the first EFA, we determined that a three-factor solution was the optimal fit to the data. Both the likelihood-based statistics (i.e., AIC, BIC, ABIC) and the residual-based fit indices (i.e., RMSEA, SRMR) showed the most substantial increase in fit (evidenced by a decline in these indices) when going from two to three factors. These trends mirrored those in the scree plot, with all indicators other than the chi-squared statistic suggesting that there was little or less improvement in fit by adding factors beyond the third, and this was evidenced in the interpretability of the factors, as well.

Table 2 Results of two exploratory factor analyses (EFA) with subsample 1 (n = 370)

As can be seen in Table 2, we found that all but four items (items 2, 8, 10, and 11) met our criteria for a meaningful factor loading (i.e., ≥0.40) and that none demonstrated evidence for cross-loading onto multiple factors. As such, we removed the four items that did not load onto any factor and re-ran the EFA to ensure that we achieved the same pattern of item loadings. As can be seen in the results for the second EFA in Table 2, all items maintained their original factor loadings and there were minimal changes in the magnitude of these coefficients. The first factor accounted for 27.5 % of the variance and contained items 1, 3, 4, 7, and 9, which focused on HIV testing as reinforcement of decisions to use safe sex practices; this was labeled the “Reinforced Safety” subscale. The second factor accounted for an additional 17.8 % of the variance and contained items 5, 6, and 16; these items focused on HIV testing as reinforcement of the luck involved in remaining HIV negative and the factor was labeled the “Luck” subscale. The third factor accounted for an additional 14.4 % of the variance (total cumulative variance explained = 59.7 %) and contained items 12 through 15; these items focused on the role of HIV testing in reducing perceptions of personal HIV risk and was labeled the “Invulnerability” subscale.

We next conducted a CFA with subsample 2 based on the structure identified in the final EFA conducted on subsample 1. A majority of the model fit indices suggested adequate to good fit (RMSEA = 0.071, CFI = 0.942, TLI = 0.925, SRMR = 0.076) with the exception of the chi-squared statistic, which is known to be sensitive to large sample sizes, χ 2(51) = 145.65, p < .001. After examining the modification indices, we found strong evidence that item 7 shared variance with all three factors (i.e., cross-loaded) and subsequently removed this item and re-ran the CFA. Upon removal of this item, model fit indices improved and suggested good fit (RMSEA = 0.061, CFI = 0.963, TLI = 0.950, SRMR = 0.064) with the exception of the chi-squared statistic, χ 2(41) = 96.53, p < .001. The standardized results of the final CFA are presented within Fig. 1. As can be seen in the figure, the Reinforced Safety and Invulnerability subscales had a significant negative correlation (r = −0.41, p < .001), while the Luck subscale was uncorrelated with both Reinforced Safety and Invulnerability. These results suggest that the seemingly opposite subscales of Reinforced Safety and Invulnerability share less than 17 % of their variance and may co-occur in some individuals. Using a median-split on each of the variables, we found that 26 % of participants scored above the median on both the Reinforced Safety and Invulnerability subscales while 60 % were above the median on one and not the other and 14 % scored below the median on both. Further, a median split on all three subscales revealed that 56.5 % of the sample scored above the median on at least two of the three subscales, with 13.7 % scoring above the median on all three subscales.

Fig. 1
figure 1

This figure graphically presents the standardized results of the final confirmatory factor analysis model with subsample 2 (n = 365). From left to right, the numbers represent the residual variance for each item, the factor loading for each item, the fixed variance of each factor to 1.0 (only done in the standardized version of the model), and the factor correlations with each other

We next sought to explore the association between the newly developed subscales and demographic and behavioral characteristics of the sample. After creating average item scores for each subscale, we found evidence of good internal consistency for the Reinforced Safety (α = 0.74; M = 4.17, SD = 0.66), Invulnerability (α = 0.79; M = 1.67, SD = 0.69), and Luck (α = 0.70; M = 3.19, SD = 0.91) subscales, with significantly higher mean scores for Reinforced Safety than Luck or Invulnerability, and significantly higher mean scores for Luck than Invulnerability (all p values <.001). As can be seen in Table 1, we found significant age differences for the Invulnerability and Luck subscales. Post hoc analyses with LSD adjustment revealed that those aged 40–49 scored significantly higher on the Invulnerability subscale (p ≤ .01 for both comparisons) and significantly lower on the Luck subscale (p ≤ .001 for both comparisons) than those aged 18 to 29 and those aged 30 to 39. Relationship status differences emerged; single men scored higher on the Reinforced Safety subscale and lower on the Invulnerability subscale than men in relationships. We found no differences with regard to sexual orientation or race/ethnicity. We found that men who identified as HIV negative scored lower on the Invulnerability and Luck subscales than those who identified as status unknown. With regard to recent testing among this sample of men who had all tested within the past year, we found differences on the Invulnerability subscale and post hoc analyses revealed that those who had tested within the past 3 months had significantly lower Invulnerability scores than those who had tested 6–12 months ago (p ≤ .01). We found no differences with regard to the number of tests participants had received in their lifetime.

In the final set of analyses, we sought to explore the association of the newly developed subscales with HIV risk behavior and the extent to which these associations may be modified by HIV testing behavior. Based on the bivariate associations found in earlier analyses, we adjusted for the role of age (as a continuous variable), self-reported HIV status (1 = negative, 0 = unknown), and relationship status (1 = partnered, 0 = single). In the first three analyses, we examined the main effects of the three subscales in separate regressions adjusting for the previously mentioned variables and found that higher scores on the Reinforced Safety subscale were associated with a significant decrease in the rate of UAI with casual partners (Adj. RR = 0.36, 95 % CI[0.29, 0.45], p < .001), while increases in the Invulnerability (Adj. RR = 1.49, 95 %CI[1.20, 1.85], p < .001) and the Luck (Adj. RR = 1.23, 95 %CI[1.02, 1.47], p = .026) subscales were associated with an increased rate of UAI with casual partners in the prior 3 months. It is worth noting that, when entered simultaneously in a multivariable model, only the Reinforced Safety subscale maintained a significant main effect (Adj. RR = 0.38, p < .001) while the Invulnerability and Luck subscales became non-significant (Adj. RR = 1.07, p = .55; Adj. RR = 1.11, p = .22, respectively).

The next series of regressions in which we tested the moderating role of recent (i.e., past 3 months) testing on subscale scores’ influence on UAI with casual male partners (in the prior 3 months) are presented in the upper portion of Table 3. In model 1a, the Reinforced Safety subscale retained a significant main effect and had a significant interaction with recent testing (past 3 months), though recent testing did not show evidence of a significant main effect. As can be seen in Fig. 2, average and high scores on the Reinforced Safety subscale appear protective against the otherwise increased rate of UAI in the prior 3 months after a recent (past 3 months) HIV test among those low in Reinforced Safety. In the models with the Invulnerability (2a) and Luck subscales (3a), neither subscale had a significant main effect, though recent testing emerged as a significant predictor and the interaction was marginally significant in both models. Though not significant, these interactions suggest the opposite trend as the Reinforced Safety subscale, with recent testing combined with high scores on the Invulnerability and Luck subscales associated with higher rates of UAI with casual partners in the prior 3 months.

Table 3 Negative binomial regressions of the moderating effect of HIV testing behavior on subscale scores’ influence on unprotected anal intercourse with casual male partners
Fig. 2
figure 2

This figure plots the interaction between the Reinforced Safety subscale and recent testing in predicting unprotected anal intercourse (UAI) with casual partners. As can be seen, for those who had a recent test, low scores on the Reinforced Safety subscale were associated with substantially higher rates of UAI. Average and high scores on the Reinforced Safety subscale appear to buffer against the effect of recent testing on UAI with casual partners

The final three regression models were similar to the prior models with the exception that we were testing the role of the number of negative tests men had received in their lifetime as a moderator of the subscales’ associations with UAI with casual partners in the prior 3 months. The results of these regressions are displayed in the lower portion of Table 3, and the demographic covariates displayed similar trends as in the prior models. Unlike the prior models, all three subscale scores maintained significant main effects in their respective models, as did the role of receiving five or more tests. The interaction of these two variables was non-significant for the Reinforced Safety (model 1b) and Invulnerability (model 2b) subscales, but was significant for the Luck (model 3b) subscale and is plotted in Fig. 3. As can be seen in the figure, men who have received only two to four tests in their lives appear to have substantially higher rates of UAI in the prior 3 months as their scores on the Luck subscale increase, compared with men who have had five or more negative tests, who appear to have higher rates of UAI in the prior 3 months irrespective of their scores on the Luck subscale.

Fig. 3
figure 3

This figure plots the interaction between the Luck subscale and number of lifetime HIV-negative tests predicting unprotected anal intercourse (UAI) with casual partners. As can be seen, men who had received five or more tests had high rates of UAI with casual partners regardless of their scores on the Luck subscale. However, for those who had received two to four tests in their lives, higher scores on the Luck subscale were associated with substantially higher rates of UAI with casual partners

Discussion

Enhanced efforts have recently been initiated in the USA to increase the frequency of HIV testing among sexually active MSM, with recommendations of testing as often as every 3–6 months [7]. Despite hypotheses of heterogeneity in how individuals react to testing HIV negative [19], including the possibility that some individuals may increase their risk behaviors upon multiple negative test results [19, 24], and some epidemiological evidence of such increases [2022], there has been little research directly examining this issue. The current study reports the development and psychometric evaluation of a novel measure of reactions to testing HIV negative. Importantly, we demonstrate that there are diverse responses to negative test results beyond simply relief, and that type of response was significantly related to engagement in HIV risk behaviors, particularly for some subscales among those who had tested recently or more frequently.

Exploratory and confirmatory factor analyses were performed to establish the factor structure and subscales of the newly developed Inventory of Reactions to Testing HIV Negative. Three subscales emerged from these analyses. The subscale with items most commonly endorsed was Reinforced Safety, with a mean score in the “agree” to “strongly agree” range. Items on this subscale represented a belief that testing negative reinforces past decisions to have safer sex and to continue safer sex in the future. Items less frequently endorsed (mean scores in the “neutral” to “disagree” range) formed two additional scales. The Luck subscale included items endorsing that testing negative was luck or represented “dodging a bullet.” The Invulnerability subscale included items indicating that multiple negative test results produce feelings of immunity or difficulty in becoming HIV infected. The Reinforced Safety and Invulnerability subscales were moderately negatively correlated, suggesting that individuals who felt that testing negative meant safer sex behaviors were wise were somewhat less likely to indicate that it also meant they were invulnerable or at low risk for HIV. Neither of these subscales was correlated with the Luck subscale, indicating that these dimensions were unrelated. The dimensions identified in the exploratory and confirmatory factor analysis and their interrelationships triangulate well with how participants described their reactions to testing negative during the online focus groups—some participants made statements that reflected endorsement of each of the subscales (e.g. “Yes, before I did something risky, a negative test was more about validating my safe sex behaviors and giving me that peace of mind. After the risky behavior, I felt like it just showed me how lucky I was.”) Given the factor structure with relatively orthogonal dimensions, we recommend that in future uses of Inventory of Reactions to Testing HIV Negative, each subscale should be scored and analyzed separately rather than creating a total score.

Several significant differences emerged in the scales based on individual characteristics, particularly with regard to age and relationship status. Older participants scored higher on the Invulnerability and lower on the Luck subscales than younger participants. Keeping in mind that the entire sample reported an HIV negative or unknown status, this result likely reflects the fact that older participants had the potential for several decades of sexual activity—including during the height of HIV incidence in the 1980s—without receiving an HIV-positive test result. As such, it is unsurprising that they may perceive negative test results to reflect putative invulnerability and less due to chance. At the same time, one might hypothesize that young adults would be more likely to endorse these beliefs given adolescence is a period of particular susceptibility to the personal fable that others may suffer consequences of risky actions but not oneself [38, 39]. Such a hypothesis would appear to be refuted by the data. Single men scored higher on the Reinforced Safety subscale and lower on the Invulnerability subscale than men in relationships, which likely reflects fact that single men—particularly young MSM—tend to have less unprotected sex than men in relationships [40, 41] and therefore testing negative is seen as reinforcement for their use of condoms during sex in casual relationships.

In terms of frequency and recency of testing in this sample of MSM who had all tested within the past year, we found that those who had tested within the past 3 months had significantly lower Invulnerability scores than those who had tested 6–12 months ago, but no differences were found in other subscales. Furthermore, there were no differences in scale scores based on the number of lifetime tests. In general, this suggests that beliefs about the meaning of testing negative may be more stable characteristics of the individual rather than beliefs that fluctuate as time from the most recent test increases or with more testing. Longitudinal data will be required to test this hypothesis about the stability of these subscales.

Perhaps the most important finding from this study is the association between subscale scores and HIV risk behaviors. The Reinforced Safety subscale had the highest mean agreement ratings and also showed the largest association with the rate of UAI (Adj. RR = 0.36). This scale also had a significant interaction with recent testing: among those who scored low in this scale, recent testers had a much higher rate of UAI than those who had not recently tested. This pattern suggests that it is protective to hold the belief that testing negative reinforces and confirms the value of safer sex practices. However, individuals who believe testing negative is not a reward for safe behaviors and does not support future condom use are at particular risk for UAI in the period shortly after testing. Given the high mean for this scale, such individuals are likely to be relatively rare in the population of testers, but are good candidates for enhanced counseling that reinforces the association between safer sex practices and staying HIV negative.

The Invulnerability subscale was positively associated with the rate of UAI, but the effect size was smaller than for Reinforced Safety (Adj. RR = 1.49). This subscale showed no interaction with recency or frequency of HIV testing, which is somewhat surprising because it seems intuitive that the more negative test results one receives the more one may begin to suspect innate resistance to HIV infection. While further research should explore the source of these beliefs, it is clear that endorsing them is associated with increased engagement in HIV risk behaviors. Given that instilling a sense of personal vulnerability is a core element in many key theoretical models of HIV risk behavior change and their associated interventions (e.g., the information-motivation-behavioral skills model; [42]), delivering such interventions upon testing negative may be a particularly promising approach to HIV prevention (e.g., [24]).

The subscale with the smallest main effect on UAI was the Luck subscale, with individuals who endorse stronger feelings that testing negative was pure chance reporting more UAI (Adj. RR = 1.23). This subscale showed a significant interaction with the number of prior tests. MSM who had tested more in the past had a higher overall rate of UAI, but the rate of UAI varied considerably among those who had tested less frequently depending on their score on the Luck subscale. Those who tested less frequently but strongly endorsed the belief that testing negative was due to chance had a high rate of UAI similar to those who had tested frequently. However, those who tested less frequently and had low scores on the Luck subscale had low rates of UAI. This pattern suggests that beliefs about testing negative being up to chance are most important early in one’s history of HIV testing. As such, test counselors may find it valuable to ask clients who are relatively new to testing (<5 prior tests) about these beliefs, and if they are endorsed, then provide further counseling to instill a sense of agency in maintaining an HIV negative status.

There are a number of implications of this study for the practice of HIV testing and counseling to reduce future infection among those who test negative. First, this brief 11 item scale may be useful to administer to clients in order to identify the subset that may increase engagement in HIV risk taking secondary to their negative test result. In particular, clients who score low on Reinforced Safety (a mean score of 3 “neutral” or lower will be ∼10 % of MSM based on the current study), high on Invulnerability (a 2 “disagree” or 3 “neutral” on most items), and high on Luck (4 “agreeing” or 5 “strongly agree” on most items) may be cause for enhanced counseling. If administering this brief questionnaire is infeasible in the clinic context, counseling staff may consider incorporating verbal questions about these beliefs during their standard counseling session. Given recent findings that risk reduction counseling does not reduce risk of acquiring future STIs [43], such resource-intensive counseling may become less frequent. Future research should examine if such counseling may have efficacy among individuals who believe testing negative is unrelated to their protective behaviors, but instead due to luck or invulnerability. Such results may support efficient use of public health resources by utilizing a selective rather than universal prevention strategy for risk reduction counseling. The implications of our findings may extend beyond testing for HIV to other infectious diseases and other non-infectious but preventable diseases for which testing is possible (e.g., poor diet and cholesterol or blood sugar levels). Future research should explore heterogeneity in reactions to a range of test results and implications for future engagement in health behaviors.

The results of this study must be considered in light of several limitations. First, the two online focus groups only involved a small number of MSM (N = 9). For the circumscribed topics that we were addressing (e.g., “how do you feel after receiving a negative HIV test result?”), we felt this provided sufficient feedback to inform questionnaire development when coupled with our own experience in this area. However, more participants may have provided greater diversity of opinions and reactions. Second, this was an online convenience sample of Grindr-using MSM in an urban area who completed a brief online questionnaire. The scale scores and associations with UAI may not be generalizable to the larger population of MSM, particularly to MSM outside of urban areas or who do not use geospatial smartphone apps to meet other MSM. Third, the study was cross-sectional and non-experimental; therefore, we cannot make strong inference about the direction or causal relationships between variables. Despite these limitations, this study makes an important original contribution to our understanding of how MSM react to HIV testing and raises some cautionary implications as we move into an era of increased prioritization of frequent and regular HIV testing among sexually active MSM.