Introduction

In 2020, an estimated 43% of new HIV infections diagnosed in West and Central Africa were reported in Nigeria [1]. This, coupled with the highest rates of new infections in countries in sub-Saharan Africa, makes Nigeria’s HIV epidemic one of the largest in the world with an estimated 4.5% of people living with HIV (PWH). More specifically, nearly 2 million people in Nigeria were living with HIV in 2020 [1]. In the same year, Nigeria reported 86,000 new HIV infections, and recorded 49,000 AIDS-related deaths. Within Nigeria, Oyo state has an estimated HIV prevalence of 0.9% [2]. Ibadan, the capital of Oyo state and Nigeria’s third most populous city with over 3.5 million people, carries a significant HIV burden. However, compared to other major cities in Nigeria, Ibadan has fewer resources for HIV prevention [2]. Additionally, Ibadan has a blended composition of urban and rural areas. These features make Ibadan’s social dynamics and epidemiological characteristics highly complex and representative of the entire region [3, 4].

Approximately 20% of new HIV infections in Nigeria occur among youth and young men aged 15 to 24 years, yet only 34% of young men are fully aware of how to prevent HIV [1]. Accordingly, in 2017, 3.0% of all Nigerians ages 15–24 years were living with HIV. Despite wide recognition of the disproportionate impact of HIV on adolescents and young adults, [5] little attention has been paid to exploring ways to address ongoing HIV transmission.

According to data from the Integrated Biological and Behavioral Surveillance Survey (IBBSS), HIV prevalence among men who have sex with men (MSM) in Nigeria has steadily increased, from 14% to 2007 to 17% in 2010 and to 25% in 2020 [1, 6] Although HIV prevalence among 15–19 year old MSM remained stable (12–13%), there was a significant increase among MSM ages 20–24 years, from 16.2% to 2010 to 23.9% in 2020 [7]. Compared with all young men in Nigeria, young MSM (YMSM) are up to ten times as likely to be living with HIV; data from Nigeria’s National Agency for the Control of AIDS (NACA) estimates that 2.9% of all 15–19 year old men and 2.7% of all 20–24 year old men were living with HIV in 2018 [8].

The impact of HIV on MSM is compounded by the stigma and criminalization of same-sex sexual behavior. The passage of the Same-Sex Marriage Prohibition Act of 2014 has had much broader implications for MSM in Nigeria than just prohibiting marriage between same-sex individuals [9, 10] The law also forbids any public demonstrations of “same sex amorous relationship[s]” and participation in gay clubs or societies, with punishments ranging from 10 to 14 years in prison. Consequently, these laws reinforce the widespread stigma against sexual minorities and facilitate their prosecution, which in part force MSM in Nigeria to actively conceal same-sex demonstrations of affection in public. Further, these laws prohibit the provision of services by gay-serving organizations, which has particularly stymied the ability of HIV service organizations to provide targeted and tailored prevention and treatment services to MSM vulnerable to or living with HIV. Likely as a consequence of these factors, UNAIDS data in 2020 show a strikingly low antiretroviral therapy (ART) coverage among those with HIV at 26%.

In the United States (US) and other jurisdictions where sexual and gender minority (SGM) individuals are afforded more protections, sexual networks of MSM are likely homophilous (i.e., predominantly include other MSM), thus creating an insular transmission network. However, in locations where same-sex relationships are criminalized, including Nigeria, MSM may hide their sexual identity by maintaining female partners in public, but privately having sex with men. This leads to less bounded networks in which YMSM socially and sexually interact with the general population, generating very complex epidemiological dynamics. In order to study the HIV epidemic and devise successful public health strategies in this context, approaches that can navigate this degree of complexity are needed. Specifically, community-led and community-engaged approaches are necessary for success, but may require compromises to complex data collection in order to ensure participant safety. Research to describe HIV-1 transmission networks among YMSM and the interplay with the general population is therefore a high priority.

Network data collection is crucial for identifying sexual connections key to potential transmission acts. Data on all sexual partners, including number and type of sexual acts, condom use, perceived HIV-1 status, and concurrency, are important factors in determining disease spread within a network. Despite challenges in recall, especially for one-time partners, partner elicitation remains the primary method of identifying individuals who were potentially exposed to HIV-1 and remains a critical component of evidence-based interventions including Partner Services. As such, it is a building block for a more complex understanding of network dynamics central to public health intervention strategies.

Egocentric and whole networks comprise the two major structures in social network analysis. Egocentric networks contain information on both individuals and their immediate network (alters), while whole networks contain information from all individuals within a bounded population that can be collected via a roster. In most cases, particularly where the population and/or the outcome are stigmatized, it is infeasible to collect whole network data [11, 12] Therefore, approaches have been developed to improve the utility of egocentric networks by constructing a macronetwork in which individuals who appear multiple times across egocentric networks are matched and consolidated. A macronetwork can be thought of as a whole network with missing data [13]. Development of a macronetwork can help to overcome some major limitations inherent in egocentric networks: [13,14,15] many network-level measures cannot be accurately assessed (i.e., distance, centrality, positional equivalence) and observed network data will be biased, as some individuals may appear as alters multiple times across egocentric networks. Despite limitations, development of a macronetwork is the optimal approach for understanding sexual transmission dynamics in settings where developing a roster of the population is impossible (i.e., a list of all YMSM in Ibadan).

Although some research on social/sexual network dynamics related to HIV-1 transmission have been conducted in Nigeria, [16] none have concurrently focused on both YMSM (i.e., key population) and the general population, or have leveraged the use of advanced network surveying technologies to improve data collection [17, 18] Therefore, we first sought to assess the feasibility and acceptability of collecting social and sexual network data in Network Canvas from individuals newly diagnosed with HIV-1 in Ibadan, Nigeria. Once this was established, we then examined whether there were significant differences in individual- and network-level characteristics between YMSM and general population participants. Then, we sought to determine HIV-1 sexual transmission dynamics in the population by combining social/sexual network data from participants. We did this by attempting to match individuals across networks to construct a macronetwork.

Methods

Researchers from Northwestern University, the Ann & Robert H. Lurie Children’s Hospital, and the University of Ibadan collaborated on a research study in Ibadan, Nigeria. The iCARE study was built on infrastructure developed through a successful US-Nigeria-HIV-centered academic partnership. The project tested two combination interventions among youth aged 15 to 24 years across the HIV care continuum. The described research study below is a subproject to these interventions, and utilized network and phylogenetic data to examine the HIV-1 epidemic in the YMSM and general population in Ibadan.

Procedures

Key informant interviews (KIIs) and focus groups were conducted in Ibadan to assess interest in and acceptance of the use of the Network Canvas software suite for social and sexual network data collection among individuals newly diagnosed with HIV-1 in Ibadan. A total of four Key Informants were identified as individuals with substantial knowledge and experience working with the target population and recruited by the Ibadan research team - a peer navigator, a nurse, a medical doctor, and a recruiter. In September and October 2020, KIIs were conducted by members of the Northwestern University and University of Ibadan research teams over Zoom. KIIs included sections pertaining to: experiences with discussing sexual network and behavior data, administering surveys to patients, patient privacy and data security, and Network Canvas introduction and assessment. During the software introduction, KII participants received a live demonstration of the suite’s Interviewer software before providing their feedback about the experience.

Three focus groups were convened among people living with HIV-1 in October 2020: focus group #1 consisted of 8 YMSM aged 15–24 years; focus group #2 consisted of 5 cisgender women aged 15–45 years from the general population; and focus group #3 with 5 cisgender men aged 15–45 years from the general population. Focus group participants were recruited by peer navigators across the three study sites in Ibadan following confirmation of their HIV-1 diagnosis. Following the consent process and participation in the focus group, each individual was paid 5000 Naira (approximately 12–13 United States dollars (USD)). KII and focus group feedback was used to develop and refine a survey protocol to collect social and sexual network data from participants in three languages: English, Yoruba, and Pidgin. All qualitative activities were led by a community-based researcher in Ibadan.

Between March 2021 and March 2022, 151 individuals at least 15 years of age and newly diagnosed with HIV-1 (i.e., never having had a prior positive HIV test) and treatment-naive were recruited from three sites within the iCARE catchment area in Ibadan, Nigeria to assess differences in individual- and network-level characteristics between YMSM and the general population. The three sites were a private hospital and two public health facilities. Participants completed an interviewer-administered network interview. Participants received 2,500 Naira for the interview. All study activities received approval by the Northwestern University IRB.

A variety of approaches were employed to ensure data for all participants and their alters were kept protected and secure. All data were collected on password-protected, encrypted computers, and data were deleted from study computers once they had been uploaded to the secure server. All participants provided informed consent for all study procedures, could skip any questions or terminate the interview at any time, and were not asked to provide full names or exact addresses for alters to ensure privacy. Study procedures used accepted standards within social network analysis for ethically collecting and utilizing data from alters, including requesting a waiver of secondary subject consent per 45CFR46.116(f) and using a certificate of confidentiality provided by NIH.

Measures

Traditional methods of collecting network data are exceedingly complex and time-intensive [18]. Therefore, we used Network Canvas to collect data on social and sexual networks for this study. Network Canvas is an innovative, open-source tool designed to make the capture and management of complex network data efficient, low-barrier, and engaging for participants. Informed by the principles of participant-aided sociograms [19] and human-computer interaction, the software utilizes touchscreen interfaces and simple visual components to promote an engaging experience for participants [17]. We developed a bespoke survey protocol informed by our KIIs and focus groups which was deployed at three field sites in Ibadan to collect both ego (i.e., participant) and alter (i.e., network member) data. Interviewer staff were trained to utilize the protocol through mock interviews and were provided with Interviewer Guides, which gave a screen-by-screen description of the interview and the data to be collected at each stage. The guide provided an interviewer script, as well as definitions for terms that could be unfamiliar to participants, such as transgender, gender identity, and sexual orientation. These materials were provided in English, Pidgin, and Yoruba.

Ego-Level

Participants provided their name, age, and neighborhood of residence. Next, they were asked a series of sex, sexual orientation, and gender identity questions. Then they were asked a series of questions about their medical history.

Alter-Level

Alter-level data were elicited through three main phases used in network surveying [20,21,22,23,24,25]. First, social network connections were elicited via a name generator asking participants to name those with whom they share personal matters. Participants could add as many individuals to this prompt as they liked and were asked to provide information on each alter’s name, age, and neighborhood of residence. Unlike standard network survey flow in which all name generators come first, the survey was separated by alter type (social and sex) due to recommendations from KIIs and focus groups.

Second, sexual network connections were elicited through a name generator asking participants to name people with whom they had sex in the past six months. Participants could add as many individuals to this prompt as they liked and were asked to provide information on each alter’s name, age, and neighborhood of residence. In this name generator, participants had the opportunity to import previously named alters from their social network with the exception of members whose relationship to the participant was that of a family member, who were filtered out.

Following the nomination of alters, participants were asked to provide information about the relationships amongst alters. Participants placed network members on a sociogram to draw ties between alters. For social alters, participants were asked to connect all the people who know each other. For sexual alters, participants were asked to connect the people who they believe have had sex with each other in the past 6 months. Moreover, for sexual alters the participant was asked to identify all who they believed were HIV-1 seropositive. If they indicated any network members as HIV-1 seropositive, they were asked if they believe that network member has a regular medical provider who they see when they are sick.

Matching Procedure

First, a dataset was created consisting of all ego and alter observations and responses to the name, neighborhood, age, and gender questions. All individuals were compared on these four responses to determine matches using the following process:

  1. 1.

    Individuals with exact matches for all four variables were immediately marked as duplicates.

  2. 2.

    Then, individuals with the same or similar names (e.g., Michael and MICHEAL), the same gender, the same neighborhood of residence address, and ages within 3 years were reviewed and marked as duplicates.

  3. 3.

    Next, all instances with similar names, same gender, and ages within 3 years were explored. Addresses were investigated for proximity to each other, and all matches where the distance was less than 1 mile were considered duplicates.

  4. 4.

    Finally, individuals with the same address, same gender, and ages within 3 years were investigated. A case-by-case determination was made by the first author about whether the names were similar enough to constitute matches. Most duplicates found this round were due to use of nicknames (e.g., Tunde and Babatunde) or use of full vs. partial name (e.g., John Smith, John, and Smith).

Anyone who was not identified as a potential duplicate through these four steps was considered a unique individual. Duplicate individuals were then combined using the following process to determine demographics:

  1. 1.

    If one of the individuals was an ego, all ego-level data were used for demographics.

  2. 2.

    If none of the individuals were an ego, then the following methods were used for determinations:

    1. a.

      All reported ages were averaged and the mean age was used.

    2. b.

      All categorical variables were classified based on majority (e.g., two heterosexual reports and one bisexual report would be collapsed into heterosexual) where possible.

    3. c.

      If there was no majority (e.g., one heterosexual report and one bisexual report), the responses from the ego with the closest relationship to the alter were used (e.g., mother/son relationship selected over casual sex partner relationship).

Analytic Approach

Focus groups were audio-recorded and a staff member took notes in real-time. At a later date, a staff member from Northwestern University listened to the audio recordings of each focus group and cross-referenced their notes to the notes taken live. Both staff members sought to identify recurring themes regarding participant comfort and experience reporting on sexual behavior and health information as well as their impressions of using Network Canvas for capturing these data. No formal coding processes were used for qualitative analyses.

For quantitative data, all analytic procedures were conducted in SAS v9.4 (SAS Institute, Cary, NC). First, univariate statistics were generated to describe both ego and alter demographics. Then, bivariate analyses were conducted to identify significant differences between YMSM and general population study participants. Odds ratios and 95% confidence intervals were calculated for categorical variables, and t-tests were calculated for continuous variables. Sensitivity analyses were conducted using generalized linear models (GLMs) to explore whether gender or age impacted associations with population type.

Network visualization was conducted in Gephi v0.9.2 (Gephi Consortium, France). Nodes were arranged in a Fruchterman-Reingold layout. Ego nodes were distinguished from alter nodes by color and by size. Edge types were distinguished by color. Statistics generated utilizing Gephi’s built-in algorithm to compute modularity, modularity with resolution, and number of communities.

Results

Key Informant Interviews and Focus Groups

KIIs (n = 4) and focus groups (n = 18) informed the development of the Network Canvas protocol and recruitment. Key findings indicated participants had concerns about the confidentiality and privacy of their data, YMSM indicated they would feel more comfortable with a trusted community member being involved in recruitment, and that the interview materials and study instruments should be offered in multiple languages. Across all groups and interviews, the importance of community engagement was highlighted as an important priority for the success of the proposed project.

Based on the qualitative data collection activities, the following recommendations were implemented: during the consent process participants were reassured that their data would be securely handled and were given an explanation of study protocols for data access and storage, a trusted community member was included in recruitment efforts, and all interview materials were offered in English, Pidgin, and Yoruba. Further, minor edits were made to ensure that all terminology was defined within the Interviewer Guide, and that response options were worded clearly and appropriately. Changes included new phrasing and probes for eliciting living location (e.g., asking about neighborhood instead of address), updating relationship types (e.g., separating sexual partner into romantic and non-romantic), and adding healer to list of medical providers.

Univariate

With respect to feasibility and acceptability, we were able to recruit a sample comprised of more than one-third (37.7%) YMSM, and all participants reported information on at least one sexual partner. Participants were on average 31.5 years old (YMSM mean = 23.9 years) and majority male-identified (57.0%; Table 1). Participants reported a mean of 2.6 social alters and 2.6 sexual alters.

Table 1 Ego Demographics (n = 151)

Examining demographic statistics between social alters and sexual alters, we observe that most social alters were friends (43.8%) and family members (31.8%) and had most frequently met through social media (31.0%) followed by neighborhood (23.4%; Table 2). Most sexual alters were romantic sexual partners (46.5%) followed by non-romantic sexual partners (31.4%); similar to social alters, most were met through social media (37.1%) followed by neighborhood (25.3%, Table 2). Most social and sexual alters were cisgender males; 64.3% of social alters were cisgender males while 79.6% of sexual alters were cisgender males (Table 2). The mean age for social alters was 34.4 (SD = 12.5) and the mean age for sexual alters was 32.8 (SD = 8.2).

Table 2 Alter Demographics by Alter Type

Bivariate

Unsurprisingly, key population participants were all assigned a male sex at birth, whereas general population participants were 67.0% assigned female at birth, 31.9% assigned male at birth, and 1.1% unknown sex assigned at birth. Key population participants were significantly younger (mean = 23.9 years; SD = 2.3) than general population participants (mean = 36.1 years; SD = 7.2; t = 15.3, p < 0.0001; Table 3). Additionally, key population participants predominantly identified as gay (68.4%), with the remainder being bisexual (31.6%). General population participants were majority heterosexual (93.6%) with the rest identifying as bisexual (6.4%).

Table 3 Ego Demographics by Population

There were also significant differences between general and key population participants based on reported alters. Key population participants reported a mean of 3.2 social alters, including those who were also sex alters, (SD = 1.1) compared with 2.3 (SD = 0.9) reported by general population participants (t = -4.9; p < 0.0001). When looking only at social alters (excluding those who were both sex and social alters), there was no longer a significant difference between general and key population participants. Key population participants reported a significantly higher mean number of sexual alters, including those who were also social alters, (3.6; SD = 1.3) than general population participants (2.1; SD = 0.8; t = -8.24; p < 0.0001). When looking just at sexual alters (excluding those who were both sex and social alters), the significant difference remained, albeit at a smaller magnitude. Finally, key population participants reported twice as many alters that were both sexual and social network members (mean = 1.7; SD = 1.5) than general population participants (mean = 0.7; SD = 0.7; t = -4.61; p < 0.0001).

Sensitivity Analyses

When gender was included in the model, the magnitude of all associations between alter number and population type decreased (Supplemental Table 1). With the exception of sexual alters (only), all models that were previously significant remained significantly associated with population type. After also including age in the models, the magnitude of all associations once again decreased, but all significant associations from the prior model remained.

Matched Network

From the 151 egos and 634 alters, 85 potentially unique individuals (194 total) were identified during the initial review of egos and alters as potential matches. At Step 1, 10 individuals were collapsed into 4 unique individuals. At Step 2, 38 new individuals were identified as matches (either with new individuals or collapsed individuals) and collapsed into 20 unique individuals. At Step 3, 7 new individuals were identified as matches (either with new individuals or collapsed individuals) and collapsed into 4 unique individuals. Finally, at Step 4, 8 new individuals were identified with matches (either with new individuals or collapsed individuals) and collapsed into 4 unique individuals. Overall, 65 egos/alters were collapsed into 25 unique individuals. Although the majority of unique individuals came from a matched pair (n = 21), 2 had 3 composite individuals and 2 had 6 composite individuals. The resulting macronetwork consisted of 749 nodes with 1,225 edges (Fig. 1). Within the macronetwork, one general population participant was connected to a large community comprised of 15 key population participants and their corresponding alters, providing initial evidence for intermingling between key population and general population individuals newly diagnosed with HIV.

Fig. 1
figure 1

Matched social and sexual network of individuals newly diagnosed with HIV from Ibadan, Nigeria

Discussion

Our success in collecting social and sexual network data from over 150 newly HIV-1 diagnosed individuals in Ibadan, Nigeria demonstrates feasibility and acceptability of using Network Canvas to facilitate data collection and management. Engaging with local stakeholders through KIIs and focus groups increased the likelihood of this project succeeding, in that they provided valuable information on participant engagement, question wording, and interface development. Community engagement in any research project is critical for its success, but it was especially vital for this initiative, as safety and security stemming from the criminalization of same-sex relationships could easily have prevented YMSM recruitment. Furthermore, we were not only able to recruit participants, but were also able to collect detailed data about their social and sexual network members, which likely would not have been possible without formative work and close partnership with recruiters and interviewers.

Although this study enrolled individuals 15 years or older (regardless of gender) who were newly diagnosed with HIV-1 during the specified time period, a main population of interest was YMSM. Due to the aforementioned stigma and criminalization impacting YMSM, messaging focused on this community needed to be more general and to rely heavily on word-of-mouth. Despite the lack of targeted recruitment, we were still able to enroll a sample that consisted of more than one-third YMSM, with a diversity of ages and backgrounds. In addition, we collected data on more than 600 alters, providing a complex story about the lives of the participants.

We saw several differences between participants in the key population (i.e., YMSM) and the general population in demographics at the individual level (e.g., YMSM were younger and more likely to identify as gay than general population participants), which was unsurprising. However, we also saw significant differences in network characteristics by population type. Key population participants reported a higher mean number of both social and sexual network members compared to general population participants. Most of these associations persisted even after controlling for age and gender; the only exception was for the number of alters who were sexual alters only. Since gender was also not associated with number of sexual alters only in the adjusted model, there seems to be a complex interplay between the role of gender and population type on sexual partnering that needs to be explored further.

Our matching procedure allowed us to identify interconnected networks within our sample. Utilizing information for alters and egos including name, age, sex, and location, we were able to identify individuals nominated in different interview sessions. Despite not collecting full names or addresses, primarily for safety concerns, we were still able to confidently confirm matches. This therefore shows that using a less precise matching protocol is feasible for identifying individuals across networks in Ibadan, Nigeria. However, future work should continue to explore the balance between collecting detailed data and ensuring participant safety to see if more accurate data could improve the matching process. Moreover, overlap between networks shows evidence for interconnectedness, including between general population and key population individuals newly diagnosed with HIV, providing insight into community detection. This provides preliminary evidence for overlapping HIV-1 epidemics between YMSM and the general population, which could spark increased public health investments in prevention among YMSM and, paradoxically, accelerate the repeal of oppressive laws. Although exploring the role of homophily in sexual connections and the potential HIV-1 transmission networks is critical for exploring the underlying dynamics at play, this investigation was beyond the scope of this manuscript.

Although there were many strengths to this study, one major limitation is that we were likely only able to enroll participants who were comfortable with disclosing their sexual identity and/or behavior, so results may not be representative of the entire YMSM population or general population in Ibadan, Nigeria. Moreover, as in any survey study, data were open to recall bias and social desirability bias. However, our team incorporated several techniques to minimize the impact. To minimize recall bias, we asked participants to name sexual partners from only the past six months – time anchoring of questions has been shown to improve recall [26]. To minimize social desirability bias, participants were assured of their anonymity throughout the process, and trusted community members were engaged to ensure participants felt comfortable throughout the process. Additionally, all information on alters was based on ego self-report; to address this limitation, participants had the opportunity to select “Don’t know” or “Prefer not to answer” if they were unsure of an answer to avoid guessing. Finally, our matching procedures were vulnerable to human error and relied heavily on the judgment of the study team, rather than an automated process to identify fuzzy matches. First, the use of neighborhood rather than address, and name instead of separate first and last name, both limited our ability to confidently match individuals. Rather than rely on the judgment of a single person, two study team members reviewed all potential matches, and only those which were agreed upon as matches were combined. Further, any questions about reported neighborhoods were discussed with field staff in Ibadan to identify whether two locations were close to each other.

To our knowledge, this was one of the first studies to collect detailed social and sexual network data from newly HIV-1 diagnosed individuals in Ibadan, Nigeria, with a particular focus on YMSM. Not only did we find that this approach was acceptable, but participants were willing to provide a wealth of information about themselves and their networks. Despite not prioritizing the collection of specific details on alters, we were still able to identify 65 potential matches that could be combined into 25 unique individuals.