Introduction

Gender power inequity has been an essential construct in theories of women’s sexual health across a variety of fields, including medical anthropology, psychology, sociology, feminism, social work, nursing, and public health (Beckman, Harvey, Thorburn, Maher, & Burns, 2006; Dudgeon & Inhorn, 2004; Harper, Minnis, & Padian, 2003; Maman, Campbell, Sweat, & Gielen, 2000). Numerous studies have demonstrated that power inequality within sexual relationships is linked to poor reproductive and sexual health outcomes for women worldwide (Amaro, 1995; Blanc, 2001; Campbell et al., 2009; Connell, 1987). High-risk sexual behavior and violence, which occur most often in the context of women’s primary heterosexual relationships, are often related to self-perceived low relationship power (Campbell et al., 2009; Pulerwitz, Gortmaker, & DeJong, 2000). Power inequities and high-risk sexual behavior combined with greater physiological vulnerabilities contribute to high and ever increasing rates of HIV and other sexually transmitted infections (STIs) among women (Campbell et al., 2009; Higgins, Hoffman, & Dworkin, 2010), particularly women of color and those who are socioeconomically disadvantaged (Beckman et al., 2006; Wingood & DiClemente, 2000).

The World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC) have prioritized research and interventions that address factors contributing to low relationship power that place women at risk for violence, HIV/STI acquisition, and poor reproductive health outcomes (Sebelius, 2011; World Health Organization, 2009). However, methodological challenges related to the quantitative measurement of relationship power have hampered examination of its impact on health outcomes, including risk for HIV and other STIs (Blanc, 2001). Frequently, researchers evaluate the outcomes of assumed power differentials using proxy variables, such as the manifestation or threat of violence, partner age discordance, or emotional and economic dependence (Harper et al., 2003; Tschann, Adler, Millstein, Gurvey, & Ellen, 2002; Wekerle & Wolfe, 1999), without examining the power component itself (Foss, Vickerman, Heise, & Watts, 2003; Pulerwitz et al., 2000). Yet, such proxy measures do not fully capture the complexities of imbalances that may exist within sexual relationships.

The Sexual Relationship Power Scale (SRPS) is a 23-item scale developed by Pulerwitz et al. (2000) to address the need to measure relationship power among women in intimate and sexual relationships. The SRPS consists of two subscales measuring the constructs of relationship control (RC; 15 items) and decision-making dominance (DMD; 8 items). Since its development, the SRPS has been used in numerous studies exploring relationship power as a determinant of sexual risk within primary relationships, and there exists a substantial literature reporting the psychometric properties of the SRPS and subsequent modifications of the scale in various populations, cultural contexts, and research settings. Although the SRPS has been viewed as a useful tool for measuring relationship power in HIV prevention research (Blanc, 2001; Frye et al., 2007), a review of its psychometric properties has never been undertaken. In this systematic review, we describe the psychometric properties of the SRPS by study population and identify psychometric trends across populations in the HIV prevention literature from 2000 to 2012. Our aim is to help researchers better understand the strengths, limitations, and evidence-based application of this measure as well as identify future research directions to investigate the weaknesses in the construct and measurement of relationship power related to sexual risk.

Development of the Sexual Relationship Power Scale

The development of the SRPS was guided by the theories of gender and power (Connell, 1987) and social exchange (Emerson, 1972, 1981). The theory of gender and power postulates that structural factors at the societal and institutional levels operate to maintain a division (inequity) of labor, power, and normative relations that result in women’s substandard health outcomes (Connell, 1987). Social exchange theory holds that power derives from negotiated cost/reward trade-offs or exchanges embedded within the context of interpersonal relationships (Emerson, 1972, 1981). Based on these theories, in addition to a literature review of relationship power, and focus group discussions with Latina, African-American, and White women in the United States, Pulerwitz et al. (2000) developed the SRPS with 62 initial items in five domains: (1) decision-making dominance, (2) relationship control, (3) distribution of economic and emotional resources, (4) alternatives to the relationship, and (5) dependence on the relationship.

The 62-items were tested in a sample of 18–45-year-old women from a community health clinic (N = 388). The sample consisted primarily of Latina (89 %) women with a high school degree or less (79 %) who reported having a primary sexual partner (43 % were married). Initial factor analysis provided empirical support for a conceptual distinction between two domains that were the foundation of two unique subscales: (1) relationship control (RC) and (2) decision-making dominance (DMD). The other postulated dimensions were discarded due to weak psychometrics. The two retained dimensions have also emerged in the conceptual definition of relationship power proposed by other investigators (Harper et al., 2003; Harvey & Bird, 2004; Ronfeldt, Kimerling, & Arias, 1998). The two subscales have distinct response sets. The RC subscale employs a 4-point Likert scale to measure level of agreement on item statements (Strongly Agree; Agree; Disagree; Strongly Disagree). An example RC item is “Most of the time, we do what my partner wants to do.” The DMD subscale was constructed to measure the balance of decision-making power (1 = Your partner has more power; 2 = Both of you have equal power; 3 = You have more power) on each of the eight items, with higher scores indicating higher relationship power for the respondent. For example, one DMD item asks “Who usually has more say about what you do together?” Detailed instructions for scoring the SRPS were included in the original article by Pulerwitz et al. (2000). The DMD was rescaled to a range of 1–4 to correspond to the RC score range. Subscale scores were totaled separately and divided by the number of non-missing items to calculate a mean individual score for the RC and DMD scales. Mean subscale scores were added together and divided by two to produce an overall SRPS score between 1 and 4. Pulerwitz recommended standardized scoring to enable cross-sample comparison (J. Pulerwitz, personal communication, December 12, 2011). Alternatively, Pulerwitz, Amaro, DeJong, Gortmaker, and Rudd (2002) recommended trichotomizing scores into “high” (>2.82), “medium” (2.82–2.43), and “low” (<2.43) levels of power.

Pulerwitz et al. (2000) also developed a modified version (SRPSm) of the original scale to be used in studies involving outcomes related to condom use. In the modified version, four items related to condom use were removed (items 1, 2, and 8 in the RC subscale and item 22 in the DMD) in order to prevent tautologous correlations with condom use outcomes. The internal consistencies of the SRPSm and modified subscales (RCm and DMDm) were similar to the original scales (alphas: SRPS = 0.84; RC = 0.85; DMD = 0.63; SRPSm = 0.86; RCm = 0.85; DMDm = 0.57). Furthermore, the SRPSm was significantly associated with condom use, and other variables hypothesized to be related to relationship power, thus supporting its construct validity (Pulerwitz et al., 2000, 2002).

Method

A systematic literature search was conducted using Medline and Psych Info databases for articles published from 2000 to 2012 for the terms “HIV” and “Sexual Relationship Power Scale.” Additionally, articles citing Pulerwitz et al. (2000, 2002) were identified using ISI Web of Science. The initial search identified 128 articles. Articles identified in the search were examined and included in the review if they (1) were published in peer-reviewed journals from 2000 to 2012, (2) were written in English, (3) used SRPS or a modified version in primary data collection and analysis, (4) reported psychometric properties (reliability or validity) for the SRPS or subscales, and (5) included HIV sexual risk (or related) outcome measures. Review and editorial articles were excluded. We identified 54 publications that matched our inclusion criteria, based on independent assessments by two reviewers. The following data were extracted: authors and references, study purpose and design, sample characteristics and recruitment setting, SRPS scale characteristics and modifications, reliability psychometrics, and validity psychometrics. Two reviewers independently reviewed articles for data extraction, and disagreements were resolved by consensus.

Extracted data were tabulated in a standardized format (Table 1). Risk of bias was not assessed for the reviewed studies, and extracted data were equally weighted in the synthesis, as quality factors (e.g., sample size) that would influence psychometric properties were part of this review. Cross-comparisons and synthesis were performed by population (based primarily on race/ethnicity and sex) to reveal patterns in the psychometric properties of the SRPS within and across groups. Quantitative analysis for cross-population trends in the psychometric properties was performed on the sample of studies using generalized estimating equation (GEE) modeling to identify correlates of scale internal consistency reliability and construct predictive validity. GEE was used to adjust for non-independence of data points due to cases in which multiple estimates of reliability and validity were nested within studies.

Table 1 Summary of psychometric properties of the SRPS reported in published HIV/AIDS-related research studies, 2000–2012

Results

Study Characteristics

Across the included studies, the SRPS or subscales were administered to the following study populations: U.S. African-American females (n = 4), U.S. Latina females (n = 4), U.S. multiple race/ethnicity females (n = 21), South African females (n = 10), other/international female samples (n = 6), male samples (n = 10), heterosexual couples (n = 3). Of the seven studies that recruited both female and male participants, five reported psychometric results separately, whereas two did not. Few studies included a sample with a narrow age range: nearly half of the studies (48 %) included participants 18 years of age or older, typically up to some specified maximum (e.g., 49 years); several studies (15 %) recruited young adults (typically 18 to 29 years old); a substantial proportion of studies (37 %) included adolescents, but the age limits for these samples varied widely from 14–18 years to 15–49 years, with the majority including both adolescents and adults in the same sample. A majority of studies (57 %) were conducted in the U.S., with the remainder performed either in South Africa (24 %) or other international settings (19 %).

Administration, Scale Modifications, and Scoring

Of the 54 published studies included in this review, 21 reported use of the total SRPS (either the original, SRPSm or adapted version of the scale); 35 employed the RC subscale as a separate measure (original, RCm or adapted); and 18 used the DMD (original, DMDm or adapted). Nearly half of the studies reported the use of more than one scale or subscale.

Nearly 60 % of the studies employed some adapted version of the SRPS or subscales. Adaptations to the scale and subscales included major (e.g., developing a different version of the scale), moderate (e.g., changing or deleting items), or minor revisions (e.g., item wording modifications). A major scale adaptation was the development of the South African version (Dunkle et al., 2004), which was used in 12 of the 54 studies reviewed. The South African version combined selected SRPS items with items from a different gender scale (Dunkle et al., 2007). Ten studies combined subsets of items from RC and DMD subscales to create a new scale (Amaro et al., 2007; Jones & Gulick, 2009; Kaufman, Shefer, Crawford, Simbayi, & Kalichman, 2008; Operario, Nemoto, Iwamoto, & Moore, 2011a, 2011b; Pettifor, Measham, Rees, & Padian, 2004; Younge, Salem, & Bybee, 2010). However, moderate types of adaptations, such as dropping items from the scale due to concerns about negative emotional reactions by participants (Ragsdale, Gore-Felton, Koopman, & Seal, 2009) or the irrelevance of certain items to specific populations (Ketchen, Armistead, & Cook, 2009), were found to be more common.

The SRPS scale not only was developed in both English and Spanish but has also been translated into numerous other languages, including African language translations (i.e., Sotho, Zulu, Tswana, Xhosa, Pedi, Venda, Tsonga, Afrikaans, Setswana, siSwati, and Runyankole), Native Creole, Chinese, French, Hindi, Urdu, and Tamil. The original Spanish language version has been used in nine studies with Lantino/Hispanic participants (Bermudez, Castro, Gude, & Buela-Casal, 2010; Rocca, Doherty, Padian, Hubbard, & Minnis, 2010; Zukoski, Harvey, Oakley, & Branch, 2011). Most translations involved the use of back translations, pilot testing, and expert evaluation of cultural content validity (Dunkle et al., 2004; Kershaw et al., 2006; Ketchen et al., 2009).

Response sets were often modified from the original, such as expanding the Likert scale to include a neutral response or more responses (Beckman et al., 2006; Tietelman, Ratcliffe, Morales-Aleman, & Sullivan, 2008; Younge et al., 2010), dichotomizing responses (Parrado, Flippen, & McQuiston, 2005; Pettifor et al., 2004), or reflecting the Likert scores (Buelna, Ulloa, & Ulibarri, 2009; Filson, Ulloa, Runfola, & Hokoda, 2010). Scoring variations were reported in several articles, most often in light of non-normal sample distributions, response set modifications, and cultural contexts. Other scoring schemes were more complex. Gagnon, Merry, Bocking, Rosenberg, and Oxman-Martinez (2010) scored subjects as having low power on the DMD if respondents selected “Your partner has more power” on three or more of the eight DMD items. Six studies used tertiles of low, medium, and high relationship power variables, although in one study, this was done based on natural distribution cut-off points (Dunkle et al., 2004). The methods used for determining high–low cut-off points when creating categorical responses were inconsistent, sometimes based on the sample distribution (Salazar et al., 2011), other times on raw scores (Harris, Grant, Pitter, & Brodie, 2009), thus, at times creating small cells.

During the initial development of the SRPS, the scale was administered verbally as a way to include women of all literacy levels (Pulerwitz et al., 2000). The scale, as written, has a 90 % reading ease level on the Flesch Reading Ease scale (Flesch, 1948) and a 4.6 grade reading level according to the Flesch-Kincaid assessment. Pulerwitz et al. (2000) used trained, bilingual researchers to conduct the survey, in English or Spanish, in a private area and participant responses were kept anonymous. Across reviewed studies, 61 % of surveys were administered exclusively by trained interviewers, mostly face-to-face, although one study employed telephone interviewing, and another used computer-assisted personal interviewing (CAPI) techniques. Surveys were self-administered in 28 % of studies, with the majority of these (n = 9) using computer-assisted self-interviewing (ACASI/CASI) techniques. Some combinations of interviewer- and self-administered survey methods were used in several studies. Two studies primarily used interviewer–administrated methods, but switched to ACASI for sensitive items on drug use or sexual behavior. Two studies gave participants the choice of interviewer- or self-administration of surveys. Rocca et al. (2010) randomly assigned either interviewer-administered or CASI of surveys and found no difference in SRPS scores between the two methods.

Most of the scales were administered at one time point for cross-sectional analysis. However, in a few studies, evaluation of power across time demonstrated sensitivity to intervention assignment (Amaro et al., 2007).

Psychometric Properties of the SRPS by Population

U.S. African-American Females

Four studies reported psychometric properties of the SRPS on samples of U.S. African-American females. Two studies involved adolescent or young adult samples spanning the ages of 14–21 years recruited from urban community health centers (Bralock & Koniak-Griffin, 2007; Salazar et al., 2011). The other two studies analyzed samples of adult females over age 18 (Harris et al., 2009; Younge et al., 2010). Each of the four studies reported reliability and validity data for different forms of the SRPS. Harris et al. (2009) used the original SRPSm in a study of adult women and reported Cronbach’s alpha of 0.89; Bralock and Koniak-Griffin (2007) employed original RC and DMD subscales in a study involving 14–20-year olds, and reported Cronbach’s alphas similar to those originally reported by Pulerwitz et al. (2000): RC = 0.89, DMD = 0.63. By contrast, studies that used modified versions of the scales with African-American samples reported lower internal consistency reliability. Younge et al. (2010) created an 8-item SRPS combining unspecified items from both subscales together with a modified response set and reported an alpha of 0.64. Salazar et al. (2011) created a 12-item SRPS of unspecified items and reported an alpha of 0.80. As pointed out by Younge et al. (2010), reducing the number of SRPS items and modifying the response set may have resulted in suboptimal psychometric scale properties.

Despite suboptimal reliability, Younge et al. (2010) found that lower SRPS scores were associated with higher scores on a perceived HIV risk scale. Harris et al. (2009) found that lower/medium SRPSm scores were significantly associated with increased HIV risk behavior, although some caution in interpretation is warranted, because the HIV risk measure was created as a composite of the four condom-related items from the SRPS and was therefore not an independent measure of sexual risk. Bralock and Koniak-Griffin (2007) used the unmodified SRPS and found no statistically significantly correlation between percentage condom use and relationship power; however, an association between SRPS and condom self-efficacy beliefs was reported.

U.S. Latina Females

Four studies reported the psychometric properties of the SRPS among Latina women in the U.S. One study was conducted with adult Mexican woman ages 18–49 years (Parrado et al., 2005), two involved samples of young Hispanic woman spanning ages 18–29 (Ragsdale et al., 2009; Zukoski et al., 2011), and the fourth involved Latina adolescents age 15–19 years (Rocca et al., 2010). Three of the studies offered participants a choice of either the English or Spanish versions of the survey; the other did not specify language version. Two studies employed original SRPS or subscale versions (although Ragsdale et al. removed item 15 from RC), and both reported Cronbach’s alphas similar to those of Pulerwitz et al. (2000): SRPS = 0.87; RC = 0.88; DMD = 0.63 (Ragsdale et al., 2009); and RC = 0.90 (Zukoski et al., 2011). In contrast, Parrado et al. (2005) used the RC subscale with a binary (yes/no) response set and reported a slightly lower alpha: RC = 0.80. In this study, exploratory factor analysis (EFA) of the original 15-item RC revealed three factors within the RC subscale: relationship control, sexual negotiation, and emotional consonance (Parrado et al., 2005). Internal validation of the scale adaptation was supported with pre-post tests and focus groups demonstrating better power to discriminate than the original RC. The relationship control construct was thus reduced to five items (3, 4, 7, 10, & 12) with an alpha of 0.65.

Evidence of construct validity is provided in three of the studies. Despite the lower alpha, Parrado et al. (2005) found that the 5-item RC subscale was significantly associated with numerous outcomes, including education, age, social support, and relationship variables. Ragsdale et al. (2009) found differences in SRPS scores according to ethnicity or immigration status, but not acculturation, and reported that low SRPS scores were associated with frequency of unprotected sex. In addition, Latina adolescents with low SRPS scores had an elevated risk of pregnancy (Rocca et al., 2010).

U.S. Multiple Race/Ethnicity Females

Twenty-one of the reviewed articles involved females from multiple racial/ethnic groups and did not report psychometric properties separately for each group. Samples were predominantly African American in eight of the studies (alpha ranges: SRPS 0.84–0.88, RC 0.78–0.90, DMD 0.63–0.78), predominantly White in seven (SRPS 0.86–0.93, RC 0.76–0.92, DMD 0.61–0.83), mostly Hispanic in two (SRPS 0.82–0.88, RC 0.81–0.90, DMD 0.57–0.63), and equally balanced in three (too few data points to calculate range). Across studies, the full SRPS and RC subscale exhibited consistently good reliability; the DMD subscale was less consistent (Fig. 1). Studies reporting suboptimal DMD alphas (i.e., <0.70; Nunnally & Bernstein, 1994) tended to have younger samples (mean age range 19–27 years), whereas studies reporting higher DMD alphas (>0.70) tended to have older samples (mean age range 35–39 years). Only two studies reported RC subscale alphas below 0.80, and these were the only studies to modify the original wording of the items (Beckman et al., 2006; Tietelman et al., 2008).

Fig. 1
figure 1

Distribution of reported Cronbach’s alpha by population and scale. 1=US African American females; 2 = US Latina females; 3=US multiple race/ethnicity females; 4=South African females; 5=International female samples/other; 6=males; 7=heterosexual couples

Higher perceived relationship power by women (as measured by SRPS) was associated with perceived lower sexual pressure (Jones & Gulick, 2009), less prevalent dating violence (Buelna et al., 2009), less intimate partner violence (IPV) (Buelna et al., 2009; Filson et al., 2010; Pulerwitz et al., 2000), less frequent unprotected anal intercourse (Knudsen et al., 2008), consistent condom use (Amaro et al., 2007; Pulerwitz et al., 2000), and fewer treated STIs (Buelna et al., 2009). In addition, relationship power was a partial mediator of the association between IPV and depression (Filson et al., 2010) and between IPV and sexual risk (Buelna et al., 2009). In other findings, the SRPS showed no evidence of association with proxy microbicide use (Mosack, Weeks, Sylla, & Abbott, 2005), injurious dating violence (Buelna et al., 2009), STI positive tests (Buelna et al., 2009), or frequency of unprotected sex (Operario et al., 2011a).

Higher relationship control (RC subscale) in women was significantly associated with less intimate partner or dating violence (Campbell, Tross, Hu, Pavlicova, & Nunes, 2012; Pulerwitz et al., 2000; Roye, Tolman, & Snowden, 2012; Tietelman et al., 2008; Volpe, Hardie, & Cerulli, 2012), less sex work (Mosack et al., 2010), higher female condom use (Weeks et al., 2010), and less frequent unprotected vaginal sex (Knudsen et al., 2008; Mosack et al., 2010; Pulerwitz et al., 2000; Roye, Krauss, & Silverman, 2010). RC was not associated with preferred contraceptive method (Beckman et al., 2006), diaphragm use satisfaction (Beckman et al., 2006), or childhood sexual abuse (Mosack et al., 2010). In contrast to the original findings of Pulerwitz et al. (2000), several studies found no significant association between RC and condom use frequency (Campbell et al., 2009; Panchanadeswaran et al., 2010; Tietelman et al., 2008). In a study of primarily White women, Knudsen et al. (2008) found that higher RC scores were associated with less frequent unprotected anal intercourse, whereas Koblin et al. (2010) found no evidence of a significant association between RC and unprotected anal sex in a study of predominantly African-American women.

Higher DMD for women was associated with lower frequency of unprotected sex in two studies (Campbell et al., 2009; Pulerwitz et al., 2000) but was not related to sexual risk behavior in two others (Knudsen et al., 2008; Panchanadeswaran et al., 2010). Further, Pulerwitz et al. (2000) found no evidence of an association between DMD and physical abuse or forced sex. Weeks et al. (2010) and Campbell et al. (2012) found that African-American women tended to have higher DMD scores compared with White women.

South African Females

Ten articles reported the use of the SRPS in samples of South African woman, predominantly with young women ages 15–26 years (Dunkle et al., 2004; Jama Shai, Jewkes, Levin, Dunkle, & Nduna, 2010; Jewkes et al., 2006; Jewkes, Dunkle, Nduna, & Jama Shai, 2010; Ketchen et al., 2009; Nduna, Jewkes, Dunkle, Jama Shai, & Colman, 2010; Pettifor et al., 2004; Sayles et al., 2006). All studies were conducted in local South African languages. Five studies were conducted using data from a national intervention program (i.e., Stepping Stones: Dunkle et al., 2004; Jama Shai et al., 2010; Jewkes et al., 2006, 2010; Nduna et al., 2010). The South African version of the SRPS was originally based on 12 items from the RC subscale and expert knowledge of gender issues in South Africa (Dunkle et al., 2004). Tertiles were used as response categories. Dunkle et al. (2004) reported Cronbach’s alpha of 0.84 for this adaptation. Subsequent studies using the same survey have adopted slightly different formulations of the South African SRPS and response sets but with lower alphas (Jewkes et al., 2006: 0.73; Nduna et al., 2010, 0.68). Only two studies have used modified versions of the SRPS with South African samples. Pettifor et al. (2004) created a 4-item SRPS with revised wording and a dichotomous response (agree/disagree), and reported an alpha of 0.69. Ketchen et al. (2009) employed 22 items from the original SRPS with some revision during translation and reported an alpha of 0.70.

In these samples of South African women, significant correlations were found between lower perceived relationship power and inconsistent condom use (Dunkle et al., 2004; Pettifor et al., 2004; Jama Shai et al., 2010), HIV-positive status (Dunkle et al., 2004), HIV incidence (Jewkes et al., 2010), IPV (Dunkle et al., 2004; Jama Shai et al., 2010; Jewkes et al., 2006, 2010), and emotional stress during pregnancy (Groves, Kagee, Maman, Moodley, & Rouse, 2012). However, several studies found that relationship power was not associated with HIV-positive status (Jewkes et al., 2006; Ketchen et al., 2009; Pettifor et al., 2004) or sexual concurrency (Steffenson, Pettifor, Seage, Rees, & Cleary, 2011).

International Female Samples/Other

Six studies that used the SRPS or subscales for HIV-related research were conducted with female populations in other international settings: China, Thailand, Haiti, Mexico, Botswana, Swaziland, and Uganda. These studies recruited mostly younger adult women but included a range of participants from 15 to 49 years of age. Scale and subscale reliability estimates for these studies were largely consistent with prior research. One exception was the Ugandan study by Hatcher et al. (2012) who reported unusually high alphas (DMD = 0.92, RC = 0.95, SRPS = 0.96). In this study, only one unspecified item was removed from the original SRPS, which was administered to 270 anti-retroviral therapy (ART) naïve HIV-positive women recruited from local health clinics. It is not clear why the scale and subscale internal reliabilities are so high in this particular population and setting.

Two studies reported the psychometric properties of the SRPS for female entertainment/sex workers (Ulibarri et al., 2010; Yang & Xia, 2006). In a study of female sex workers from two Mexican/U.S. border cities (i.e., Tijuana and Ciudad Juarez), Ulibarri et al. (2010) found that lower RC scores were associated with greater odds of experiencing IPV. Yang and Xia (2006) recruited women from entertainment establishments (e.g., hair/beauty salons, bathing/massage centers, and karaoke TV halls) in Shanghai, China and used a modified version of the SRPS, which included 12-items from the RC subscale scored on a 5-point Likert scale. Results indicate that relationship power was not significantly related to consistent condom use after adjusting for cognitive/affective factors.

A study of young impoverished Thai women found that low DMD, but not RC, was associated with unprotected sex (Powwattana, 2009). In a study of pregnant Haitian women receiving prenatal care, higher DMD scores for women were associated with intention to use condoms after pregnancy but not with self-reported condom use or STIs in the prior year (Kershaw et al., 2006).

Males

Although the SRPS was designed to measure women’s perceptions of relationship power, 10 studies included in the current review administered the scale to men. None of the 10 studies reported conducting formative work to evaluate the appropriateness of administering the SRPS to men, and with the exception of minor gender-appropriate wording changes, all 10 studies used scale items as originally developed for women. Most of these studies utilized South African modified versions of the RC and DMD with adolescent and adult South African males (Dunkle et al., 2007; Jewkes, Sikweyiya, Morrell, & Dunkle, 2011; Kaufman et al., 2008; Nduna et al., 2010; Sayles et al., 2006; Steffenson et al., 2011). No adaptations of the South African scales were reported, which were unique to male participants. While the DMD subscale has not been utilized with South African females due to actual or perceived poor psychometric properties, one study administered the DMD to South African males and reported Cronbach’s alpha of 0.91 (Kaufman et al., 2008). Magee, Small, Frederic, Joseph, and Kershaw (2006) also found adequate internal consistency (alpha = 0.71) of the DMD administered to a sample of adult Haitian men. Reliability estimates for the RC subscale for South African males have generally mirrored those for South African females, with one exception. Nduna et al. (2010) reported suboptimal reliability (alpha = 0.54) for a 13-item version of the RC administered to sexually active young men in Eastern Cape Province. This low alpha might be due to modifications to the response set in which higher scores indicated more equitable relationship dynamics, rather than a measure of relationship power as originally designed. A study of young sexually active men from rural communities in the U.S. reported an alpha of 0.76 using the original RC subscale (Zukoski et al., 2011). Only a single study reported psychometric properties of the SRPS administered to men (Operario et al., 2011b). The study involved 174 male partners of transgendered women, used a “brief” unspecified version of the SRPS, and reported Cronbach’s alpha of 0.87.

In a study of sexually experienced African males aged 15–26 years, attitudes toward gender relations and relationship control were measured using the same 13-item South African scale developed for female respondents (Dunkle et al., 2007). Men reporting more equitable power in relationships were less likely to engage in transactional sex with partners. Kaufman et al. (2008) surveyed male participants from an urban primary health clinic in South Africa, using a modified version of the SRPS consisting of 10 items from the RC subscale (alpha = 0.89) and 6-items from the DMD subscale (alpha = 0.91). Results indicated that higher perceived RC and DMD were associated with masculine ideology and that negative attitudes toward women were associated with higher RC (but not DMD) scores. Sayles et al. (2006) and Steffenson et al. (2011) examined aspects of relationship power among sexually active South African males aged 15–24 years and found that relationship power was not significantly associated with condom self-efficacy (Sayles et al., 2006) or self-reported partner concurrency (Steffenson et al., 2011).

Heterosexual Couples

Only three published studies report psychometric properties of the SRPS in HIV-related research involving couples, all of which were heterosexual couples. In a study of the sexual dysfunction of young rural Chinese couples, Lau et al. (2006) applied a Chinese language translation of the RC subscale to measure “the extent to which the husband controls the marital relationship” (p. 580). However, the RC was administered only to the women because the authors believed that the subscale had not been validated for men. Cronbach’s alpha for the RC was 0.82. Confirmatory factor analysis supported a one-factor solution. Lower RC scores were significantly associated with wife’s report of higher sexual dysfunction and lower sexual satisfaction. Additionally, men whose wives scored lower on the RC scale were more likely to have a sexual dysfunction.

Gagnon et al. (2010) used the DMD subscale to explore determinants of gender disparities in decision-making power among South Asian migrants residing in Montreal, Canada. The sample consisted of 87 women and 44 men (of which 14 were couples) born in India, Sri Lanka, Pakistan or Bangladesh. The SRPS was translated into French, Hindi, Urdu, and Tamil languages. Internal consistency reliability estimates were not provided, because the DMD subscale was recoded as a dichotomous variable. Overall, about 24 % more men than women considered themselves to have high decision-making power. Among women, high decision-making power was associated with greater knowledge of STIs and higher self-perceived efficacy to ask a sexual partner to use a condom. Gagnon et al. noted that no differences in responses were found within couples but did not provide the results of this dyadic analysis, which may have been underpowered given that only 14 paired couples were included in the sample.

VanderDrift, Agnew, Harvey, and Warren (2013) administered an unspecified 8-item RC scale to both members of heterosexual couples from a multi-race sample recruited in East Los Angeles. Cronbach’s alpha was 0.99 for the sample (alphas were not reported separately for males and females). Dyadic analysis showed that relationship power moderated the effect of condom use intentions on condom use behavior. Specifically, actual condom use within the couple was correlated with the condom use intentions of the partner with the highest relationship power.

Cross-Population Trends in the Psychometric Properties of the SRPS

A systematic comparison of reliability and validity statistics across the reviewed studies revealed several patterns.

Reliability

Among the 54 studies reviewed, 41 reported internal consistency reliability (most commonly Cronbach’s alpha) for one or more study samples. These 41 studies reported a total of 63 alphas: 15 for SRPS, 31 for RC, and 17 for DMD. Less frequent use of the DMD compared with the RC might be related to its inferior reliability scores. Across all studies, the DMD subscale had substantially lower reliability scores compared to either the SRPS (B = −0.26, p < .001) or RC subscale (B = −.17, p < .01) (Table 2). The DMD subscale performed especially poorly in studies with younger females. The mean age of the sample was significantly positively correlated with DMD subscale reliability (r = 0.62, p < .001), even after adjusting for potential confounders (Fig. 2). Of the studies involving U.S. female samples, low DMD alphas (< 0.70) were reported for all five samples of adolescents or young adults, whereas all five studies with adult female samples reported adequate (≥0.70) DMD alphas. The DMD was not used in studies involving South African female samples—one study reported dropping the DMD subscale after finding inadequate psychometrics (Roye et al., 2010), but studies in other countries followed a similar pattern. International studies with young female samples all reported inadequate DMD scale reliability, whereas the one study with an adult female sample (median age 34 years) reported an acceptable DMD alpha. These results indicate that the DMD subscale may not be reliable with younger female samples. Interestingly, the two studies that administered the DMD to male samples reported alphas above 0.70 regardless of mean age.

Table 2 Predictors of internal consistency reliability of SRPS and subscales across studies: Multivariable model using generalized estimating equation (GEE) analysis
Fig. 2
figure 2

Scatter plot of relationship between mean sample age and alpha, after adjusting for population type, number of items per scale, and scale modifications

With few exceptions, the RC subscale performed adequately with regard to internal consistency reliability. Only a single study with U.S. females reported less than adequate RC subscale reliability. In this study, researchers administered the original 15-item RC subscale (with alpha = 0.80) to a sample of Hispanic females in North Carolina, but after conducting a factor analysis (in which three sub-factors emerged) decided to construct a 5-item RC subscale (with alpha = 0.65) for use in the analysis. Thus, the researchers traded reduced reliability for increased construct validity. Two other RC alphas below 0.70 came from a single South African study (Nduna et al., 2010), in which the response set was substantially altered such that higher scores represented more equitable relationships, rather than higher or lower relationship power.

The full SRPS also performed adequately across studies. The one exception involved a study of unmarried sexual active African-American females from Michigan (Younge et al., 2010), which reported an alpha of 0.64. In this study, the SRPS was composed of only eight items culled from both the RC and DMD and used a modified 5-interval response set assessing which member of the couple had more power.

Multivariable linear GEE regression revealed several significant correlates of internal consistency reliability of the SRPS and subscales across HIV-related studied (Table 2). First, alphas for exclusively male samples were significantly lower (B = −0.082, p = 0.009) across studies, whereas alphas for studies involving dyads were significantly higher (B = 0.22, p < .001). Consistent with classical test theory, another independent predictor of scale reliability was the number of items composing the scale. On average, for each additional item added to a scale, the alpha increased by nearly 0.01 (p = .007). The analysis further revealed a significant moderation effect in which the relationship between item number and alpha was significantly stronger for the RC subscale compared to the SRPS (Fig. 3). This finding might reflect the fact that the relationship between item number and alpha is non-linear, with decreased gain in alpha for scales beyond about 10–15 items (Nnadi-Okolo, 1990).

Fig. 3
figure 3

Scatter plot of relationship between number of items per scale and alpha, after adjusting for population type, mean age of sample, and scale modifications

Overall, modifications to the original scale items or response sets had a negative impact on scale reliability (Table 2). Modifications to the response set, in particular, can have a substantial impact on scale psychometric properties. As noted, Nduna et al. (2010) modified the response to RC scale items such that high scores represented more equitable relationships and reported relatively low alphas (0.68 for women; 0.54 for men). Across studies, researchers who modified the original scale or subscales (items or responses) reported alphas that were on average 0.06 points lower compared to those who used the original scales (B = −0.059; 95 % CI −0.107, −0.012; p = .014). However, the relationship between scale modification and alpha is moderated by the type of scale: modifications to the DMD subscale resulted in slightly higher alphas. A further property related to performing scale modifications is loss of predictability. As seen in Fig. 4, studies using the original unmodified scales display a narrower range of alphas (after adjusting for other covariates), indicating that factors such as target population, scale item number, and mean age can account for a higher proportion of the variance in reliability scores across studies, whereas studies using modified scales have less predictable alphas with broader ranges of residuals, including high and low outliers.

Fig. 4
figure 4

Distribution of reported Cronbach’s alpha by scale modification and scale type, after adjusting for population type and mean age of sample. Key: Scale: 1 = DMD; 2 = RC; 3 = SRPS. Scale Modification: 0 = no modification from original; 1 = modification from original

Validity

It is beyond the scope of this review to assess scale content validity, and few HIV-related studies have assessed criterion validity of the SRPS against alternative measures of relationship power. Hence, we will explore evidence of validity by examining construct predictive validity based on the theory that relationship power is associated with certain outcomes, particularly condom use and intimate partner violence. This task is complicated by the use of different measures of condom use and HIV risk across studies, the reliability and validity of which vary considerably. Nonetheless, several informative patterns related to construct validity emerged from our analysis.

Across reviewed studies, 32 analyses were performed that examined the association between SRPS (or subscale) scores and measures of condom use. The SRPSm (developed specifically for analyses with condom use outcomes) has not been applied consistently across studies involving condom use behavior. Only seven of the 32 analyses used the SRPSm (or modified subscale) when examining condom use outcomes; 14 analyses retained the condom-associated SRPS items with condom use outcomes; and 11 analyses used an adapted version of the scale or subscales and did not indicate if condom-items were removed.

Of the 32 analyses reporting condom use outcomes, 19 (59 %) found that higher relationship power for females predicted greater condom use, at the 0.05 significance level. Cross-study analysis indicated that use of the DMD subscale displayed decreased odds of predicting condom use compared to either the SRPS (OR = 0.29; 95 % CI 0.10, 0.83; p = .02) or the RC subscale (OR = 0.13; 95 % CI 0.10, 0.16; p < .001). There was one exception to this rule: compared to either the SRPS (OR = 0.94; 95 % CI 0.91, 0.98; p = .007) or RC subscale (OR = 0.93; 95 % CI 0.92, 0.95; p < .001), the DMD subscale showed a significant negative relationship between mean sample age and ability to predict condom use behavior. In other words, relative to the SRPS and RC, the odds of finding a significant association between DMD and condom use increased with decreasing mean age of the study sample. Hence, for the DMD subscale, younger age of the sample appears to reduce reliability but improve validity. Not surprisingly, studies with larger samples had increased odds of finding a significant association between SRPS (and subscale) scores and condom use (OR = 1.26; 95 % CI 0.98, 1.63; p = .08).

Scale modification, number of items per scale, and Cronbach’s alpha were not significantly associated with whether the SRPS or subscales predicted condom use across studies. It should be noted, however, that these factors might have an impact on validity in individual studies. For example, Roye et al. (2010) explored relationship predictors of sexual risk behavior and performed parallel analyses using two types of response sets for the RC subscale. When analyzed as a categorical variable, high relative to low RC was significantly associated with higher condom use and lower frequency of anal intercourse, but these associations were not significant when the RC subscale was analyzed as a continuous variable.

Fourteen studies reported results that examined the association between SRPS or subscale scores and measures of IPV. Of these, 12 (86 %) were found to be statistically significant at the 0.05 alpha level, with higher relationship power for women predicting less IPV. The two reports of no association with IPV both used the DMD subscale, whereas the 12 studies reporting a statistically significant association with IPV all used either the SRPS or RC subscale.

Discussion

Since its conception, the SRPS and subscales have been used in numerous HIV/STI-related studies with a variety of populations. Numerous modifications and translations of the scales have been undertaken and used to test theoretical relationships encompassing a broad range of predictor and outcome variables. The following section highlights some of the central findings of our psychometric review of the SRPS literature from 2000 to 2012.

With few exceptions, the SRPS and RC subscale show good psychometric properties across numerous populations and research settings. The RC subscale is the most utilized measure of relationship power in the HIV/STI prevention literature. Our analysis indicates, however, that the reliability of the RC subscale can be particularly sensitive to reductions in the number of items used. Overall, studies that used the original versions of the SRPS and RC subscale displayed acceptable reliability and validity across numerous populations and settings. Thus, unless researchers have a substantive reason for revising scale content, there appears little justification for modifying items or response sets when using the SRPS or RC subscale. Indeed, even application of the SRPS and subscales to male respondents, without special adaptation or modification, appears to be valid and reliable. However, there is a need for more research to examine how men interpret SRPS items and the conditions under which the original SRPS may not be appropriate for male respondents.

Case and cross-study analysis indicate that the DMD subscale exhibits relatively weak psychometric properties across most populations and settings. In particular, the subscale is associated with lower reliability compared to the SRPS and RC subscale, especially among younger female samples. This may be due to the younger developmental stage of intimate relationships in adolescents, where dyadic decision-making has different focal points. Roye et al. (2010) reported poor factorability (KMO = 0.56) and the emergence of three separate sub-factors within the DMD, and numerous studies have removed items from the original 8-item subscale due to low factor-item correlation (Kaufman et al., 2008; Kershaw et al., 2006; Magee et al., 2006; Panchanadeswaran et al., 2010; Roye et al., 2012; Shannon et al., 2012; Weeks et al., 2010). The DMD also displays inconsistent construct validity. The weak psychometric properties of the original DMD may account for the observation that, unlike the SRPS and RC subscale, modifications to the DMD tend to increase rather than decrease reliability (Table 2). Researchers wanting to utilize the DMD subscale as a separate measure, should consider the validity of the subscale in the applied context, determine whether modifications are warranted for the target population, understand its psychometric deficiencies, and validate the psychometric properties of the subscale prior to inferential analysis.

Under some circumstances, scale modifications may be required to improve scale validity. Indeed, the SRPS and RC subscale have proved to be extremely adaptable for use in a wide range of cultural settings. Major modifications such as the South African version of the SRPS and RC subscale provide one example, given the cultural differences that exist between South African populations and those for which the scale was original designed. Yet, it is clear that modifications to item wording (Beckman et al., 2006; Tietelman et al., 2008), removal of items (Parrado et al., 2005; Younge et al., 2010), and response set modifications (e.g., Nduna et al., 2010; Pettifor et al., 2004) have been found to reduce reliability across studies. Removal of items with low factor-item correlations might or might not increase reliability, depending on the number of items in the scale before and after removal, but may be necessary to improve construct validity. SRPS and subscale scoring were standardized in the original development of the scale with the purpose of facilitating comparisons of relationship power reported across samples (Pulerwitz et al., 2000). In general, adaptations in SRPS scoring, particularly scoring based on sample distribution, make it difficult to conduct comparisons of relationship power across populations.

The use of factor analysis to assess the reliability and construct validity of the SRPS and subscales for a particular sample is recommended. Confirming the uni-dimensionality of the scale or demonstrating the existence of sub-factors within the scale could lead to increased construct and predictive validity (e.g., Parrado et al., 2005). This approach requires a proper balance between reliability and validity due to item removal and other scale modifications. Lack of consideration of this balance may be partly responsible for the observation that less than two-thirds of the studies included in this review detected an association between relationship power and condom use. Indeed, this might be an overestimate given that some of these correlations may be due to the inclusion of condom use items within the power scales. Small sample sizes and limited power also likely played a role.

As discussed, analysis of risk of bias was beyond the scope of this paper, so limitations in this review include publication and reporting bias. Researchers may not have included the SRPS in analysis or in publication if the scale demonstrated weak psychometric properties. Nor did we systematically assess study quality. Nearly all of the studies employed a cross-sectional observational design, and we did not observe a great deal of variation in methodological quality across studies. The primary design elements affecting study validity that varied across studies were sample size and number of items per scale, and we considered these elements in our interpretation of results. Another limitation was the lack of detail in publications on modifications made to scales and subscales. Finally, samples were limited, in particular, the SRPS and subscales were employed much less frequently with males and heterosexual couples, and no published studies were found to apply the SRPS to same-sex couples. Our review indicates that the utility of the SRPS may extend to these and other vulnerable populations at risk for HIV/STI acquisition.

Based on our findings, we make the following recommendations to researchers conducting HIV/STI research using the SRPS or subscales: (1) reliability and validity may vary across different populations and age groups, and factor analysis should be performed to assess the psychometric properties of the SRPS and subscales for a particular research sample and setting; (2) the full complement of items in each scale (particularly the RC subscale) should be used for data collection, and removal of items should be based on the results of post hoc item analysis; (3) the condom-modified versions (SRPSm, RCm, DMCm) should be used when models include condom use outcomes; (4) scale reliability should be reported separately for males and females; (5) the items included in the scales for analysis should be precisely specified; (6) modifications and adaptations of the scales should be described, including wording changes, altered response sets, and scoring details; and the underlying rationale for modifications should be provided; (7) application of the SRPS and subscales to male samples and to both members of couples is recommended to explore the dyadic nature of power imbalances, perceptions of power imbalances, and their effects on sexual risk behavior. Only a single study used dyadic data analysis to examine how the scale performed across couples or which score was able to discriminate across outcomes (VanderDrift et al., 2013). Future investigations using the SRPS and subscales should examine both heterosexual and same-sex couple-level data, as well as the correlations among partner reports. Dyadic data from couples, including same-sex couples, are likely to yield many interesting and informative comparisons in terms of perceptions of relationship power. Moreover, given that the SRPS has been primarily administered at a single time point, further studies should be designed to examine SRPS scores over time, rather than cross-sectionally. Factors that influence changes in relationship power need to be identified to inform effective risk reduction interventions. Furthermore, intervention studies need to examine their effect on increasing relationship power equity and subsequent changes in HIV-related outcomes.