Introduction

The U.S. has witnessed a rapid and dramatic change in attitudes toward lesbian, gay, bisexual, and transgender (LGBT) issues. For example, public opinion regarding marriage equality has shifted by 31 % within the last 19 years (McCarthy, 2015). Reflecting this trend, the Supreme Court has reversed its position on multiple civil rights concerns pertaining to LGBT issues in the decades after the 1986 Bowers v. Hardwick ruling, most recently making a landmark decision to recognize the rights of same-sex couples to marry (Obergefell v. Hodges) (Epps, 2015). While these cases involved dramatic changes in same-sex relationship rights specifically, the common parlance acronym of LGBT and the political alliance of transgender with same-sex concerns have helped to concurrently bring transgender-related issues into greater awareness. In just the last few years, media coverage of transgender issues has soared, in part through events such as Laverne Cox, a transgender woman, appearing on the cover of Time magazine, and Caitlyn Jenner, a former Olympic athlete, coming out as transgender to Barbara Walters in a highly publicized interview. Likewise, Web searches using the term “transgender” rose 500 % between July 2013 and July 2015 (Google Trends 2015). Given this rapidly changing climate, it is surprising to note that only five empirical studies of US attitudes toward transgender persons have been conducted in the last decade (Nagoshi et al., 2008; Nisley, 2011; Norton & Herek, 2012; Walch, Nagamake, Francisco, Stitt, & Shingler, 2012; Willoughby et al., 2010). As the U.S. experiences a dramatic increase in media exposure and legal protections for transgender individuals (Transgender Law Center, 2015), it is critical to gather accurate measures of attitudes toward this population. Here, we report the development and validation of a timely, multidimensional scale using a sample from the U.S. population. Given the high degree of Christian representation within the U.S. population and their unique attitudinal claims, the new scale was additionally designed to capture the nuances of this group.

Religiosity and Attitudes Toward Transgender

To date, 14 studies examining attitudes toward transgender individuals have been conducted in the Western world (Antoszewski, Kasielska, Jedrzeczak, & Kruk-Jeromin, 2007; Devor, Kendel, & Strapko, 1997; Franzini & Casinelli, 1986; Green, Stoller, & MacAndrew, 1966; Harvey, 2002; Hill & Willoughby, 2005; Landen & Innala, 2000; Leitenberg & Slavin, 1983; Nagoshi et al., 2008; Nisley, 2011; Norton & Herek, 2012; Tee & Hegarty, 2006; Walch et al., 2012; Willoughby et al., 2010). Of the 14 studies, three specifically examined the relationship between religiosity and attitudes toward transgender persons and found religious people to hold more negative attitudes toward transgender persons compared to their nonreligious counterparts. In a study examining opposition to transgender persons’ civil rights in the UK, Tee and Hegarty (2006) found that more religious people expressed stronger opposition to transgender persons’ civil rights. In a similar study conducted in the U.S. by Nagoshi et al. (2008), “transphobia” (i.e., prejudice against gender nonconforming persons) was found to be “significantly and highly correlated with right-wing authoritarianism, religious fundamentalism, and hostile sexism,” again suggesting that religious individuals tend to hold negative views toward transgender persons. In the same study, it was found that a more restrictive view of sexuality and support of traditional gender roles—traits typically associated with religiosity—were also correlated with prejudice against gender nonconforming individuals. As well, in a more recent study conducted with a large U.S. sample, Norton and Herek (2012) found that “women held more negative attitudes toward transgender people to the extent that they said religion provided greater guidance in their daily lives,” again indicating a positive correlation between religiosity and negative evaluations of transgender persons, though, in this case, specifically applying to females.

Contextual Considerations for Scale Development

While the three studies examining the relationship between religiosity and attitudes toward transgender persons seem to suggest that religious individuals hold unambiguously negative attitudes toward transgender persons, Rosik, Griffith, and Cruz (2007) have warned that, in questionnaire research, nuances in attitude are often lost when scales are not constructed with sensitivity to religious beliefs, resulting in a failure to provide an accurate measure of the construct of interest. Specifically, in examining conservative religious people’s attitudes toward gays and lesbians, arguably a useful comparison case for attitudes toward transgender persons, researchers have found notable attitude differences depending on whether questions focus on the person or the behavior (Bassett et al., 2000; Fulton, Gorsuch, & Maynard, 1999; Wilkinson & Roys, 2005). This person-behavior distinction (the view that each person has equal dignity and value regardless of their behavior) is one made by a majority of those who hold to the Christian faith. Because approximately 71 % of all U.S. citizens identify as Christian (Pew Research Center, 2015), there is a need for instruments to capture these variances in Christians’ attitudes arising from their belief system.

A secondary contextual concern relates to the issues of timeliness. Just as researchers of attitudes toward lesbians and gays have advocated for timely, culturally relevant scales (Herek, 1994; Worthington, Dillon, & Becker-Schutte, 2005), this concern is highly salient in the measurement of attitudes toward transgender persons. Given that none of the presently available, validated transgender attitude scales include questions pertaining to civil rights of transgender persons, along with the fact that only three scales have been developed within the past decade, there is a need for new scales relevant to the present time (Hill & Willoughby, 2005; Nagoshi et al., 2008; Walch et al., 2012).

Psychometric Considerations

While there are three existing transgender attitude scales that have undergone psychometric evaluation, a significant psychometric limitation of these scales is the fact that they were normed with samples consisting largely or exclusively of college students. This narrowly defined population used for scale development undermines the validity of the instrument when it is utilized with a broader population (Henrich, Heine, & Norenzayan, 2010).

Another limitation pertains to the reductionist conceptualization of the construct of interest. In research on attitudes toward sexual minorities, scholars have increasingly moved from a conceptualization of attitude as a single construct to that of a “multi-dimensional and wide-ranging” construct (Worthington et al., 2005) to better account for its complexities (Fyfe, 1983; LaMar & Kite, 1998; McNaught, 1997; Mohr, 2002). For example, one study yielded four factors: condemnation/tolerance, morality, contact, and stereotypes (LaMar & Kite 1998), and another yielded five factors: internalized affirmativeness, civil rights attitudes, knowledge, religious conflict, and hate as “separate, but interrelated dimensions of heterosexual knowledge and attitudes regarding LGB individuals” (Worthington et al., 2005). In contrast, two of the three extant transgender attitude scales are one-dimensional (Nagoshi et al., 2008; Walch et al., 2012) and the third is bi-dimensional (Hill & Willoughby, 2005). Findings from the research on attitudes toward sexual minorities provide support that a transgender attitudinal construct may also be multi-dimensional.

Based on the needs for a contextually relevant and psychometrically sound instrumentation, the current two-phase study validated a scale that improves upon the limitations of earlier scales.

Phase 1: Scale Development and Exploratory Factor Analysis

The first phase of the study was devoted to the development of a psychometrically sound and contextually relevant transgender attitude scale as described above.

Method

Participants

A sample sufficient in size to perform an exploratory factor analysis was collected using Amazon Mechanical Turk (MTurk) (MacCallum, Widaman, Zhang, & Hong, 1999). MTurk is a 10-year-old service that has been increasingly employed as a participant-recruitment tool by social scientists and has been shown to provide samples of equal to greater quality than traditional internet and college samples, producing data that meets or exceeds “psychometric standards associated with published research” (Buhrmester, Kwang, & Gosling, 2011).

For this study, participants were restricted to individuals residing in the U.S. over the age of 18 years. Because the purpose of the study was to develop and validate a scale sensitive to religious nuances, particularly those of evangelical Christians, stratified sampling was employed, using screening questions on MTurk in combination with the quota function on the survey software Qualtrics to ensure that there was adequate evangelical Christian representation in the sample. The focus was placed on evangelical Christians because they hold to distinct doctrinal views that likely affect attitudes toward transgender persons (i.e., the person-behavior distinction and a dichotomous view of sex and gender) (Frame, 2006; Ortlund, 2006).

After data screening was conducted, a sample of 295 participants consisting of 55.3 % female and 44.7 % male, ranging in age from 18 to 75 years with a mean of 36.6 (SD = 11.9), were included in the exploratory factor analysis portion of the study. Concerning ethnicity, marital status, and education, 81.4 % were Caucasian, 47.1 % married, and 50.5 % reported holding at least a Bachelor’s degree. Participants were asked religious affiliation using the following choices: none, evangelical Christian, Catholic, Jewish, Muslim, other. Specifically, 41.4 % of participants indicated having no religious affiliation while 54.2 % reported religious backgrounds rooted in Christianity: 36.6 % evangelical Christian, 11.5 % Catholic, and 6.1 % non-evangelical Christian. The evangelical Christian representation in the sample slightly over-estimates the evangelical proportion of the general U.S. population, which is estimated to be between 25 and 35 % (Institute for the Study of American Evangelicals, 2015; Pew Research Center, 2015). More details concerning demographic information are shown in Table 1.

Table 1 Demographic characteristics

Measures

As a first step in the scale development process, a thorough review of the related literature and extant questionnaires on attitudes toward sexual and gender minorities was conducted. After consulting a university reference librarian, the researchers employed six databases (Academic Search Complete, ATLA Religion Database, Gender Studies Database, PsycARTICLES, PsycINFO, and SocINDEX) to identify relevant studies, PsycTESTS to locate extant scales on attitudes toward gender and sexual minorities, and references of identified studies to determine additional studies for inclusion. Searches were conducted using cognates of terms related to sexual and gender minorities (e.g., transgender, transsexual, LGBT, lesbian, gay, homosexuality), religion (e.g., conservative, religious groups, Christianity, evangelical), and attitudes/scale (e.g., instrument, scale, measure, attitudes, beliefs). The focus of the literature review was to understand how attitudes have been conceptualized in similar studies and to determine areas where improvement was necessary in existing transgender attitudes scales. Based on the literature review, the researchers determined to use a multi-dimensional model of defining attitudes toward transgender persons, consisting of dimensions falling under the two broad conceptual categories of cognitive evaluations and affective reactions and to specifically tap religious nuances in attitudes toward transgender individuals.

Item Pool Generation

Using questions from existing scales and studies, along with novel items to adequately represent the religious nuances and the various dimensions of the target construct, the first author generated a pool of 96 items.Footnote 1 Of the questions incorporated from existing scales, some were taken directly from extant transgender attitudinal scales (Harvey, 2002; Hill & Willoughby, 2005; Landen & Innala, 2000; Nagoshi et al., 2008; Tee & Hegarty, 2006; Walch et al., 2012) while others were modified from extant homosexuality attitudinal scales (LaMar & Kite, 1998; Worthington et al., 2005). In the item generation process, several considerations were made based on DeVellis’ (2012) work. Firstly, both positively and negatively worded items were included in order to avoid acquiescence bias (e.g., “Transgender people should not be allowed to adopt and raise children”; “Transgender individuals should be treated with the same respect and dignity as any other person”). Secondly, a Likert scale was chosen as the best item response form for this instrument as it was designed to measure attitudes. Thirdly, fairly strong language was used for each item so as not to elicit too much agreement by the use of extremely mild statements (e.g., “A person who is not sure about being male or female is mentally ill”). Fourthly, the researcher endeavored to generate statements with clarity, brevity, and appropriateness of language, which was tested through expert consultation (details below). Fifthly, based on insight from the literature, item wording was carefully considered in order to develop a scale adequate to capture religious nuances of attitudes toward transgender persons. Specifically, language from religiously sensitive scales and an expert on Christian theology were consulted to refine item wording. Finally, a level of redundancy was allowed in the item pool based on the assumption that specific wording might be found preferable through factor analysis.

Expert Consultation and Initial Test Revisions

Three experts then reviewed the initial item pool: a faculty member with expertise in sexual minorities studies, a faculty member with expertise in scale development, and a faculty member with an additional graduate degree in Christian theology as an expert in Christian thought. Each reviewer was asked to evaluate items for conceptual coherence, relevance, and appropriateness to the target subpopulation in light of their area of expertise with a focus on brevity, clarity, and singularity of each item. The primary researcher revised item wording based on written and verbal feedback from each of the three experts. At this point, all 96 items were retained as the majority of feedback pertained to item wording.

After the first round of expert consultation, the primary researcher and scale development expert were selected to conduct a close examination of the item pool. During the second round of evaluation, questions pertaining to cognitive evaluation were refocused to target underlying beliefs regarding gender and sex as dichotomous. Questions explicitly pertaining to human value (not found in existing scales) were included in order for the scale to illuminate the person-behavior differentiation highlighted by evangelical Christians. Items concerning civil rights were also included. Questions related to social/affective responses were designed with a view toward capturing interpersonal comfort in increasing social distance along the spectrum of closedness to openness, ranging from the affective states of antipathy and apathy on the one end, moving toward ambivalence, then finally to interest and acceptance on the other. Each item was worded with a personal orientation in order to avoid unnecessarily abstract statements. The two researchers discussed each item to refine item wording and to make preliminary inclusion/exclusion suggestions, after which the draft of the reduced item pool was sent to the other researchers for review and feedback.

Initial Scale

After the third round of expert evaluation and negotiation as to the appropriate length of the initial scale to be subjected to exploratory factor analysis, the item pool was reduced to 48 questions. The item pool was left sufficiently large so as not to lose its intended scope and range. The question order was randomized using a random integer set generator, and all participants were presented the items in the same randomized order. Based on findings from Meade and Craig (2012), three attention check items were included as part of the questionnaire in order to safeguard against careless participants. The survey building software, Qualtrics, was used to create the initial scale, and the “request response” function, which generates an alert to participants when there are unanswered questions, was also employed in order to minimize inadvertent item nonresponse. For the purpose of this study, participants were provided with the following definition of transgender: “a transgender person is defined as a person whose biological sex does not match their felt sense of self as male or female.” In addition to the 48 questions, eight questions were included pertaining to demographics (sex, age, ethnicity, education, marital status, and religious affiliation), gender identification, and contact with transgender persons. Contact with transgender persons included the following choices: immediate family member, relative, friend, neighbor, coworker, other (please specify), and I do not know anyone who identifies as transgender. At the end of the survey, an open-ended comment box was also provided for participants to offer additional comments.

Procedure

After approval was obtained from the university’s Institutional Review Board, participants were recruited through MTurk. The questionnaire, along with the informed consent, was made available to participants through the MTurk interface. The study was set up in such a way that clicking on the “next” button at the end of the informed consent would indicate participants’ agreement and subsequently direct informed participants to the survey available on a secure webpage. Participants were paid $.70 to complete the 5- to 10-min questionnaire, a rate comparable to survey studies of similar lengths made available by other researchers on MTurk.

Results

After data collection, item–total correlation was first evaluated and four poorly performing items were eliminated, leaving 44 items with a Cronbach’s alpha value of .98 to undergo exploratory factor analysis. After completing data screening using SPSS (version 22.0), FACTOR (Lorenzo-Seva & Ferrando, 2015) was used to perform exploratory factor analysis (exploratory maximum likelihood) in order to evaluate the initial factor structure of the scale. Oblique rotation (normalized direct oblimin) was specified, as there was strong reason to believe that the factors would be correlated (factors being related dimensions of the underlying construct of interest), and factor loadings below .40 were suppressed based on Brown’s (2015) recommendation. The initial solution produced three factors with eigenvalues above 1, explaining 69.6 % of the variance. A total of five items that either cross-loaded or did not load were eliminated (1. “If a family member revealed that they were transgender, I would love that family member just the same”; 2. “Transgender couples should have the same rights to adopt children as any other couple”; 3. “I’m not really interested in knowing more about transgender people”; 4. “There should be a place in faith communities for transgender persons”; and 5. “A person can look like a male or female but feel like the opposite gender”), and a second rotated factor analysis (normalized direct oblimin) was performed with the remaining 39 items. The results yielded a simple solution with acceptable fit indices. In order to shorten the scale, items with loadings below .50 with the exception of one item were eliminated, which yielded a simple solution with similar fit indices. The item, “If a transgender person identifies as female, she should have the right to marry a man,” was retained because this question loaded above the recommended value of .40 and was conceptually significant, given the current political climate in the U.S. Again, for purposes of brevity and balance of items, the question with the lowest loading on Factor 1 (.55) was eliminated and a fourth rotated factor analysis was performed with the remaining 33 questions to examine the fit indices.

The result was an interpretable, simple three-factor solution, accounting for 74.5 % of the variance. Each of the 33 items had moderate to high factor loadings, ranging from .46 to .97. The first factor, labeled interpersonal comfort, contained 16 items. The second factor, labeled sex/gender beliefs, contained 11 items. The third factor, labeled human value, contained six items. The items and their factor loadings are shown in Table 2. The reliability estimates for each factor were high—α = .97 for Factor 1, α = .95 for Factor 2, and α = .94 for Factor 3—revealing high internal consistency of each subscale. Cronbach’s alpha for the overall scale was .97, also indicating the reliability of the overall scale. No corrected item–subscale correlation values fell below .30; thus, all 33 items were retained to be included in the final scale (Nunnally & Bernstein, 1994). The fit indices for the three-factor solution also indicated a good fit with the following values: RMSEA = .06, RMSR = .02, NNFI = .94, and CFI = .95 (RMSEA, RMSR, NNFI, and CFI are statistics measuring the level of acceptable fit of the model). The values on this initial scale suggest scores within the threshold indicating an acceptable fit.

Table 2 Factor loadings for the post-EFA initial 33-item scale

Phase 2: Scale Validation Through Confirmatory Factor Analysis

The purpose of the second phase of the study was to administer the newly developed instrument to an independent sample to test the stability of the factor structure and further analyze its reliability and validity.

Method

Participants

MTurk was employed again to obtain a large sample of participants residing in the U.S., 18 years and older, for the confirmatory factor analysis (CFA) phase of the study. The “Set Embedded Data” function in Qualtrics was used to reject participants who participated in Phase 1 of the study to assure a sample independent from the first. The same stratified sampling procedure was used to ensure evangelical Christian representation in the sample population. After data screening, a sample of 238 participants consisting of 55.5 % female and 44.5 % male, ranging in age from 19 to 66 years with a mean age of 33 (SD = 10.3), were included. Of the sample, 80.3 % were Caucasian, 37.8 % married, and 46.6 % reported holding at least a Bachelor’s degree. Specifically, 38.7 % of participants indicated having no religious affiliation while 58 % reported religious backgrounds rooted in Christianity: 41.6 % evangelical Christian, 10.1 % Catholic, and 6.3 % non-evangelical Christian. More details of the demographic characteristics of the sample are shown in Table 1.

Measures

Participants were presented with a questionnaire consisting of the newly developed Transgender Attitude and Beliefs Scale (TABS) and four standardized self-report measures along with eight demographic questions (sex, age, ethnicity, education, marital status, religious affiliation, gender identification, and contact with transgender persons) in order to test the psychometric properties of TABS. Specifically, the Attitudes Toward Transgender Individual Scale (ATTI) and the Genderism and Transphobia Scale (GTS) were included to test for convergent validity, and the Rosenberg Self-Esteem Scale (RSES) and the short form of the Marlowe–Crowne Social Desirability Scale (M-C SDS) were utilized to test discriminant validity (see descriptions of each scale below). Mirroring the previous study, participants were given the same definition of transgender as in Phase 1.

Attitudes Toward Transgender Individuals Scale (ATTI)

The ATTI is a single-factor 20-item scale developed by Walch et al. (2012), assessing transgender stigma. Participants were asked to rate items such as “Transgender individuals should not be allowed to cross dress in public” on a five-point Likert scale, ranging from 1 = strongly agree to 5 = strongly disagree. After reversing negatively worded items, higher scores reflect greater acceptance of transgender persons. The scale has demonstrated reliability (α = .95) as well as evidence of convergent and discriminant validity.

Genderism and Transphobia Scale (GTS)

The GTS is a two-factor 32-item scale developed by Hill and Willoughby (2005), measuring “violence, harassment, and discrimination toward cross-dressers, transgenderists, and transsexuals” without the use of explicit labels. Questions such as “I have beat up men who act like sissies,” “Feminine boys should be cured of their problems,” and “God made two sexes only” are used to measure the latent variable. Responses were rated on a seven-point Likert scale, ranging from 1 = strongly agree to 7 = strongly disagree, with a higher overall score reflecting greater tolerance (in attitude and behavior) of gender nonconforming individuals. According to Hill and Willoughby, the measure demonstrates strong internal consistency (α = .94–.96 overall) along with evidence of convergent and discriminant validity.

Rosenberg Self-Esteem Scale (RSES)

The RSES is a widely used measure of global self-esteem developed by Rosenberg (1989). It is a 10-item scale with statements such as “I am able to do things as well as most other people” and “I certainly feel useless at times,” which are rated on a four-point Likert scale, ranging from 1 = strongly agree to 4 = strongly disagree. After reverse coding negatively worded items, a higher score denotes greater self-esteem. The scale has demonstrated evidence of reliability (average Cronbach’s alpha value of .81) and validity in multiple studies across multiple cultures (Schmitt & Allik, 2005).

Marlowe–Crowne Social Desirability Scale

The original Marlowe–Crowne Social Desirability Scale (M-C SDS) is a 33-item scale developed by Crowne & Marlowe (1960), assessing participants’ tendency to provide a socially desirable response, using a true–false response format. Some questions in the scale include: “There have been occasions when I took advantage of someone” and “I’m always willing to admit it when I make a mistake.” The scale is reported to have high internal consistency (K-R20 = .88) and strong evidence of convergent and discriminant validity (Crowne & Marlowe, 1960). The study at hand utilized the 13-item short form C developed by Reynolds (1982), which strongly correlates with the original version (r = .93).

Procedure

Just as in the first phase of the study, participants were recruited through MTurk, and the five scales (in the order of TABS, GTS, ATTS, RSES, and MC-SDS), along with the informed consent, were made available to participants through the MTurk interface. Again, the study was set up in such a way that clicking on the “next” button at the end of the informed consent would indicate participants’ agreement and subsequently direct informed participants to the questionnaire available on a secure webpage. The question order in TABS was first randomized using a random integer set generator, and all participants were presented the items in the same randomized order. Based on findings from Meade and Craig (2012), a total of five attention check items were included in the questionnaire in order to safeguard against careless participants. In this phase of the study, the attention check items were raised from three (Phase 1) to five due to the inclusion of additional measures, which increased the overall number of items answered by the participants. The survey building software, Qualtrics, was used to create the questionnaire, and the “request response” function, which generates an alert to participants when there are unanswered questions, was also designated in order to minimize inadvertent item nonresponse. Participants were paid $.80 to complete the questionnaire, a rate based on other survey studies of similar length made available on MTurk by other researchers. The survey took an average of 13 min to complete.

Results

Replication of Factor Structure Through Exploratory Factor Analysis (EFA)

First, data screening was performed to ensure that the collected data met key assumptions (sufficient sample size, interval-level scale, and multivariate normality) required for factor analysis using maximum likelihood (ML) estimation (Brown, 2015). Following data screening, FACTOR (Lorenzo-Seva & Ferrando, 2015) was used to run an exploratory factor analysis to examine whether the factor structure of TABS from Phase 1 of the study would be reproduced with the independent sample. The analysis yielded a three-factor structure, replicating the results of the first EFA with the exception of one item (“I have a hard time respecting transgender individuals”) cross-loading on Factors 1 and 2 instead of yielding a simple structure. Because the item loaded similarly on both factors (.42 on one and .48 on the other), the item was retained in the factor where there was a closer conceptual fit. Factor loadings (between .419 and .946), reliability estimates (Cronbach’s alpha values: Factor 1 α = .98, Factor 2 α = .97, Factor 3 α = .96), and model fit indices (RMSEA = .07, CFI = .94, NNFI = .93, RMSR = .02) were largely comparable to the results of Phase 1 of the study with RMSEA in the second sample being slightly higher than the original sample, though still falling within the acceptable range (Brown, 2015), suggesting a generally stable factor structure of TABS.

Confirmatory Factor Analysis (CFA)

The researchers then conducted a CFA using SPSS Amos 22.0 with ML estimation to further assess the stability of the factor structure of the refined scale. Based on prior evidence from the original EFA and theory bearing on the multidimensionality of attitudinal scales, a model with three factors was specified in which 16 indicators loaded on Factor 1 (interpersonal comfort), 11 indicators on Factor 2 (sex/gender beliefs), and six indicators on Factor 3 (human value). In the measurement model, error measurements were presumed to be uncorrelated and no indicator double-loadings were permitted. The three factors—interpersonal comfort, sex/gender beliefs, and human value—were permitted to correlate based on evidence of factor interrelatedness from the original EFA. Accordingly, the model was over-identified with 492 df and yielded a model fit approaching the acceptable range: RMSEA = .09, TLI = .90, CFI = .90. Modification indices and standardized residuals were then examined to identify localized areas of strain. Q3.25 was eliminated because of high modification index values, and CFA was rerun, which produced a better model fit. The same procedure was repeated three additional times, where, in each round, an item with a high modification index value was eliminated, each iteration producing increasingly better model fit. The item elimination process was not pursued beyond the fourth iteration, as the fifth produced a poorer fit. In this manner, a total of four items were eliminated from the scale. The eliminated items include the following: (1) “I have a hard time respecting transgender individuals”; (2) “Transgender individuals should have the same access to healthcare benefits as any other person”; (3) “If I was introduced to a transgender person at a party, I would feel comfortable having a polite conversation with that person”; and (4) “Even if someone has sex reassignment surgery, they are still the biological sex they were born as.”

At this point, modification indices were examined to consider possible error covariances to attain greater parsimony. Two errors, e8 and e2, were permitted to correlate because there was reason to believe that there would be error covariance due to similar wording and close conceptual correspondence between the two items (1. “I would feel comfortable having a transgender person into my home for a meal”; 2. “If my child brought home a transgender friend, I would be comfortable having that person into my home”). A CFA with the revised parameter specifications, in fact, yielded a better model fit. Two additional errors, e24 and e21, were also permitted to correlate for the same reason. The items associated with these errors were: (1) “All adults should identify as either male or female”; (2) “Humanity is only male or female; there is nothing in between.” The revised model (see Fig. 1), specifying three factors (with 14 items loading on Factor 1, 10 items on Factor 2, and five items on Factor 3), three-factor covariances, and two error covariances (e8 and e2, e21 and e24) resulted in an interpretable model, sufficiently reproducing the observed relationship among indicators: χ2(df = 37, p < .001) = 897.02, RMSEA = .07 (90 % CI .07–.08), CFI = .94, TLI = .93, and SRMR = .05. Additionally, each of the 29 items had moderate to high factor loadings, ranging from .43 to .94, suggesting that the indicators were highly related to the purported factors (see Table 3). The values of factor correlations among the three subscales also supported the multidimensional conceptualization of the construct of interest: interpersonal comfort and sex/gender beliefs: r = .85; interpersonal comfort and human value: r = .77; sex/gender beliefs and human value: r = .62. The reliability estimate for each factor was high—α = .97 for Factor 1, α = .95 for Factor 2, and α = .93 for Factor 3—revealing high internal consistency of each subscale. Cronbach’s alpha for the overall scale was .98, also demonstrating the reliability of the overall scale. The possible raw range for the subscales was as follows: interpersonal comfort = 84 points (from 14 to 98); gender beliefs = 60 (from 10 to 70); human value = 30 points (from 5 to 35). The means and SD of the factor scores for the three subscales by gender are shown in Table 4.

Fig. 1
figure 1

Visual representation of TABS CFA model

Table 3 Factor loadings for the final 29-item TABS
Table 4 Mean and SD of subscales

Construct Validity

In order to evaluate the convergent validity of the new scale, correlations between TABS and two previously validated transgender attitude measures (ATTI and GTS) were examined using Pearson’s coefficients. Because higher scores on all three scales indicate more favorable attitude toward transgender individuals, it was expected that TABS would demonstrate a strong positive correlation with both the ATTI and the GTS. Upon calculating Pearson’s coefficients, TABS was found to correlate strongly in the expected direction with the GTS (r (236) = .88, p < .001) and the ATTI (r (236) = .95, p < .001), thus demonstrating its convergent validity.

Discriminant validity of TABS was evaluated by examining the correlation coefficients between TABS and two scales assessing constructs that are theoretically unrelated to attitudes toward transgender persons. It was expected that TABS would not correlate significantly with either the RSES, measuring global self-esteem, or the M-C SDS, assessing social desirability. In fact, TABS correlated poorly with the RSES, r(236) = −.02, p = .715, and the M-C SDS, r(236) = .04, p = .529, thereby demonstrating discriminant validity of TABS. It is noteworthy that there was almost no correlation between scores on TABS and M-C SDS, suggesting that participants completed TABS without regard to social desirability.

Discussion

Findings from the present study suggest that our Transgender Attitude and Beliefs Scale (TABS) is a psychometrically sound, multi-dimensional instrument with demonstrated reliability and validity. There were strong factor and overall alpha coefficient values, and factor loadings were moderate to high on all indicators. TABS also evidences construct validity as demonstrated by its expected performance on tests of convergent and discriminant validity against theoretically related and unrelated constructs.

TABS exhibits improvements over previous transgender attitude scales in at least four ways. First, unlike previous studies, the present study was conducted with a non-college sample, improving upon the generalizability of the scale. Second, TABS demonstrates superiority to previous scales in that it reflects the multi-dimensional conceptualization of attitudes toward sexual minorities, increasingly recognized in the literature (Fyfe, 1983; Hong, 1983; LaMar & Kite, 1998; McNaught, 1997; Mohr, 2002; Worthington et al., 2005) but lacking in extant transgender attitudes scales. Factor analyses established and confirmed the three-factor structure of the new scale. Third, TABS, as a three-dimensional scale consisting of 29 questions, is the briefest multi-dimensional transgender attitude measure presently available, given that the shortest scale (TS with 9 questions) is a one-factor scale and the GTS, while bi-dimensional, contains 33 questions (Hill & Willoughby, 2005; Nagoshi et al., 2008). Fourth, TABS is superior to previous scales in that it is contextually relevant, and the various aspects of which are discussed below.

One of the major concerns addressed by TABS is the need for an updated, timely measure, a quality deemed consequential for attitudinal measures (Herek, 1994; Worthington et al., 2005), and one made even more important by the rapidly increasing awareness of transgender-related issues in the general US population. Specifically, TABS contains items pertaining to transgender civil rights, not found in extant validated transgender attitude scales and which begin to capture the current public discussion surrounding civil rights issues of transgender persons (Currah, 2008; Davis, 2009; Gluckman & Trudeau, 2003). The inclusion of items such as “If a transgender person identifies as female, she should have the right to marry a man” and “Transgender individuals should have the same access to housing as any other person” is representative of TABS’ sensitivity to the recent, focused attention given in the US to transgender issues.

Another major point of concern addressed by TABS is the necessity of taking into account the religious climate of the population, which research has corroborated as vital in attitudinal studies for a nuanced, thus more accurate, measure of the construct. More specifically, because TABS was designed for the U.S. population, the binary view of humanity and the intrinsic value of human beings—beliefs held by evangelical Christians—are incorporated in the scale. What is more, factor analyses confirmed two distinct factors reflecting the two notions: one containing items pertaining to sex/gender beliefs (e.g., “Humanity is only male or female, there is nothing in between”) and the other consisting of statements regarding human value (e.g., “Transgender individuals are valuable human beings regardless of how I feel about transgenderism”).

Additionally, the contextually nuanced nature of TABS is further supported by the results of the study. For instance, while previous studies have presented a monolithic picture of religious people holding negative attitudes toward transgender persons, results from TABS show more attitudinal variability in the responses of the religious subgroup. Specifically, evangelical Christians displayed overwhelming endorsement of the fundamental value of transgender persons. The mean value of the ratings of evangelical Christians to the statement, “Transgender individuals are valuable human beings regardless of how I feel about transgenderism,” was 6.14 (SD = 1.11) on a scale ranging from 1 to 7 with 78.8 % choosing agree or strongly agree. In contrast, for statements regarding interpersonal comfort, the mean values were lower, ranging between 4.13 and 5.48, though with greater variability (SD = 1.72–2.10). Similarly, the mean value of ratings to a civil rights item, “If a transgender person identifies as female, she should have the right to marry a man,” was 4.82 (SD = 1.93), with nearly a third of the participants disagreeing with the statement. These preliminary findings suggest that evangelical Christians firmly hold to the intrinsic value of the person, though their ratings are lower on matters of transgender civil rights and the degree of comfort in associating with transgender individuals. This is a markedly different picture of conservative Christians’ attitudes toward the transgender population than what has been suggested through previous transgender scales, demonstrating TABS’ ability to capture subtle, but significant, attitude variability, while replicating some negative attitudes.

A number of limitations should be considered, particularly the relatively sparse demographic information collected from our participants and the use of non-representative samples (albeit more representative than those used in previous transgender attitude scale validation studies). For instance, the present study did not include information about several demographic characteristics that may have been useful, such as race (only ethnicity), socioeconomic status, urban/rural distinctions, sexual orientation, or political leanings. Likewise, there was a small skew toward evangelical Christians in this study (36.6 % of the Phase 1 sample and 41.6 % of the Phase 2 sample vs. 25–35 % of the general U.S. population), which may limit generalizability and suggest the need for more specifically stratified sampling in future studies (Institute for the Study of American Evangelicals, 2015; Pew Research Center, 2015). The samples in the two phases of the current study were also slightly skewed in other ways. Specifically, 55.3 % of the first and 55.5 % of the second samples were female, which are higher than the 50.5 % of the general U.S. population, and the present samples did not include any transgender participants. The study at hand also did not include a measure of test–retest reliability, which would have strengthened the validation process and is called for in future studies utilizing TABS. Finally, future studies should also take into account measures of religiosity (i.e., the extent of practice or perceived importance of religion) and/or behavioral markers (e.g., weekly church attendance).

Despite the limitations, TABS exhibits evidence of psychometric strength with demonstrated improvements to prior measures in its contextual relevance and capacity to assess multiple dimensions of beliefs and attitudes. Therefore, there is strong indication that TABS will serve as an effective tool with various subsets of the population. For example, there is a complete absence of data-driven studies examining U.S. evangelical Christians’ beliefs and attitudes toward transgender persons, despite the fact that they have been the most politically vocal sector of Christianity and have strongly influenced the setting of social norms throughout the history of the nation. The use of TABS would be particularly appropriate in that the instrument was specifically designed with sensitivity to Christian beliefs. With the increasing visibility of the transgender population and their expected need for both mental and physical care, TABS would also be appropriate for examining healthcare professionals’ attitudes toward transgender persons. Similarly, there is utility for TABS in educational institutions for an assessment of transgender receptivity at the student, faculty, and administration levels. Based on the growing number of young people identifying as gender nonconforming, the demand for climate research in educational settings is likely to rise. Additionally, employing TABS to explore possible correlates—such as age, gender, education, and contact with sexual and gender minorities—to attitudes toward transgender is also warranted. The three-dimensional structure of TABS, tapping into interpersonal comfort, sex/gender beliefs, and human value, also lends itself to an examination of possible relationship between a person’s view of gender and value of human beings with their level of comfort with transgender persons. While the definition of “transgender” in the present study was intentionally broad, it may be modified to designate a narrower definition for research purposes. For example, there may be value in exploring possible variations in attitudes depending on stage of transition (pre- vs. postoperative), subgroup (MTF, FTM, gender-fluid, etc.), and age (children, adolescents, adults) of transgender persons.