In recent years, there has been strong clinical and research interest in the overlap between gender variance (GV) and autism spectrum disorder (ASD) (for review, see van der Miesen et al. 2016). GV is a term that is often used in the current literature to describe an individual’s variation in gender role behaviors, deviating from culturally specific gender norms (Adelson 2012). GV does not imply having distress and, thus, is different from gender dysphoria (GD; Adelson 2012). In the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), GD refers to significant levels of distress associated with the incongruence between one’s birth-assigned gender (i.e. male or female) and one’s experienced gender (American Psychiatric Association; APA 2013). Although a number of reports suggest that childhood GV does not always persist as GD later in life (de Vries et al. 2010; Steensma et al. 2011; Wallien and Cohen-Kettnis 2008), high levels of GV are often predictive of a potential GD diagnosis (Coleman et al. 2012). In the DSM-5 (APA 2013), ASD is classified as a neurodevelopmental condition. The DSM-5 criteria for an ASD diagnosis includes significant impairments in social communication and interaction (i.e. difficulties in social-emotional reciprocity, nonverbal expression and understanding, understanding and maintaining social relationships) as well as restricted, repetitive patterns of behaviors, specific interests, and idiosyncratic sensory experiences (APA 2013).

Evidence suggests the prevalence of GD and ASD have been increasing independently in the general population (Lai et al. 2014; Loomes et al. 2017; Meerwijk and Sevelius 2017; Zucker 2005, 2017). Recent prevalence rates of GD, transgender identity, or GV, among adolescent and adult populations are estimated to be 0.2–1.1% (Flores et al. 2016; Kuyper and Wijsen 2014; Meerwijk and Sevelius 2017); however, the prevalence of individuals referred to or undergoing treatment at specialty gender identity clinics is approximately 1:14,705 in birth-assigned males and 1:38,461 in birth-assigned females (Arcelus et al. 2015; Zucker 2017). With regards to ASD, recent studies estimated a prevalence rate of approximately 1–1.5% in the general population, with males being 3–4 times more likely to be diagnosed with ASD (Christensen et al. 2016; Loomes et al. 2017; Ofner et al. 2018). Interestingly, although GD and ASD are two distinct conditions and appear to reflect a minority of the general population, various quantitative studies reported elevated ASD characteristics among those clinically referred for GD as well as elevated GV among individuals diagnosed with ASD (for review, see van der Miesen et al. 2016).

Gender Variance and ASD Characteristics in Prior Clinical Studies

Among recent quantitative studies, de Vries et al. (2010) conducted the only systematic study on the prevalence of ASD diagnoses among children (i.e. under the age of 12) and adolescents (i.e. between 12 and 18 years of age) with GD. Using the Dutch version of the Diagnostic Interview for Social and Communication Disorders-10th revision (DISCO-10; Wing 1999) to assess ASD diagnoses, this study reported an ASD rate of 6.4% and 9.4% for the pre-pubescent child and the pubescent adolescent samples, respectively. These rates were considered higher than the estimated 1–1.5% population prevalence rate (Christensen et al. 2016; de Vries et al. 2010; Ofner et al. 2018).

Another approach has been to examine the presence of ASD-related traits (i.e. ASD characteristics or autistic(-like) traits) among individuals with GD. Using the Autism-Spectrum Quotient (AQ; Baron-Cohen et al. 2001), two adult studies found elevated ASD characteristics among individuals with GD relative to typically developing adults (Jones et al. 2012; Pasterski et al. 2014), and one of these studies reported elevated ASD characteristics among birth-assigned females with GD compared to birth-assigned males with GD (Jones et al. 2012). Among children and adolescents with GD, studies examined clinical-range (mild/moderate or severe) ASD scores on the Social Responsiveness Scale (SRS; Constantino and Gruber 2005). In one sample of children and adolescents 5–18 years of age, 27.1% scored in the mild/moderate range and 27.1% scored in the severe range (Skagerberg et al. 2015). Another study of children less than or equal to 12 years of age reported that 44.9% fell within the clinical range (VanderLaan et al. 2015).

A recent clinical study used the Children’s Social Behavior Questionnaire (CSBQ; Hartman et al. 2006) to assess various domains of ASD characteristics in a sample of children and adolescents with GD in comparison to a typically developing control group and a group diagnosed with ASD (van der Miesen et al. 2018a). The domains examined included: (1) the extent of situation-appropriate behavior and emotions, (2) responses to social situations, (3) orientation in time and place, (4) understanding social information, (5) stereotyped movements and atypical responses from these senses, and (6) resistance to change. Overall, children and adolescents with GD had more ASD characteristics across all domains compared to typically developing children and adolescents but less than children and adolescents with an ASD diagnosis.

While the aforementioned studies focused on individuals with GD, other studies have examined whether individuals diagnosed with ASD exhibit increased GV. To date, three quantitative studies examined GV in clinical samples of children and adolescents 6–18 years of age with ASD using parental endorsement of the “wishes to be of the opposite sex” item (Item 110) in the most recent version of Child Behavior Checklist (CBCL; Achenbach and Rescorla 2001). Strang et al. (2014) compared children and adolescents with ASD to those diagnosed with attention-deficit/hyperactivity disorder (ADHD), and other neurodevelopmental disorders as well as two typically developing groups. Significantly higher rates of GV were found among children with ASD (5.4%) and ADHD (4.8%) compared to two typically developing groups (0–0.7%). No significant differences were found between the other neurodevelopmental group (1.7%) and the typically developing groups. Janssen et al. (2016) found that 5.1% of children and adolescents diagnosed with ASD reported more GV compared to 0.7% of the typically developing sample. A more recent study also found a significantly higher rate of GV among children and adolescents diagnosed with ASD (4.0%) compared to a typically developing group (0.7%), but not compared to those clinically referred for other mental health concerns (4.0%; May et al. 2017). These findings are paralleled by adolescent and adult studies showing greater gender and sexual orientation diversity among samples of individuals with ASD (Dewinter et al. 2017; George and Stokes 2017, 2018a, b; van der Miesen et al. 2018b).

Gender Variance and ASD Characteristics in Nonclinical Studies

Several community studies suggested that, independently, both GV and ASD characteristics are not uncommon in nonclinical populations. Nevertheless, no study has investigated the relation between GV and ASD characteristics in nonclinical, community samples to date. ASD characteristics are continuously distributed in the general population (Baron-Cohen et al. 2014; Constantino and Todd 2003; Hoekstra et al. 2007; Hurst et al. 2007). For example, Constantino and Todd (2003) used the SRS to quantitatively measure ASD characteristics in twins aged 7–15 years old from the general population and found ASD characteristics to be continuously distributed, with 1.4% of boys and 0.3% of girls scoring within the clinical range for ASD. With regard to GV, van Beijsterveldt et al. (2006) used the earlier version of the CBCL (Achenbach 1991) to measure parent-reported endorsement of Item 5 (i.e. “behaves like the opposite sex”) and Item 110 among a nonclinical population of children. They reported 3–5% of the girls and 2–3% of the boys behaved like the opposite sex, and approximately 1% of boys and girls wished to be of the opposite sex. Other child- and parent-report studies also provided prevalence estimates ranging between 2.3 and 6% for GV in the community (Coolidge et al. 2002; Martin et al. 2017). Using a similar approach in a young adult university-based sample, Lai et al. (2010) reported 7.3% of birth-assigned females and 1.9% of birth-assigned males reported “I wish I was the opposite sex.” Taken together, both GV and ASD characteristics are not specific to only clinical populations. As such, GV and ASD characteristics may be associated more generally, rather than simply within the clinical contexts in which they have been studied to date.

Gender Variance and Clinical Diagnoses Other than ASD

While a number of studies support the potential association between GD/GV and ASD/ASD characteristics, Turban and van Schalkwyk (2018) note that the specificity of this correlation has not been sufficiently established as few studies have examined GV among other clinical populations. As such, there is insufficient evidence to suggest that elevated GD/GV is specific to ASD. For example, there might be other emotional challenges (e.g. body dissatisfaction; van de Grift et al. 2017) and environmental factors associated with GV such as those relating to minority stress (e.g. poor peer relations, familial non-acceptance; e.g. Baams et al. 2013; Landolt et al. 2004) that result in elevated scores on measures of social impairment. In addition, a small number of case-reports and quantitative studies point to other clinical populations in which GV may be elevated. For instance, Strang et al. (2014) explored rates of GV as measured by parental endorsement of cross-sex wishes on the CBCL in children with ASD, ADHD, or other neurodevelopmental conditions. Childhood GV was associated with ASD and ADHD; but, the latter finding has not been replicated.

Further, based on past literature, one might expect an over-representation of gender-variant individuals among other clinical populations, especially those who experience internalizing conditions (e.g. anxiety or depression). Previous studies suggest children who exhibit GV are more likely to experience peer difficulties, including ridicule, corrections, and rejection (e.g. Berndt and Heller 1986; Carter and McCloskey 1984; Wallien et al. 2010; Zucker et al. 1995; Zucker 2005). Indeed, there is evidence to suggest that experienced social stigma was associated with poorer psychological well-being, including distress, depression and anxiety among transgender and/or gender-variant individuals (Baams et al. 2013; Bockting et al. 2013; Coolidge et al. 2002; de Vries et al. 2016; Landolt et al. 2004; Li et al. 2016; Lippa 2008; Shiffman et al. 2016; Toomey et al. 2010). Correspondingly, several studies found that associations between GD/GV and internalizing challenges in youth were mediated by poor peer relationships (e.g. Coolidge et al. 2002; de Vries et al. 2016; Landolt et al. 2004; Shiffman et al. 2016; van Beijsterveldt et al. 2006). Given these patterns, it may be relatively more common for individuals who express GV to be at risk for stigma-related psychological distress and, therefore, to have received a mental health diagnosis such as anxiety or depression. Based on small clinic samples and individual cases, it has also been hypothesized that conduct disorders, mood/anxiety disorders, obsessive–compulsive disorder, heightened sensory sensitivities, and learning disabilities may be associated with increased GV (e.g. Kaltiala-Heino et al. 2015; Landen and Rasmussen 1997; Tateno et al. 2008; Williams et al. 1996; Wood and Halder 2014); however, these hypotheses are best regarded as highly tentative in the absence of more rigorous empirical data.

The Present Study

The overall goal of the present study was to advance our understanding of the association between GV and developmental/mental health conditions, with particular attention to ASD. In comparison to previous studies examining the GV-ASD link, this study focused on parent-reported responses for a relatively younger sample of children between 6 and 12 years of age (N = 2445). The use of online recruitment also allowed for obtaining a large community-based sample that provided considerably higher statistical power than previous studies in this field. Another novel aspect of the present study included the use of more comprehensive measures for both GV and ASD. In particular, whereas previous studies examining GV in relation to ASD relied on a single-item measure of cross-sex wishes (e.g. Janssen et al. 2016; May et al. 2017; Strang et al. 2014), this study employed the Gender Identity Questionnaire for Children (GIQC; Johnson et al. 2004)—a well-validated and highly sensitive measure of GV that relates to the DSM criteria for GD and assesses several gender-typed domains relevant to children (e.g. toy, playmate, activity, clothing and hairstyle preferences). Likewise, the CSBQ was employed to examine potential associations between distinct domains of parent-reported ASD characteristics and GV. In addition, the present study reports on levels of GV in relation to a variety of developmental/mental health diagnoses to provide insight regarding the specificity of the association between GV and ASD.

Method

Participants and Procedure

Parents of children (6–12 years of age) were recruited to participate in a parent-report online questionnaire from June to December 2016. To participate in this study, parents were required to be a minimum age of 18 years, to be proficient in English, and to have at least one child aged 6–12 years. Participant recruitment was conducted online using Facebook and Kijiji, as well as by contacting Canadian organizations and facilities (e.g. recreational community centers). As part of a larger study on gender expression and psychological well-being, advertisements used to recruit parents for the study included the tag-line “Survey on Child Well-Being.” A majority of participants who filled out the questionnaire reported that they were recruited via Facebook (79.3%) where study advertisements were viewed by 182,887 people, and the survey link was clicked 3119 times. Of the remaining participants who completed the questionnaire, 68 reported they were recruited from Kijiji (1.1%) or other sources (i.e. word of mouth, community organizations, local advertisements; 1.1%), and 573 participants (18.5%) did not report where they learned about the study. Participants were given a link to an anonymous survey that was administered using Qualtrics Survey Software and were required to provide informed consent before completing the survey online. Once the survey was complete, participants were compensated for their time by being given the opportunity to enter a prize draw with a 1-in-100 chance of winning a $100 e-gift card of their choice (e.g., Amazon). The present study was approved by the University of Toronto Research Ethics Board.

Parent-reported responses were collected for a total sample of 3,097 children; however, 652 participants were excluded because they did not provide adequate information on the focal measure of GV investigated in this study and/or key demographic variables such as the child’s age and birth-assigned gender. Thus, the final sample included parent-reports on 2445 children. Of the total responses included in this study, 96% of the data were obtained from maternal reports (n = 2347) and the remaining 4% (n = 98) were from paternal reports; and 5.4% of the children (n = 132) lived in the same household (e.g. siblings).

Parents were asked whether their child had received a developmental/mental health diagnosis from a healthcare professional, and if they indicated “yes” then they were asked to list all such diagnoses. Children (Mean age = 8.83 years, SD = 1.99) were categorized as either typically developing (i.e. from the nonclinical subgroup; n = 2004; 1022 girls, 982 boys) or as part of a clinical subgroup of the population meaning that they had (according to parent-report) at least one of eight categories of developmental/mental health diagnoses (n = 441; 165 girls, 276 boys): autism spectrum disorder (ASD; n = 80; 23 girls, 57 boys), attention-deficit/hyperactivity disorder (ADHD; n = 247; 81 girls, 166 boys), oppositional defiant disorder (ODD; n = 39; 16 girls, 23 boys), obsessive–compulsive disorder (OCD; n = 17; 4 girls, 13 boys), sensory processing disorderFootnote 1 (SPD; n = 28; 8 girls, 20 boys), mood and anxiety disorders (n = 140; 64 girls, 76 boys), learning disabilities (n = 51; 25 girls, 26 boys) and other neurodevelopmental conditions for which participant numbers were relatively small (n = 22; 6 girls, 19 boys). This latter category included Tourette syndrome (n = 8; 2 girls, 6 boys), developmental coordination disorder (n = 2; 2 boys), developmental delay disorder (n = 2; 2 boys), speech sound disorder (n = 1; 1 boy), fetal-alcohol spectrum disorder (n = 2; 1 girl, 1 boy), executive functioning disorder (n = 3; 3 boys), and auditory processing disorder (n = 4; 2 girls, 2 boys).

Measures

Demographic Information

Parents responded to questions regarding their own and their child’s demographic characteristics. The demographic variables of interest included the child’s birth-assigned gender (coded as females/girls = 0 and males/boys = 1), age, average school performance (based on letter grades coded as 1 = D, 2 = C, 3 = B, 4 = A), location (i.e. Western Canada, Eastern Canada, Quebec, Ontario), area of residence (i.e. rural, urban, suburban), parent marital status, ethnicity, annual household income in Canadian dollars, and predominant religious background.

Table 1 presents the demographic characteristics for boys and girls separately for the clinical and nonclinical subgroups, respectively. Overall, girls had a higher school performance, independent t(2200) = 3.07, p = 0.02, Cohen’s d = 0.13; however, the boys and girls did not significantly differ on any other demographic variable (see Table 1).

Table 1 Descriptive statistics for demographic variables for the clinical and nonclinical subgroup of children

Gender Variance

The Gender Identity Questionnaire for Children (GIQC) is a 16-item standardized parent-report questionnaire rated on a 5-point Likert scale and is used to assess gender-typed behavior in children (Johnson et al. 2004). The rating scale ranges from 1 (stereotypically opposite to birth-assigned gender) to 5 (stereotypically same birth-assigned gender). Based on the factor analysis of Johnson et al. (2004), items 8 and 16 were left out of the analysis as the one-factor solution containing the remaining 14-items was found to best fit the data, accounting for 43.7% of the variance. Typically, a lower GIQC total score is indicative of increased GV; however, for the present study, scoring for the GIQC was reversed to allow for easier interpretation of the results (i.e. higher scores reflect elevated GV). The GIQC is psychometrically valid. For the present study, the Cronbach’s alphas were found to be 0.77 and 0.64 for birth-assigned boys and girls, respectively. Additionally, the GIQC has been found to show negligible age effects and is known to effectively encompass most of the core behavioral characteristics of those with GD (Johnson et al. 2004).

ASD Characteristics

The refined version of the Children’s Social Behavior Questionnaire (CSBQ) is a 49-item parent-report questionnaire used to assess characteristics of ASD (Hartman et al. 2006). Parents were asked to rate their child’s behaviors during the preceding 2 months on a 3-point Likert scale ranging from 0 (does not apply) to 2 (clearly or often applies). The CSBQ has six subscales measuring different domains:

  1. 1.

    The not tuned subscale describes a child’s difficulty adapting their emotional and behavioral reactions appropriately to social context. An example item is “quickly gets angry.”

  2. 2.

    The social subscale included items characterizing responses to social contact, social needs, and initiation of contact. A sample item is “does not begin to play with other children.”

  3. 3.

    The orientation subscale reflects the inability for a child to adequately orient self in time, place or in relation to others. For example, “takes in information with difficulty.”

  4. 4.

    The understanding subscale assesses whether the child experiences any difficulty understanding social information such as “takes things literally, e.g. does not understand certain expressions.”

  5. 5.

    The stereotyped subscale represents stereotyped behavioral reactions, a preoccupation with objects and sensory stimuli, and atypical responses to information from the senses. For instance, a sample item includes “constantly feels objects.”

  6. 6.

    The change subscale refers to a fear of new situations and resistance to change, which are assessed using items such as “remains clammed up in new situations or if change occurs.”

With respect to its psychometric properties, the CSBQ has good test–retest and inter-rater reliability as well as internal and discriminant validity when used on various groups of children (i.e. PDD-NOS, high functioning autism and clinical controls; Luteijn et al. 2000). In terms of the validity of the subscales, Hartman et al. (2006) performed several factor analyses to support the discrete subscales; and items were removed to increase the distinctiveness of the subscales. Each subscale also had a high conceptual validity. Pertaining to the current study, Cronbach’s alpha for the CSBQ total score was found to be 0.95 and ranged between 0.78 and 0.89 for each of the individual subscales. In addition, unlike other instruments assessing ASD characteristics, the CSBQ is not directly related to the DSM-criteria of ASD (de Bildt et al. 2009). Further, each of the subscales showed large effect size differences when comparing the neurotypical to the PDD-NOS and to high functioning autism groups, respectively. Thus, this instrument and its subscales provide a more sensitive screening measure of milder expressions of ASD and ASD characteristics that are present in the nonclinical population (de Bildt et al. 2009; Hartman et al. 2006; Luteijn et al. 2000).

Behavioral and Emotional Challenges

The Child Behavior Checklist (CBCL) is a standardized parent-report questionnaire used to assess a broad range of behavioral and emotional challenges (Achenbach and Rescorla 2001). The CBCL consists of 113-items, and parents were asked to rate their child’s behavior over the past 2 months on a 3-point Likert scale ranging from 0 (not true) to 2 (very or often true) (Achenbach and Rescorla 2001). T-scores derived from the parents’ answers were calculated as age-standardized measures of children’s total behavioral and emotional challenges. This measure has extensive normative data available, and it is a well-validated and reliable measure (Nakamura et al. 2009). The CBCL was found to have a high internal consistency (α = 0.78–.97) and inter-rater reliability (ICC = 0.95; Achenbach and Rescorla 2001).

Statistical Analysis

All statistical analyses were conducted using SPSS version 24. To assess the relationship between GV and ASD characteristics in a nonclinical sample, relations between CSBQ total and subscale scores and GIQC scores were examined. For the nonclinical sample, the analysis was limited to children whose parents did not report a history of developmental/mental health diagnoses and completed all items pertaining to the CSBQ (n = 1851; 949 girls, 902 boys). Zero-order correlations were conducted with the CSBQ total score, the six CSBQ subscales, and demographic variables including the child’s birth-assigned gender, age, annual income and school performance as well as the CBCL T-score. Doing so conveyed the correlation structure among these variables and also identified which demographic variables were associated with the GIQC and should, therefore, be included as control variables. Second, a multiple linear regression predicting GIQC scores was conducted in which all variables identified as needing to be controlled (Step 1) and each of the CSBQ subscales (Step 2) were included as predictors. In a third step, the interactions between gender and CSBQ subscales that predicted elevations in GIQC scores were entered as predictors. Participants for whom data were missing for any of the control variables included in the regression were deleted pairwise from this analysis.

For the analyses regarding GV and developmental/mental health diagnoses, the same approach was used. Zero-order correlations were employed across the entire sample to examine whether GV was associated with each of the particular diagnoses as well as to determine whether birth-assigned gender, age, annual income, or school performance were associated with GIQC scores. The demographic variables that were significantly correlated with the GIQC scores were identified as relevant control variables. All relevant control variables were included in a step-wise multiple linear regression examining unique associations between various diagnostic categories and GIQC scores, relative to the reference group of children who did not have a developmental/mental health diagnosis. In addition, for any diagnosis that was found to have a significantly unique association with elevated GIQC scores, the possible moderating influence of gender was assessed by including their respective interactions with gender as predictor variables in an additional step in the regression model. Again, any participants for whom data were missing for any of the control variables included in the regression were deleted pairwise from this analysis.

Statistical significance was initially evaluated at a conventional critical value of 0.05. For CSBQ subscales and developmental/mental health diagnoses that were unique predictors of GIQC scores in the regression analyses, we performed the Benjamini-Hochberg (1995) procedure for multiple comparisons to inform the possibility of false discoveries (i.e. Type I error). For these analyses, we set the false discovery rate (FDR) at 5%.

Results

Gender Differences in Gender Variance

Overall, the mean GIQC score of girls (mean, M = 2.11, standard deviation, SD = 0.40) was significantly higher than that of boys (M = 1.97, SD = 0.37; p < 0.05, Cohen’s d = 0.36). In the nonclinical subgroup, the mean GIQC score of girls (M = 2.10, SD = 0.39) was significantly higher than that of boys (M = 1.96, SD = 0.37; p < 0.05, Cohen’s d = 0.37). Similar results were found in the clinical subgroup where the mean GIQC score of girls (M = 2.16, SD = 0.43) was significantly higher than that of boys (M = 2.01, SD = 0.35; p < 0.05, Cohen’s d = 0.38).

Gender Variance and ASD Characteristics in Nonclinical Children

For the nonclinical subgroup, the mean total CSBQ score and the mean CSBQ subscale scores for boys and girls separately are presented in Table 2. Table 3 shows the zero-order correlations between GIQC and CSBQ scores, as well as possible control variables including birth-assigned gender, income, school performance, and CBCL total T-scores. Higher CSBQ total scores and higher GIQC scores were both associated with increased emotional and behavioral challenges based on the CBCL total T-scores, and birth-assigned gender was significantly associated with the GIQC. As such, these two variables were controlled for subsequently.

Table 2 Means and standard deviations for the CSBQ total and subscales scores of birth-assigned boys and girls in the nonclinical subgroup of children
Table 3 Zero order correlations among study variables for CSBQ total and subscales among the nonclinical subgroup

Table 4 shows the multiple linear regression analysis predicting GIQC scores. Predictors included the CBCL total T-score and gender (Step 1), and the CSBQ subscales (Step 2). In Step 2, the not tuned subscale was significantly associated with GIQC scores in the negative direction; the orientation and stereotyped subscales were significant positive predictors of GIQC scores. Step 3 of the regression assessed the possible moderating effect of gender in predicting GIQC scores on the orientation and stereotyped subscales and found no significant moderating effects. Using the Benjamini-Hochberg procedure with a 5% FDR, the orientation (p < 0.001) and not tuned (p = 0.003) subscale effects remained significant relative to their respective corresponding adjusted critical alphas of 0.008 and 0.017. In contrast, the p-value for the stereotyped subscale effect (p = 0.036) was not lower than the corresponding adjusted critical alpha (p = 0.025). Thus, at a 5% FDR, it may be that this latter effect is a case of Type I error.

Table 4 Multiple linear regression analyses comparing the effect of ASD characteristics on predicting gender variance

Gender Variance and Clinical Diagnoses

Table 5 shows the zero-order correlations between GV and primary variables of interest in the clinical subgroup of children. The GIQC was found to have a significant negative correlation with birth-assigned gender. Controlling for gender (Step 1), a step-wise multiple linear regression analysis was conducted (Table 6). In Step 2, ASD, SPD, and ODD were unique significant predictors of elevated GIQC scores. Step 3 included the interaction of gender with ASD, SPD, and ODD, respectively, and found no moderating influence of gender on the associations between these diagnoses and GIQC scores. Using the Benjamini–Hochberg procedure with a 5% FDR, the effects for ASD (p < 0.001), SPD (p = 0.009), and ODD (p = 0.011) remained significant relative to their respective corresponding adjusted critical alphas of 0.007, 0.014, and 0.021. Figure 1 presents the mean GIQC score for the nonclinical sample and each of the individual diagnostic categories as well as the significant independent effects found in the multiple linear regression.

Table 5 Zero-order correlations among study variables and diagnoses for the full sample
Table 6 Multiple linear regression comparing the effect of each diagnostic category on predicting gender variance
Fig. 1
figure 1

Mean (standard error) GIQC scores for the nonclinical group and each clinical diagnosis category. Asterisks indicate groups that significantly differed from the reference group (i.e. nonclinical group) in a multiple linear regression (p-value < 0.05). ODD oppositional defiant disorder, ASD autism spectrum disorder, OCD obsessive–compulsive disorder, and ADHD attention-deficit/hyperactivity disorder

Further, to discern whether GV was distinct from ASD characteristics within the SPD and ODD groups, we evaluated the correlations between the CSBQ subscales and GIQC within these two groups, respectively. For the SPD group, the change subscale was negatively associated with the GIQC (r = − 0.41, p < 0.05), but all other subscales did not show a significant association (range of r = − 0.25 to 0.08, all ns). For the ODD group, the GIQC was positively associated with the understanding subscale of the CSBQ (r = 0.36, p < 0.05), but all other subscales were not significantly associated with the GIQC (range of r = − 0.07 to 0.21, all ns).

Discussion

In a large community sample of children ages 6–12 years, this study examined two important issues relevant to childhood GV and ASD characteristics. First, it examined whether ASD characteristics were associated with GV beyond clinical populations, and whether particular domains of ASD characteristics contributed to this association. Second, it examined whether clinical diagnoses other than ASD might also be associated with GV.

We found a positive association between characteristics of ASD and GV in this nonclinical sample, indicating that such associations exist beyond the clinical domain (i.e. beyond those who have clinical diagnoses of ASD and/or GD). In particular, the orientation and stereotyped subscales of the CSBQ were found to be unique predictors of increased GV; however, when a 5% FDR criterion was applied, the stereotyped subscale effect was shown to potentially be a case of Type I error. With regards to children with a developmental/mental health clinical diagnosis, we found that ASD (GIQC: M = 2.18, SD = 0.44), SPD (GIQC: M = 2.24, SD = 0.42), and ODD (GIQC: M = 2.19, SD = 0.46) all showed significantly higher levels of parent-reported GV compared to the reference group of children without reported clinical diagnoses (GIQC: M = 2.01, SD = 0.39). For the GIQC scale, as used here, a cut-off score of 2.46 shows good sensitivity (86.8%) and specificity (95%) for identifying children clinic-referred for gender-related concerns (Johnson et al. 2004). The children in our sample with ASD, SPD, and ODD were 0.63, 0.52, and 0.59 SD’s from this cut-off, respectively, as compared to 1.10 SD’s for the nonclinical sample of children. As such, the GV of the average child diagnosed with ASD, SPD, or ODD appears to be intermediate between that of typically developing children and those clinic-referred for GD.

Possible Interpretations of the Present Findings

In the nonclinical sample, elevations in GV were associated with the CSBQ orientation subscale and, more tentatively, the stereotyped subscale. These scales measure difficulty orienting one’s attention to social cues as well as atypical responses to sensory input (i.e. over- or under-responsivity), respectively. It should be noted, however, that these findings are somewhat not in line with a recent study of children clinically referred for GD reporting elevations in all six domains of ASD characteristics using the CSBQ (van der Miesen et al. 2018a). As such, GV in gender-referred children might be broadly associated with ASD characteristics, whereas elevated GV in non-referred children may not be related to all facets of ASD.

Still, the present findings were consistent with some hypotheses proposed in the prior literature. The association between GV and the stereotyped subscale of the CSBQ is consistent with the idea that strong gender-related interests in children might be related to preferences for specific kinds of sensory stimulation (e.g. silky materials, bright and shiny objects; de Vries et al. 2010; Tateno et al. 2008; Williams et al. 1996). Children who score higher on the orientation subscale may be less inclined to attend to social cues, including those about gender, which might have an influence on development in domains like gender expression and gender-stereotyped beliefs (Martin and Ruble 2004; Strang et al. 2014). Alternatively, stigmatization of childhood GV may lead one to feel unaccepted, thus lowering their inclination to socialize. That said, the social subscale, which measures the child’s tendencies to seek out and engage in social interactions, was not significantly associated with GIQC scores, which raises some doubt about whether a relative lack of social engagement underpins the orientation subscale’s association with GV. Further, certain facets of ASD might have an alternate association with gender expression because the not tuned subscale (i.e. difficulties adapting one’s emotional and behavioral reactions to a social context) predicted decreases, rather than increases, in GV.

Among the parent-reported developmental/mental health diagnoses significantly associated with increased GV, ASD showed the largest effect size. This finding replicated the results of prior studies showing elevated wishes to be of the opposite gender among children with ASD (Janssen et al. 2016; May et al. 2017; Strang et al. 2014), but with a more refined measure of GV. However, Turban and van Schalkwyk (2018) noted that associations between social impairments and GD/GV may be nonspecific to ASD. In line with this argument, increased levels of GV were also found among children with a diagnosis of SPD or ODD.

SPD is not currently in the DSM but has been conceptualized as a complex condition in the child development literature (Miller and Schaaf 2008). Like the present finding regarding the stereotyped subscale, the association between SPD and elevated GV is in line with the hypothesis that some individuals experience altered sensitivity towards certain sensory input (e.g. tactile sensations) that intersects with gender expression. Yet, because there is considerable overlap in characteristics between SPD and ASD (i.e. specific interests, idiosyncratic sensory experiences; Schoen et al. 2009), it is not clear whether our finding regarding SPD and GV is simply an extension of the GV-ASD link.

The association between GV and ODD corresponds to previous reports of youth referred for GD who either had an ODD diagnosis prior to their referral (Kaltiala-Heino et al. 2015) or met the criteria for ODD based on age-appropriate diagnostic interviews (Drummond et al. 2017; Wallien et al. 2007). The understanding subscale of the CSBQ, however, was positively associated with GV in the ODD group, suggesting that elevations in GV might have stemmed from associations with ASD characteristics within the ODD group.

Of the several other clinical diagnoses examined here, no associations were observed. Finding no association between GV and ADHD is of particular interest given prior quantitative work showing an increased co-occurrence of ADHD and GV in clinic-referred children and adolescents (de Vries et al. 2011; Strang et al. 2014; Wallien et al. 2007). This lack of replication could be due, in whole or in part, to methodological differences, including the way in which the presence of an ADHD diagnosis was determined. Also, Strang et al. used the single item measure of GV from the CBCL, focusing more narrowly on wishes to be of the opposite gender. In contrast, the present study used the GIQC as a more comprehensive measure for distinguishing levels of gender expression (Johnson et al. 2004). Furthermore, inconsistencies between these two studies may be due to differences in sample characteristics. Strang et al. examined a clinical sample of both children and adolescents from a hospital database, whereas the present study was conducted online and consisted only of children.

It is important to note that the observed associations cannot be taken as reflecting causal relations. Other common underlying mechanisms may contribute to the purported GD/GV-ASD link. For instance, biological factors might contribute to the association between GD/GV and ASD. The extreme male brain (EMB; Baron-Cohen 2002; Baron-Cohen et al. 2014) theory, for example, posits that the cognitive characteristics of ASD reflect an ‘extreme-male’ profile based on normative gender differences in the relative discrepancy in empathizing/systemizing abilities (i.e. on average, greater systemizing over empathizing abilities in boys and vice versa in girls). At the level of potential underlying mechanisms, the fetal androgen theory hypothesizes that such differences in cognitive profiles stem from exposure to differing levels of androgens prenatally (e.g. higher levels of testosterone experienced by boys in utero), which may partly contribute to the higher prevalence of ASD in males than in females (Auyeung et al. 2013; Baron-Cohen et al. 2011). As such, the fetal androgen theory may explain the co-occurrence of ASD and GV characteristics, particularly in birth-assigned females. Some research shows birth-assigned females who experience GD have elevated ASD characteristics compared to birth-assigned males with GD (Jones et al. 2012) and are similar to typically developing boys with regards to scores on empathizing abilities (Di Ceglie et al. 2014). Additional research examined birth weight as a biological factor relevant to the GD/GV-ASD link (VanderLaan et al. 2015). Relatively higher birth weight has been correlated with masculinized somatic features in females (Avidime et al. 2011) and lower testosterone in the pre/perinatal period in males (Carlsen et al. 2006). In gender-referred children, higher birth weight and greater GV were associated with clinical-range ASD characteristics (VanderLaan et al. 2015).

Limitations and Future Directions

This study was correlational and cross-sectional in its approach. Thus, the ability to control for confounding variables and draw conclusions regarding causation or time-related developmental pathways is limited. Other limitations relate to the method of data collection. Online recruitment may have produced coverage and self-selection biases. The use of parent-report measures may have introduced other kinds of biases (e.g. socially desirable responding). Retrospective reports regarding developmental/mental health diagnoses are less favorable than using a diagnostic interview. As such, associations between GV and ASD characteristics as well as between GV and various developmental/mental health diagnoses should be interpreted cautiously. Moreover, it will be important for future research to replicate the current findings before any firm conclusions can be drawn.

There are also limits to the generalizability of the findings. The sample sizes for the clinical diagnosis groups were variable (e.g. ranging from n = 17 for the OCD group to n = 248 for the ADHD group) and the heterogeneous neurodevelopmental group consisted of a wide-range of neurodevelopmental diagnoses that may be better assessed as different clinical groups. Larger and more even sample sizes would have provided more representative groups and more optimal comparisons. Also, although the GIQC is a well-validated and highly sensitive measure of GV, it is not equivalent to a GD diagnosis (Johnson et al. 2004) and the present findings were obtained from a nonclinical sample. Thus, while the present findings might have some implications for the relationship between GV and ASD, their relevance to GD in clinical populations is unknown; however, they might guide future research to better understand this relationship in clinical contexts. These findings also provide some tentative evidence to suggest clinicians working with children with SPD or ODD should be mindful of possible GV, similar to the recommendations for working clinically with adolescents diagnosed with ASD (Strang et al. 2016).

This study was limited to children ages 6–12 years and parents did not report on their child’s pubertal status; however, the relationship between GV and ASD exists in clinical populations of adolescents and adults as well (de Vries et al. 2010; Dewinter et al. 2017; George and Stokes 2018a, b; Janssen et al. 2016; Jones et al. 2012; May et al. 2017; Pasterski at al. 2014; Skagerberg et al. 2015; Strang et al. 2014; van der Miesen et al. 2018b). Feelings of GD/GV or underlying motives for expressing to be of the other gender are thought to be experienced differently before vs. after the onset of puberty (Burke et al. 2014; Steensma et al. 2011; Strang et al. 2016) and a number of studies suggest that GD in prepubertal children is less likely to persist beyond puberty whereas adolescents with GD are more likely to pursue treatment options to alter primary and/or secondary sex characteristics (for discussion, see de Vries et al. 2010; Wallien and Cohen-Kettenis 2008). Future research should evaluate to what extent, if any, the present findings extend to other age groups.

Conclusion

The present study found that characteristics of ASD were associated with elevations in GV in a nonclinical community child sample. This association was mainly attributable to certain facets of ASD, namely deficits in social orienting and possibly also stereotyped behaviors. Further, this study documented higher levels of GV among children with parent-reported diagnoses of ASD, SPD, or ODD. These findings suggest that the GD/GV-ASD link might reflect a more general pattern that exists beyond clinical populations. Also, elevated GV may not be unique to ASD populations, but might be found among other clinical populations as well.