Autism spectrum (AS) refers to a group of closely related neurodevelopmental conditions, characterised by impairments in social communication and interaction, repetitive behaviours, and specific interests (APA 2013). In the United Kingdom (UK), prevalence rates of AS during childhood have been estimated at 1% (Baron-Cohen et al. 2009). During recent years, the presence of AS diagnoses and or features in children and young people (CYP) accessing specialist paediatric gender services has gained increased attention (Nordahl-Hansen et al. 2019). This is largely due to a rise in international evidence that has identified a significantly higher rate of clinically diagnosed AS in clinic referred, gender diverse CYP, compared to the general population (Shumer et al. 2016; Kaltiala-Heino et al. 2015; Skagerberg et al. 2015; De Vries et al. 2010).

The national Gender Identity Development Service (GIDS) is the only child and adolescent specialist gender service in the United Kingdom (UK), commissioned through the National Health Service (NHS). The primary aim of the GIDS is to facilitate holistic exploration of gender identity development, to mitigate any associated behavioural, emotional and relationship difficulties, and to promote wellbeing (Di Ceglie 1998). The broader context is acknowledged at all stages of assessment and treatment, particularly in regard to the CYP experience of living with a diverse gender identification and in the development of their identity. A proportion of CYP attending the GIDS may present with symptoms of gender dysphoria (GD), a term used to describe clinical distress or impairment in several important areas of functioning, due to an incongruence between birth assigned sex and experienced gender (American Psychiatric Association [APA] 2013).

Consistent with studies of cis-gender samples, research has documented an overrepresentation of clinical AS diagnoses in birth assigned male (AM) compared to birth assigned female (AF) clinic referred, gender diverse CYP, with early studies proposing a ratio of 3:1 respectively (De Vries et al. 2010). However, growing evidence has identified how current diagnostic methods and screening tools may be insufficient in capturing the cis-female experience of AS which, as a result, may inflate the sex imbalance in cis-gender prevalence studies (Halladay et al. 2015; Kirkovski et al. 2013; Loomes et al. 2017). For instance, contemporary literature proposes that features associated with AS may be more frequently “camouflaged” in cis-female individuals, which may partially account for lower prevalence rates reported in this group (Dean et al. 2017; Kirkovski et al. 2013; Lai et al. 2017). Whether or not this phenomenon is observed in clinic referred, gender diverse CYP is currently unknown.

Though standardized screening tools for AS have not yet been validated for use in clinic referred, gender diverse CYP, they have been utilized by researchers to better understand this association. Several investigations have demonstrated significantly high scores on subscales pertaining to core AS symptomatology, such as restrictive and repetitive behaviours and interests (RRBI) (Van der Miesen et al. 2018; Skagerberg et al. 2015). Increased incidences of socially related symptoms that are commonly seen in broader AS presentations, such as differences in social cognition and communication, have also been reported (Strang et al. 2018; Van der Miesen et al. 2018). Taken together, this appears to provide support for an over-representation of key features associated with AS amongst clinic referred, gender diverse CYP.

To the best of the author’s knowledge, two large scale studies have demonstrated differences in the presentation features associated with AS amongst clinic referred, gender diverse CYP, according to birth assigned sex. In a sample of 248 AM (mean age = 10.1, SD = 3.79) and 242 AF (mean age = 12.1, SD = 3.39) participants diagnosed with GD, Van der Miesen et al. (2018) identified several differences in scores amongst the two groups using the Children’s Social Behavior Questionnaire (CSBQ). Whilst less reciprocated social behaviour and social interests were reported in AF participants, AMs appeared to score higher on the stereotyped subscale, indicating more stereotyped behaviours and sensory sensitivity in this group. In a later study of 61 clinic referred, gender diverse CYP (45 AM; 16 AF; M = 7.97 years; range = 4.08–12.95), Leef et al (2019) found a main effect of sex on the Social Responsiveness Scale 2nd Edition (SRS-2) total scores and for each subdomain, whereby AF participants scored higher than AM participants.

Some scholars have argued that high scores on AS screening tools within this clinical population may reflect social impairments that arise due to factors including emotional and behavioural difficulty, minority stress and poor peer relationships (Turban 2018; Turban and Schalkwyk 2018; Skagerberg et al. 2015). Descriptive studies of clinic referred, gender diverse CYP in Europe and North America suggest that 40–45% present with a psychiatric comorbidity (Kaltiala-Heino et al. 2018; Zucker et al. 2012). In addition, a wealth of international evidence has documented a high prevalence rate of comorbid difficulties including bullying, depression, anxiety, self-harm and suicidal ideation (Spack et al. 2012; Zucker et al. 2012; Holt et al. 2016). This begs the question of whether high scores exhibited by clinic referred, gender diverse CYP on AS screening tools result from gender diversity impacting upon psychosocial aspects of development and wellbeing, difficulties associated with social stress including minority stress and poor peer relationships (Turban 2018; Turban and Schalkwyk 2018), or whether they are indicative of a narrow AS phenotype.

For clinic referred, gender diverse CYP, pubertal suppression by means of gonadotropin-releasing hormone analogues (GnRHa) may be considered to temporarily halt physical development associated with puberty. Having gained increasing acceptance internationally, the clinical rationale for GnRHa treatment is to reduce distress associated with pubertal bodily changes, whilst also providing space and time to explore gender identity (Coleman et al. 2012; Hembree 2011). For CYP accessing GnRHa, pubertal stage is important and likely to affect psychological functioning as accessing GnRHa in early puberty and choosing to then access sex hormone treatment is likely to result in less surgery and increase the likeness to identified gender which may lead to subsequent improved wellbeing and/or peer relationships. Accessing in later puberty, when sex characteristics have developed however, may not impact wellbeing in the same way.

If we consider features of AS as influenced by psychosocial wellbeing (Turban 2018), it is important to note that early studies have documented improvements in some aspects of psychological functioning after accessing GnRHa for GD (Costa et al. 2015; De Vries et al. 2011, 2014; Turban et al. 2020). In a study of 70 CYP (AM = 33, AF = 37, mean age = 13.6, range = 11.1–17, SD = 1.8), De Vries et al. (2011) described enhanced behavioural and emotional functioning measured by the Youth Self Report (YSR) and parent-report Child Behaviour Checklist (CBCL), significantly fewer depressive symptoms on the Beck Depression Inventory II (BDI-II), and improved global functioning measured by the Children’s Global Assessment Scale (CGAS) after approximately 2 years of accessing GnRHa. Thus, improvements in SRS-2 scores (if considered as features of AS) may be expected with GnRHa uptake due to improvements in psychosocial wellbeing for this younger cohort. However, it is important to note that the extent to which these findings provide support for psychological improvements resulting from GnRHa treatment is ambiguous due to the confounding effects of completing measures close to starting sex hormone treatment. If, however, AS features are more related to neurotypicality, it may be assumed that scores would remain consistent over time.

Aims

To date, there is an absence of research exploring the impact of GnRHa on social communication difficulties in clinic referred, gender diverse CYP. Available literature has suggested that high scores on AS screening tools are an artefact of GD, rather than a true reflection of AS (Skagerberg, Ceglie and Carmichael 2015) and there remains no longitudinal evidence from either clinic referred or general population samples. The current study aimed to investigate whether scores on the SRS-2 change over time for gender diverse adolescents after accessing GnRHa alongside psychosocial support for a year at the GIDS. Owing to previous literature indicating improvements in psychosocial functioning, and due to the non-specific nature of many AS screening tools, it was hypothesised that the scores on the SRS-2 would decrease following 1 year accessing GnRHa, for both AM and AF participants. This analysis was further explored by birth assigned sex to understand whether differences were present.

Methods

Participants

Following a comprehensive psychosocial assessment at the GIDS, adolescents who met strict eligibility criteria, and who had a desire to access the medical pathway, were referred to a Paediatric Endocrine Clinic at either the University College London Hospital (UCLH) or Leeds General Infirmary (LGI). An assessment of suitability for GnRHa was established by an endocrine specialist prior to treatment prescription. The eligibility criteria for the present study were established in accordance to treatment guidelines available at the time (Coleman et al. 2012; Hembree 2011); the individual must have demonstrated an enduring pattern of gender nonconformity or gender dysphoria (APA 2013), a desire for puberty suppression, and sufficient therapeutic engagement at the GIDS. Furthermore, they must have met the minimum stage of puberty (Tanner Stage 2–3) and been assessed as having Gillick competence. Lastly, any contra-indicatory medical issues were addressed by the multidisciplinary team. Consent to access GnRHa treatment was obtained from a caregiver and the young person at UCLH as part of initial check-ups to determine Tanner Stage eligibility and other inclusion criteria. A priori effect size calculations were determined using G*Power for a medium effect (f = 0.25) and 80% power indicated a sample size of 98 was required for the analysis. For further information about the eligibility and exclusion criteria, see supplementary material.

Measures

The Social Responsiveness Scale 2 (SRS-2) School Age Form (Constantino and Gruber 2012) was utilised in this study to measure features typically associated with AS presentations. The screening tool has been validated for use in 4–18 year olds (Constantino and Gruber 2012) and comprises at total of 65 items, all of which contain Likert scale responses ranging from ‘0’ (not true) to ‘3’ (almost always). The subscales include social awareness, social cognition, social communication, social motivation and autistic mannerisms (restricted interests and repetitive behaviour). As well as on the five subscales, scores on the SRS-2 are summed to a total severity T score (SRS-2 Total). The SRS-2 total T score indicates the degree of social communication difficulties. T scores of 59 and below are interpreted as being within the normal range and are not generally indicative of social communication deficiencies; T scores of 60 to 65 fall within the mild range and are generally suggestive of subtle social communication deficits; T scores that fall between 66 and 75 are thought to highlight the presence of moderate impairments in reciprocal social behaviour; and a T score of 76 or above is classified as severe and considered to be highly associated with an AS diagnosis. The SRS-2 was scored according to norm data based on birth assigned sex. Prior study has demonstrated good psychometric properties and cross-cultural validity of this measure (Frazier et al. 2014). The SRS-2 has been used to measure ‘features of AS’ in studies investigating the co-occurrence of AS and GD in young people (Leef et al. 2019; Skagerberg, Ceglie and Carmichael 2015).

Procedure

The SRS-2 assessment tool was completed by the participant’s primary caregiver at two time points. Questionnaires including the SRS-2, amongst other psychological measures, were sent via post at baseline (prior to commencing endocrine treatment) and again after accessing GnRHa for 1 year (± 3 months). If the questionnaires were not completed within 1 month of receipt, they were sent once every month for the following 2 months before recording as missing data. Participants were made aware of the right to withdraw their participation from completing questionnaires at any time by a cover letter included with the questionnaires. All data was anonymised prior to analysis. Owing to anonymization of data, exemption for ethics was confirmed by external and local ethics committees affiliated with the Tavistock and Portman NHS Research and Development Department.

Statistical Analysis

To assess changes in the SRS-2 scores, each subscale T score, along with the total T score, was analysed in six separate two-factor repeated measures ANOVAs to determine differences in change over time. Birth assigned sex was included as a between-subjects factor in these analyses. Effect sizes are reported using Cohen’s d and 95% confidence intervals are stated. Standard error is stated where appropriate. Data is included for participants meeting inclusion criteria up to point of analysis (March 2020). Data is shown for 95 participants due to incomplete measures for the remaining participants. All analyses were conducted using IBM SPSS 25.

Results

Participants

A sample of 122 adolescents were identified as fitting the initial inclusion criteria. Age at which GnRHa consent was taken ranged from 9.9 to 15.9 years old (mean age: 13.6 ± SEM: 0.11). It is important to note that age of consent to access GnRHa did not reflect age of prescription and use of blockers; this was often held by local GP practices thus details regarding first injection dates are not available as they were not reported to UCLH or the GIDS. Age at second time point of completion ranged from 10.9 to 16.6 years old (mean age: 14.6 ± SEM: 0.13). Data is presented for adolescents referred as part of the under 16 s pathway.

Stage of puberty was determined using Tanner Stage taken at UCLH or LGI and was available for 97 participants (see Table 1 for further breakdown by birth assigned sex). The majority of the participants within this sample (34%) were in Tanner Stage 2/3 (also known as early to mid-puberty) and more AFs were in the later stages of puberty development, with 38.1% in Tanner Stage 3–5 (mid- to post-puberty) whilst 17.5% AMs were at these later stages of development. This is in line with adolescent development, as AFs are more likely to complete puberty earlier than AMs (Tanner 1981). Further breakdown of mean scores over time is presented by Tanner Stage in Supplementary Table S1 owing to many of the young people included in this sample belonging to a later stage of puberty development (Tanner Stage; T5). Analysis by Tanner Stage was non-significant across time for all young people, but it is important to note that this may be due to low power related to sample size.

Table 1 Tanner stage at time of consent to GnRHa in the present sample (N = 97)

Of this sample, 95 (38 AMs (40%) and 57 AFs (60%); mean age ± SEM at consent for GnRHa 13.6 ± 0.11) had completed measures of the SRS-2 at baseline and after one year (± 3 months) on the puberty blocker.

Social Responsiveness Subscales and Assigned Gender

No significant differences across time, between birth assigned sex and no interactions between time and birth assigned sex were found for subscales or total scores on the SRS-2 (see Tables 2, 3, Fig. 1). Indeed, AMs were on average more often within the normal range for t-score cut offs across all subscales and total scores (mean scores; Table 2) whilst AFs were largely within the normal range across the majority of subscales (social awareness, social cognition and autistic mannerisms) but displayed moderate scores for social communication, social motivation and total SRS-2 scores on average (Table 2). Between the two time points there was no significant improvements or deterioration nor were differences between assigned genders noted (see Table 3).

Table 2 Subscale analyses by time (baseline and one year on GnRHa) and birth assigned sex (AMs and AFs)
Table 3 Mean (± SEM) t-scores from baseline to 1 year on GnRHa for subscales and total score of the Social Responsiveness Scale between AMs and AFs
Fig. 1
figure 1

Mean (± SEM) t-scores from baseline to 1 year on GnRHa for subscales and total score of the Social Responsiveness Scale between assigned males and assigned females

Discussion

The present study sought to explore changes in features associated with AS, as measured by the SRS-2 at baseline and approximately one year after accessing GnRHa treatment, in gender diverse adolescents referred to the GIDS. This is the first study to investigate changes on the SRS-2 over time in a cohort of clinic referred, gender diverse adolescents accessing treatment to supress pubertal hormones. The results did not support the hypothesis that scores on the SRS-2 would improve after accessing GnRHa treatment. Instead, no significant differences in SRS-2 scores over time or between birth assigned sex were noted, and no interactions between time and birth assigned sex were found for SRS-2 subscales or total scores.

Whilst not in the clinical range, consistent with prior empirical studies of clinic referred, gender diverse CYP (Leef et al. 2019; Skagerberg et al. 2015), participants in this cohort were rated as having a higher cut off of social impairment on the SRS-2, compared to general population samples of a similar age. For instance, adolescents in the present sample scored higher than those recruited to the SRS-2 standardisation study in the United States (US), whereby school age cis-males (n = 493) and cis-females (n = 518) obtained a mean score of 33.6 (SD = 25.2) and 29.0 (SD = 23.7) respectively (Constantino and Gruber 2012). To the best of the authors’ knowledge, there is an absence of UK-based publications investigating the psychometric properties of the SRS-2 instrument in a general population sample. As a result, it is not possible to assume UK norm scores. Prior investigation has reported on the reliability, validity, and factor structure of the preceding version of the SRS-2, in a general population sample of n = 247 cis-male and n = 253 cis-female children (mean age 6.2; range 5–8 years) based in North East England (Wigham et al. 2012). SRS scores documented by Wigham et al (2012) for cis-male (Mean = 32, SD = 17.1) and cis-female (Mean = 27.7, SD = 15.9) participants were comparable with reports by Constantino and Gruber (2012), and considerably lower than the present cohort.

A variety of theories regarding the increased rates of features associated with AS amongst clinic referred, gender diverse CYP have been cited elsewhere in the literature (Van Der Miesen et al. 2016). However, like other AS assessment tools, the influence of behavioural problems, language, age and cognitive ability should be considered when interpreting scores (Hus et al. 2013). Prior research has demonstrated how the SRS-2 may lack specificity in differentiating between features associated with AS from other behavioural difficulties in childhood (Cholemkery et al. 2014). This is especially pertinent for clinic referred, gender diverse CYP, given research detailing considerable behavioural and emotional difficulties in this population (Turban and van Schalkwyk 2018; Zucker et al. 2014). This group may obtain high scores on SRS-2 items such as ‘get teased a lot’ and ‘is regarded by other children as odd or weird’ due to social difficulties arising from experiences of bullying and prejudice (Turban 2018; Turban and Schalkwyk 2018). Hence, high scores on the SRS-2 may be more representative of distress resulting from GD, rather than features associated with AS alone.

The absence of significant differences in total SRS-2 mean scores between AM and AF adolescents in this cohort, both at baseline and at follow-up, contrast with reports from general population studies. Prior research using the first and second edition of the SRS has documented how cis-males tend to, on average, score higher than their cis-female counterparts (Wigham et al. 2012; Constantino and Gruber 2012). In more detail, Wigham et al. (2012) found that cis-male participants were rated more highly on all SRS subscales apart from ‘social motivation’. Research utilising the SRS assessment tools in clinic referred, gender diverse samples has produced more variable results. Whilst findings from the present study converge with prior investigation of CYP accessing specialist support at the GIDS (Skagerberg et al. 2015), the results diverge from international literature that have demonstrated differences according to birth assigned sex (Leef 2018; Van Der Miesen et al. 2018). These conflicting findings appear to suggest that SRS-2 scores may be influenced by diverging study methodology and geographical location.

On a day to day basis, parents may view their child through a ‘gendered lens’, interpreting their behaviour in relation to their birth assigned sex. In most cases, these ‘gendered lenses’ continue to be influenced by gender norms that are determined by mainstream societal standards. Research on AS has long been grounded in cis-male dominated samples (Hiller et al. 2016) and as a result, the way in which we conceptualise, measure and diagnose AS is heavily influenced by cis-male presentations. Based on this, it is possible that features associated with AS are more readily identified and reported on AS screening tools, such as the SRS-2, by parents who view their child through a more cis-male oriented ‘gendered lens’. In clinical practise, a wide range of gender identities are exhibited by CYP accessing the GIDS, many of which do not fit the cis-male–cis-female gender binary (Twist and de Graaf 2019). Therefore, exhibiting a gender identity that falls outside of the cis-gender ‘norm’ may influence the way features associated with AS are perceived by others, perhaps helping to explain why the same pattern of results seen in cis-gender samples was not observed in present cohort. It is also worth noting that sex differences in the camouflaging effect have not been investigated in clinic referred, gender diverse samples. Depending on their gender identity, AM CYP may be just as likely to attempt to camouflage their social deficits in a similar fashion to what has been observed in cis-females. At the present time, these hypotheses remain only speculative.

Longitudinal research exploring psychosocial functioning in clinic referred, gender diverse CYP accessing GnRHa treatment remains sparse, with available studies being subjected to methodological differences. Findings from the present study suggest that features associated with AS in this cohort did not significantly change after approximately one year of GnRHa treatment. In the context of a limited evidence base, and within the confines of this analysis, it is not possible to ascertain whether the absence of change in SRS-2 scores over time represent a stable presentation of features associated with AS, or instead suggest that unsupportive social environments and psychosocial dysfunction persist during GnRHa treatment (Turban and van Schalkwyk 2018), influencing scores at both data collection time points.

Clinical Implications

Results from this study draw attention to the overrepresentation of social communication difficulties in clinic referred, gender diverse adolescents accessing GnRHa, when compared to general population studies (Constantino and Gruber 2012; Wigham et al. 2012). It remains unclear if a higher score on the SRS-2 reflects social communication difficulties or is instead an artefact of gender distress. Nonetheless, the present findings highlight the need for ongoing psychosocial support for adolescents accessing specialist gender services. By screening for features associated with AS at the point of referral, additional investigations can be accessed promptly, perhaps improving an individual’s ability to communicate and think about their gender needs. Furthermore, due to differences in social communication, insight and flexible thinking, current guidelines have suggested that extended diagnostic periods should be considered in cases where AS is suspected or diagnosed (Strang et al. 2018).

Future Research

Further research in this area is required in order to better understand the relationship between features associated with AS and clinic referred, gender diverse CYP. Longitudinal study of SRS-2 scores for those who choose to access sex hormone treatment is warranted to further our understanding of whether this treatment is associated with change in SRS-2 scores. Future research should also aim to explore the relationship between SRS-2 scores and other related factors such as anxiety in clinic referred, gender diverse CYP, which may have an influence on SRS-2 responses such as those observed in cis-gender samples (Cholemkery et al. 2014). This is particularly warranted given that symptoms of anxiety remained unchanged in prior research by De Vries et al. (2011). Furthermore, given the poor levels of specificity identified with current measures, an AS screening tool designed specifically for clinic referred, gender diverse CYP is necessary. Where possible, large samples should be obtained, as well as clinical and non-clinical control groups, to form a more robust and reliable evidence base. Progress in this area of research will enable the development of more thorough treatment guidelines for the management of clinic referred, gender diverse CYP exhibiting features associated with AS.

Limitations

Sufficient sample sizes of clinic referred, gender diverse CYP are difficult to obtain for longitudinal enquiry. This is reflected in the present study, which was underpowered at final analysis owing to low completion rates (a common issue found in longitudinal research). This may help to explain the non-significant findings reported and, as a consequence, further insights into the temporal relationship between features associated with AS and gender diversity in clinic referred samples requires further investigation (although it is also acknowledged that the sample was underpowered by only three cases). Clinical diagnoses for AS was not possible to obtain for the studied cohort, thus SRS-2 measures were used. It should be emphasised that screening tools, including the SRS-2, do not provide the level of assessment required for an AS diagnosis. As a result, the authors were not able to comment on prevalence rates of AS. In clinical practice, an extensive assessment of the young person’s developmental history and presenting behaviours, using both interview and observation methods, would be requisite (Hayes et al. 2018).

Although used widely in this research area, the SRS-2 has not been validated for use in clinic referred, gender diverse CYP. As a consequence, the reference norms for AF and AM CYP are based on birth assigned sex, as opposed to experienced gender at the time the measure was completed. It is poorly understood how this influences scores, and makes comparing results to the general population problematic. This can also cause distress for the participant and their caregiver, potentially influencing responses on the questionnaire. Owing to an absence of a concurrent comparison group of adolescents taken from a different clinical population, as well as a non-clinical group based in the UK, the authors are unable to establish whether the elevated rates in features associated with AS is specific to the clinic referred, gender diverse adolescents, or a characteristic of clinical population in general.

Conclusions

In conclusion, the findings from the present study show no evidence of significant differences in SRS-2 scores over time or between birth assigned sex, or interactions between these amongst a group of gender diverse, clinic referred adolescents accessing GnRHa treatment. Nonetheless, inflated rates of features associated with AS were observed in this cohort across all subscales for AM and AF participants, compared to general population studies. Although non-specificity of AS screening tools in clinic referred, gender diverse populations continues to be an issue, ongoing research in this field is important for clinical and education purposes, in order to ensure optimal treatment is being provided.