Introduction

Transgender individuals have a gender identity and/or expression different from society’s expectations based on their sex attributed to birth. Although ICD-10 adopts the diagnostic category transsexualism to describe such condition (World Health Organization 1993), the suffix “-ism” assumes disease (American Psychiatric Association 2013). Therefore, the WHO Working Group on Sexual Disorders and Sexual Health recommended renaming, as gender incongruence, in the ICD-11 (World Health Organisation 2018).

To dissociate gender identity from essentially pathological situations, we adopt the terminology of Guidelines for Psychological Practice with Transgender and Gender Nonconforming People (American Psychological Association 2015).

Not all transgender and non-conforming gender (TGNC) have gender dysphoria, and not all will undergo surgical procedures concerning gender affirmation. Individuals with gender dysphoria (GD) face psychological, familial, social, and economic difficulties that compromise their quality of life (QoL). Moreover, the QoL is one of the essential aspects of human health, which is embedded in a psychological, physical, social, and environmental context (Skevington et al. 2004).

The clinical presentation usually includes discomfort related to the original sexual characteristics and a request for medical help to alter the phenotypic expression of the body. Applications may include treatment with reverse sex hormones, hair removal in trans women, surgery to provide changes in primary and secondary sexual characteristics, and a new legal genre (Dhejne et al. 2014).

The frequency of gender affirmation surgery (GAS) has increased in recent years including, but not limited to, chest wall masculinization, hysterectomy, phalloplasty, and/or metoidioplasty for trans men individuals, and breast augmentation, vaginoplasty, and facial contouring for trans women patients (Weiss and Schechter 2015). More specifically, GAS refers to the surgery procedures required to create a body phenotype that best represents one’s own identity (Selvaggi et al. 2018).

Recent research has shown that gender reassignment surgery has a positive effect on subjective well-being as well as sexual function (Gijs and Brewaeys 2007; Carolin and Gorzalka 2009; Hess et al. 2014). The quality of life frequently improves among patients after surgery, and regrets from those who decided to undergo the abovementioned procedures are seldom reported (Mattila et al. 2015).

More recently, trans men submitted to chest wall masculinization reported improved outcomes, with statistically significant changes in several different domains, including physical, psychosocial, and sexual well-being and self-esteem (Agarwal et al. 2018). In a study that analyzed results of surgery for trans women (da Silva et al. 2016), the participants had improvements regarding sexual activity, freedom, physical security and safety, financial resources, and social and health care. Accessibility and quality were also improved after GAS.

Five important reviews on treatments for individuals with GD evaluated the prognosis of hormonal interventions during GAS (Murad et al. 2010), the procedures and surgical techniques for sexual reassignment (Sutcliffe et al. 2009; Horbach et al. 2015), and patient-reported outcome measures following transgender surgery (Barone et al. 2017; Andréasson et al. 2018).

However, as far as we could investigate, we could not find any evidence of a previous systematic review or meta-analysis aimed at assessing the evidence concerning the change in quality of life after transgender surgery. Thus, in order to fill this knowledge gap, we performed a systematic review of the literature and meta-analysis to identify evidence that assessed the patient’s quality of life after transgender surgery.

Methods

Both the systematic review and meta-analysis were guided through the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) extension statement for reporting systematic reviews incorporating network meta-analyses (Hutton et al. 2015). Since this study was a review of published studies, ethical approval was not required.

Search Strategy and Selection Criteria

Data was collected from PubMed, Scielo, Google Scholar, and the Science Direct from the inception of each database to 26 June 2018, in English, Spanish, and Portuguese. A multi-lingual combination of medical subject headings and text words was used to identify studies concerning quality of life after transgender surgery (Box 1).

Box 1PubMed® search strategy used in the systematic review of quality of life after SRS

1# ((“quality of life”) AND ((“transgender” OR “trans-gender” OR “transvestism” OR “transvestite” OR “transsexual” OR “transsexualism” OR “trans man” OR “trans men” OR “trans women” OR “trans woman” OR “transman” OR “transmen” OR “transwomen” OR “transwoman” OR “transgendered”))) AND ((“sex change” OR “sex reassignment surgery” OR “gender adjustment surgery” OR “gender reassignment surgery” OR “gender-confirmation surgery” OR “gender affirming surgery” OR “gender reassignment” OR “female-to-male chest reconstruction” OR “male-to-female chest reconstruction” OR “vaginoplasty” OR “phalloplasty”))

  1. Search strategies for other databases used (Scielo, Google Scholar, and Science Direct) are available from the corresponding author

Peer-reviewed studies were eligible for inclusion if they involved transgender patient and that measured quality of life after GAS. However, quality of life was not necessarily the primary outcome. We excluded the following from the analysis: duplicate articles; qualitative studies; articles that did not report original research or analysis; and studies that did not report on reassignment surgery, quality of life, or quality of life based on GAS.

Two reviewers (T.S.P. and M.A.A-S) screened the identified titles/abstracts for possible inclusion, and disagreements were resolved by discussion. In the next step, the researchers independently assessed the full text of potentially eligible studies. The authors minimized publication bias across studies by including additional articles, after scanning reference lists of previously included articles.

Screening and Data Extraction

The search strategy above was completed on Aug. 26, 2018. One reviewer (T.S.P) extracted details of the studies into a database. The data collected were as follows: country and area; data collection period; study type and sampling method; description of study population; sample size; follow-up period of patients; questionnaire as the instrument of outcome measurement; and surgical procedures (region of surgery, male-to-female or female-to-male surgery). Other author (M.A.A-S) checked for accuracy, and disagreements were resolved by consensus.

The authors of the articles were contacted in cases where the mean and standard deviation for each group were not available, thus preventing the obtaining of the standardized coefficients and effect size estimates. We excluded articles that did not use a control group, those with a sample of less than ten individuals, and those which did not separate the group that underwent surgery from the other groups, such as the one with individuals who only underwent hormonal treatment.

Meta-analysis

Conceptually, the approach for the topic of the present study posed several challenges. First, instead of the standard meta-analysis where we found the ubiquitous pattern “favors treatment A” vs. “favors treatment B,” we considered important that all multiple surgical interventions come to the foreground for the analysis. Second, the outcome variable, i.e., “quality of life” (measured lato sensu), differed much between studies on account of the selection of different instruments and questionnaires. Third, some studies suffered from small-sample concerns, more so when subgroup comparisons are to be done. Additionally, on account of pre-post comparisons, the appropriate estimator shall curb potential violations of the assumption of independence, hence the need to choose a parameter capable of providing reliable estimations under inherent correlation between measures.

It was hypothesized that the results would be heterogeneous because of differences between studies in the diverse types of QoL measures (e.g., San Francisco short 36-question, WHOQOL-bref), as well as in the focus onto the different gender identities of the participants (e.g., trans men, trans women, both). Another factor that contributes to heterogeneity is that the most frequently used QoL measures do not calculate a total score but calculate separate composite scores for mental and physical health.

Consequently, to tackle this combination of issues, we specified the outcome measure as the standardized mean difference according to the formula for Hedge’s g and applied a network meta-analysis (White and Thomas 2005; Hawkins et al. 2009; Lu et al. 2014; Salanti et al. 2011).

The network meta-analysis process undertook the following steps:

  • Formatting the data under the “augmented” pattern, i.e., treatments are compared to a reference condition (in the present case, pre-values or the expected values for a given “standard” population).

  • Presenting a pattern for the combinations between arms within and between studies; setting up the analysis, by selecting the standardized mean difference and the pooled standard deviation across arms for the pairwise comparisons.

  • Mapping the network in order to underscore arms which are directly and indirectly compared, as well as to weight the contribution of each pair of treatments; checking for consistency of the model and this, we expect a p value > 0.05 for the inconsistency Wald test.

  • Ranking treatments according to the cumulative probability of being in the “best” arm; splitting between nodes—“node splitting”—to present partial contrasts between arms and information about indirect effects; summarizing the analysis by performing a forest plot with results pooled within design as well as overall.

We preferred to employ a frequentist approach instead of Bayesian analysis. Under the frequentist approach, asymptotic distribution is assumed, but this is expected when the dependent variable is continuous and, more so, when the standardized mean difference is estimated. Among the advantages, we may cite the relative simplicity of the analysis, its high speed, and flexibility of graphical interface (Chaimani et al. 2013). Also, there is no need to decide among a vast array of priors, which may eventually produce different results of fail to converge. What is more, this method is more conservative, for it tends to reduce the probability of type I error (White 2015).

A two-tailed p value < 0.05 was the criterium of statistical significance. We used the statistical package Stata 15.1 (StataCorp, College Station, TX, USA) for all estimations.

Results

Eligible Studies

According to the search strategy, 2843 records were identified. After the titles and abstracts were screened, 2821 were rejected due to the reasons listed in Fig. 1. After careful full-text screening, 14 articles proved eligible for inclusion in this review. Among the articles eligible for inclusion in the review, longitudinal studies were selected for meta-analysis. Studies that did not measure quality of life before and after surgery were not included in the meta-analysis.

Fig. 1
figure 1

Flowchart for selection of articles in the systematic review of quality of life after sex reassignment surgery

Grade Assessment

All studies evaluated in this research were observational, not blind and not randomized. This is expected if we take into consideration the condition of the patients as well as ethical concerns. That being said, such aspects will automatically prompt the rating to be low according to GRADE criteria.

Study Quality Assessment

GRADE was used to assess the overall quality of evidence for each outcome collected (Balshem et al. 2011). GRADE specifies four categories—high, moderate, low, and very low—that are applied to a body of evidence, not to individual studies.

Systematic Review

These studies enrolled 881 participants (635 trans women, 246 trans men). All the studies presented sampling by accessibility or by convenience. The majority of the 14 included studies originated from Europe. Study characteristics are described in Table 1.

Table 1 Study characteristics and details of the references. N/A not applicable

Instruments Used to Measure QoL

A study used questionnaires self-developed, the King’s Health Questionnaire (study 1). The other instruments identified were as follows: San Francisco 36 and derivatives (studies 2, 3, 9); WHOQOL-100 (study 6); WHOQOL-bref (studies 4, 5, 13); Body Image-Related Quality of Life (study 7); Subjective Happiness Scale (id: 4, 12, 14); Satisfaction With Life Scale (studies 4; 12, 14); Cantril Ladder (studies 12; 14); Freiburg Personality Inventory (study 10); Rosemberg Self-Steem Scale Patient Health Questionary (study 10); Fragen zur Lebenszufriedenheit Module (studies 10, 11); and 1 instrument specific for plastic surgery, the BREAST-Q (study 8).

Impact of GAS on QoL

The studies enrolled in the systematic review either had quantitative comparisons with normative data (studies 1, 2, 3, 4, 5, 7, 9, 10, 11, 13, 14) or presented data before as well as after surgery (studies 4, 6, 7, 8, 9, 10, 32). Some articles also presented post-operative data within different periods of follow-up (study 9), different types of surgery in the same study (study 2), and difference in quality of life among patients satisfied and dissatisfied with the surgery (study 14).

Meta-analysis

This section presents results of the meta-analysis component of the review. Seven studies involving 420 individuals (259 trans women; 122 trans men) met the criteria for inclusion in the meta-analysis. Statistical heterogeneity of the results was significant. Pooled estimations may potentially be biased on account of the presence of heterogeneity, which lead to a significant p value when applying the test of consistency (p < 0.001).

That being said, there is an expected bias on the test considering the inherent diversity of samples, centers, procedures, and instruments enrolled in the present study. However, in order to mitigate these problems, we specify the measure of the result as the standardized mean difference. Figure 2 shows the results of individual studies on different QoL scale modules.

Fig. 2
figure 2

Forest plot of data for quality of life before GAS (A), after GAS (B), and normative data (C) included in the meta-analysis

Three studies showed no statistical significant difference. One of these had the largest sample sizes of the trans women population (190) who underwent GAS in general, in comparison to the selected studies. The other two had the smallest samples, one with a trans men population (26) who underwent chest surgery and, finally, one with a trans men (21) population who underwent genital surgery.

Better overall QoL results were found in trans men population who underwent chest surgery. The trans women population that underwent surgery in the genital region presented better results in only specific domains that compose the QoL (psychological, social relations, high self-esteem, and satisfaction with the body).

Before and After Surgery (T1 vs. T0)

The combination of studies shows a slight positive effect of GAS on individuals’ quality of life. The study with trans men population submitted to mastectomy (study 8) obtained better results. Besides, it was the only one with significant results in all modules of the QoL questionnaire. Among them, the top results were breast satisfaction (p < 0.0001) and psychosocial well-being (p < 0.0001), although sexual satisfaction (p < 0.0001) and physical well-being (p < 0.0001) showed positive results as well.

Studies with trans women population also had positive effects in specific modules (studies 6, 10). Domains II (psychological) (p = 0.041) and IV (social relationships) (p = 0.007) were improved significantly after GAS (study 6). Patients mostly improved general satisfaction (p = 0.01), or satisfaction with body image (p < 0.01) or self-esteem (p = 0.01) (study 10).

Nonetheless, these studies stand out because they have a negative effect. Domains I (physical health) (p = 0.002) and III (level of independence) (p = 0.031) were significantly worse after GAS (study 6). In one of the studies, the negative effect on two modules brought positive data regarding quality of life. Before surgery, the results of the Patient Health Questionnaire-4 displayed high value of 3.95, which suggested mild depression and anxiety disorder. After surgery, the values concerning this specific score were significantly lower (p < 0.01). The lower value in the Freiburg Personality Inventory score found in T1 (p = 0.03) compared to T0 showed greater emotional stability.

Before and After Surgery Compared to Normative Data (T0 vs. Norm and T1 vs. Norm)

Compared with normative data, the scores together were slightly improved after surgery but did not reach statistical significance. Self-esteem before surgery was similar to normative data. After surgery, it became significantly greater (p < 0.01) than that of a general German population (study 10). Compared with data from the German standard (p = 0.01), the emotional stability score was significantly improved after surgery. In the same study, the positive effect on the value of the Patient Health Questionnaire-4, comparing T0 with normative data, suggested mild depression and anxiety disorder previously. The graphical similarity between T1 and the normative values may suggest that the quality of life after surgery reached the expected value for the general population.

After Surgery and Normative Data (T1 vs. Norm)

One study, while evaluating other measures before surgery, only assessed the quality of life after surgery and compared it with normative data (study 4). It revealed similar scores in all areas except the subdomain environment, which was higher for the participants than the norm (< 0.001).

Discussion

World literature about quality of life and GAS has increased in recent years, as more patients tend to decide in favor of transgender surgery (Kuiper and Cohen-Kettenis 1988; Rakic et al. 1996; Carolin and Gorzalka 2009; Murad et al. 2010). Usually, after GAS, transgender show additional improvement in their gender and QoL incongruence with less uncertainty about their gender role and more self-confidence about their body image (Sutcliffe et al. 2009), yet they rarely use standardized tools (Andréasson et al. 2018).

The systematic review already cited by Murad concluded that approximately 80% of transgender reported subjective improvement in terms of GD, QoL, and psychological symptoms (Sutcliffe et al. 2009). Our results add to the body of literature demonstrating improvement, especially in some domains after GAS, although it is important to note that almost all studies used different research tools to reach their conclusions.

Factors that predict a positive outcome after GAS are not fully understood, but some criteria for a good prognosis have been identified in previous studies. Among the studies that compose our review, we found better overall QoL results in the population of trans men who underwent thoracic surgery. On the other hand, previous studies have reported that being trans woman are predictors of a positive outcome after GAS (Bodlund and Kullgren 1996; Gooren 2011).

QoL of Trans Men

Better results post GAS was found in trans men population who underwent chest surgery. More recently, trans man undergoing chest wall masculinization reported improvement in the outcomes, with statistically significant changes in several different domains including physical, psychosocial, sexual well-being, and self-esteem (Agarwal et al. 2018).

Van de Grift et al. (2016) demonstrated improvement in body dysphoria and body satisfaction in trans men after thoracic surgery. However, they did not find altered self-esteem and quality of life related to body image, although the participants stated the positive or very positive effect of mastectomy in daily life, quality of life, social situations, self-esteem, and body image.

When looking at specific modules of the QoL, the studies included in this research reported that GAS has been shown to have a positive impact on trans men (Wierckx et al. 2011; van de Grift et al. 2016; Agarwal et al. 2018). They reported improvement in the outcomes in several different domains including health physical, psychosocial, and sexual (Wierckx et al. 2011; Agarwal et al. 2018); improvement in body dysphoria and body satisfaction (van de Grift et al. 2016); and self-esteem (Agarwal et al. 2018).

QoL of Trans Women

Trans women submitted to surgery in the genital region presented better results than other types of surgery. However, these results represent only a few specific domains that make up the QoL (psychological, social relations, high self-esteem, and satisfaction with the body).

Trans women who underwent genital surgery were significantly associated with a higher mental health-related quality of life post-surgical (Ainsworth and Spiegel 2010). However, the results were not significantly different when compared to the general population; hence, they could not be taken as conclusive of a direct positive effect of surgery.

Different studies with trans women reported that the GAS promoted the improvement of psychological aspects (Ainsworth and Spiegel 2010; da Silva et al. 2016; Papadopulos et al. 2017a), social relations (da Silva et al. 2016; Papadopulos et al. 2017a; Papadopulos et al. 2017b), and sexual activity (da Silva et al. 2016). A tentative explanation for this finding could be related to the sense of personal fulfillment with surgery and better acceptance of the body.

Negative Results After GAS

The surgical procedure is complex and involves the possibility of surgical complications, other esthetic procedures, and frustration. One study showed that, within 15 years after GAS (Kuhn et al. 2009), overall satisfaction among trans people was lower than in controls. However, the authors reflected on the interference of patients’ optimistic or pessimistic attitude towards life. According to the authors, confident patients appeared to strive to maintain a positive outlook for their health, even with some serious health problems, as well as a positive outlook for self-care activities.

Van de Grift et al. (2018) reported eight (6%) participants with dissatisfaction and/or regret after GAS, which was associated with preoperative psychological symptoms or self-reported surgical complications (OR = 6.07).

In a Brazilian study (da Silva et al. 2016), energy and fatigue, sleep and rest, negative feelings, mobility, activities of daily living, and physical environment worsened after GAS. Even beyond 1 year after GAS, the trans woman continued to report problems in physical health and difficulty in recovering their independence.

In an Italian experience (Castellano et al. 2015), GAS outcomes seem to hinder the psychological well-being, particularly in trans men because of the complexity of the surgical procedure and the possibility of complications, especially about phalloplasty. The trans men’s subgroup scores were significantly lower than trans women’s and men’s cisgender.

Limitations

This review included cross-sectional and cohort studies that have some inherent limitations. For example, some cross-sectional studies did not calculate the “within patient” effect on quality of life. Thus, it is difficult to determine the direction of observed associations. Data on the postoperative time varied considerably between the studies.

Despite the limitations, some of them inherent to the characteristics of this kind of research, our findings provide useful information in the global debate on the association between GAS and quality of life of transgender.

Implications for Research

To further elucidate the strength of the association between GAS and the quality of life of transgender, there is a need for high-quality follow-up studies conducted in different regions of the world and among individuals from different cultures, surgeries, and differentiation of the effect on feminization and masculinization. Cultural differences should be considered when applying the results of this review. Individuals from countries that reject transgenderism or who have limited access to gender-change surgery will possibly have different results than the European standard.

The medical literature focused largely on short-term outcomes concerning surgical and functional satisfaction, rather than the overall quality of life. Besides, standardized questionnaires have been rarely applied. Moreover, several studies collected data retrospectively, which prompts to recall bias.

In order to fully evaluate the results after gender-affirming procedures, it would be important to create and validate a universally accepted metrics, instead of relying on the panoply of instruments available nowadays, several of them being originally applied to different situations. Future research might highlight these aspects as well as focus on the assessment of long-term outcomes.

Conclusion

Evidence of low quality suggests that GAS is likely to improve the quality of life of transgender individuals. Despite the limitations of the published literature, this review concludes that better overall quality of life outcomes was found in the trans men population who underwent chest surgery. So far, there are no precise conclusions as to the guarantee of a satisfactory long-term quality of life after GAS.

Therefore, relying on the so-called external authority (the gatekeeper) to decide who meets the eligibility criteria for surgery may not be enough. Gatekeeping requires an assessment of gender dysphoria as a prerequisite for GAS and can generate barriers to the required medical care assessment. A tentative solution to this issue would be to work under the informed consent model in order to facilitate decision-making, clarification of the risks and benefits involved, while preserving patients’ authority over their experiences. Public policies aimed at the health of the transgender population should recognize the need for medical and psychological support in the preoperative and postoperative periods.