Introduction

Gender-confirming surgery is a growing field, possibly due to growing acceptance and tolerance of transgender and gender nonconforming persons [1]. However, no guidelines on how to report surgical outcomes exist. Transgender and gender nonconforming people are persons who at some point or all the time can experience gender dysphoria. Gender dysphoria describes the distress a person can feel over an incongruence between a person’s assigned sex at birth and that person’s gender identity [1]. The purpose of gender-confirming surgery is to confirm the gender incongruent patients’ gender identity and thereby decrease gender dysphoria. A variety of techniques are being used for this purpose, and treatment algorithms have been proposed [2]. To improve evaluation of surgical outcomes of gender-confirming chest surgery, it is beneficial to review which outcomes are reported, how they are evaluated and explore potential missing areas in the literature. The World Professional Association of Transgender Health makes recommendations on standard of care for these patients, which have been adopted by professionals all over the world. The most recent edition states that: “An official audit of surgical outcomes and publication of these results would be greatly reassuring to both referring health professionals and patients” [1].

In recent years, researchers within different fields of medicine have emphasised the need for more consensus in outcome evaluation to improve the potential of research [3, 4]. Therefore, a tradition of creating core outcome sets has developed. Core outcome sets are lists of consensus-based outcomes that as a minimum should be measured and reported [5]. Furthermore, increased focus on evaluation methods of different types of outcomes has emerged leading to the concept of clinical outcome assessments [6]. This scoping review aims to provide an overview of outcome measures used to evaluate gender-confirming chest surgery.

Materials and Methods

This scoping review follows the PRISMA-SCr guideline [7]. A review protocol was registered on Open Science Framework on OSF.io (https://doi.org/10.17605/osf.io/tu7ck). We included all studies investigating non-cis gender persons, of all ages, who underwent gender confirmation chest surgery. Gender-confirming chest surgeries were defined as breast reduction, breast augmentation or revisions of one of these procedures.

The primary outcome was outcome measures used to evaluate gender-confirming chest surgery including both surgical and non-surgical outcome measures. Process and structure measures, such as duration of surgery, use of anaesthesia and staff qualifications, were not included in this review of evaluation of outcomes [8]. Only studies including five or more patients were included. We included all studies regardless of publication status or year, and we included studies reported in English, German, French, Swedish, Danish or Norwegian.

The search string was created in collaboration with a research librarian. The PubMed database (from 1968) search string was: ((((((((((((((((((gender affirming surgery) OR sex reassignment surgery) OR gender-confirming surgery) OR gender confirmation surgery) OR gender affirmation surgery) OR female-to-male) OR male-to-female) OR sex reassignment procedures) OR gender change) OR sex change) OR transsexualism) OR transgendered persons) OR transgender) OR intersex) OR gender dysphoria) OR transgenderism) OR gender identity disorder)) AND (((((((((((((((breast reduction) OR chest reconstruction) OR reduction mammoplasty) OR chest wall contouring surgery) OR mastoplasty) OR mastectomy) OR mastectomies) OR mastopexy) OR mammaplasty) OR breast augmentation) OR chest wall contouring) OR subcutaneous mastectom*) OR chest wall masculinisation) OR top surgery) OR masculinising mastectom*). This search was adapted to EMBASE (from 1980), Cochrane Library (from 1996), Scopus (from 1960), CINAHL (from 1982) and PsycINFO (from 1965). Searches were performed on September 19, 2018. Furthermore, reference lists of included studies were screened for additional studies (snowball search).

Studies included in the literature search were systematically and independently screened by two authors (AT and DZ), first by title and abstract, and then by full texts. Conflicts were resolved by discussion. A data charting form was made cooperatively by two authors (AT, DZ). Variables of relevance to the study aim were detected during full-text screening and included in the data charting form. First author (AT) selected eight studies that together included all relevant variables. These eight studies were used for calibration of the data charting form, where two authors (AT, DZ) independently charted the data. Conflicts were resolved through discussion with the last author (JR). The final data charting form consisted of the following items: complications, reoperations, revision surgery, aesthetic outcome, nipple areola complex sensitivity and patient-reported outcome measures. Questionnaires that could not be accessed for review without payment were not included in the data analysis.

Data synthesis was performed in three steps. First, we charted all the data and compared the categories, domains and classifications. Existing categories and domains were then supplemented with further categories or domains if necessary to report the data. Finally, the outcome categories, domains and classifications were counted for the number of studies using them.

Results

The literature search identified 849 records, and after the screening process, 47 studies were included in the review (Fig. 1) [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55]. An overview of included studies is shown in Table 1.

Fig. 1
figure 1

PRISMA flow diagram of study selection process

Table 1 Study characteristics

The terminology used to report techniques and outcomes varied widely across studies, which made it necessary to streamline the wording to report findings. Transmasculine patients assigned female gender at birth were included in 39 studies, and 11 studies included transfeminine patients assigned male gender at birth.

Clinician-Reported Outcome Measures

Complications were reported in 40 studies, 20 studies reported on reoperations (within 30 days of primary surgery), and 26 studies reported on revision surgery. Of the 47 included studies, 17 evaluated all three of these outcome categories. Across studies, there was a tendency to asses somewhat similar complications. In eight studies, the authors subcategorised complications into minor and major complications based on necessity of reoperation [12, 17, 18, 22, 24, 25, 40, 45]. In five studies, minor and major complications were used as a subcategorisation without clearly defined distinctions between the categories [20, 27, 30, 31, 41]. Complications were not clearly subcategorised in 18 studies. Other complication subcategorisations included short- versus long-term [23, 50], medical versus aesthetic [36], in hospital versus outpatient [29], mastectomy-specific versus general surgical complications [21] or included complications within a broad category of adverse events [42]. When gender-confirming chest and genital surgery is performed simultaneously, the categorisation and classification of complications differ from when gender-confirming chest surgery is the primary focus. The division of clinician-reported outcomes into categories, domains and classifications is shown in Table 2.

Table 2 Clinician-reported outcomes divided into categories, domains and classifications

In nine of the 20 studies within the reoperation category, reoperation was not defined as being performed for major complications, but rather as an undefined requirement for reoperation. Distinctive parameters of reoperation requirement were not reported in any study. In five studies, revision surgery was divided into scar, contour and nipple–areola complex revision [12, 18, 25, 38, 45]. Other divisions of revision surgery were: planned versus unplanned in two studies [22, 52]; medical versus aesthetic indication in two studies [10, 36]; and local versus general anaesthesia in one study [39]. The studies that did not divide revision surgery into scar, contour and nipple–areola complex revision could easily be divided this way. Furthermore, an extra domain for skin revisions was created to supplement the division. Most studies did not report which revision surgeries were performed.

Aesthetic outcomes were reported in three fashions: clinician-reported outcomes in four studies [12, 25,26,27], patient-reported outcomes in 18 studies [9, 10, 14,15,16, 18, 19, 23, 24, 28, 29, 31, 32, 40, 42, 43, 48, 49] or both clinician and patient-reported outcomes in three studies [27, 37, 45]. Only a few studies used questionnaires (breast-Q [56] or body-Q [57]) validated to assess patient satisfaction after breast surgery [14, 19, 35]. Self-constructed questionnaires or Likert scales were used in 14 studies [10, 12, 16, 18, 23,24,25, 27, 28, 37, 40, 43, 45, 49], and six of these studies specified which classifications the aesthetic outcome evaluation were based on [10, 12, 25,26,27, 37].

Patient-Reported Outcome Measures

Patient-reported outcomes were included in 29 studies and were used in a variety of ways to assess postoperative results. Besides measuring aesthetic outcome, patient-reported outcomes were used to evaluate functional outcome and mental health parameters (Table 3). Mental health evaluation was included in 18 studies [9, 10, 14,15,16, 19, 23, 29, 31, 32, 34, 35, 37, 39, 43, 46, 48, 55]. Of these, 12 studies included ad hoc scales or questionnaires [10, 15, 23, 29, 31, 32, 34, 37, 39, 43, 48, 55]. Five studies included questionnaires with some degree of formal validation for a transgender population [14,15,16, 32, 48]. Questionnaires validated in other patient groups or procedures including generic tools were used in seven studies [9, 15, 19, 28, 32, 34, 35]. Postoperative function was never assessed or reported using performance outcomes, but was most often included as a patient-reported outcome and included nipple–areola complex sensitivity, pain, bra comfort, range of motion in upper extremities and posture in 14 studies [9, 10, 16,17,18,19, 25, 29, 35, 40, 43, 45, 46, 49]. Nipple–areola complex sensitivity was measured in seven studies [16,17,18, 25, 40, 43, 45]. No studies included monofilament testing or a two-point discrimination test.

Table 3 Patient-reported outcomes divided into categories, domains and classifications

Reporting Variations

There are variations not just in the type of outcomes used but also in the way the findings are reported. Some authors report outcomes per breast, others per patient and sometimes it is not reported at all. Studies also differ, when reporting on reoperation and revision surgery. In these outcome categories, there was an inconsistency in the reporting per procedure, per patient or per indication.

Conclusion

Evaluation and reporting of outcome measures in gender-confirming chest surgery showed a high level of heterogeneity. In light of this scoping review, it could be advisable to increase comparability of studies regarding outcome evaluation. A core outcome set in gender-confirming chest surgery could help researchers and clinicians increase comparability between studies. To compare studies and procedures, standardised evaluation methods and measures are needed. Only then can we reach a better understanding of what procedures are beneficial to whom and why.

Discussion

The results from this systematic scoping review highlight that there is a lack of consensus on outcome evaluation and reporting leading to heterogeneity in outcome domains and classifications.

To our knowledge, this is the first scoping review performed for gender-confirming chest surgery. A strength of the study is the large amount of included studies including not only studies in English but also French and German. A further strength is that we followed the recently published PRISMA-SCr extension guideline [7]. The inclusion of both breast reduction and augmentation and revisions made it possible to detect variations in outcome evaluation across different procedures. Limitations include that most studies were retrospective evaluations of charts and therefore relied on charts being adequately filled out which can lead to reporting bias. The study designs might therefore be a factor affecting the level of heterogeneity in the literature and thereby in this study. Revision surgery was often reported as number of revision surgeries performed rather than indications for revision surgery. A recent study found that a majority of transgender patients express barriers to surgical care of which financial barriers are the most prevalent [58]. Therefore, measures of utilisation of revision surgeries might not be a reliable measure in healthcare systems where the patient needs economical strength to get a revision procedure done. Thus, it cannot be concluded that a lack of performed revision surgery is a good estimate for a lacking need for revision surgery. The same issue can affect outcome measures gathered in the outpatient setting as minor complications, and need for revision surgery might not be acknowledged if the patients must pay for an outpatient visit. Further investigation of barriers to care within gender-confirming surgery is needed.

Additional efforts should be made to establish more comparability in outcome evaluations. An effort could be to create a core outcome set for gender-confirming chest surgery or create a commonly, shared terminology within this research field. With such tools, researchers could easily ensure a study design that includes outcomes, and use a terminology that has been agreed upon by a panel of experts within the field. Furthermore, instead of reporting a lot of negative findings, researchers could refer to such a tool and make a remark that it was followed and that negative findings will not be reported.