Introduction

Labiaplasty, or labia minora reduction, is a surgical procedure in women that usually reduces the degree of protrusion of the labia minora. The incidence of labiaplasty in the National Health Service (NHS) in the UK was 1,726 in 2010–2011 [1]. The number of labiaplasties conducted in the private sector is probably greater than in the NHS. Braun [2] and Liao et al. [3] identified up to 18 publications covering 937 case reports or series of labiaplasty worldwide up to March 2009.

The motivation for seeking labiaplasty falls into three main categories [2, 4, 5]. Women desire the procedure for: (i) Aesthetic reasons: for example, to reduce self-consciousness in public situations and feelings of ugliness and abnormality. (ii) Functional reasons: for example, to reduce discomfort, irritation or pain during (nonsexual) activities. (iii) Sexual reasons: for example, to reduce dyspareunia or fears of negative evaluation by a sexual partner or self-consciousness during intimacy. About a third of women seeking labiaplasty have been teased or had negative comments made about their genital appearance [6].

Some women seeking labiaplasty may have body dysmorphic disorder (BDD). This is characterised by a preoccupation with a perceived defect that is not observable or appears slight to others; however, the individual’s concern is markedly excessive. Crouch et al. [7] described the size of the labia of women seeking labiaplasty to be within normal published limits. To fulfill the diagnostic criteria for BDD, however, the perceived defect must be either significantly distressing or cause impairment in social, occupational or other important areas of functioning. The most common preoccupations in BDD are facial skin, nose, eyes, eyelids, mouth and chin, or just being ugly in general [8, 9], In other areas of the body, a cosmetic procedure and the diagnosis of BDD may be associated with a poor outcome [1012].

Surgical complication rates reported for labiaplasty are <5 % [13] and 10.8 % for side effects [14]. There is only one prospective pilot study of 14 women undergoing labiaplasty, [15] and no controlled studies on psychosexual outcomes. All other retrospective case series claim a high level of patient satisfaction and anecdotes pertaining to success in the short term. None of these studies utilized standardized outcome measures of sexual function or genital body image independent of the surgeon (although one used a general body image measure).

The lack of evidence regarding psychosexual outcome of labiaplasty, especially in the long term, has led to significant criticism [7, 16]. The objectives of this study were therefore to determine the outcome after labiaplasty with a comparison group, especially in the long term. The hypotheses were that women receiving labiaplasty would improve on specific measures of genital appearance satisfaction and sexual function.

Materials and methods

Ethics permission was granted by the Joint South London and Maudsley Trust and Institute of Psychiatry NHS Research Ethics Committee (09/H0807/33). We recruited 88 women who were categorised into two groups: those desiring, and a control group of those not desiring labiaplasty. A STROBE diagram is provided in Fig. 1 for women receiving labiaplasty.

Fig. 1
figure 1

Women receiving labiaplasty

Participants

  1. (1)

    Women having labiaplasty

    We recruited from the following sources 49 women seeking labiaplasty: (a) Thirty-five (71 % of the study sample) from a private cosmetic clinic were recruited from 77 women who had labiaplasty in the recruitment period after being given information about the study. (b) Fourteen (29 % of study sample) women from an NHS gynecology clinic who were drawn from a total of 35 women who had a labiaplasty and were given information about the study.

  2. (2)

    Comparison group

    We recruited 39 women for the comparison group who completed baseline and 3-month follow-up questionnaires. They were characterised by not wanting labiaplasty. Comparison participants were recruited from MindSearch, a King’s College London database containing email addresses of members of the public willing to be contacted for research participation.

    Inclusion criteria were that all women were required to be between 18 and 60 years of age. Mann–Whitney and chi-square tests were used to check whether groups were matched; no significant differences were found between two groups in age, sexual orientation, marital status, education, ethnicity, whether or not they had children, and in symptoms of anxiety or depression (Table 1).

    Table 1 Demographics and baseline characteristics for labiaplasty and comparison group

Procedure

Women in both groups were recruited contemporaneously between January 2010 and May 2012. Prelabiaplasty, participants signed informed consent and completed all questionnaires listed below, either online (78 % of 49 participants) or on paper. This process was repeated at the 3-month follow-up. The long-term follow-up consisted of three of the outcome questionnaires: Genital Appearance Satisfaction (GAS), Pelvic Organ Prolapse—Urinary Incontinence Sexual Function Questionnaire (PISQ), and Cosmetic Procedures Scale—Labia (COPS-L); 91 % of 23 participants completing this online. Qualitative data were collected regarding any adverse effects as a result of the procedure between 11 and 42 months postoperatively. At both follow-up stages, all participants were contacted first via email then by post, with a web link and paper versions of the questionnaires. If no response had been obtained, participants were contacted by telephone.

The comparison group signed informed consent and completed the full set of questionnaires at two time points, 3 months apart, in order to control for the effects of time on these measures. At the first time point, questionnaires were completed online by 91 % of participants and at follow-up by 92 %; the remainder were completed on paper. All were thanked with a £20 High Street voucher at each stage of the study.

Labia measurements were taken for women undergoing labiaplasty at the time of the procedure. The surgeon measured the degree of protrusion of the labia minora and width of each labium with a disposable tape measure. All measurements were made with the patient in the lithotomy position, with minimal stretching of labia. Width was measured anteroposteriorly from the clitoral hood and the lower aspect of the labia minora. We took the average of left and right measurements. Patients all underwent labial trimming with cutting diathermy, following which the edges were sewn over with Vicryl 3/0 Rapide. A range of techniques were used in private patients: labial trimming (15), central wedge reduction (9), de-epithelisation technique (3) and superior pedicle flap reconstruction (2).

Measures

Participants completed the following self-report questionnaires:

  1. (1)

    GAS scale [17, 18]: The GAS scale was our primary outcome measure. It contains 11 statements, and total scores range from 0 to 33. Higher scores represent greater dissatisfaction with the genitalia. To calculate reliable and significant change, we used a mean of 23.2 and standard deviation (SD) of 5.1 for a clinical sample, mean of 4.75 and SD of 5.6 in a comparison group, and a Cronbach’s alpha of 0.91 [16].

  2. (2)

    Hospital Anxiety and Depression Scale (HADS) [19]:

    The HADS is a self-report instrument used to examine the severity of anxiety and depressive symptoms in two separate subscales, with a range from 0 to 21.

  3. (3)

    PISQ [20]: The PISQ covers a broad measure of sexual function in women (range 0–125). Higher scores represent increasing sexual function.

  4. (4)

    Body Image Quality of Life Inventory (BIQLI) [21]: The BIQLI is a self-report assessment scale that measures the impact of general body image concerns on a broad range of life domains. A more negative score reflects a more negative body image affecting quality of life.

  5. (5)

    COPS-L [18]: This is a modification of the original COPS questionnaire [22] and focuses on concerns about the appearance of the labia rather than general appearance. The domains follow the diagnostic criteria for BDD. Participants who scored more than the cutoff score of 45 on were interviewed using a module for Diagnostic and Statistical Manual of Mental Disorders, Fourth Revision (DSM-IV) disorders [23].

Statistical analysis

Data were analysed using SPSS v21. Data were not normally distributed, so Mann–Whitney and chi-square tests were used to compare clinical and comparison groups at the initial time point and at the 3-month follow-up. Wilcoxon signed-rank tests were used to compare differences between groups at initial the time point and at 3-month follow-up and within the labiaplasty group to compare the initial time point with the long-term follow-up using case deletion. The GAS scale was used to identify the number of women who displayed reliable and clinically significant change following labiaplasty. The method summarises changes at the individual level in the context of observed changes for the entire sample [24, 25]. Two questions were addressed:

  1. (1)

    Has the patient changed sufficiently to be confident that the change is beyond that which could be attributed to measurement error? This is termed reliable change and is measured by the Reliable Change Index (RCI). It is calculated from the standard error (SE) of the difference (before and after treatment) and takes into account the reliability of the instrument (Cronbach’s alpha).

  2. (2)

    How does the end state of the patient compare with the scores observed in socially and clinically meaningful comparison groups? This is termed clinically significant change. As distributions of GAS scores for clinical and comparison populations were not overlapping, we chose to use criterion b, which examines whether the woman moves to within two SD of a normative sample mean. This is the most stringent but credible criterion when the aim is to determine whether a patient returns to a normal population. We used an Excel spread sheet, the Leeds Reliable Change Indicator, to prepare figures (available to download) [26].

Results

Data were not normally distributed, so medians and IQRs are reported throughout; nonparametric tests were used for analyses.

Group characteristics prior to intervention

Table 1 demonstrates demographics and questionnaire scores for clinical and comparison groups prior to the clinical group receiving labiaplasty procedures. Prelabiaplasty, there were no significant differences in symptom severity regarding anxiety or depression, body image, quality of life or sexual function. As expected, the labiaplasty group had significantly higher dissatisfaction towards the appearance of their genital area compared with the comparison group, as evidenced by GAS and COPS-L total scores.

Sample attrition

Twenty-six participants in the labiaplasty group completed the 3-month follow-up; 23 completed the long-term follow up (Fig. 1). However, four in the long-term follow-up group had not completed the 3-month follow-up, so in total 30, of the 49 were followed up on at least one occasion. Those lost to follow-up were nonresponse to our invitation, although one woman stated she found the questions too intrusive. The 19 women in the labiaplasty group lost to follow-up after completing the initial questionnaires were not significantly different to the 26 who completed either the 3-month or long-term follow up in terms of age (U = 232.50, Z = −0.868, p = 0.386), sexual orientation (χ2 = 2.711, df = 2, p = 0.258), marital status (χ2 = 4.861, df = 3, p = 0.182), education (χ2 = 0.091, df = 1, p = 0.755), ethnicity (χ2 = 2.820, df = 2, p = 0.244) and whether or not they had children (χ2 = 0.377, df = 1, p = 0.539); there were also no significant differences in terms of severity on GAS at baseline (U = 243.50, Z = −0.883, p = 0.377), HADS depression (U = 270.00, Z = −0.108, p = 0.914), HADS anxiety (U = 251.00, Z = 0.514, p = 0.607), COPS-L (U = 273.00, Z = 0.521, p = 0.602), PISQ (U = 225.50, Z = 0.816, p = 0.414) or BIQLI (U = 274.50, Z = 0.011, p = 0.991).

All but one of the 19 women lost to follow-up were reassessed clinically by the surgeon and reported that they were satisfied with the procedure and reported no adverse side effects. We therefore used case-wise deletion for missing data in analyses and 3 months and long-term follow-up.

Comparisons to a matched-comparison sample

Table 2 reports differences between groups on the standardised measures at the 3-month follow-up. There were no significant differences between groups on GAS, COPS-L, BIQLI, HADS anxiety or HADS depression. Women in the labiaplasty group scored significantly higher on the PISQ than did comparison participants, indicating significantly higher overall sexual function at 3 months.

Table 2 Comparisons of labiaplasty and control groups: scores on standardised questionnaires at 3-month follow-up

Longitudinal comparisons for labiaplasty group

Table 3 reports before and after scores on standardised measures for women in the labiaplasty group at two time points: prelabiaplasty versus the 3-month follow-up. At the 3-month follow-up, the women scored significantly lower on GAS and COPS-L (with very large effect sizes), implying improved satisfaction and less impairment concerning the appearance of their genitalia. They also had lower levels of anxiety, as indicated by a significant change on the HADS, and higher overall sexual function, as indicated by a significant change on the PISQ (moderate-effect sizes).

Table 3 Comparisons of labiaplasty group from prelabiaplasty to 3-month follow-up on standardised questionnaires (data deleted case wise, n = 26)

Scores on COPS-L and GAS remained significantly lower at long-term follow-up, with large effect sizes. GAS had a median score of 7 (IQR 2–12) at long-term follow up, which remained significantly improved compared with prelabiaplasty (Z = −4.202, p < 0.0005, d = 2.93); COPS–L had a median score of 11 (IQR 4–18), which was also a significant improvement (Z = −4.199, p < 0.0005, d = 2.24). Median score on the PISQ was 100 (IQR 89– 104), which was no longer significantly different compared with prelabiaplasty (Z = −1.787, p = 0.074, d = −0.18).

Longitudinal comparisons for control group

Significant changes were observed over 3 months for the control group on several measures. At 3 months, scores on GAS had decreased, with the median moving from 7 to 2 (Z = −3.508, p < 0.0005, d = 0.72); scores on the PISQ deteriorated, with the median changing from 100 to 97 (Z = −2.049, p = 0.041, d = 0.22). Effect sizes were, however, smaller than for the labiaplasty group over time. There were no significant changes in the four other measures over time.

Reliable and clinically significant change on GAS

Figure 2 is a visual display of outcome data at 3 months on 25 labiaplasty women who completed a GAS questionnaire at this time point (26 women provided data at 3 months, but one questionnaire was incomplete). Each point on the image is a patient; the X axis is the prelabiaplasty GAS score, and the Y axis is the postlabiaplasty GAS score. The diagonal line indicates the cutoff for reliable change, with points falling within the tramlines as representing unreliable change. Horizontal and vertical marker lines show criterion b, which examines whether a participant moves to within two SD of a normative sample mean and indicates clinically significant change from assessment to follow-up. At 3 months, 24 patients (96 %) achieve reliable and clinically significant change on GAS score. One patient (4 %) had reliable improvement that was not clinically significant. Overall, the RCI was 7.58, SE mean 1.53 and SE difference 2.16.

Fig. 2
figure 2

Reliable and clinically significant change on Genital Appearance Satisfaction (GAS) questionnaire for labiaplasty group at 3-month follow-up (n = 25)

Figure 3 is a visual display of outcome data at long-term follow-up on the 23 labiaplasty patients who provided data at this time point. Participants are assigned the same number on Figs. 2 and 3. All 23 patients again lie below the diagonal line, indicating reliable improvement: 21 (91 %) patients achieved reliable and clinically significant change, two patients (9 %) had reliable change data but showed no clinically significant change, one of whom was in this category at the 3-month follow-up. Neither of these patients had BDD. Overall, the RCI for long-term follow-up was 6.57, SE mean 1.53, and SE difference 2.16.

Fig. 3
figure 3

Reliable and clinically significant change on Genital Appearance Satisfaction (GAS) questionnaire for labiaplasty group at long-term follow-up (n = 23)

Changes in diagnosis

We were especially interested in nine women who were identified as having a diagnosis of BDD at interview prelabiaplasty. All had labia minora within normal range according to the surgeon’s measures, thus fulfilling one criterion for BDD. The preoccupation was specific to the genitalia (either exclusively or primary feature of concern in eight women, and secondary concern in one woman). Seven were treated privately and two on the NHS. Three months after labiaplasty, only one woman retained the diagnosis of BDD. Six of the eight women with BDD made reliable and clinically significant improvements on the GAS scale at 3 months (with two missing data). We were only able to follow-up four of the eight women with BDD in the long term. These four women continued without a diagnosis of BDD and made reliable and clinically significant changes on the GAS. One woman did not lose her BDD diagnosis; her preoccupation now focussed on her nose, not her genitalia. Her concern regarding her nose was present prelabiaplasty, but her concerns about her genitalia were primary preoperatively. Of note is that she made reliable and significant change on the GAS from 32 to 13 and was pleased with her labiaplasty.

Ratings of cosmetic and functional success

Women were asked to rate the functional success on a Likert scale. Eight (31 %) said the procedure had very much improved functioning, six (23 %) much improved, five (19 %) moderately improved, four (15 %) slightly improved and three (12 %) no change.

Side effects/complications

The 23 women followed up in the long term were asked whether they had experienced any long-term adverse effects following the procedure. Seventeen said they had no adverse side effects; six (26 %) mentioned one or more side effects with (i) urination (e.g. sometimes spraying) (n = 3), (ii) aesthetic concerns , such as noticeable scarring or the labia being jagged (n = 2), (iii) slight aching on one side of vaginal entrance (n = 1), (iv) reduced sexual arousal (n = 2), (v) some discomfort while wearing tight clothes (n = 1). Only one mentioned regret about having the procedure performed.

Labia measurements

Comparison of average labia minora width in private patients (mean = 28.09 mm, SD = 6.04, n = 23, range 17–41.5) and NHS patients (mean = 40.27 mm, SD = 6.99, n = 11, range 30–52.5) in a nonparametric independent samples comparison test demonstrated that NHS patients appeared to have significantly greater labia minora width than private patients (U = 20.50, Z = −3.91, p < 0.001). However, all women were in the normal range for the general population. For example, Lloyd et al. [27] found a mean width of 21.8 mm (SD = 9.4, n = 50, range 7–50).

Discussion

We conducted the first prospective study of women undergoing labiaplasty in both the NHS and private sectors with a comparison group. We used validated questionnaires of genital body image and sexual function, which were conducted independent of surgeons. Ninety-six percent of women showed reliable and clinically significant change on our primary outcome measure (GAS) at a 3-month follow-up, and 91 % fell into this group at long-term follow-up. As a group, women who underwent labiaplasty showed very large effect sizes at 3 months in genital body image and had enhanced sexual functioning compared with the comparison group. At long-term follow-up, patients maintained improvements in genital body image but no longer experienced improved sexual functioning. There were minor adverse effects reported in about a quarter of our sample, but this was not a deterrent; only one woman that she regretted her decision to have the procedure. Our study suggests a higher rate of minor side effects (26 %) compared with Alter [14], although our study collected all reported side effects.

The main weakness of the study is that we were only able to recruit 43 % of consecutive patients who underwent labiaplasty and do not therefore know whether our sample is representative. This recruitment or attrition rate is comparative to that of the only other prospective study of labiaplasty [15] and may reflect characteristics of the clinical population (e.g. reluctance to discuss anxieties about genitalia, general avoidant tendencies). Another possible weaknesses is that we did not take labia measurements for our comparison group; however, given that our clinical group had measurements within the normal range (see “Results), this would not seem critical.

The main strengths of the study are that we used validated questionnaires and that assessments were undertaken independent of surgeons and were conducted in the long term in the labiaplasty group. However, this may also contribute to a weakness in that it was more difficult to capture data when patients attended for their 3-month follow-up appointment. Another weakness is that we were unable to follow-up 19/49 (38.8 %) of the women we recruited. However, women lost to follow-up were no different in baseline measures to women who were followed up. Furthermore, all women but one were followed up clinically and reported satisfaction to the surgeon. The study had relatively small numbers, and therefore, we cannot comment on the prevalence of adverse events. Previous case series suggest minor side effects occur in about 10 % of women, and a very large case series would be required to provide an accurate estimate of prevalence of side effects. However, it is challenging to recruit consecutive cases, especially in the private sector, to participate in such research, and there is no incentive to participate after the surgery is completed.

Women with BDD did surprisingly well at the 3-month follow-up in that eight of nine lost their diagnosis. This is a small sample and thus must be interpreted cautiously, but it suggests that a diagnosis of BDD is not a contraindication to labiaplasty in the short term. It was not possible to interpret data in the long term, as we were only able to follow-up 50 % of women with BDD. This suggests that the risk in BDD is relatively low in the short term for a procedure in which there is an obvious desired change (e.g. reduction of labia minora or breast augmentation) compared with a procedure in which the change may be ambiguous (e.g. rhinoplasty) and if symptoms of BDD are in the mild range without excessive distress and shame. However, in BDD, if another body feature is also of significant concern, then the preoccupation may transfer to a different feature or a new preoccupation may emerge in the long term. Further prospective studies are required to clarify this.

Crouch et al.[7] and Michala et al. [16] recommend providing reassurance about the diversity of normal vulval appearance and counselling to explore issues leading to a request for surgery. We agree that it would be desirable to evaluate a psychological intervention, especially in women seeking labiaplasty who have been teased or received comments about the appearance of their genitals [6]. However, at present, no data are available on the psychosexual outcome or genital satisfaction of either reassurance by a surgeon or subsequent counselling. Whereas there is evidence of benefit from cognitive behavioural therapy (CBT) for body image problems or BDD [28, 29], CBT is not a generic intervention and has not yet been developed for this population. A strategy of reassurance may be similar to informing a woman seeking breast augmentation that her breast size is within normal limits and does not therefore require surgery. Equally, counselling may be difficult in those with medically unexplained symptoms. Therefore, the first step would be to evaluate the role of reassurance or a psychological intervention on a standardised scale in a consecutive case series in order to estimate an effect size for a future randomised controlled trial of labiaplasty vs a psychological intervention.

Conclusion

We provide an initial benchmark for psychosexual improvements that occur after labiaplasty. We recommend that specific measures of genital body image, sexual function and side effects be used in outcome studies of labiaplasty, or of any psychological intervention, for women dissatisfied with their genitalia. As a minimum, we recommend the use of GAS, COPS-L and either PISQ or Female Sexual Functioning Index (FSFI) [30] for future audit outcome studies including psychological interventions.