Introduction

Rhinoplasty is one of the most frequently performed procedures in facial plastic surgery. The most common patient complaint is a “dorsal hump,” followed by “too large” of a nose, “bulbous tip,” and “nasal airway obstruction” [1]. Reduction in a dorsal hump alters the structure of the nose with resulting aesthetic and functional implications. The conservation of natural anatomical relationships to prevent functional sequelae of aesthetic nasal surgery has become an integral concept in rhinoplasty [2]. Since resection of the dorsal hump consists of the removal of an important portion of the osseocartilaginous dorsum, preservation, or reconstruction of the middle third of the nose is imperative to prevent midvault insufficiency, nasal valve dysfunction and/or an inverted-V deformity. Over the years, different techniques and grafts have been created so that this objective is achieved in dorsal hump reduction.

Since its description by Sheen, the spreader graft has become the gold standard for midvault reconstruction after hump resection [3]. A spreader graft is a rectangular strip of cartilage placed submucosally, along the superior border of the septum between the upper lateral cartilage and septum. It has been shown to preserve support of the nasal dorsum and function of the internal nasal valve [3]. It results in a wider dorsal roof, improved dorsal aesthetic lines and expands the internal valve angle, as it moves the lateral wall away from the septum [3]. In addition to being used for reduction rhinoplasty, spreader grafts have become an important technique to help straighten the deviated septum and nasal dorsum [4].

Release, preservation, and resuspension of the upper lateral cartilages to the dorsal septum were described by Fomon, though not in the context of dorsal hump reduction [5]. The more modern iteration, used in midvault reconstruction after dorsal hump reduction, was described by O’Neal and Berkowitz [6]. Spreader flaps, also known as autospreader flaps, are our primary method of midvault reconstruction after hump reduction.

Both the spreader graft and autospreader have been extensively studied and used throughout the years [2,3,4, 6, 9, 10, 14]. However, there is heterogeneity in the reported efficacy and outcome measures of these techniques and limited comparative data. Therefore, the aim of this study was to systematically compare the outcomes of spreader grafts and autospreader flaps in the context of midvault reconstruction after dorsal hump removal.

Materials and Methods

A systematic review was conducted in accordance with the Cochrane Handbook for Systematic Reviews of Interventions [7]. Inclusion and exclusion criteria were based on the population, intervention, comparison, and outcome (PICO) framework.

Population

Adults (≥18 years) with nasal dorsal irregularities requiring nasal dorsal reconstruction with spreader graft or upper lateral cartilage turn-in flaps.

Type of Studies

Clinical and observational studies published in peer-reviewed academic journals with abstracts available without restrictions on language or time of publication were included. Studies were excluded from the systematic review and meta-analysis when they met the following criteria: pilot reports, case reports, case series (< 5 patients), descriptive publications on surgical techniques, theses, conference proceedings, letters (except research letters and brief reports), and editorials.

Intervention

Rhinoplasty employing either spreader graft or autospreader flap techniques. Dorsal hump reduction usually involves reducing the cartilaginous dorsal septum and trimming the vertical height of the upper lateral cartilages. The spreader graft is the standard method for stabilizing the middle vault. The upper lateral cartilage turn-in flap (autospreader or spreader flap) has later been introduced as a viable alternative to the spreader graft for middle nasal vault reconstruction.

Comparison

Spreader graft versus autospreader flap technique

Outcome

Difference between groups in the rates of complications and changes in nasal cosmesis and nasal obstruction severity levels before and after the surgery.

Data Sources and Searches

Medline (via Pubmed), Embase, Cinahl, Central, Scopus, and Web of Science databases were searched in March 2021. To prevent losing any relevant studies, common search clauses were utilized. The search strategy for each database is as follows:

  • Pubmed ((spreader [TIAB] OR autospreader[TIAB]) AND (graft[TIAB] OR flap[TIAB])) OR ("turn in" [TIAB] AND cartilage) AND hasabstract[TW]

    EMBASE: (spreader:ab,ti OR autospreader:ab,ti) AND graft:ti,ab,kw AND 'human'/de AND 'article'/it AND 'human'/de

  • Cinahl ((TI spreader OR AB spreader OR TI autospreader OR AB autospreader) AND (TI graft OR AB graft OR TI flap OR AB flap)) OR ((TI "turn in" OR AB “in turn”) AND (TI cartilage OR AB cartilage)) Limiters: Abstract Available Source Types: Academic Journals

  • Central ((spreader OR autospreader) AND (graft OR flap)) OR ("turn in" AND cartilage) in Title Abstract Keyword in Trials

  • Scopus ((TITLE-ABS-KEY ( spreader ) OR TITLE-ABS-KEY ( autospreader ) ) AND ( TITLE-ABS-KEY ( graft ) OR TITLE-ABS-KEY ( flap ) ) ) OR ( TITLE-ABS-KEY ( "turnin" ) AND TITLE-ABS-KEY ( cartilage ) ) AND ( LIMITTO ( EXACTKEYWORD , "Rhinoplasty" ) ) AND

    ( LIMIT-TO ( EXACTKEYWORD , "Human" ) ) AND (LIMIT TO ( DOCTYPE , "ar" ) )

  • Web of science (TS= (((spreader OR autospreader) AND (graft OR flap) ) OR ("turn in" AND cartilage) ) OR TI= (((spreader OR autospreader) AND (graft OR flap) ) OR (("turn in"ANDcartilage) ))) AND DOCUMENT TYPES: (Article) Indexes=SCI-EXPANDED Timespan=All years

Study Selection

Search results were first screened based on titles and abstracts by two independent reviewers (C.M.B. and P.N.P.). The identified manuscripts were then screened on full texts according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline (Figure 1). Disagreements between the reviewers were resolved by consensus or by a third reviewer (C.K.K.).

Fig. 1
figure 1

PRISMA Flow-diagram

Assessment of Risk of Systematic Bias

The methodological quality of our systematic review was classified according to the Guidance for Assessing the Quality of Before–After (Pre–Post) Studies with No Control Group [60]. Twelve attributes were assessed: (1) study question or objective clearly stated; (2) study population and eligibility criteria; (3) study participants representative of clinical populations of interest; (4) all eligible participants enrolled; (5) sample size; (6) intervention clearly described; (7) outcome measures clearly described, valid, and reliable; (8) blinding of outcome assessors; (9) follow-up rate; (10) statistical analysis; (11) multiple outcome measures; and (12) group level interventions and individual-level outcome efforts. Quality of the included trials was estimated as poor, fair, or good.

Data Extraction

Relevant data were extracted from the records by 1 reviewer (C.M.B) using a predefined structured form and verified by a second reviewer (C.K.K).

Statistical Methods

Seventeen studies reported NOSE scores with complete data. The NOSE score estimates reported by the original studies were pooled together depending on the use of spreader grafts, autospreader or none, employing a random effects synthesis. The results were reported as weighted raw mean differences in the NOSE scores before and after the surgery. The results were accompanied by 95% confidence intervals (95% CIs). The heterogeneity was assumed being present if Q-statistics exceeded the degree of freedom (DF). The amount of heterogeneity related to true effect was assessed by using I2 statistics. The differences between treatment groups were assessed on the pooled summary data using ANOVA with Tukey HSD Post hoc Test setting a desired confidence level for post hoc confidence intervals at 95%. The results of ANOVA were reported as two-tailed p values considering p < = 0.05 statistically significant. All the analyses were carried out using the CMA software, version 3.3 available from www.meta-analysis.com and Stata/IC Statistical Software: Release 16. College Station (StataCorp LP, TX, USA).

Results

The search yielded 1129 studies (Figure 1). After excluding duplicate records, reviews, case studies, conference proceedings, letters, and editorials, 441 records were screened by 2 independent reviewers based on titles and abstracts. The remaining 71 records were further assessed based on their full texts. Fifty-two studies [2, 4, 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57] were included in the qualitative analysis. Of the 52 included studies, 16 were conducted in Turkey, 14 in the USA, 5 in Iran, 3 in Egypt, 3 in Italy, 3 in Canada, 2 in Germany, 1 in Brazil, 1 in South Korea, 1 in Portugal, 1 in Argentina, 1 in the Netherlands and 1 in Oman. Among them, 45 were observational in nature, with 13 retrospective studies (25.0%), 30 prospective studies (57.7%), 1 descriptive analytical study (1.9%), and 1 case series (1.9%). There were 6 randomized clinical trial (11.5%) and 1 non-randomized clinical trial (1.9%). There were 34 studies (65.4%) related to spreader graft (SG) alone [8,9,10,11,12,13, 16,17,18,19,20,21,22,23, 25, 27, 28, 30,31,32,33,34, 37, 40, 41, 43, 46,47,48, 50,51,52, 54, 55], 10 studies of autospreader flap (AF) alone (21.1%) [2, 24, 26, 29, 36, 39, 42, 49, 53, 56], and 8 studies involving (13.5%) both grafts [4, 14, 15, 35, 38, 44, 45, 57]. Sample size varied from 15 to 694, and the mean age varied from 13 to 73 years (Table 1). Among the identified 52 studies, 8 studies [8, 9, 18, 19, 25, 32, 47, 54] were found to have included patients aged less than 18 years of age in their cohort. Although this contradicts the adopted PICO framework for this review, it was decided to include these studies in the review as the patient cohorts in these studies included adult patients and due to the relevance of the study content to this review. Of the 52 studies, NOSE data were included in 19, but only 17 contained complete preoperative and postoperative data. Twenty-two studies included patients who underwent revision surgery [8, 10,11,12, 18, 19, 22, 23, 27, 30,31,32, 34, 38, 41, 43, 46, 47, 52,53,54,55].

Table 1 Basic characteristics of included studies

Risk of Systematic Bias

Of the included 52 studies [2, 4, 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57], methodologically, 25 (48.1%) were considered to be good [2, 4, 9, 13,14,15, 18, 20, 22,23,24, 30, 31, 33, 35, 37, 41, 42, 45,46,47,48, 50, 52, 57], 6 (11.5%) were considered poor [26, 32, 34, 38, 51, 53], and 21 (40.4%) were considered fair [8, 10,11,12, 16, 17, 19, 21, 25, 27,28,29, 36, 39, 40, 43, 44, 49, 54,55,56] (eTable 1).

Patient-Reported Outcome Measures

Of the 52 studies, 19 reported NOSE [59] scores (Nasal Obstruction Symptom Evaluation Survey) [2, 9, 12, 15, 21,22,23, 29, 30, 33, 35, 41, 45, 47,48,49,50, 52, 56] (Table 2). However, only 17 [2, 9, 12, 15, 21,22,23, 29, 30, 33, 35, 45, 47, 49, 50, 52, 56] presented complete NOSE data preoperatively and postoperatively. Studies by Paul et al [41] and Talmadge et al [48] were excluded in the qualitative synthesis due to the lack of clarity of the data reported. The included 17 studies were divided into three groups (SG, AF and/or none) and the pooled estimates analyzed. Four of the 17 studies included the AF technique [2, 29, 49, 56], ten described the SG as the chosen technique [9, 12, 21,22,23, 30, 33, 47, 50, 52], and three reported both [15, 35, 45]. The overall preoperative and postoperative change in the NOSE score was − 23.9 (95% CI, − 26.7 to − 21.1) points. The changes in the NOSE scores before and after the surgery were similar for all three groups, for AF they were − 27.1 (95% CI, − 36.2 to − 18.0) points; for SG, they were − 26.5 (95% CI, − 30.4 to 22.6) points and for those where none of them were used, the scores were − 19.9 (95% CI, − 24.3 to − 15.5) points (Table 3). The heterogeneity was substantial: overall Q = 7182, df 36, I2 = 99%. The ANOVA for summary data (Tukey HSD Post hoc Test) showed no differences between groups, AF group versus no graft (p = 0.7578), AF versus SF group (p = 0.9948), and SG group versus no graft (p =0.6608).

Table 2 Nasal obstruction symptom evaluation score and visual analog scale score
Table 3 Change in nasal obstruction symptom evaluation score across the analyzed studies

Six studies reported results using a VAS (Visual analog scale) [2, 4, 35, 37, 42, 56]. Three studies [2, 42, 56] reported scores for AF [2, 42, 56], one for SG [37] and two for both [4, 35] (Table 2). One study analyzed only functional aspects [37], three studies the aesthetic aspects [2, 35, 56] and two studies both aesthetic and functional aspects [4, 42]. The study by Hassanpour et al. [4] did not report preoperative and postoperative mean scores and standard deviation, only the percentage of satisfaction with appearance and function.

Objective Outcome Measures

Among fifteen studies (28.9%) reporting acoustic rhinomanometry, two were AF related studies [24, 42], eleven were SG studies [2, 9, 16,17,18, 20, 21, 31, 33, 37, 41] and two studies related with both AF and SG [4, 57] (Table 4). Eight studies (15.4%) reported both preoperative and postoperative outcomes, as well as standard deviations [16, 18, 21, 33, 37, 41, 42, 57]. Six of eight studies reporting complete data were about SG [16, 18, 21, 33, 37, 41], one was about AF [42], and one studied both SG and AF [57]. Seven other studies (13.5%) [2, 4, 9, 17, 20, 24, 31] reported objective outcomes; however, they did not register complete data to compare the changes between preoperative and postoperative results. Due to this reason, a qualitative synthesis was not carried out.

Table 4 Results of acoustic rhinomanometry

Risks of Complications or Revision Surgery

Of the 52 studies included, 18 studies (34.6%) reported proportion of revision surgery and details of complications [8,9,10,11, 16,17,18, 20, 26, 30, 31, 34, 36, 38, 39, 41, 43, 53]. Complications were reported in 13 of 34 SG studies (38.2%); in 4 of 11 AF studies (36.4%) and in 1 of 7 combined SG and AF studies (14.3%) (eTable 2). Revision rates were reported in 5 of 34 SG studies (14.7%), in 2 of 11 AF studies (18.2%) and 1 of 7 for combined studies (14.2%) (eTable 2). Bleeding ranged from 0 to 4.47%, infection from 0 to 5.62%, aesthetic complications excluding dorsal irregularities from 0 to 11.73%, other functional complications from 0 to 15.0%, and revision surgery from 0 to 6.12%. Of the 34 SG studies (pooled: n = 3326), there were 8 infections (0.24%), 9 bleeding events (0.27%), no dorsal irregularities, 29 other cosmetic complications (0.87%), and 46 other functional complications (1.38%). Of the 5 studies reporting revision rates (n = 367) [8, 9, 16, 17, 34], there were 14 revisional procedures (3.81%). In the 11 AF studies (pooled: n = 801), there were 16 other cosmetic complications (2.00%), 10 other functional complications (1.25%), no infections, no dorsal irregularities, and no bleeding events. One study (n = 147) [38] reported revision surgery for 9 cases (6.12%). Of the 7 studies involving both SG and AF (pooled n = 749), there was 1 other cosmetic complication. Manavbaşi and Başaran [38] reported that a patient-reported problems resulting from excessive dorsal width, excessive swelling in the supratip area and demanded removal of the grafts in the second postoperative week. There were no infections, no bleeding events, no other functional complications, and no dorsal irregularities. Only one study (pooled n = 169) [38] described a revision procedure in 1 patient (0.60%).

When comparing SG versus AF, the relative risk for infections was 4.10 (95% CI, 0.24–70.93); for bleeding was 4.58 (95% CI, 0.27–78.61); for nasal dorsal irregularities was 0.24 (95% CI, 0.0048–12.14); for other aesthetic complications was 0.4365 (95% CI, 0.24–0.80); for other functional complications was 1.11 (95% CI, 0.56–2.19); and for revision surgery was 0.37 (95% CI, 0.16–0.86).

Discussion

This study systematically evaluated outcomes and complications of SG and AF in 52 studies. Less than half of the studies included in this review, 25 of 52 (48.1%) demonstrated good methodology according to the Guidance for Assessing the Quality of Before–After (Pre–Post) Studies with no control group [58]. Meta-analysis included only 17 studies that reported both preoperative and postoperative NOSE scores [59]. Based on 95% CI, the change in NOSE for SG: − 26.5 (95% CI, − 30.4 to 22.6) is insignificant statistically, whereas the ones for AF: − 27.1 (95% CI, − 36.2 to − 18.0) points, and for no grafts : − 19.9 (95% CI, − 24.3 to − 15.5) especially in that closest to a zero effect level, AF =18.0 and ‘none at all’=15.5 are both less than 19.4 the reported MCID for NOSE. There was high heterogeneity (I2 = 99%) between the three groups).

Of the 52 studies included, rates of revision surgery and complications were described in 18 studies: 13 related to the SG technique [8,9,10,11, 16,17,18, 20, 30, 31, 34, 41, 43], 4 related to the AF technique [26, 36, 39, 53] and 1 related to using both grafts [38]. Revision surgery rates were reported in 5 of 34 studies for SG, in 1 of 11 studies for AF and 1 of 7 for studies that used both grafts. Other functional complications (1.38%) were the most prevalent among the spreader graft group, followed by other cosmetic complications (0.87%). Among the autospreader flap group, other cosmetic complications (2.00%) were more numerous when compared to other functional ones (1.25%) (eTable 2). Other complications, such as bleeding and infections, were not found to be significantly different between the 2 groups. Overall, these complication rates were very low, all occurring at rates less than 2%. More prevalent was revision surgery, which was slightly higher for the patients that were submitted to procedures that used both SG and AF (14.2%) or SG only (14.7%), when compared to those undergoing the AF technique (9.1%).

Five studies, four evaluating the spreader graft technique [21, 33, 37, 41] and one evaluating the autospreader flap technique [42] reported increase in the minimal cross-sectional area postoperatively. Two studies using spreader grafts showed that nasal airflow during quite inspiration improved postoperatively [16, 18]. One study [57] comparing the nasal air resistance in patients that were submitted to rhinoplasty with spreader grafts or autospreader flaps reported decrease in air resistance in both groups postoperatively. Despite the improvements in these objective outcomes, it is difficult to compare and to affirm what graft is more efficient, since the data were collected by analyzing different groups or parameters.

As various modifications have been suggested for the autospreader flaps, it is unclear if certain aspects like scoring the autospreader flaps have any impact on outcomes.

Limitations

We understand the complexity of this chosen topic of middle vault management. Not all spreader grafts or flaps are the same. Moreover, each surgeon also has their respective modification of these grafts. Due to such differences, there exists an inherent problem obtaining a standardized result in the management of the middle vault. One of the main limitations of the study was the lack of consistent methodology among the included studies in this review, coupled with the heterogeneity of reported outcomes, were the main limitations of this study. Over half of the included studies were assessed to be of low quality. While 52 studies were included for the qualitative analysis, most of them did not include complete outcome data or the lack of standardized reporting of patient outcomes in these studies, a major shortcoming, makes it difficult to effectively compare both methods.

Conclusion

Of the 52 studies reviewed, less than half were considered to have a good methodology and only 17 were included for the quantitative analysis. Discrepancies in the functional and/or aesthetic outcome measures made comparisons difficult. To increase the reliability and level of evidence, surgical outcomes measures should be standardized, and improved study methodology is required. We recommend a highly validated and extensively translated PROM like the standardized cosmesis and health nasal outcomes survey (SCHNOS) questionnaire to be accepted as a global standard in assessing rhinoplasty patients [60,61,62,63,64,65,66,67,68,69,70,71,72]. As stated by the available data, change in NOSE scores after rhinoplasty were similar in procedures that used spreader graft only or autospreader flap only. Complications did not differ significantly between groups. Considering that the results of this systematic review and meta-analysis demonstrated that there were no significant differences between the two techniques, based on surgeon preference, it may be beneficial to use autospreader flaps since it limits the need for cartilage harvest.