Introduction

Mitral regurgitation (MR) affects almost 10% of people over 75 years of age [1]. With aging populations, the potential impact of symptomatic severe MR on functional status, quality of life and health care expenditure is considerable [2, 3].

Due to high surgical risk about half of patients with symptomatic severe MR do not undergo valve surgery which is the therapy of first choice [4]. Percutaneous mitral valve repair (PMVR) has emerged as an alternative option for many inoperable or high-risk patients with symptomatic MR [5, 6]. The MitraClip system is currently the most widely applied method with more than 40,000 implants worldwide. So far a benefit of PMVR compared to conservative or surgical treatment regarding mortality has not been shown and hence PMVR is recommended as a symptomatic therapy [7]. In the EVEREST II study, which is the only randomized study on PMVR so far, an improvement in quality of life was demonstrated in patients undergoing MitraClip procedure, which was similar in magnitude when compared to surgical repair [8]. However, these data cannot be transferred to real-world patients undergoing PMVR who substantially differ with respect to underlying mitral valve pathology, age, left-ventricular function and comorbidity [9,10,11,12,13]. Taken together, the actual impact of PMVR regarding quality of life, symptom burden and functionality is weakly quantified, but of major interest for physicians and patients to guide informed-decisions [14, 15] and estimate the benefit of PMVR within the health care system.

The aim of this study was to systematically evaluate and quantify the impact of PMVR using MitraClip on various measures of functionality. We used data of a prospective cohort of patients undergoing MitraClip procedure at our center with assessment of 6-min walking distance, heart failure associated symptoms and health-related quality of life, and combined results with data obtained from a systematic review on this topic.

Methods

Prospective cohort study

All patients undergoing MitraClip procedure at our high-volume referral center were eligible for inclusion if written informed consent was given by the patient, starting in May 2014 to June 2016. All patients were on optimal medical therapy, had indication for treatment of MR according to current guidelines and underwent discussion in the interdisciplinary Heart Team with the concordant decision on an interventional treatment with the MitraClip system. There were no general clinical or morphologic criteria precluding PMVR, but the decision on feasibility basically aligned with criteria defined in a recent consensus paper of the German Society of Cardiology [16]. The study was approved by the local ethics committee of the University of Cologne (14-116).

Six-minute walking distance (6MWD) [17], New York Heart Association (NYHA)-class, Minnesota Living with Heart Failure Questionnaire (MLWHFQ) [18] and the generic, validated Medical Outcomes Study Short-Form 36 [19, 20] (SF-36, Optuminsight, Life Sciences, Inc.), were assessed during the hospitalization 1–5 days before the procedure, depending on time between admission and procedure, and at follow-up about 6 weeks after the procedure by a trained medical student who was blinded to procedural and echocardiographic results.

Systematic review protocol: data sources and study selection

According to PRISMA criteria for systematic reviews [21], we conducted a systematic search of PubMed using the word “mitraclip” until 30.04.2016, without language restriction, to identify full-length papers on prospective and retrospective studies that reported functional status or quality of life outcome after MitraClip. We also manually searched included references of seven published reviews [22,23,24,25,26,27,28] (Fig. 1).

Fig. 1
figure 1

Flow chart of the study selection

Studies were eligible if they reported one of the following functional measures at baseline and beyond discharge following MitraClip: (1) all-cause mortality; (2) NYHA functional class; (3) the SF-12/-36 physical component summary (PCS) and mental component summary (MCS); and (4) other measures of functional capacity and quality of life, including 6MWD, EuroQol-5D [29], and the MLWHFQ. Studies were excluded if: (1) they were experimental studies on animals, reviews, commentaries, case reports, or abstracts; (2) the intervention was not MitraClip; and (3) the study lacked defined outcomes for inclusion. When more than one study originated from the same patient population, studies were prioritized according to the following hierarchy of characteristics: (1) all quantitative measures of functionality and quality of life superior to NYHA class; (2) sample size; and (3) duration of follow-up. One multicenter registry [30] was included for the analysis, albeit parts of the registry population were also reported in several other studies [31,32,33,34,35,36] included in the systematic review. However, only the outcome EuroQol-5D was analyzed from the registry population, which was not reported and analyzed in the individual studies.

Two investigators (C.I. and R.P.) independently assessed abstracts and full-text papers for eligibility, and disagreement was resolved in all cases after a second independent re-assessment by the two investigators.

Data extraction and quality assessment

Two investigators (C.I. and S.L.) independently examined the quality of the assessed studies and disagreement was resolved by a third reviewer (R.P.). We modified the Newcastle-Ottawa Scale [37] to evaluate studies for the following quality domains (see Appendix Method 1): (1) representativeness of source population; (2) selection of the comparison patients; (3) assessment of functional outcomes; and (4) adequacy of follow-up. Evidence for publication bias was assessed visually using Funnel plots and was tested statistically using the Egger’s regression test of asymmetry.

A standardized form was used to extract patient-related and treatment-related characteristics. The following primary outcomes were extracted (for details of these measures see Appendix Tab. 1): change in NYHA class (≥1 class) [38], SF-12/36 PCS and MCS scores (≥2.5 points) [39], 6MWD (≥50 m) [17], EuroQol-5D (≥0.074 points) [40], and MLWHFQ (≥5 points) [41]. Cut-off values which are usually regarded as clinically meaningful for patients are shown in brackets. These cut-off values are arbitrary and not explicitly validated in PMVR patients, but are recommended in recent consensus statements on clinical trial design of transcatheter mitral valve repair and heart failure [42,43,44]. We also extracted all-cause mortality as secondary outcome. Where matched data were available, we used only the matched data for the clinical parameters [31, 45]. Where the median and interquartile range was given, we assumed the distribution was normal or approximately normal. Thus, we converted mean = median and standard deviation = IQR/1.35 [46].

Data synthesis and analysis

As a result of poor quality of included studies and considerable unexplained heterogeneity, we summarized the change in primary outcomes descriptively. Missing data on mean change and corresponding standard deviation were obtained as described in Chapter Seven of the Cochrane Handbook for Systematic Reviews of Interventions [46] (see Appendix Method 2 for computation of the change in primary outcomes), without pooling individual study estimates. If there were very few high-quality studies, a DerSimonian–Laird random-effects estimate [47] can be misleading and its 95% confidence interval underestimates the uncertainty of treatment effects. Thus, we presented the range (minimum–maximum) of observed mean changes and displayed the variation in forest plots. Since baseline values of 6MWD differed substantially across studies, we calculated percent changes from baseline to follow-up for secondary analyses (see Appendix Method 3 [48]). Because of the high heterogeneity of effect estimates we performed meta-regression including important study baseline characteristics and study quality criteria (representativeness of patients, availability of comparison group, completeness of follow-up and outcome assessment) using the STATA command “metareg” for outcomes with 10 or more study groups available. Analyses were performed in STATA version 12.1 (StataCorp LP, USA).

Results

Prospective cohort study

Of 230 patients admitted for MitraClip procedure, 217 agreed for participation, 2 of whom died before the procedure (Appendix Fig. 1). 215 patients were included [mean age 78 (±8) years; 57% male] in the study with a high estimated surgical risk [mean Logistic EuroScore 22% (±16%)]. Frequency of categories of left-ventricular ejection fraction (<30, 30–50 and >50%) was 27, 27 and 46%, respectively. Underlying pathology of MR was primary/degenerative in 35%, secondary/functional in 57%, or combined degenerative and functional in 8%. 87% of patients were in NYHA class III/IV.

In 7 of 215 patients (3%) a Clip could not be implanted because of technical, procedural or morphologic reasons. Overall, 192 (89%) patients underwent successful MitraClip implantation which was defined by implantation of at least 1 clip and reduction of MR grade to ≤2. No patient was lost to follow-up at 6 weeks regarding vital status. Of 208 patients with at least 1 Clip implanted, 6 (2.9%) patients died postprocedural or during 6-week follow-up. Paired values for baseline and follow-up in patients with at least 1 clip implanted and surviving till follow-up were available in 196 (97%) patients for NYHA class, in 142 (70%) patients for 6MWD and in 171 (85%) patients for MLWHFQ, PCS and MCS. Missing values were mainly due to incomplete assessment of distinct tests during follow-up visit.

Mean NYHA class decreased from 2.9 (SD 0.5) to 2.4 (SD 0.6, p < 0.0001), with 55% of patients showing an improvement of 1 class or more. Mean 6MWD increased from 256 m (SD 129 m) to 299 m (SD 121 m, p < 0.0001), with 39% of patients showing an improvement of 50 m or more. Mean MLWHFQ score decreased from 34 points (SD 18) to 21 points (SD 15, p < 0.0001), with 70% of patients showing an improvement of 5 points or more. Finally, mean PCS and MCS increased from 36 points (SD 8) and 49 points (SD 11) to 42 points (SD 8, p < 0.0001) and 52 points (SD 9, p = 0.0006), with 70 and 53% of patients showing an increase of 2.5 points or more.

Patient characteristics in included studies

Our systematic review identified 36 observational studies (34 studies of pre- and post-PMVR comparison [30,31,32,33,34,35, 45, 49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75], one study of PMVR versus conservative treatment [76] and one study of PMVR versus healthy controls [36]) and 1 randomized controlled trial of PMVR compared to surgical treatment (EVEREST II trial, [8]). Clinical characteristics of patients treated with PMVR varied widely across studies (Table 1; see Appendix Tab. 2 for characteristics of individual studies). In two observational studies with comparison groups, PMVR patients were well matched regarding age, gender, logistic EuroScore, ejection fraction and cause of mitral regurgitation compared to conservatively treated patients. Many observational studies included patients who were older, had lower ejection fraction, had functional mitral regurgitation and had higher procedural success than patients in the EVEREST II trial.

Table 1 Characteristics of patients in 38 included studies of MitraClip

Quality of included studies

The majority were small pre- and post- PMVR comparison observational studies that are prone to a high-risk of biases (see Appendix Tab. 3 for quality assessment of individual studies and Appendix Fig. 2 for summary of quality assessment). The small number of head-to-head comparison studies (2 of 38 studies) poses major challenges in drawing definitive conclusions about the functional benefits of PMVR from the available evidence. In addition, the lack of independent assessors of functional outcomes [33 studies (87%)], especially when a subjective interpretation of patient responses is possible, may lead to a conclusion favoring PMVR. Representativeness of included patients was limited or unclear in 23 (61%) studies. The pre- and post-PMVR comparison among survivors and the presence of non-negligible loss to follow-up [defined by more than 10% loss to follow-up without report of reasons; 23 studies (61%)] may also overestimate the benefits of PMVR.

There was evidence for significant publication bias for the outcomes NYHA class (p < 0.0001), PCS of SF-12/-36 (p = 0.02) and MLWHFQ (p = 0.006, Appendix Fig. 3).

Change in NYHA class

Changes in NYHA class were reported in 34 studies. Five studies [8, 31, 54, 59, 68] only qualitatively reported changes in NYHA class after PMVR with an improvement reported in all five studies. In 29 studies with quantitative data, a substantial variation in the mean change in NYHA class was observed (I 2 = 88.6%, p < 0.001 Fig. 2). In most studies, there was an average improvement of ≥1 NYHA classes after PMVR. However, several studies ([34, 35, 51, 52, 55, 56, 62, 69, 72, 74], our cohort) showed a mean change <1 NYHA class.

Fig. 2
figure 2

Change in the New York Heart Association Functional Class. Some studies reported estimates by subgroups only and in this case 2 or 3 lines are presented for one study without repeating the author (year). CI confidence interval, NYHA New York Heart Association, SD standard deviation

SF-12/-36 physical and mental component summary scores

Changes in SF-12/-36 summary scores were reported in seven studies. The mean PCS and MCS score improved in all studies with a difference regarded as clinically meaningful, with significant heterogeneity across studies (PCS range 4.4–9.2 points, I 2 = 75.1%, p < 0.001; MCS range 2.6–8.9 points, I 2 = 82.4%, p < 0.001, Fig. 3a, b). The change in SF-12/36 scores for surgical mitral valve repair (SMVR) and conservative treatment was only assessed in one study each. Patients who underwent SMVR had improvement in PCS score of 4.4 points and in MCS of 3.8 points. Healthy controls had no change in their PCS and MCS scores.

Fig. 3
figure 3

Change in the Medical Outcomes Study Short-Form Health Survey Physical (a) and Mental Component Summary Score (b). CI confidence interval, SF-PCS Short-Form Physical Component Summary, SF-MCS Short-Form Mental Component Summary, SD standard deviation

Six-minute walk distance

Changes in 6MWD were reported in 15 studies with marked heterogeneity (I 2 = 95.3%, p < 0.001) and a range of mean change from 2 to 336 m (Fig. 4). All studies showed an improvement and the majority showed an improvement of more than 50 m increase. Analyzing 6MWD as “percent change from baseline” did virtually not change results (I 2 = 81.8%, p < 0.0001, Appendix Fig. 4).

Fig. 4
figure 4

Change in the 6-min walking distance. Some studies reported estimates by subgroups only and in this case 2 lines are presented for one study without repeating the author (year). 6MWD 6-min walking distance, CI confidence interval, SD standard deviation

MLWHF, other measures and mortality

Changes in MLWHFQ score were reported in eight studies with substantial heterogeneity (I 2 = 89.4%, p < 0.001) regarding the magnitude of improvement ranging from −7 to −18 points (Fig. 5). All studies showed an improvement with the lower limit of the 95% interval less than −5 points.

Fig. 5
figure 5

Change in the Minnesota Living With Heart Failure Questionnaire Score. Some studies reported estimates by subgroups only and in this case 2 lines are presented for one study without repeating the author (year). MLWHFQ Minnesota Living With Heart Failure Questionnaire, CI confidence interval, SD standard deviation

One registry study reported changes in EQ-5D quality of life with an increase in mean EQ-5D score from 0.8 to 0.9 which is regarded as clinically meaningful [40].

Mortality was reported in 37 studies, with a range from 0 to 20% (mean 7%) within up to 6 months, a range from 6 to 31% (mean 18%) for >6 to 12 months and a range from 5 to 46% (mean 24%) for more than 12 months.

Meta-regression analysis

Meta-regression analyses were performed for the outcomes NYHA class and 6MWD. Only baseline NYHA class, age and sample size were significantly associated with variation of NYHA class changes after PMVR across studies. Heterogeneity I 2 decreased from 88.6 to 83.8% when adjusting for the effects of baseline NYHA class, age and sample size, suggesting a statistically significant association, but no relevant association that can explain heterogeneity. For 6MWD only baseline 6MWD was significantly associated with the variation of 6MWD changes after PMVR, with a decrease of heterogeneity I 2 from 95.3 to 93.8% when adjusting for baseline 6MWD.

Discussion

This systematic review shows distinct variations across 38 studies in the clinical characteristics of patients undergoing MitraClip as well as in the amount of improvement in functional outcomes. There was a consistent trend that MitraClip ameliorated functional capacity, physical and mental functioning as well as disease-specific quality of life. However, this evidence is based on studies of low to moderate quality with small patient numbers. These findings might nevertheless be of relevance for patients with severe symptomatic mitral regurgitation with prohibitive or high surgical risk amenable to MitraClip therapy, since this procedure so far lacks data on a morbidity and mortality benefit.

A body of evidence demonstrates that MitraClip is an effective and safe treatment option for severe MR in high-risk or inoperable patients [11, 16, 30, 34, 45, 64, 77]. Technical efficacy and safety are essential to judge the feasibility of MitraClip and eligibility of certain patient subgroups as well as to gain a safety approval. However, they do not reflect the subjective benefits from the patients’ perspective such as functionality and quality of life, which are crucial to weight up disease- and treatment-related burden [78]. Although earlier studies reported health-related quality of life after surgical and interventional mitral valve treatment [79], no current study systematically reviewed quantitative effect estimates on functional parameters after PMVR.

In our prospective cohort study we assessed a broad spectrum of validated, functional measures including exercise performance, general and disease-specific quality of life in addition to NYHA class, as recommended by the Mitral Valve Academic Research Consortium [80, 81], and overall was the second largest, non-registry study providing any quantitative functional measure. This is of particular relevance given that altogether only 7–15 of the 38 studies identified in this review reported quantitative measures of functionality. Hence, our sample of 142–171 patients with respective measures available provides a substantial part of the total number of patients of all studies combined, ranging from about 700–1000. A limitation of our study is the lack of a comparison group, which, however, applies to all studies in this field. To the best of our knowledge no study exists so far which provides an appropriate comparison group to objectively quantify the benefit of MitraClip therapy in comparison to the natural course of the disease or a gold-standard alternative therapy. The EVEREST II randomized trial included patients substantially differing from patients currently treated with MitraClip in real life. Other studies compared to healthy controls or did not provide quantitative functional measures. A further limitation of our cohort study is the short follow-up of 6 weeks. Notably, we did not observe an association of follow-up duration with the effects of MitraClip in meta-regression analysis, suggesting that recovery from the procedure is fast and patients show early and sustained benefits.

Our systematic review revealed that data on quantitative measures of functionality in patients undergoing MitraClip are sparse. The most frequently reported measure in 34 of 38 studies was NYHA class. However, the relation with objective measures of exercise capacity in elderly and morbid patients is poor [82] and challenge the suitability of NYHA class to capture impact on functional capacity and quality of life in patients undergoing MitraClip treatment [83]. The majority of the 7–15 studies reporting more sophisticated functional measures showed an overall improvement regarded as clinically meaningful. For example, the mean absolute benefit regarding 6MWD was larger than observed in heart failure patients undergoing cardiac resynchronization therapy (CRT) in all but one study, whereas the improvement in MLWHFQ was slightly lower in MitraClip studies compared to CRT, but can be still regarded as clinically relevant [84].

However, these findings have to be interpreted cautiously. As already discussed, the small sample sizes and lack of appropriate comparison groups limits an accurate effect estimation of MitraClip, and the observed publication bias might cause overestimation of effects. A further major drawback is the substantial heterogeneity across studies regarding patient characteristics, procedural success and primary outcomes. Meta-regression analyses did not identify variables which could explain heterogeneity of effects. Notably, functional or degenerative MR pathology was not associated with variation in changes of NYHA class or MLWHF. Important to note, the small number of studies with quantitative measures limits the statistical power of meta-regression analysis and hence, variable baseline characteristics as well as study quality criteria might have individual and cumulative impact on the heterogeneity of treatment effects. The large unexplained heterogeneity precludes accurate estimation of generally applicable effect sizes of benefit, which would be important for sample size calculations in future treatment trials. Additionally, the heterogeneity of beneficial effect sizes highlights the need for better tools to identify patients with best clinical response to this therapy and hence best risk–benefit for the patient as well as cost-effectiveness ratio for the health system.

An important step to improve evaluation of quality of life and functional parameters in patients undergoing MitraClip procedure will be the standardization of trial design, as recommended by MVARC [43, 44, 80, 81]. Crucial aspects are concomitant assessment of validated quality of life scales and exercise tests, as was done in our cohort, to confirm consistency, and longitudinally repeated assessments at predefined follow-up intervals with detailed report of frequency of lost patients and causes. However, it has to be emphasized that the MVARC recommendations are expert consensus and for example, cut-off values of functional parameters defined as clinically meaningful have not been validated in MitraClip patients. Particularly in the context of subjective endpoints such as quality of life, selection of an appropriate comparator group and blinded outcome assessors are mandatory. Several ongoing multicenter, randomized trials (COAPT, RESHAPE-HF2, MITRA-FR, MATTERHORN) evaluating MitraClip therapy in functional MR will analyze functional and quality of life outcomes and hopefully shed light on the quantification and individualization of MitraClip benefits.

There are additional limitations worth mentioning. Although we tried to exclude duplicate inclusion of patients mentioned in different studies, due to several multicenter studies and limited methodological reporting some patients might be included more than once. Finally, we included outcome reportings of only one follow-up time point per study for clarity reason favoring the longest follow-up presented. Due to this compromise data on more patients might be available in few studies, but with a shorter follow-up time.

Conclusion

To synopsize, treatment of severe MR with MitraClip results in improvements of physical capacity, physical and mental functioning as well as disease-specific quality of life usually regarded as clinically meaningful in the majority of patients. However, considering the huge between-study heterogeneity and the quality of individual studies, currently the quantitative benefit for the individual patient cannot be estimated and future controlled, randomized studies with assessment of symptom burden, functional status and quality of life are still required not only to quantify the benefit of MitraClip therapy with reference to conservative therapy, but also to identify patient groups who will mostly benefit.