Introduction

An estimated 1.2 billion people use tobacco worldwide (World Health Organization 2014). Tobacco-related deaths are one of the main causes of preventable early mortality, claiming 5 million lives annually (World Health Organization 2014). Smoking cessation is associated with significant health benefits, including reducing the risk of developing lung cancer, heart disease, and stroke (Polosa and Benowitz 2011). Despite the benefits of cessation and the desire of most smokers to quit, approximately 80 % of those who attempt to quit on their own relapse within the first month of abstinence, and only 3-5 % remain abstinent for 6 months or longer (Hughes et al. 2004).

The most common aids for smoking cessation are nicotine replacement therapies (NRTs), such as skin patches and chewing gums. However, they do not provide the additional sensory rituals that smokers seek (Caponnetto et al. 2012). It is suggested that electronic cigarettes (e-cigarettes) are a more analogous alternative (Caponnetto et al. 2012). E-cigarettes are battery-powered electronic drug delivery systems designed to provide users with a low concentration of nicotine, without the exposure to tobacco smoke and its harmful constituents. With each puff, a liquid containing nicotine is heated and vaporized to create a visible vapor without smoke or flame, while also allowing for handling and puffing actions (Yamin et al. 2010). Thus, e-cigarettes can address the psychological, cognitive, social, and behavioral elements of smoking, where most alternatives fail to do so (Caponnetto et al. 2012).

The e-cigarette is becoming an emerging phenomenon of increasing popularity with smokers. The use of e-cigarettes has increased from 3 % in 2010, to 7 and 11 % in 2012 and 2013, respectively. Furthermore, the number of people who reported having tried e-cigarettes has increased from 9 % in 2010 to 22 % in 2012 and 35 % in 2013 (ASH 2014). Given these statistics, many countries are in the process of creating regulations around e-cigarettes. Bans on e-cigarette sales have been proposed by some jurisdictions, and others have recommended that e-cigarettes should be regulated as tobacco products with lower nicotine content (Bam et al. 2014; MHRA 2014). Canadian regulations currently ban the sale, import, or advertising of e-cigarettes containing nicotine (Stanbrook 2013). However, online advertising from other countries where e-cigarettes are not prohibited is readily accessible. In summary, the use of e-cigarettes among smokers is increasing rapidly in Canada because of users’ ability to smoke them in prohibited areas and the perception of their safety (Stanbrook 2013).

Results from a recent UK survey concluded that e-cigarettes might be effective in helping smokers quit or reduce their smoking and avoid relapse (Etter and Bullen 2011; Fidler et al. 2011). Furthermore, two prospective cohort studies suggest that e-cigarettes can aid smoking abstinence (Polosa et al. 2011, 2014). Health-care providers, regulatory agencies, and public health decision-makers are interested in whether e-cigarettes can reduce the harm associated with smoking and achieve smoking abstinence better than current methods. The clinical evidence regarding the effectiveness, safety, and harm associated with e-cigarettes for smoking cessation recently began to undergo a more thorough examination in a manner similar to other drugs and devices (Caponnetto et al. 2012).

There have been several narrative reviews (Franck et al. 2014; Harrell et al. 2014; Orr and Asal 2014; Pepper and Brewer 2013) summarizing the available studies on e-cigarettes, though at the time, the limited existing data prohibited estimating their overall safety and efficacy through meta-analysis. However, primary research on e-cigarettes is in high demand; thus many studies are underway, two of which have been published (Bullen et al. 2013; Caponnetto et al. 2013). Recently, a systematic review and meta-analysis was published by the Cochrane Collaboration group (McRobbie et al. 2014), which reported that participants using nicotine-containing e-cigarettes are approximately two-and-a-half times (RR 2.29; 95 % CI 1.05, 4.96) more likely to have abstained from smoking for at least 6 months compared to those using placebo-containing e-cigarettes. Their review included both comparative studies and cohort follow-up studies with 6 months or longer follow-up. Conversely, our review focuses only on comparative studies both long-term behavioral studies with 6 months or longer follow-up period and short-term pharmacotherapy studies, where nicotine-containing e-cigarettes are compared to placebo-containing e-cigarettes or any other NRTs. It also investigates additional patient-important outcomes not reported in previous reviews, including the desire to smoke and withdrawal symptoms. Hence, a thorough and well-conducted review investigating the efficacy and the short-term effects of e-cigarettes, compared to placebo or any other nicotine replacement therapies, would be timely and useful for public health decision-makers.

Methods

Literature search strategy

A comprehensive systematic literature search was developed by an information specialist (KC) and the primary reviewer (SK). Bibliographic databases were searched, using the OVID interface, up to May 26, 2014: MEDLINE (1946-present), EMBASE (1974-present), PsychINFO (1860-present) and the Cochrane Central Registry of Controlled Trials (CENTRAL; April 2014). Terminology was used to search controlled vocabularies (MeSH and EMTREE) and keywords on the concept of “electronic cigarette” or “e-nicotine” (Supplementary Table 1). No limits on year, language, or human subjects were applied. Gray literature was identified through searching the websites of health technology assessment (HTA) and related agencies, as well as reports of major smoking cessation conference proceedings. The Google search engine was used to search for additional Web-based materials and information. These searches were supplemented by reviewing the bibliographies of key papers. All search results were imported into a Reference Manager Version 12 database, for duplicate removal and reference management.

Inclusion criteria

Articles were included if they met the following criteria: randomized controlled trials (RCTs) or comparative observational studies; comparing interventions with nicotine-containing e-cigarettes (any brand, any dose) to other nicotine replacement therapies (e.g., nicotine patches, nicotine gums, nicotine inhalers etc.) or placebo-containing e-cigarettes; healthy adults (≥18 years old); current smokers (≥10 cigarettes per day) regardless of whether they were considering quitting; and reports of any of the following outcomes: smoking abstinence for at least 6 months from the start of e-cigarette use; desire to smoke for at least 1 h after e-cigarette use; number of cigarettes smoked per day; withdrawal symptoms (i.e., irritability, restlessness, poor concentration, anxiety, depression, and hunger); serious and non-serious adverse events.

Exclusion criteria

Trials with non-human subjects; subjects with comorbidities or other health complications; no comparison group; non-intervention trials (e.g., review, conference abstract, case report, comment, editorial, news, survey, recommendation, or expert opinion) were excluded.

Study selection

The reviewers (SK, TD) independently screened study titles and abstracts based on the pre-specified inclusion and exclusion criteria. The full text articles of potentially eligible studies were retrieved and assessed by both reviewers independently to confirm inclusion or exclusion. Disagreements were resolved through discussion and consensus. No third party was required, although available (LL).

Data abstraction

The reviewers (SK, TD) independently extracted data from included studies using predesigned and piloted forms, including details on the following: patient demographics, intervention, comparator, study outcomes, country, year and length of follow-up. Furthermore, information on the methodology of the study and the funding source(s) was also extracted for quality assessment. The authors were contacted when data were reported in graphical form, were unclear, or missing. A second reviewer verified the data abstraction.

Quality assessment

The reviewers (SK, TD) independently assessed the methodological quality of included RCTs using the Cochrane Collaboration’s Risk of Bias Tool (Higgins and Green 2011). This tool assesses the methodological quality of RCTs, assigning low, unclear, or high risk of bias for the following domains: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessor, incomplete outcome data, selective reporting, or other sources of bias (e.g., possible funding by industry). The methodological quality of Controlled Before-After (CBA) studies was assessed using the same criteria as RCTs except that the “random sequence generation” and “allocation concealment” domains were both reported as “high risk of bias” by both reviewers based on the Cochrane guidelines (Higgins and Green 2011). Disagreements were resolved through discussion and consensus, and consultation with a third party (LL) when needed. Agreement was measured with the κ statistic and its 95 % confidence interval (CI).

Data analysis

A meta-analysis was conducted where studies were similar in terms of participants, intervention, and outcome measures to provide a meaningful summary of effect using the Cochrane Collaboration’s Review Manager Analysis version 5.2 statistical software (RevMan 5.2). The pooled result of dichotomous outcomes was summarized using a relative risk (RR) and its 95 % CI, with the Mantel–Haenszel method. A random effect model was used to conduct the meta-analyses, since some heterogeneity exists between studies such as the length of follow-up and the study design. For all statistical tests, a significance level of 5 % was used. In the case of continuous outcomes, the pooled data were summarized using a mean difference and its 95 % CI. In studies where the standard deviation (SD) was missing and not reported, Buck’s regression was used to meta-analyze the data (Buck 1960). Studies reporting mean values with no SDs or median with interquartile range (IQR) were not included in the meta-analysis and instead summarized narratively. The authors of one study (Dawkins et al. 2012) were contacted to obtain gender-stratified raw data to calculate the pooled estimates of the total mean change in Mood and Physical Symptoms Scale (MPSS) and their corresponding SDs in each study arm. A Poisson distribution was assumed to convert the number of adverse events into an average number of events experienced per subject in each study arm and their corresponding SDs.

Heterogeneity assessment

The heterogeneity between studies was assessed and quantified, using the I 2 statistics, representing the percentage of total variation across trials that are due to differences rather than chance. In the case of moderate heterogeneity (I 2 = 30–60 %) or higher, a priori subgroup analyses were conducted, if feasible, in an attempt to explain the observed heterogeneity (Altman and Bland 2003).

Sensitivity analysis

When appropriate, a priori sensitivity analyses were conducted to assess the effect of methodological features, stratifying studies by low versus unclear or high risk of bias, as well as the effect of missing data (Akl et al. 2012).

Results

Literature search results

A total of 570 unique studies were identified. Of these, 91 full text articles were reviewed. Five studies were deemed eligible for inclusion. Eighty-six studies were excluded from the review. Seventy-one were not RCTs or comparative observational studies, one was an HTA report, three studies did not have comparators, one had inappropriate interventions, and ten did not report on outcome of interest. Study flow and reasons for exclusion are outlined in the PRISMA flowchart (Supplementary Fig. 1).

Study characteristics

Included studies

Table 1 summarizes the characteristics of the included studies. One study was a CBA, and the remaining four were RCTs. Studies were conducted in New Zealand (Bullen et al. 2010, 2013), the UK (Dawkins et al. 2012, 2013), and Italy (Caponnetto et al. 2013). In total, five studies were included in this review, including 840 participants where 581 received nicotine e-cigarettes and 259 received placebo e-cigarettes.

Table 1 Study characteristics of included studies

Participants in all studies were 18 years of age and older who smoked at least ten cigarettes per day. Participants in Bullen et al. (2013) were willing to quit, whereas those in Caponnetto et al. (2013) were not. The remaining three studies did not specify the participants’ attitudes toward quitting. All five studies compared nicotine e-cigarettes to placebo e-cigarettes in adult smokers. Nicotine dosages varied according to study, since different brands of e-cigarettes were used. Finally, the studies differed in length of follow-up: 9 months after quit date (Caponnetto et al. 2013), 6 months after quit date (Bullen et al. 2013), and 1 day follow-up (Bullen et al. 2010; Dawkins et al. 2012, 2013).

Quality of included studies

The overall measures of agreement were excellent (κ = 0.81, 95 % CI 0.05–0.1), and all disagreements between the reviewers were resolved after discussion.

Random sequence generation and allocation concealment (selection bias)

Bullen et al. (2010, 2013), and Caponnetto et al. (2013) reported adequate sequence generation process and allocation concealment; hence, they were classified to be at “low risk of bias.” These domains were not addressed in Dawkins et al. (2012), and thus it was classified to be at “unclear risk of bias.” In Dawkins et al. (2013) it was classified to be at “high risk of bias” by default, based on the Cochrane guidelines (Higgins and Green 2011).

Blinding (performance bias and detection bias)

Only Bullen et al. (2013) reported blinding of participants and outcome assessors. The remaining four studies adequately blinded participants and personnel; however, they did not report on blinding of outcome assessors. Hence, they were classified to be at “low risk of bias” and “unclear risk of bias” for these domains, respectively.

Incomplete outcome data (attrition bias)

Missing outcomes were balanced between study arms, intention to treat (ITT) analyses was conducted, and more than 80 % of subjects completed the studies except in the study by Caponnetto et al. (2013). All five studies were classified to be at “low risk of bias.”

Selective reporting (reporting bias)

Although the protocol of only one of the studies (Bullen et al. 2013) was published online, all outcomes listed in “Methods” of each of the five studies were reported. Hence, all were classified to be at “low risk of bias.”

Other bias (source of funding)

One (Bullen et al. 2010) out of the five studies was funded by industry, and was classified to be at “high risk of bias.” The remaining four (Bullen et al. 2013; Caponnetto et al. 2013; Dawkins et al. 2012, 2013) were not funded by industry, and thus were classified to be at “low risk of bias”.

Efficacy of E-cigarettes

Smoking abstinence

In two studies (Bullen et al. 2013; Caponnetto et al. 2013), there was a total of 43 (9 %) smoking abstinence events in the intervention group versus 8 (5 %) in the placebo group. The pooled relative risk (RR) was 2.02 (95 % CI 0.97, 4.21), although the effect was not statistically significantly different between groups (p = 0.06) (Fig. 1). The degree of heterogeneity was 0 % between the included studies. Furthermore, Bullen et al. (2013) demonstrated a total of 17 (6 %) smoking abstinence events in the nicotine patch group (risk difference for nicotine e-cigarette versus patches 1.51 [95 % CI −2.49, 5.51]), which showed insufficient statistical power to conclude the superiority of nicotine e-cigarette to patches.

Fig. 1
figure 1

Meta-analyses indicate no significant difference in smoking abstinence between the two groups

Desire to smoke

The study by Bullen et al. (2010) reported a statistically significant reduction in the desire to smoke based on the mean Mood and Physical Symptoms Scale (MPSS) scores (0.82, 95 % CI 0.25,1.38; p = 0.006) from baseline between the nicotine (n = 39) and placebo e-cigarette groups (n = 39). It also reported an MPSS score of −0.10 (95 % CI −1.16, 0.95; p = 0.99) from baseline between nicotine e-cigarette (n = 39) and nicotine patches (n = 39). The pooled mean difference of the MPSS scores from the two studies (Dawkins et al. 2012, 2013) that reported the desire to smoke cigarettes approximately 15 min after smoking either nicotine- or placebo-containing e-cigarettes was −0.22 (95 % CI −0.80, 0.36, p = 0.45) (Fig. 2). The degree of heterogeneity was 0 %, demonstrating no heterogeneity between the included studies.

Fig. 2
figure 2

Meta-analyses indicate no significant difference in desire to smoke between the two groups

Cigarettes smoked per day

Two studies (Bullen et al. 2013; Caponnetto et al. 2013) reported the number of cigarettes smoked per day by participants, both in nicotine and placebo e-cigarette groups; however, it was not appropriate to pool the data, since Caponnetto et al. (2013) reported the median number and IQR of cigarettes smoked per day, whereas Bullen et al. (2013) reported the average number of cigarettes smoked per day. The median values (and IQR) of cigarettes smoked per day at week 52 in the Caponnetto et al. (2013) study was determined to be 12 (5.8–20.0) and 14 (6.30–20.0) for the 7.2 mg and 5.4 mg nicotine-containing e-cigarettes, respectively, compared to 12 (9.0–20.0) for the placebo e-cigarettes. Furthermore, Bullen et al. (2013) reported 2.8 versus 4.5 cigarettes smoked per day in the nicotine versus placebo-containing e-cigarettes, respectively. Overall, both studies showed a slight decrease in the number of cigarettes used per day among participants who were using nicotine- versus placebo-containing e-cigarettes.

Short-term effects of e-cigarettes

Withdrawal symptoms

Three out of the five studies reported withdrawal symptoms (i.e., irritability, restlessness, poor concentration, anxiety, depression, or hunger) as one of their outcomes. Two studies (Dawkins et al. 2012, 2013) used the Mood and Physical Symptoms Scale (MPSS), whereas the third study (Bullen et al. 2010) used the visual analog scale (VAS).

Bullen et al. (2010) was not included in the pooled analysis, since it reported the difference between the nicotine and placebo e-cigarette groups. This study reported no statistically significant reduced ratings for irritability 0.26 (95 % CI −0.49, 0.99; p = 0.48), restlessness 0.53 (95 % CI −0.11, 1.18; p = 0.10), and poor concentration 0.39 (95 % CI −0.30, 1.07; p = 0.26) from baseline between nicotine (n = 39) and placebo (n = 39) e-cigarettes.

The withdrawal symptoms reported in the nicotine and placebo e-cigarette groups in the studies by Dawkins et al. (2012) and Dawkins et al. (2013) were meta-analyzed. The mean difference of the MPSS scores of participants who received nicotine versus placebo e-cigarettes were −0.16 (95 % CI −0.40, 0.07; p = 0.17) for anxiety, −0.03 (95 % CI −0.38, 0.31) for irritability, −0.03 (95 % CI −0.42, 0.35) for restlessness, −0.01 (95 % CI −0.35, 0.32) for poor concentration, −0.01 (95 % CI −0.32, 0.30) for hunger, and −0.01 (95 % CI −0.22, 0.20) for depression. All pooled estimated differences were determined to be not statistically significant (p > 0.05). Furthermore, the I 2 value of 0 % (p > 0.05) was estimated for all six pooled withdrawal symptoms, demonstrating no degree of heterogeneity between the two studies (Fig. 3).

Fig. 3
figure 3

Meta-analyses indicate no significant difference in withdrawal symptoms between the two groups

Adverse events

Adverse events were adequately reported in only two (Bullen et al. 2010, 2013) of the five studies included in the review. Bullen et al. (2013) classified and reported adverse events as serious and non-serious, whereas Bullen et al. (2010) reported only non-serious events. However, discussion with the author of the study revealed that participants did not experience any serious adverse events (unpublished data, 2014). As a result, only the non-serious events reported in both studies were pooled and meta-analyzed.

The pooled mean difference in the average number of non-serious adverse events experienced in participants who received nicotine (n = 329) versus placebo (n = 112) e-cigarettes was −0.09 (95 % CI −0.28 to 0.46) (Fig. 4). However, this difference was not statistically significant (p = 0.65). The I 2 value was 53 %, signifying a moderate degree of heterogeneity. Due to the limited number (2 studies) of included studies, a priori subgroup analyses were not feasible to conduct to explore the reasons for the observed heterogeneity due to time to follow-up and different study design.

Fig. 4
figure 4

Meta-analyses indicate no significant difference in non-serious adverse events between the two groups (Canada 2015)

Serious adverse events, reported in the Bullen et al. (2013) study, were slightly higher in the nicotine e-cigarette group (27/137, 19.7 %) than in the placebo e-cigarette group (5/36, 13.9 %). Serious events by convention included death (n = 1, in nicotine e-cigarette group), life-threatening illness (n = 1, in nicotine e-cigarette group), admission to hospital or prolongation of hospital stay (n = 17 and n = 4, in nicotine and placebo e-cigarette groups, respectively), persistent or significant disability or incapacity, congenital abnormality (n = 8 and n = 1, in the nicotine and placebo e-cigarette groups, respectively).

Sensitivity analysis

Sensitivity analyses to assess the effect of missing data were not undertaken for the pooled effect estimate of smoking abstinence, since it was not statistically significant (p = 0.06). The a priori sensitivity analyses of assessing the influence of methodological features (low versus high risk of bias) on effect estimates was not investigated, since a maximum of two studies were pooled for each of the outcomes investigated in this review.

Discussion

This systematic review and meta-analysis synthesized all available published evidence on the efficacy, effectiveness, and safety of nicotine-containing e-cigarettes compared to other nicotine replacement therapies or placebo in healthy adult smokers. This review differs from others recently published, because it only included comparative studies and conducted meta-analyses to achieve greater statistical power for more precise estimates of the outcomes of interest.

Results from the recent trials suggest that the use of nicotine e-cigarette increased the proportion of patients who stopped smoking, although this change was not statistically significant. Importantly, the lower-bound 95 % CI of our estimate of treatment effect, or the most conservative estimate, suggested only a 3 % decrease in smoking abstinence among the intervention group. Thus, the finding of this review does not suggest that e-cigarettes are likely to be counterproductive for smoking abstinence among healthy adult smokers, whether or not they were willing to quit. The ability of an e-cigarette to mimic the psychological, cognitive, social, and behavioral elements of smoking (e.g., handling, holding and puffing actions) could be a possible explanation for the slight increase in smoking abstinence observed among nicotine and placebo e-cigarette users (Caponnetto et al. 2012). Furthermore, this review found that the desire to smoke among individuals who were using nicotine e-cigarette versus placebo was slightly lower, though again not statistically significant. No trials reported increased withdrawal symptoms or incidence of adverse events (both serious and non-serious); however, there were a limited number of studies, small sample sizes, and wide confidence intervals around effect estimates. Thus, the current evidence regarding the safety of e-cigarettes is inconclusive.

Recent reviews (Franck et al. 2014; Harrell et al. 2014; Orr and Asal 2014; Pepper and Brewer 2013) have summarized the available evidence on the efficacy and safety of e-cigarette used for smoking cessation; however, due to the limited amount of available literature and the inclusion of observational non-comparative studies, meta-analyses were not conducted. Their conclusions and the result of the recently published systematic review and meta-analysis by Cochrane Collaboration group (McRobbie et al. 2014) are consistent with the findings of this review. First, all reviews have demonstrated a slight increase in smoking abstinence among healthy adult smokers with the intervention. Second, as this review addressed, they also reported that all non-serious adverse events were self-resolved, and that the difference in the observed number of events between intervention and control groups was not significant (Franck et al. 2014; Harrell et al. 2014; Orr and Asal 2014; Pepper and Brewer 2013). Third, they have also addressed that the serious adverse events experienced by the participants in the study by Bullen et al. (2013) were not associated with the use of e-cigarettes. Fourth, they have also suggested that no definitive conclusion can be made regarding the efficacy and safety of e-cigarettes given that the evidence faces methodological and study design limitations (Franck et al. 2014; Harrell et al. 2014; Orr and Asal 2014; Pepper and Brewer 2013).

There are several limitations of this review. Most notably, there is a paucity of completed studies in this field and of those that have been published; they differ greatly in study design, comparators, outcomes, and follow-up periods. These limitations constrained the number of studies viable for pooling data in each meta-analysis. As a result, further investigations such as subgroup analyses to explore heterogeneity, and sensitivity analyses to assess the robustness of the pooled estimates were prohibited. Furthermore, the trials by Bullen et al. (2013) and Caponnetto et al. (2013) had variable usage of behavioral support in their studies, and the results only generalize to the products used in the interventions, since not all products are manufactured similarly (Farsalinos and Polosa 2014). Hence, definitive conclusion about the efficacy and short-term effect of the use of nicotine-containing e-cigarettes cannot be confidently made. However, based on the small number of events, additional research will reduce imprecision and have an important impact on the smoking abstinence estimate and as a result may change the outcome observed.

The review showed no statistically significant effect of nicotine e-cigarettes on smoking abstinence over a long period of time, desire to smoke, withdrawal symptoms, and non-serious adverse events. However, the data upon which these conclusions have been drawn were limited. Based on the results of this review, health-care providers, regulatory agencies, and public health decision-makers should consider the uncertainty surrounding this novel intervention when deciding whether e-cigarettes should be encouraged, allowed, or restricted. This review also demonstrates the need for higher-quality studies, such as RCTs with large sample sizes. For example, for detecting an absolute difference of 5 % between nicotine e-cigarettes and placebo e-cigarettes, it will require a total of 870 individuals, assuming an alpha level of 0.05 and a power of 0.80. Additionally, there is a need to adequately report smoking abstinence over a long period of time (≥6 months) and adverse events experienced by the participants to have more robust evidence of the benefits and harms associated with the use of e-cigarettes as a smoking cessation product.