Introduction

Composite resin restorations are considered the first treatment option for direct restorations on both posterior and anterior teeth, because of their excellent aesthetic properties, good mechanical properties, conservative preparations, and their satisfactory clinical performance [1].

Nonetheless, composite resin undergoes volumetric shrinkage during the polymerization, which leads to the development of polymerization stresses at the interface between tooth and restoration [2,3,4,5]. As a consequence, some undesirable clinical events such as postoperative sensitivity, microleakage, marginal discoloration, cusp deflection, and formation of interfacial gaps at the margins [4, 6] may occur. To overcome such limitations, composite resins are placed in incremental layers [7], which have the disadvantage of prolonging treatment time.

To reduce clinicians’ working time and to simplify the restorative technique, bulk-fill composite resins were developed and introduced to the dental market. These can be placed in increments up to 4 to 5 mm thick, without compromising their mechanical properties, and degree of conversion [8]. These materials have greater translucency than incremental composites, they contain alternative photo-initiator systems [9] and modified monomers [9], to allow greater polymerization depth [10], and reduced polymerization shrinkage [10].

Bulk-fill composites can be categorized into low and high viscosity formulations. Flowable bulk-fill composites are indicated as base in class I and class II cavities, requiring an additional layer of conventional composite resin on the occlusal surface. High viscosity bulk-fill, also named as full-body composites, can be placed in increments up to 5 mm, without the need of a cover layer of regular viscosity composite resin [11]. A third type of bulk-fill composite needs activation with a sonic handpiece to be placed into the cavity (SonicFill, Kerr). During sonic vibration, the bulk-fill resin increases flowability allowing for a better adaptation into the cavity walls. Similar to full-body bulk-fill composite resins, these sonically vibrated materials are also indicated for class I and II cavities, in monolayer (increments of up to 5 mm deep), with no need of an additional occlusal increment of regular composite resin [11].

The simplification of operative procedures achieved with the use of bulk-filling technique is attractive for posterior restorations. Although there are some randomized clinical trials (RCTs) comparing these two composite resin placement techniques, they are low powered and as such, they are more prone to false-negative conclusions. By meta-analyzing data from these RCTs in a single estimate, more precise inferences can be produced. Therefore, the purpose of this systematic review of the literature was to answer the followed focused research question based on the PICO acronym (patient-intervention-comparator-outcome): “Is the performance (retention/fracture) of bulk-fill composites placed in posterior restoration of adult patients similar to incremental filled restorations? The hypothesis of this study was that both composite resin-filling techniques yield similar clinical performance.

Methodology

Registration and search strategy

This study complied with PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) recommendations [12], and it was registered in the PROSPERO database (CRD42018108450). The search strategy was based on the acronym of the PICOS question and prepared using controlled vocabulary terms (MeSH terms) and of free keywords commonly found in the titles and abstracts of articles. The elements of the PICOS acronym can be seen below:

  1. 1.

    Population (P): class I and II restorations in posterior teeth of adult patients;

  2. 2.

    Intervention (I): restorations performed with the bulk-filling technique;

  3. 3.

    Comparison (C): restorations performed with the incremental filling technique;

  4. 4.

    Primary outcome (O): retention/fracture rate. As secondary outcomes, we evaluated anatomical form, surface texture, color match, marginal adaption, marginal discoloration, caries, and postoperative sensitivity;

  5. 5.

    Study design (S): randomized clinical trials (RCTs).

The databases used for the search were PubMed, Scopus, Web of Science, Latin American and Caribbean Health Sciences Literature (LILACS), Brazilian Bibliography in Dentistry (BBO), and Cochrane Library. The search strategy was first developed for PubMed and later adapted to the other databases (Table 1). The reference lists of primary studies were manually searched for relevant additional publications as well as the related article link (first 10 articles) of each primary study in PubMed database. No restrictions regarding publication data or language were imposed to the search strategy. EndNote X8 software (Clarivate Analytics, Philadelphia, PA) was used to manage the retrieved studies and citations.

Table 1 PubMed search strategy

The gray literature was searched using the System for Information on Gray Literature in Europe (SIGLE) database. Abstracts from the Annual Session of the International Association for Dental Research (IADR) and its regional subgroups (1990–2020) were searched. Theses and dissertations (full texts) were searched in the ProQuest and Capes databases. Unpublished and ongoing studies were searched in clinical trial databases (Current Controlled Trials, International Clinical Trials Registry Platform, ClinicalTrials.gov, ReBEC, and EU Clinical Trials Register).

Eligibility criteria

This systematic review included only RCTs with parallel or split-mouth designs that described the clinical performance of posterior composite resin restorations (classes I and II) performed with incremental or bulk-filling techniques in permanent premolars and molars.

Studies were excluded if they (1) did not provide results for incremental and bulk-filling composite resin restorations; (2) did not present separate data for the control and intervention groups; and (3) reported data on primary teeth or class V restorations.

Selection of studies and data collection

The studies were selected by title and abstract following the eligibility criteria described. Articles indexed in more than one database were considered only once. Should not the information available in the title and abstract be sufficient for a definitive decision, the full-text article was assessed.

Two researchers obtained full-text articles and classified those that met the inclusion criteria. Relevant information on the research project, participants, interventions, and outcomes was collected using extraction forms by three study authors (Table 2). Data extraction was pilot-tested using a sample of four studies to ensure that the data were consistent with the specific research question. To avoid overlapping, multiple reports of the same study with different follow-ups were extracted into a single form.

Table 2 Characteristics of the included studies

Risk of bias of individual studies

The risk of bias was classified for each of the quality assessment items according to the Cochrane Handbook for Systematic Reviews of Interventions version 1.0 [13] by two independent authors. The assessment criteria included random sequence generation, allocation concealment, blinding, incomplete outcome data, and selective outcome reporting. Each domain was classified into “high risk of bias,” “low risk of bias,” or “unclear risk of bias” (insufficient information or uncertainty over potential bias). When two researchers did not reach a consensus, a third one was consulted.

The study was classified as “high risk of bias” when at least one domain was judged as being at high risk of bias. The study was classified as “low risk of bias” when all domains were at low risk of bias, and the study was classified as “unclear risk of bias” when there were not sufficient details to provide a definite conclusion.

Summary measures and synthesis of the results

For all the evaluated outcomes, data from the eligible studies were dichotomized into alpha vs. bravo/charlie for all meta-analyses when USPHS criteria were used [14]. When FDI criteria were used, the data were also dichotomized into success (score 1) vs. failure (scores 2–5). For each study, the risk difference (RD) and the 95% confidence interval (CI) were calculated. The meta-analyses were performed on all studies from which the information could be extracted using the random-effects model. Cochran Q test, I2 statistics, and prediction interval (for meta-analysis with more than five studies) were used to assess statistical heterogeneity. All analyses were performed with RevMan software (Review Manager v5.4; The Cochrane Collaboration). Among the included studies, different follow-ups were published; thus, the data was merged within similar follow-up time ranges for the purpose of the meta-analysis.

Assessment of the certainty of evidence using GRADE

The certainty of the evidence was graded for each outcome across studies (body of evidence) using the Grading of Recommendations: Assessment, Development and Evaluation (GRADE) (http://www.gradeworkinggroup.org/) to determine the overall strength of evidence for each meta-analysis. The GRADE approach is used to contextualize or justify intervention recommendations with four levels of evidence quality, ranging from high to very low.

The GRADE approach begins with the study design (RCTs or observational studies) and then addresses five reasons (risk of bias, imprecision, inconsistency, indirectness of evidence, and publication bias) to possibly rate down the quality of the evidence (one or two levels) and three to possibly rate up the quality (large effect; management of confounding factors; dose–response gradient). Each one of these topics was assessed as “no limitation,” “serious limitations,” and “very serious limitations” to allow categorization of the quality of the evidence for each outcome into high, moderate, low, and very low. The “high quality” suggests with good confidence that the true effect lies close to the estimate of the effect, whereas “very low quality” suggests poor confidence in the effect estimate, meaning the estimate reported can be substantially different from the actual effect.

Results

Characteristics of the selected studies

After the database screening and removal of duplicates, 1646 studies were identified (Fig. 1). After title and abstract screening, 630 studies remained. This number was reduced to 25 after application of the eligibility criteria of the full texts. Among them, four were excluded due to the following reasons: (1) in vitro study [15], (2) retrospective study [16], (3) study performed on primary teeth [17], and (4) the study did not use the bulk-filling technique [18]. Twenty-one articles were eligible for the qualitative analysis [19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]. Of these, seven articles were longer follow-ups of previous articles [22, 25, 31,32,33, 35, 36], so that a total of 14 studies remained for evaluation.

Fig. 1
figure 1

Flowchart of the study

The characteristics of the 14 selected studies are shown in Table 2. Four studies used a parallel design [19, 21, 28, 34] and ten had a split-mouth design [20, 23, 24, 26, 27, 29, 30, 37,38,39]. The age of the participants ranged from 7 [20] to 87 years [38], with an overall mean ± SD of 38.1 ± 12.3 years [21, 23, 24, 26, 27, 29, 30, 37, 38].

Two studies reported the placement of one restoration per patient [19, 28], one study reported placing at least one restoration per patient [34], nine studies performed 2 to 4 restorations per patient [20, 24, 26, 27, 29, 30, 37,38,39], one study performed 4 restorations per patient [23], and one study did not report this information [21].

Regarding the type of restoration, one study [20] included only class I cavities, six studies [19, 21, 23, 24, 29, 39] included only class II cavities, and other seven studies [26,27,28, 30, 34, 37, 38] included both class I and class II cavities. Twelve studies [19, 21, 23, 24, 26, 28,29,30, 34, 37,38,39] included premolars and molars and two studies [20, 27] included only molars. Seven studies did not report the cavity depth [19, 24, 27, 29, 37,38,39], while two studies reported moderate-sized cavities [21, 23], two reported at least 3-mm-deep cavities [26, 30], one reported cavities with at least 2 mm in depth [28], one reported cavities with 2–5 mm in depth [34], and one reported cavities with 4–5 mm in depth [20]. As for the bulk-fill resin classification, eight studies used full-body (sculptable) resins [19, 21, 23, 24, 27, 30, 34, 39], six studies used base (flowable) composites [23, 26, 28, 29, 37, 38], and three used the sonic-activated material [19, 20, 23].

Rubber dam was used in six studies [19, 26, 28, 30, 34, 39], cotton rolls and/or saliva ejector were used in seven studies [21, 23, 24, 27, 29, 37, 38], and one study did not report the method of field isolation [20]. Regarding the adhesive systems used, five studies used etch-and-rinse adhesives [19, 26, 28, 29, 39], four trials used self-etch adhesives [20, 24, 37, 38], two studies both etch-and-rinse and self-etch adhesives [27, 30], two studies used a universal adhesive [21, 34], and another used universal and self-etching systems [23]. The placement technique was described in all studies, including the thickness of the layers in each group and the light-curing time for each material. More details about the placement technique used in each study are shown in Table 2.

The evaluation criteria used in eleven studies used was the modified USPHS [19,20,21, 23, 24, 26, 27, 29, 37,38,39] and one study used the FDI criteria [30]. Two studies evaluated only postoperative sensitivity; one used the Likert scale [28], and another one used the numerical rating scale (NRS) and visual analogue scale (VAS) [34].

Follow-up periods of the eligible studies varied from 7 days [34] to 10 years [27], with the majority reported data between 1 and 3 years [19,20,21, 23, 24, 26, 29, 30, 39]. The number of restorations lost to follow-up due to patient dropout was more than 10 restorations in seven studies [21, 23, 27, 29, 30, 37, 39] and less than 10 restorations in three studies [19, 24, 38]. There were no dropouts in four studies [20, 26, 28, 34].

Risk of bias assessment

Risk of bias assessment is shown in Fig. 2. Six studies did not report the method of randomization [19, 20, 23, 27, 37, 38] and the great majority did not report the allocation concealment. Four studies [26, 28, 30, 34] were classified as at “low risk of bias,” seven studies [19,20,21, 23, 24, 37, 38] were at an “unclear risk of bias,” and three studies were classified as “high risk of bias” [27, 29, 39].

Fig. 2
figure 2

Assessment of risk of bias with the Cochrane Collaboration tool

Meta-analysis

Retention/fracture

The 1–1.5-year follow-up (eight studies [19,20,21, 23, 24, 26, 27, 29, 30, 37, 39]) showed a risk difference (RD) of 0.00 (95%CI − 0.01 to 0.01; p = 0.86, Fig. 3). Heterogeneity was not detected (p = 0.96, I2 = 0%), and the prediction interval was 0.00 (− 0.01, 0.01, Fig. 3).

Fig. 3
figure 3

Forest plots of the retention/fracture risk of composite resin restorations in posterior teeth placed with the incremental and bulk-filling techniques

In the 2 − 3-year follow-up (eight studies [20, 21, 27, 29, 30, 37,38,39]), the RD was 0.00 (− 0.02 to 0.02; p = 0.88). Heterogeneity was not detected (p = 0.70, I2 = 0%), and the prediction interval was 0.00 (− 0.02, 0.02) (Fig. 3). The retention/fracture risk for 5 or more years of follow-up (three studies [27, 37, 38]) was 0.05 (− 0.08 to 0.18; p = 0.46). The data were heterogeneous (p = 0.004, I2 = 82%; Fig. 3). The results showed no differences for the incremental and bulk-filling techniques at all the follow-up periods evaluated.

Postoperative sensitivity

Postoperative sensitivity up to 30 days (two studies [28, 34]) showed a RD of 0.04 (− 0.02 to 0.10; p = 0.18, Fig. 4). Heterogeneity was detected (p = 0.003, I2 = 79%, Fig. 4). In the meta-analysis of 1–1.5-year follow-up (eight studies [19,20,21, 23, 24, 27, 30, 39]), the RD was 0.00 (− 0.01 to 0.02; p = 0.63). Heterogeneity was not detected (p = 0.99, I2 = 0%) and the prediction interval was 0.00 (− 0.02, 0.02) (Fig. 4).

Fig. 4
figure 4

Forest plots of the risk of postoperative sensitivity of composite resin restorations in posterior teeth placed with the incremental and bulk-filling techniques

The RD in the 2–3-year follow-up (five studies [20, 21, 27, 30, 39]) was 0.00 (− 0.01 to 0.02; p = 0.71). Heterogeneity was not detected (p = 0.49, I2 = 0%), and the prediction interval was 0.00 (− 0.03, 0.03; Fig. 4). The results showed no differences for the two restorative techniques considering all the follow-up periods evaluated.

Secondary outcomes

In this study, marginal adaptation, marginal discoloration, caries, anatomical form, surface texture, and color match were also evaluated as secondary outcomes. Table 3 shows the risk difference, the confidence intervals, the heterogeneity, and the prediction intervals for the comparison of posterior restorations placed with the incremental or the bulk-filling techniques. It can be seen that there were no statistically significant differences for the two restorative techniques in all the follow-up periods, considering all the secondary outcomes.

Table 3 Data and analyses for the secondary outcomes comparing posterior restorations placed with the incremental or the bulk-filling techniques

Assessment of the certainty of evidence

In the summary-of-findings table (Table 4), it is shown that the certainty of the evidence for fracture and postoperative sensitivity risk was rated as moderate for all follow-up periods, as these outcomes were downgraded by one level due to the unclear risk of bias of most of the studies.

Table 4 Summary of findings table

Discussion

The hypothesis of this study that the clinical performance of class I and II restorations in posterior teeth placed with the incremental or bulk-filling techniques would be similar was not rejected. No significant differences were found between incremental and bulk-filled posterior restorations when retention/fracture rate, anatomical form, surface texture, color match, marginal adaption, marginal discoloration, caries, and postoperative sensitivity were evaluated.

The results of the present study are promising, indicating that the bulk-filling technique is an attractive alternative for posterior restorations. It is well-attested that incremental technique takes more time and is more sensitive than bulk-fill placement [34]; also, there is a problem or air voids between the layers and operative field contamination [10]. The simplification of operative procedures is desirable in daily clinical practice, as most clinicians prefer to work with easy-to-use restorative materials that allow cavity filling in larger increments and shorter chair-time in the dental office. Innovation in bulk-fill technology has made these composite materials easier to handle and reduced the chances for error. However, clinicians must be careful at all steps of the restorative procedure. Considering that in the bulk-filling technique larger increments are used and that high irradiance light-curing units are recommended for the polymerization of these materials, it is important to understand the consequences of polymerization shrinkage and stress on the adhesive interface when using bulk-fill resins [40]. It is also important that clinicians use light-curing units that can deliver the sufficient energy and the correct wavelengths of light to polymerize resin-based materials [41], especially when placing bulk-fill resins, to guarantee adequate degree of conversion and good mechanical properties [42]. Another limiting factor on the use of bulk-fill resins is their translucency, which tends to leave the restoration grayish when compared to the conventional composites [10, 43]. In the future, it would be interesting that these materials could have improved optical properties, including different opacity levels or the possibility of generating their shade based on the surrounding enamel and dentin color (single-shade resin composite).

Moisture control and saliva contamination during adhesive application and composite placement are among the most important factors related to the success of direct composite resin restorations. Thus, good moisture control and saliva contamination contribute to satisfactory bonding of the restorative material to the tooth and reduces the risk of infiltration and secondary caries, which can compromise the survival and/or longevity of the restorations. A recent update [44] of a Cochrane Review [45] indicated that there is low-certainty evidence that rubber dam usage in dental direct restorative procedures may implicate in lower failure rates of the restorations compared to restorations placed with cotton roll and suction after six months of follow-up. At longer time periods, the evidence was found to be very uncertain. So, in the present review, the included studies used either suction/cotton rolls or rubber dam as moisture control methods. Only one study did not report the isolation method used [20].

It is important to note that several adhesive systems were used in the studies included in this review. This is a difficult factor to standardize in the clinical studies. The use of adhesive systems with different strategies is part of the clinical practice routine, and the inclusion of studies with different adhesives and bonding strategies allows the results of this review to be better generalized to clinical practice. Furthermore, other systematic reviews that evaluated different bonding strategies to assess retention rate, postoperative sensitivity, and other clinical parameters showed that no bonding strategy can be considered better or more clinically effective than the others [46,47,48].

All relevant aspects related to the composite placement technique are described in Table 2; however, some characteristics need to be emphasized here. Thirteen of the 14 studies included in this review used conventional composite resin in the restorations with incremental technique and bulk-fill composite resin for restorations with the bulk-filling technique. Only the study by Loguercio et al. [30] used a bulk-fill resin for the restorations performed by both techniques. It is also important to note that the study by Arhun et al. [18] was excluded from this review, since all restorations were performed with the incremental technique (both for conventional and for bulk-fill composite resins); therefore, it is not possible to compare the restorative techniques, as proposed in the PICOS question.

When reporting the findings related to clinical trials, it is important to address the evaluation criteria. The USPHS criteria, also known as Ryge criteria, are the classical system for the clinical evaluation of dental restorative materials. It has been customized slightly by several authors over the years, and the list of criteria was extended to include other characteristics of interest. In this case, it is commonly reported as modified USPHS [49]. By far, these are the most used criteria in clinical studies of direct and indirect restorative materials. However, concerns have been raised regarding their limited sensitivity and the fact that their items may not completely indicate the clinical success of the restorations [50, 51]. One of the possibilities is the World Dental Federation (FDI) criteria that have now been used in randomized clinical trials. In the FDI, the criteria are divided into three groups, comprising aesthetic, functional, and biological parameters. Each criterion can be rated with five scores, three for acceptable and two for unacceptable restorations [50, 51]. In the present review, 11 studies used the modified USPHS criteria, while only the study of Loguercio et al. [30] used the FDI criteria to evaluate the restorations. Nonetheless, the results of that study were still included in the meta-analysis, adapted to the USPHS dichotomization of outcomes: FDI score 1 is considered success, whereas FDI scores 2–5 are considered failure. In the present review, USPHS criteria were dichotomized into alpha (success) vs. bravo/charlie (failure) [14] in an attempt to identify the outcomes when they were first reported, even if in a mild way. The authors acknowledge that when the restoration was scored as bravo, the outcome (marginal discoloration, for example) has already occurred, but has not yet caused the restoration to fail.

A previous systematic review and meta-analysis was published comparing the clinical performance of bulk-fill composite resins with conventional composite resin used for direct restorations of posterior teeth [52]. The present study, however, has considerable distinctions, such as the use of a more specific search strategy with MeSH terms and free keywords, the inclusion of more databases and pertinent gray literature, and no restrictions regarding the follow-up periods. Two other aspects are important to highlight: (i) the way the failures were accounted for in the meta-analysis; (ii) the choice of using fixed-effects and random-effects models.

In the Veloso et al.’s study, a charlie or delta score in any of the criteria items (marginal discoloration, retention, fracture of tooth/resin composite, caries, postoperative sensitivity, anatomic form, marginal adaptation) was considered restoration failure. This approach does not account for the differences that each one of these criteria have on clinical decision-making. For instance, fracture and debonding are more important features than marginal adaptation, the latter not requiring restoration placement or repair. Differently, in the present study, each item was analyzed separately, as a distinct outcome, in an attempt to compare both incremental filling techniques in all criteria of the restoration evaluation.

The meta-analysis by Veloso et al. [52] used a fixed-effects model, because no statistically significant heterogeneity was found among the studies. Boaro et al. [53] also used a fixed-effects model for their meta-analyses, including the clinical performance outcome. However, heterogeneity should not be used to validate the choice of the model used in a meta-analysis. This choice must be based on aspects related to the study variables. The fixed-effects model assumes that there is one true effect size, and all the studies included in the analysis share a common effect size. Also, all differences in the observed effects are attributed to errors in sampling. On the contrary, the random-effects model assumes that there is a distribution of true effect sizes, and the mean of this distribution has to be estimated. Another important assumption in the random-effects model is that the effect size may vary from study to study [54]. We understand that, because of the differences among the studies included in the meta-analyses (for example, type of restoration, teeth, cavity depth, isolation method, adhesive system, and placement technique), the random-effects model is the correct one to use. In addition, the random-effects model incorporates the heterogeneity across the studies into the analyses, and is preferred when heterogeneity is accounted for.

In this study, the risk difference was used to measure the effect for all the outcomes evaluated. Other meta-analyses used risk ratio (RR) [52] or odds ratio (OR) [53] to estimate the effects. Because the present study separated all outcomes, many of them had no events. In this way, to include all the studies in the meta-analyses, risk difference was used. Although risk difference is not frequently used because absolute risks may be different at baseline, this is not the case in the present study, because all criteria are rated as alpha or acceptable at baseline and differences only occur over time. Risk difference can be understood as the difference in risk of a condition between an exposed (or intervention) group and an unexposed (control) group [55].

The certainty of the evidence produced in the present study was graded as moderate as most of the evidence came from RCTs with unclear risk of bias. The allocation concealment is undoubtedly what contributed the most to increase the risk of bias of the eligible studies. Out of the 14 studies, 10 were classified as unclear in this domain. Randomization, along with allocation concealment, is considered one of the most important features of RCTs as they prevent selection bias. However, randomization was correctly reported in eight out of the 14 studies (low risk of bias). The blinding of examiners presented a low risk of bias in 12 of the 14 studies. Still, regarding the certainty of the evidence, it is worth mentioning that the evidence was also downgraded for indirectness, because the studies used different bonding strategies, adhesive systems, and bulk-fill composite resin brands. Despite this fact, the bulk-filling technique seems to present results comparable to the incremental technique in posterior restorations, regardless of the materials used.

The present study has some limitations. The quality of the studies varied and most of the included studies were at an “unclear risk of bias” [19,20,21, 23, 24, 37, 38]. Three studies [27, 29, 39] were rated as “high risk of bias” in the incomplete outcome data domain because more than 20% of the restorations were lost to recall. As for reporting bias, all included studies were judged as free of selective reporting of outcomes and premature reporting of results; however, some information of the methods were underreported in some studies, such as the number of restorations per participant [21], cavity depth [19, 24, 27, 29, 37,38,39], and isolation method used for the placement of the restorations [20]. Regarding incomplete retrieval of identified research, all data for the meta-analyses were available in the full-text articles.

Finally, we encourage the development of well-designed randomized clinical trials comparing these two restorative techniques with low risk of bias regarding methodology design, execution, and reporting of the research results. It is also important to highlight that despite the difficulties, the studies should include a large number of participants and long follow-up periods, since the number of events tends to be small in early evaluations.

Conclusion

The present systematic review and meta-analysis showed that the clinical performance of class I and II restorations in posterior teeth is similar when placed with the incremental and bulk-filling techniques, although the quality of evidence was graded as moderate. For all the outcomes evaluated, no significant differences were observed between the restorative techniques, considering short- (up to 3 years) or long-term (5 years or more) follow-up periods.