Introduction

The diagnosis and treatment of internal (IRR) or external root resorptions (ERR) often represent a challenge in the clinical practice of dentistry. Root resorptions consist of internal or external destruction of tooth structure originating from inflammation with microbial, idiopathic, traumatic (e.g., dental trauma, orthodontic movement and periodontal treatment), or chemical (e.g., internal bleaching) etiologies [1,2,3,4,5]. Due to the absence of clinical signs in initial stages, imaging exams are essential for early diagnosis of resorptive lesions. In this sense, intraoral radiographies stand out for allowing an initial dental evaluation with low radiation dose and easy accessibility, being the first-choice exams for this purpose [5, 6].

In current practice, the digitization of dental radiographic systems is increasingly common, and the performance of digital exams has been shown to be equal to or greater than conventional radiography in various diagnostic tasks, including the detection of root resorption lesions [6, 7]. In part, this fact can be attributed to the development of modern radiographic systems, with high spatial and contrast resolutions, and especially to the possibility of manipulation and improvement of the image, which is performed with the purpose of achieving more accurate diagnoses [8].

Among the image manipulation tools available for the interpretation of digital radiographs, adjustment of brightness and the contrast is a simple, yet the most used procedure [9]. In general, these adjustments are carried out freely, i.e., according to the subjective judgment of the evaluator on the image quality [10,11,12]. However, few studies have investigated the possible influence of brightness and contrast adjustments on the diagnosis of subtle dental conditions, such as IRR and ERR in early stages.

Therefore, the aim of this study was to evaluate the performance of periapical digital radiography assessed under different radiographic brightness and contrast variations on the detection of simulated internal and external root resorption lesions. Additionally, observers’ preferences related to subjective image quality for these diagnostic tasks were evaluated and related to the diagnostic accuracy.

Materials and methods

Sample preparation

After approval by the local institutional research review board (protocol number #2.057.024), 30 singled-root extracted human teeth were collected. Disinfection and cleaning were performed with 2% glutaraldehyde for 2 h, and the teeth were kept hydrated in water, as previously described [13, 14]. A superficial root scaling was performed to remove tissue remnants and promote a uniform root surface. Exclusion criteria consisted of teeth with root canal treatment, calcification, incomplete root formation, resorptive lesions, or fractures. Teeth were decoronated at the cementoenamel junction to reduce tooth memorization bias during the evaluations. Internal and external simulated resorptive lesions were produced as follows:

  • ERR: A total of 36 cavities simulating external resorptive lesions were produced on surfaces of 15 teeth. Surfaces that did not receive preparation were used as controls. Lesions had three different sizes (small, medium, or large) and were evenly distributed among buccal, lingual, mesial, and distal surfaces of the roots. They were also evenly distributed among apical, middle, and cervical thirds of the roots. Sizes and depths of the lesions followed methodology proposed by Neves et al. [13] and de-Azevedo-Vaz et al. [14]: small (0.26 mm diameter and 0.08 mm depth), medium (0.62 mm diameter and 0.19 mm depth), and large (1.05 mm diameter and 0.24 mm depth). Diameters were determined by the width of spherical diamond-tipped burs (KG, Sorensen, Brazil), and the depths were standardized with aid of a cavity preparation machine [15].

  • IRR: 15 teeth were cut into buccal and lingual halves with a diamond disk (Isomet 1000, Buehler Ltd., Lake Bluff, USA) to expose their root canal. IRR lesions were produced by a combination of mechanical [6] and biochemical protocols [16]. Cavities with 0.25 mm of diameter and depth were prepared on the internal buccal surface of the middle third of the root, using a spherical carbide bur (KG Sorensen, Cotia, Brazil). The other surfaces were not involved to simulate an initial lesion. Thereafter, a cylindrical chamber of composite (Z250; 3 M ESPE, St Paul, MN, USA) was prepared around the cavity to confine the demineralization solutions, which were applied in two cycles of 12 h with nitric acid at 5%, and an application of sodium hypochlorite at 8% for 10 min between those cycles. The samples were kept at − 1 °C (±3 °C) for the whole process. The composite cylinder was then removed, and residual solutions were removed with water wash. Finally, the two halves were glued together with cyanoacrylate adhesive (Scotch Super Glue, 3M, Maplewood, USA).

Image acquisition

Teeth were individually inserted in an alveolar socket of a dry mandible for image acquisition with a CMOS digital radiographic sensor (Digora Toto system, Soredex, Tuusula, Finland). All radiographs were obtained using a Focus unit (Instrumentarium, Tuusula, Finland) operating at 70kVp, 7 mA, exposure time of 0.063 s and 40 cm focus-receptor distance. The exposure time was selected from radiographs taken with exposure times ranging from 0.02 to 0.50s. Three Oral and Maxillofacial radiologists, blinded to the exposure times and working in consensus, selected the images that had the best density and contrast for dental diagnosis. In case there was no difference between images with different exposure times, the lowest exposure time was selected.

To standardize initial images, the radiographs were all acquired with an aluminum step wedge, positioned laterally to the phantom, to verify the gray values and measure the initial density and contrast. An acrylic device was used to simulate soft tissue attenuation and maintain the receptor and the tooth/mandible in position, to reproduce the parallelism technique [6]. A total of 45 initial images were acquired: 15 with ERR, 15 with IRR, and 15 to be used as controls for IRR, which were radiographed before the production of simulated IRR (Fig. 1). Control group for ERR teeth was the surfaces that were not submitted to resorption simulation.

Fig. 1
figure 1

Acrylic positioning device used for standardization of exposure geometry during the acquisition of the digital radiographic images

Image preparation and assessment

To standardize initial images, they were exported in 8-bit TIFF format and the gray values of the aluminum step wedge were measured using Image J software (National Institutes of Health, USA). For image density standardization, the 6 mm thickness aluminum step was set to a gray value of 150 (standard deviation of 20). For image contrast, the differences on gray values between the 6 mm thickness step and the 8 mm and 4 mm steps were up to 40. These values were previously established in a pilot study.

Thereafter, each original radiograph had brightness and contrast adjusted in 4 different variations using PowerPoint software (Microsoft Corporation, Redmond, WA, USA) according to a previous study [17], resulting in 5 different images for each radiography acquired (initial + 4 adjustments): (V1) − 30% brightness and + 30% contrast; (V2) − 15% brightness and + 15% contrast; (V3) original image; (V4) + 15% brightness and − 15% contrast; and (V5) + 30% brightness and − 30% contrast. Thus, a total of 225 images were prepared: 75 for ERR assessments, and 150 for IRR (75 with resorptions and 75 as control). Figure 2 shows examples of ERR and IRR in radiographic images adjusted according to the different brightness and contrast variations.

Fig. 2
figure 2

Digital radiographic images showing examples of simulated IRR and ERR lesions (arrows) in the five brightness and contrast variations tested (V1 to V5). The ERR cavities are exemplified in the three different sizes: small (S1), medium (S2), and large (S3)

ERR and IRR conditions were independently assessed in a PowerPoint slideshow, with one image per slide, by 5 oral radiologists on an LCD display of 24.1 in. and resolution of 1920 × 1200 pixels (MDRC-2124, Barco N.V., Courtrai, Belgium). They evaluated the presence of ERR (per surface) and IRR (per tooth) in a 5-point scale: 1—absent; 2—probably absent; 3—uncertainty; 4—probably present, and 5—present. Manipulation tools were not allowed, except for zoom. To avoid observer learning and visual fatigue, observers assessed a maximum of 25 images per day, with an interval of at least 3 days between assessments. Radiographs were presented in a random order. After 30 days of the end of the evaluations, 30% of the sample was re-assessed in order to determine intra-observer agreements.

To assess image quality subjectively, slideshows containing 5 images of each tooth with resorptive lesions were presented, with brightness and contrast variations randomly arranged. Observers were asked to classify images as the best, good, regular, bad or the worst for ERR or IRR diagnosis.

Statistical analysis

Data were analyzed in SPSS version 22.0 software (IBM Corp, Armonk, NY, USA). Intra and interobserver agreement were calculated by weighted-Kappa test: < 0.00, poor; 0.00–0.20, slight; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, substantial; 0.81–1.00, almost perfect [18]. Areas under the receiver operating characteristic (ROC) curve were calculated to assess the diagnostic accuracy of the presence or absence of resorption. Sensitivity and specificity were also obtained. Diagnosis values were compared by analysis of variance (ANOVA) one-way (brightness and contrast variations) for IRR and ANOVA two-way (brightness and contrast variations × size) for ERR, with post hoc Tukey test. The significance level adopted was 5% (α = 0.05).

Results

Table 1 details the results related to IRR evaluation, considering the different types of image adjustments. The area under ROC curve, sensitivity and specificity values did not differ between the brightness and contrast variations tested (p > 0.05). Overall diagnostic values of periapical radiography were low for detection of simulated IRR lesions.

Table 1 Diagnostic values related to detection of simulated IRR lesions for each brightness and contrast variations tested

Diagnostic values related to detection of simulated ERR lesions are shown in Table 2. There were no statistically significant differences between the five brightness and contrast variation for the area under the ROC curve and specificity (p > 0.05). For sensitivity, V4 presented lower values than V1 in large ERR lesions (p < 0.05). Regarding ERR size, small lesions had statistically significant lower values of the area under the ROC curve and sensitivity than medium and large lesions (p < 0.05). In general (considering all simulated ERR lesions regardless of their size), periapical radiography showed low sensitivity and high specificity for this diagnostic task. Only one value of specificity is shown because, differently from groups with several sizes of ERR, there was just one control group.

Table 2 Diagnostic values related to detection of simulated ERR lesions for each size and brightness and contrast variations tested

Regarding observers’ preference on the image quality for IRR evaluation (Fig. 3a), V2 and V1 were ranked as the “best” or “good” in 84% and 62.7% of the cases, respectively. The original image (V3) was classified as “regular” in 61.3% of the responses, and V4 and V5 (higher brightness and lower contrast) received the worst classifications, ranked as “bad” or “worst” in 68% and 100% of the cases, respectively. In contrast, the evaluation of simulated ERR lesions showed that images with the greatest brightness and contrast adjustments (extremes V1 and V5) were ranked as bad or worse in most cases (Fig. 3b). The best classifications were attributed to V2, which was considered “best” or “good” in 70.6% of cases, followed by V3 (69.4%) and V4 (28%).

Fig. 3
figure 3

Distribution of the observers’ responses regarding their preference for the adjustment of the brightness and contrast of digital radiographic images for the IRR (a) and ERR (b) evaluations

Table 3 shows intra- and interobserver agreements values for both IRR and ERR evaluations. Regarding the IRR, mean intraobserver agreement ranged from fair (V3, 0.38, and V5, 0.35) to substantial (V4, 0.64), and mean interobserver agreement was fair for all brightness and contrast variations (0.21–0.37). For ERR, mean intra- and interobserver agreement ranged from fair (V5, 0.38) to substantial (V3, 0.65), and from slight (V5, 0.19) to moderate (V3, 0.42), respectively.

Table 3 Weighted kappa test (95% confidence intervals) for intra- and interobserver agreement related to IRR and ERR in each brightness and contrast variation

Discussion

IRR and ERR are commonly diagnosed by means of imaging exams because they only present clinical signs in advanced stages [5]. Despite the fact that the two types of root resorption present different physiopathology, both need to be correctly diagnosed to establish treatment and/or follow-up and were studied together in the present study. Although the radiographic characteristics of these lesions are well known, their early diagnosis is a challenge due to the limitations of radiographic examination, such as the overlap of structures in the image [1, 19]. Thus, evaluating the periapical digital radiography performance under different brightness and contrast conditions, and the observers’ preference related to image quality in the detection of these lesions, is important to improve detectability of initial lesions.

In the present study, diagnostic values of periapical radiography for IRR assessment were low, which is in accordance with previous reports evaluating early stage lesions [6, 10]. According to Kamburoglu et al. [10], radiographic detection of IRR cavities improves considerably among larger lesions. In addition, the location of IRR has also been reported to influence its diagnosis, with apical lesions being more difficult to diagnose. However, the location of the lesion did not represent a variable in our study because they were standardized between cervical and middle root thirds, since these sites are indicated as the most prevalent for the occurrence of this condition [3, 20]. Regarding the variations of brightness and contrast tested, all of them presented a similar performance in the detection of IRR. The effect of brightness and contrast on caries detection was assessed [17] and, similarly, these enhancement tools did not influence diagnostic accuracy significantly. Previous studies evaluating IRR with digital radiographic images did not test the specific effect of brightness and contrast settings on diagnosis; therefore, free adjustments of these features were allowed [6, 10]. A study investigated the effect of other specific enhancement tools (i.e. revealer filter and contrast inversion) on this diagnostic task [10], did not find significant differences from original images.

For ERR lesions, in general, periapical radiography showed high specificity, but low sensitivity values, thus impairing the accuracy of the examination. Durack et al. [21] and Creanga et al. [19] found slightly higher diagnostic values, which can be attributed to the fact that the ERR cavities evaluated by these previous studies were similar only to the medium or large lesions tested in our study. As expected and shown in previous reports [7, 11, 21], accuracy and sensitivity were higher for lesions of medium and large sizes. According to the brightness and contrast variations, sensitivity was significantly lower in the V4 group for detection of large size lesions, while the highest value was associated with the V1 group. This result demonstrates that for larger lesions, higher contrast of the image is associated with better radiographic detection.

ERR lesions located on lingual root surface are easier to detect [19], while apical lesions may be more difficult to diagnose [11]. Since the lesions in the present study were evenly distributed between the sizes and locations of the ERR cavities, we believe that this would not interfere with the results, however such comparison was not performed.

V2 group (i.e., − 15% brightness and + 15% contrast) was best ranked among all image quality variations for observers’ preferences. It appears that a reasonable increase of radiographic contrast, from the perspective of the observers, may improve image quality for the evaluation of IRR and ERR, although V2 group did not show better accuracy for either diagnostic tasks. In addition, it was observed that the most extreme variations of brightness and contrast (V1 and V5 groups) received the image quality worst classifications for ERR. For these groups, although these results refer to a subjective evaluation, a certain association with the objective data can be observed, since intra- and interobserver agreements were impaired in such groups (slight and/or fair agreement). On the other hand, IRR evaluations showed that radiographs with lower contrast (V4 and V5) were classified as worse, but there was no association between this and the agreement values. Therefore, some brightness and contrast variations seem to discreetly benefit intra- and interobserver agreements for the detection of ERR, however, such tendency is not observed for IRR lesions. Direct comparisons between the agreement values found here and those from previous studies are difficult due to the variations in lesions sizes and the methods used. Intra- and interobserver agreements have been reported varying from poor to almost perfect for ERR [5, 21,22,23,24] and slight to almost perfect for IRR [5, 6, 22].

So far, the studies comparing different imaging modalities on the detection of IRR or ERR were performed with burs for simulation of the root resorption lesions [6, 7, 10,11,12, 19, 21, 24, 25]. IRR cavities in this study were simulated using mechanical and chemical protocols, according to previously reported methods [6, 16], with the objective of standardizing the size of the cavities and to avoid strongly marked edges. Recently, a demineralization protocol for simulation of ERR lesions was published [26]. However, only a single lesion size was tested, with approximately 3 mm in diameter and 0.45 to 0.5 mm of depth, which are larger than the ERR sizes tested in our study and do not simulate lesions at earlier stages [26]. ERR lesion was mechanically induced in our study to have greater control over the size and depth of the cavities, and to allow more direct comparisons with other studies that used similar methodology. Further investigations are necessary to indicate demineralization protocols to create lesions of other sizes.

Clinical information, such as history of trauma or orthodontic movement, may contribute in the diagnosis of IRR and/or ERR [5], however, could not be taken into account in this ex vivo study. In addition, we did not use complementary imaging examinations for the evaluation of IRR and ERR lesions, such as cone beam computed tomography or parallax radiographic techniques, which represent a possibility for dentists during clinical practice and can improve the diagnosis of these conditions [1, 7, 11, 21]. However, since there is not always a significant difference between the use of one or more radiographic incidences [19] and considering the increase in patient exposure to ionizing radiation with additional examinations, optimizing the interpretation of orthogonal radiographs is very important to the diagnostic process.

Besides the present study, few studies were found investigating other enhancement tools for digital radiographic images in evaluating IRR and ERR lesions. Therefore, we strongly encourage future studies that address these issues, to find ways to improve the performance of intraoral radiography in these diagnostic tasks.

Conclusion

Periapical radiography has low accuracy for the detection of simulated IRR and ERR lesions. For both types of resorptions, brightness and contrast variations do not affect diagnostic accuracy significantly. Diagnostic values related to ERR were higher as for larger lesions. Observers’ subjective preferences were related to images with a reasonable decrease in brightness and increase in contrast; however, this was not associated with an increase in the accuracy values.