Introduction

The periodontium is a complex structure composed of two mineralized tissues, cementum and alveolar bone, and two soft tissues, periodontal ligament (PDL) and gingiva. The mineralized tissues provide skeletal support for the teeth, with the PDL as an anchor between the structures. The gingiva is responsible for protecting the underlying tissues against the oral microflora [1]. Diagnosing periodontal conditions is paramount in dental care, especially in periodontitis cases. Periodontitis is a bacteria-initiated and host-mediated inflammatory process that affects 19% of the population above 15 years of age globally and up to 70% of the population above 65 years of age in the United States [2]. Untreated progression of the disease can ultimately lead to tooth mobility and cause tooth loss through alveolar bone destruction [3]. Currently, there is frequent misclassification of the initial stage of periodontitis due to measurement errors with clinical assessment, and therefore, imaging diagnosis is essential in staging and grading the disease [4].

Radiographic evaluation of the periodontium relies on conventional radiography and cone-beam computed tomography (CBCT). While conventional radiography has anatomical limitations due to the superimposition of structures, CBCT is a reference standard for imaging hard tissues of the periodontium [5]. One study has shown that CBCT has an accuracy of 0.6 mm to measure alveolar bone height compared to direct measurement [6]. However, CBCT has poor soft tissue contrast and is reported to over or underestimate bone loss [7, 8]. Moreover, CBCT delivers higher radiation to the patients than conventional radiographs, making recurrent assessments not recommended in patient care [9,10,11].

Ultrasound (US) has recently received significant attention as an alternative non-ionizing imaging method for many purposes in dentistry, including assessing periodontal structures [10, 12]. It uses a transducer to emit sound waves into oral tissues and generate images of structures in B-mode (brightness mode) based on the sound echoes received back [13]. The advantages of US use in periodontium include real-time imaging, low cost, portability, painlessness, and soft tissue visualization. Preliminary studies have shown that US measurements have high accuracy compared to micro-CT [14, 15]. Recent human studies have suggested that US possesses diagnostic value in estimating several clinical periodontal parameters, including alveolar bone level, alveolar bone thickness, gingival height, and gingival thickness [16,17,18,19]. The present study’s authors have previously characterized US as having between good and excellent reliability in evaluating alveolar bone and gingival thickness [16]. A recent systematic review (SR) of studies that used US in the periodontium of live humans has shown that US has the potential to become a chairside diagnostic tool in dentistry [20].

However, the SR pointed out several knowledge gaps that require investigation before US can be considered for clinical implementation. One of these gaps is the lack of characterization of the repeatability of US imaging. Variability in repeating measurements with US is expected due to it being a dynamic technique which is heavily operator dependent. Therefore, the precision of the repeatability of US measurements when scanning the same patient at different times with identical circumstances needs to be investigated. The objective of the present study is to characterize the repeatability of scanning the same patient at different times with the same intraoral US system and operator.

Methods

Study design and participants

This prospective study was conducted in accordance with the declaration of Helsinki. Participant were recruited from the Graduate Periodontics Clinic at the Kaye Edmonton Dental Clinic - University of Alberta. A sample size calculation of at least 199 images was required based on a power analysis conducted on SPSS (IBM, NY, USA) for a one-sample t-test with 95% significance, 80% power, and an estimated effect size of 0.2. Inclusion criteria were adult volunteers possessing all natural incisors, canines, and pre-molars. Exclusion criteria included edentulous patients and patients with craniofacial syndromes. This study had ethics approval from the University of Alberta (Pro00133128) and written consent from participants.

Ultrasound scan

An in-house intraoral US system was used in this study. The transducer conducted scans in B-mode using 20 MHz imaging frequency, depth of 7 mm, and gain of 50%. Real-time video of the scan was transmitted via Wi-Fi to the Clarius Scanner app (Clarius Mobile Health, BC, Canada) on an iPad Pro (Apple, CA, USA), which saved the files in DICOM format. The DICOM files were then analyzed on a laptop using the RadiAnt DICOM viewer software (Medixant, Poland). The US operator was a general dentist with three years of US experience. Participants were scanned three times (T1, T2, and T3) in a single appointment, resetting the US system between each scan. Therefore, it yielded three different scans of the same tooth taken an average of 15 min apart. The tooth-periodontium interface of sixteen teeth was scanned in each participant. Incisors, canines, and first premolars on upper and lower arches were scanned. The scan protocol included using the transducer placed buccally at the midline, with its long axis parallel to the tooth’s long axis, which yields images in the sagittal plane. An in-house gel pad was used between the transducer and tooth to maintain proper acoustic conditions for imaging. Periodontal landmarks in one US image are illustrated in Fig. 1. An example of upper incisor scanning is illustrated in Fig. 2.

Fig. 1
figure 1

Periodontium anatomy of an upper central incisor in an ultrasound image. Figueredo et al. [20]

Fig. 2
figure 2

Demonstration of the scanning process of a right upper central incisor with the transducer and gel pad positioned along the tooth’s long axis

Ultrasound measurements

This study investigated the repeatability of three periodontal measurements: alveolar bone crest to CEJ distance (ABC-CEJ), gingival thickness (GT), and alveolar bone thickness (ABT). ABC-CEJ is a straight line from the ABC to the CEJ, GT is a straight line from the ABC to the edge of the gingival tissue, and ABT is the thickness of the alveolar bone measured 0.3 mm apical to the ABC. The measurements were conducted by the same evaluator on all T1 images, followed by all T2 images and, finally, all T3 images. The measurements retrieved from repeated scans are illustrated in Fig. 3.

Statistical analysis

Fig. 3
figure 3

A ABC-CEJ [1], GT [2], and ABT [3] inter-landmark illustration in US imaging. B Examples of measurements on the same tooth. Figueredo et al. [20]

US measurements were compared with statistical analysis conducted in SPSS. Each scan time’s means with standard deviation and mean absolute deviation (MAD) were calculated. A two-way mixed model intraclass correlation coefficient (ICC) with a 95% confidence interval (CI) for absolute agreements and single measures was calculated between the three measurement times.

The repeatability coefficient (RC) with 95% CI of each measurement was calculated using the formula 1.96×√2×within subject SD [21]. Profile plots and scatter plots with 45-degree lines were also used to visualize examples of samples with best and worst reliability. To interpret the magnitude of the ICC, a score lower than 0.5 was considered poor repeatability, between 0.5 and 0.75 was moderate repeatability, above 0.75 and below 0.9 was good repeatability, and above 0.9 was excellent repeatability [22].

Results

Fourteen participants, nine females and five males, aged between 24 and 37 years, were subjected to US scans. In total, 224 teeth were scanned, and one tooth scan was excluded from the study because the scan did not show the surrounding alveolar bone. As a result, 223 tooth scans were used in the investigation.

The mean ABC-CEJ distance for all measures at T1 was 3.016 mm (± 0.807), at T2, it was 3.056 mm (± 0.789), and at T3, it was 3.026 mm (± 0.790). The MAD among all ABC-CEJ measurements was 0.610 mm (± 0.508), and the RC was 0.648 (Table 1). The mean GT for all measures at T1 was 0.904 mm (± 0.299), at T2 was 0.891 mm (± 0.301), and at T3 was 0.892 mm (± 0.303). The MAD among all repetitions of the GT measurements was 0.224 (± 0.200), and the RC was 0.327 (Table 2). The mean ABT distance for all measures at T1 was 0.309 mm (± 0.094), at T2 was 0.311 mm (± 0.090), and at T3 was 0.308 mm (± 0.087). The MAD among the ABT measurements was 0.067 (± 0.060), and the RC was 0.121 (Table 3).

Table 1 ICC and means (with standard deviations) of ABC-CEJ measurements in T1, T2, and T3
Table 2 ICC and means (with standard deviations) of GT measurements in T1, T2, and T3
Table 3 ICC and means (with standard deviations) of ABT measurements in T1, T2, and T3

ICC results for ABC-CEJ between T1, T2, and T3 of all teeth showed excellent repeatability: 0.917(0.897,0.933) (Table 1). Different tooth group results varied between excellent repeatability (upper incisors, upper canines, upper premolars, and lower canines) and good repeatability (lower incisors and lower premolars). The best ABC-CEJ result was seen in the lower left canine: 0.970(0.928, 0.989) (Fig. 4), and the worst was the lower right central incisor: 0.781(0.555,0.916) (Fig. 5).

Fig. 4
figure 4

ABC-CEJ measurements of left lower canine (Tooth 33)

Fig. 5
figure 5

ABC-CEJ measurements of lower right central incisor (Tooth 41)

ICC results for GT between T1, T2, and T3 of all teeth showed good repeatability 0.849(0.816,0.878) (Table 2). Different tooth group results varied between excellent repeatability (upper premolars), good repeatability (upper incisors, upper and lower canines, and lower premolars), and moderate repeatability (lower incisors). The best GT result was seen in the upper right premolar: 0.941(0.864, 0.979) (Fig. 6), and the worst was the lower right lateral incisor: 0.345(0.027,0.678) (Fig. 7).

Fig. 6
figure 6

GT measurements of upper right pre-molar (tooth 14)

Fig. 7
figure 7

GT measurements of lower right lateral incisor (tooth 42)

ICC results for ABT between T1, T2, and T3 of all teeth showed good repeatability 0.790(0.746,0.898) (Table 3). Different tooth group results varied between good repeatability (upper and lower incisors, upper and lower canines, and upper premolars) and moderate repeatability (lower premolars). The best ABT result was seen in the upper right canine: 0.899(0.776, 0.963) (Fig. 8), and the worst was in the lower right canine: 0.507(0.188,0.780) (Fig. 9).

Fig. 8
figure 8

ABT measurements of right upper canine (tooth 13)

Fig. 9
figure 9

ABT measurements of lower right canine (tooth 43)

Discussion

The present study described the repeatability of intraoral US scanning further to validate it as a tool in periodontium assessment. Variations in measurements with US can be attributed to various sources of error. The terms “reliability,” “repeatability,” and “reproducibility” are all related in describing the sources of measurement error. According to Bartlett and Frost, reliability relates to the inherent variability due to the measurement instrument and operator error, repeatability relates to the variation in repeat measurements made on the same patient under identical conditions within short time intervals, and reproducibility refers to variation in measurements under changing conditions [21]. Several recent studies have characterized intraoral US use in anatomical evaluation around teeth [18, 23,24,25]. Two of these studies investigated US’s inter-rater reliability comparing alveolar bone level, thickness, and gingival thickness and concluded US has good reliability [16, 17]. Other studies have compared measurements conducted with US against radiographic and clinical methods. One study compared US to direct measurements of soft tissue height, gingival thickness, and alveolar bone level and found good, moderate, and excellent correlations, respectively [25]. US and radiographic estimations of gingival recession and alveolar bone level were also found to have over 90% correlation [23]. Two other studies compared US and clinical assessments of gingival thickness and pocket depth, finding a high correlation [18, 24]. Results of the present study showed similar results for means and standard deviations for each repeated scan. The repeatability coeffiecients indicated that similar results are expected from repeated US assessments, and ICC results were between excellent and good for all measurements. Our results contribute to the body of literature by characterizing US imaging as a repeatable method for the periodontium.

The alveolar bone level was measured as the ABC-CEJ distance, which yielded excellent ICC scores between T1, T2, and T3. The RC indicated that in 95% of samples, repeated measurements of ABC-CEJ distance would result in differences of less than 0.648 mm. This suggests repeated US scans of the same tooth yield similar ABC-CEJ measurements. The ABC-CEJ measurement can be compared to the clinical attachment loss (CAL) measurement currently used in clinical periodontitis diagnosis [4]. Results showed that tooth 33 had the highest ICC score and tooth 41 had the worst. It is worth noting that the alveolar bone around the canines is thicker than the alveolar bone around lower incisors. A thicker alveolar bone makes it easier to identify ABC in US images, which could be a source of the difference in ICC scores between these types of teeth.

Results for gingival thickness showed good ICC results between the three times. In a previous study, gingival thickness was found to have excellent reliability in inter- and intra-rater ICC [16]. The present results suggest some variance in repeating the gingival thickness measurement in three different images of the same tooth. This could be attributed to a difference in pressure applied to the soft tissues by the operator while scanning at different times, which could compress the gingiva and lead to different gingival thickness measurements. However, the RC for GT showed that repeated measurements would have a difference of less than 0.327 mm in 95% of samples. The best repeatability for GT was found in tooth 14, and the worst repeatability in tooth 41. This could be attributed to premolars having thicker gingiva than lower incisors, making the soft tissue thickness less likely to be impacted by compression.

Alveolar bone thickness had the lowest ICC scores among the measurements investigated. However, it still represented a good repeatability. The RC for ABT showed that repeated measurements would have a difference of less than 0.121 mm in 95% of samples. The difference in repeating ABT measurements with US could be due to a lack of protocol in positioning a transducer during scanning. ABT measurement outcomes may vary based on slight changes in transducer positioning. The best ICC score for ABT was found in tooth 13 and the worst in tooth 43. This could be attributed to the difference in depth of vestibules between upper and lower arches, as the deeper vestibule of the upper arch allows easier US scanning.

The American Academy of Periodontology has suggested that an acceptable measurement error for intra-examiner reliability for probing depth and clinical attachment loss measurements with a periodontal probe is 1 mm [26, 27]. Several factors play a role in the clinical measurement error, including operator skill, probing pressure, probe type, tissue inflammation, and measurement site [27]. US assessment, similar to clinical probing, can be influenced by the beforementioned factors. However, our findings indicate that the ABC-CEJ measurement using US imaging has a MAD value of 0.6 mm and an RC of 0.648. This suggests that the margin of error in the measurement obtained from US imaging is acceptable. Therefore, there is potential in implementing US as a chairside diagnostic tool that could assist in, for example, alveolar bone level assessment before periodontal surgeries, which currently relies on CBCT and is subject to errors in estimation and radiation exposure [7, 8].

The main limitation of the present study is that there is currently no established protocol for determining the optimal pressure that should be applied by operators during ultrasound (US) scans to keep the image quality. This is difficult to be addressed since there is no clinically available method for quantifying the pressure applied by the operator during scans, which means that the pressure may vary between each scan. Validating a US scanning protocol could improve results in the repeatability of gingival and alveolar bone thickness. Moreover, US is operator dependent, the present results should be interpreted considering that the operator who conducted the scanning process has 3 years of experience with US. The results of the current study call for further research into intraoral US capabilities and scanning protocols. This includes investigating the accuracy of US scanning compared to direct measurement of the alveolar bone level and gingival depth. As well as future research into the standardization of operator-induced pressure.

Conclusion

The results of the present study indicated excellent repeatability for alveolar bone level measurements in the same patient three times and good repeatability for gingival and alveolar bone thickness measurements. ICC scores were also supported by RC results, which showed minimal differences would be expected in the repetition of these measurements. The characterization of US as a repeatable method can bring it closer to clinical implementation as an additional non-ionizing diagnostic tool. The introduction of US in routine periodontal assessment could potentially improve the diagnosis and planning and potentially reduce the risk of losing a tooth in a periodontitis case, improving patient care.