Introduction

Facial beauty is a crucial factor for psychosocial well-being and success; hence, facial beautification is the main reason for dental and orthodontic treatments1,2,3,4. This warrants studies on predictors of beauty.

Esthetic preferences can be influenced by various factors related to judges (such as their demographic, psychological, and educational variables5,6), facial features of the judged person2,3,4,7,8,9,10, and the esthetic assessment methods, e.g., questionnaire types, laterality, sample sizes, bias reduction methods, statistical analyses, and many other methodological parameters. Of the abovementioned 3 groups of esthetic determinants, facial features of the person might be the most important to orthodontists and other clinicians such as prosthodontists and plastic or maxillofacial surgeons. An attractive face and smile are the result of the harmony among various facial features11. Smile esthetic parameters include smile width, incisor display, smile symmetry index, buccal corridor, and the relationship between the dental midline and the facial midline12. This study simultaneously assesses, for the first time in the literature, the beauty of 5 anatomical parameters: midline deviation and its direction, midline roll (angular deviation) and its direction, and facial form:

Midline deviation is one of the crucial factors to smile beauty. The dental and facial midline should ideally be parallel and in line with each other13,14. But in some other studies, it has been suggested that the dental midline should not be in the center of the mouth, because it creates a false smile15. It has also been stated that in the majority of the normal population, there is a difference between the dental and facial midlines16. There is no agreement in this regard and some aspects remain unstudied (like differences between opinions of various expert groups); besides, the side of midline deviation may matter but remains understudied. Another anatomical factor affecting smile beauty is midline roll. Identifying the angular deviation of the dental midline may be easier for lay people than identifying its linear displacement17. In this regard, the literature is even scarcer with more aspects remaining unstudied or understudied. The acceptability of the dental midline can be affected by other factors as well. The type of face and the symmetry of facial structures (such as the nose and chin) may affect people’s understanding of the dental midline position, and lower the detection threshold18,19. One of these anatomic parameters is the overall facial form, which is characterized by three basic outlines: dolichofacial (leptoprosopic), brachyfacial (euryprosopic), and mesofacial (mesoprosopic). A dolichofacial face is long and narrow, while a brachyfacial face is wide, short, and globular. A mesofacial face shape is defined by characteristics between these two20.

It seems that the overall facial form may also interact with the perception of smile esthetics. For instance, in men, the mesoprosopic facial form may be more acceptable for midline deviation, while in women, the mesoprosopic shape has the lowest acceptable threshold for midline deviation18. This warrants future studies that take into account facial form while evaluating other aspects of smile.

To date, no study has assessed the simultaneous effect of these 5 anatomical variables (dental midline and a combination of its linear or angular deviations –each with its own sidedness– together with different facial forms) on beauty. Moreover, no study has assessed midline anomalies in the context of the whole face. Furthermore, many other aspects remain unstudied, including (but not limited to) the combinations of these factors with judges’ demographics and education levels, as well as the most beautiful ranges of these 5 anatomical features according to different groups of experts or laypeople. Therefore, and considering the undeniable importance of facial and smile beauty, this very comprehensive study was conducted.

Its goal was to assess the effects of various variables (the abovementioned 5 anatomical features as well as observers’ expertise [3 groups], age, sex, and experience) on facial beauty, such effects within each expertise, such effects within each facial form, and also to find the ideal composite faces, using a very large sample which was unique to the literature. The null hypotheses were the lack of any effects of any of the independent variables, as well as the lack of any difference between different composite photographs.

Materials and methods

This was a large prospective, diagnostic and psychometric study on 15,042 esthetic preferences of 327 individuals (i.e., judges or referees of images) towards 46 facial perceptometric photographs of a female photomodel. The model had signed written informed consent allowing the researchers to use her images (either original or photo-manipulated) to the study judges and also publish them within the article. All the participants (judges) as well agreed to participate in the anonymous online survey. The protocol and its ethics were approved by the IRB of the Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran (ethics code: IR.AJUMS.REC.1400.564). All methods were performed in accordance with the relevant guidelines and regulations (including the Declaration of Helsinki); all experimental protocols were approved by the Institutional Review Board of Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran. The first page of the questionnaire was an informed consent which asked the participants to begin the survey if they were consent to participate; yet the need for any signed informed consents by the survey participants were waived by the Institutional Review Board of Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran (ethics approval code: IR.AJUMS.REC.1400.564). The photo-model signed informed consent allowing the researchers to use her image, either original or after photomanipulation, for the research and the article without the need for masking any parts of her face1.

Sample size

The G*Power software (version 3.1.9.2, Heinrich Heine University Düsseldorf, Düsseldorf, Germany21) was used to calculate the sample size. The study had 3 groups. To obtain a 95% power, 32 subjects were required in each group (confidence interval (CI) = 95%, α = 0.05, β = 0.05) based on the parameters reported in previous studies on factors contributing to facial esthetics4,22.

We did not limit ourselves to the above number (i.e., 96 judges); the sampling continued after reaching this size in order to improve the reliability of the results and also to ensure adequate power for any additional analyses. Finally, 327 participants were enrolled (detailed below). The final sample size was about 3.5 times larger than the size needed to obtain very high powers above 95%. All 3 groups exceeded the minimum number of 32 subjects per group.

Eligibility criteria

A female model who met the eligibility criteria was selected by a panel of orthodontists. The criteria included a mesoprosopic normal facial pattern (not brachyfacial and not dolichofacial) without any problems and without skeletal class II or III malocclusions, adequate incisor display in the smile (100%), overall facial symmetry in the frontal view, an absence of any dental abnormalities and crowding or spacing, an absence of any craniofacial syndromes, no history of any facial cosmetic procedures, and being aged between 18 and 30 years. The photo-model selected in this study was 22 years old. The inclusion criteria for the participants were being above 18 years of age, being either laypeople or one of the dental specialties of orthodontics, prosthodontics, or restorative dentistry, and being consent to participate. The exclusion criteria were any incomplete or inappropriately filled questionnaires. There was no age restriction for the participants.

The original photograph

The frontal view of the smile was photographed with a Canon 90D camera (Canon, Tokyo, Japan) with a 50mm lens, F22, ISO100, and flash light. The camera was set at the same level as the eyes and the head of the model was placed in the natural head position (NHP). To achieve NHP, the patient was asked to move his head up and down several times and put it in a position that he feels comfortable. A very small black triangle between the central incisors was filled with the color of neighboring gingiva.

Perceptometric image sets with controlled variable morphologies

A total of 45 standardized perceptometric photographs were created by editing the original image using Adobe Photoshop CC (version 20.0.2.30, San Jose, California, USA). These represented different combinations of facial indexes with midline deviations (in mm) or midline angular deviations (or rolls, in degrees). Because of the very large size of the current survey, it was not practical to add images that can have both types of deviations (i.e., midline deviations (in mm) or midline rolls (in °)) simultaneously. Therefore, all images having any midline shifts had only either midline deviations (in mm) or midline rolls (in °), but not both. In other words, there could not be any images in which the midline was deviated and angulated at the same time; the inclusion of such images would make the resulting questionnaire exponentially long and tiresome, impossible to answer. Proper statistical analyses were adopted to address this through evaluating the interactions of midline linear and angular deviations.

Facial index

The nasion (as the point on the soft tissue where the frontonasal suture intersects the midsagittal plane), and the gnathion (as the lowest point on the lower border of the mandible) were determined. One examiner identified the zygion and gonion anthropometric landmarks and the mandibular plane inclination on both digital photographs; two experienced orthodontists then confirmed them. If there was any disagreement between specialists, it would be settled through discussion by the two orthodontists with a third one. The bizygomatic width was measured as the distance between the outermost point on the zygomatic arch (the distance between the left and right zygions). The facial height was measured as the distance between nasion and gnathion. The facial index was defined as the bizygomatic width-to-height ratio of the anterior face. This index was used to create two new images (Fig. 1), based on the definitions of facial patterns which are: euryprosopic (brachyfacial, short and wide face): Facial indexes smaller than 83.9%; mesoprosopic (mesofacial; round face): 84% to 87.9%; and leptoprosopic face (dolichofacial, long face): larger than 88%20,23.

Fig. 1
figure 1

The original image (b), the brachiofacial face (a), and the dolichofacial face (c). The midlines have no linear or angular shifts.

The original image of the photomodel had a facial index of 86% (i.e., mesoprosopic). It was used to create two new images, one with a facial index of 80% (euryprosopic), and another one with a facial index of 94% (leptoprosopic).

Deviation of the dental midline

All the above-mentioned 3 photographs had dental midlines that were without any deviations or angular shifts, i.e., they were ‘on’ midlines. Each of them was used to create 8 new images with dental midlines that were ‘off’ (a total of 24 new images), with midlines shifted to the right or left sides of the photomodel in increments of 1 mm. For this purpose, the midline of the face was determined as a line that divides the face into two equal halves of left and right, and passes through the glabella, the tip of the nose, the middle point of the philtrum of the upper lip, and the middle point of the chin (and is perpendicular to the line between the pupils). The photomodel’s dental midline remained upright but was shifted to a side, i.e., 8 different states with 1 mm intervals: upright dental midlines with 1-, 2-, 3-, and 4-mm deviations to the right; and upright midlines with 1-, 2-, 3-, and 4-mm deviations to the left. After each midline alteration, the dentition was photo-edited to keep the right and left buccal corridors equal (Figs. 2, 3 and 4).

Fig. 2
figure 2

Midline deviations on the mesofacial face. Midline deviations to the model’s left and right sides are sorted on the top and bottom rows, respectively. In the top row, the most severe left-sided midline deviation is on the left (a). The mildest left-sided deviation is image d. In the bottom row, the most severe right-side midline deviation is on the right (h). The mildest right-sided deviation is image e. The extent of alteration in deviation severity between each two images within each row is 1 mm.

Fig. 3
figure 3

Midline deviations on the brachiofacial face. Midline deviations to the model’s left and right sides are sorted on the top and bottom rows, respectively. In the top row, the most severe left-sided midline deviation is on the left (a). The mildest left-sided deviation is image d. In the bottom row, the most severe right-side midline deviation is on the right (h). The mildest right-sided deviation is image e. The extent of change in deviation severity between each two images within each row is 1 mm.

Fig. 4
figure 4

Midline deviations on the dolichofacial face. Midline deviations to the model’s left and right sides are sorted on the top and bottom rows, respectively. In the top row, the most severe left-sided midline deviation is on the left (a). The mildest left-sided deviation is image d. In the bottom row, the most severe right-side midline deviation is on the right (h). The mildest right-sided deviation is image e. The alterations in midline deviation are 1 mm per image.

Angulation (roll) of the dental midline

From each of the 3 images with different facial indexes, 6 new images were created (a total of 18 new images). Again, the facial midline was determined as detailed above. The angulation of the dental midline was altered from zero (upright, original) to 6 different angulations on the frontal plane: by 5°, 10°, 15° to the model’s right and left sides compared to the facial midline, so that 6 different midline rolls were created: midlines with a tilt to the right at an angulation of 5°, 10°, and 15°; and midlines with a left tilt of 5°, 10° and 15° (Figs. 5, 6 and 7).

Fig. 5
figure 5

Midline angulations (rolls) on the mesoprosopic face. Midline angulations to the model’s left and right sides are sorted in the top and bottom rows, respectively. In the top row, the most severe left-sided midline roll is on the left (a). In the bottom row, the most severe right-sided midline roll is on the right (f). The extent of change in midline angulation between each two images within each row is 5°.

Fig. 6
figure 6

Midline rolls on the euryprosopic face. Midline angulations to the model’s left and right sides are sorted in the top and bottom rows, respectively. In the top row, the most severe left-sided midline roll is on the left (a). In the bottom row, the most severe right-sided midline roll is on the right (f). The alteration in angulation severity between each two images is 5°.

Fig. 7
figure 7

Midline rolls on the leptoprosopic face. Midline angulations to the model’s left and right sides are sorted in the top and bottom rows, respectively. In the top row, the most severe left-sided midline roll is on the left (a). In the bottom row, the most severe right-sided midline roll is on the right (f). The extent of change in midline roll between each two images is 5°.

Esthetic assessments

Raters

There were 3 ‘occupation’ groups. More than a thousand individuals were contacted (either in person or online) to invite them to participate in this survey, of whom, 407 agreed to participate. After excluding the incomplete or inappropriate responses, 327 participants remained. The number of participants in the 3 occupation groups were respectively 243, 49, and 35 in the groups: laypeople, orthodontists, and ‘prosthodontists or specialists in restorative dentistry’. The latter group consisted of 11 prosthodontists and 24 specialists in restorative dentistry; they were considered one group (named prostho-resto) because of their very close educations and specialties and practices in the country.

Randomized survey

This anonymous randomized survey was designed by three orthodontists. The online questionnaire asked for the demographics (age, sex) and experience (in years) of participants as well as the beauty of each of the images explained above, as well as one repeated image. The participants were asked to look at each image and select an esthetic score between 0 and 10 on an 11-point numeric rating scale (NRS), with zero as the least appealing face and 10 as the most beautiful face. The order of images was randomized. There was no time limitation. It was not possible to scroll back to the previous image. The original image was not known to the judges. One of the images/questions was randomly selected (midline deviation = 2 mm, tilt = 0°, FI = 94%) and repeated.

Survey reliability and intra-observer agreement

The internal consistency of the survey (across all the 46 questions) was determined as excellent: Cronbach Alpha = 0.972, 95% CI = 0.968–0.976, P < 0.00000005. The intraclass correlation coefficient (ICC) showed a very high intra-observer agreement between the beauty scores given to the two repeats of one of the questions by the 327 participants (ICC = 0.852, 95% CI = 0.817–0.881, P < 0.00000005).

Statistical analysis

Descriptive statistics were computed for each of the 46 questions in various occupations. The average of the repeated question was calculated as well to be used later. The sample was checked in terms of the balance of men and women in different groups, using a chi-square. It was also checked for the balance of age in different groups, using a one-way analysis of variance (ANOVA). The sample normality was checked and confirmed using the central limit theorem as well as histograms and q-q plots. The software in use was SPSS 25 (IBM, Armonk, NY, USA). The level of significance was set at 0.05 for the main analyses. For the 6 subgroup analyses, it was adjusted to 0.0083 using the Bonferroni method. For the t-test and one-way ANOVA comparing the scores of different groups of judges for each of the 45 images; for these analyses, the level of significance was adjusted to 0.0011 using the Bonferroni method.

It should be noted that when the level of significance is adjusted for any Bonferroni post hoc test using the Bonferroni method, this means that the multiple comparisons problem is being corrected twice for such a post hoc test, once for the number regressions (using the Bonferroni method), and the second time, for the number of post hoc tests performed after each of those regression analyses (using the Bonferroni post hoc correction).

Esthetic determinants and acceptable eccentricities (the zone of esthetic acceptability)

A hierarchical mixed-effects multiple linear regression followed by a Bonferroni post hoc test were used to assess the effects of each of the 5 dimensions of anatomic changes (i.e., facial forms, bidirectional midline deviation in mm, and bidirectional midline roll in °) on facial beauty, as well as the effects of the raters’ age (in years), sex (2 levels [male, female]), and occupations (3 levels [control laypersons, orthodontists, and prostho-resto group]) on the changes happened to the beauty scores they gave to the perceptometric photographs showing anatomic modifications. The model was optimized based on model fit and collinearity parameters.

Tolerable deviations and beauty determinants according to each specialty

In order to find out to what extent increasing the midline deviation or roll leads to significant changes in esthetic perception of the face, the hierarchical mixed-effects multiple linear regression was conducted for each of the occupation groups. The post hoc Bonferroni pairwise comparisons between the most appealing smiles and other forms would show the extent of modification that would lead to a significant drop in beauty scores. The non-significant pairwise comparisons between the modifications versus the most beautiful smile would mark the extent of deviation or roll, which judges would not mind (i.e., the zone of esthetic acceptability). The level of significance was adjusted for these 3 mixed-model regressions and their Bonferroni post hoc tests to 0.008, using the Bonferroni correction for multiple comparisons problem.

Acceptable midline deviations (angular or linear) within each facial form

This time, the hierarchical mixed-model multiple linear regression was performed separately for each of the 3 facial forms to identify the acceptable ranges for midline deviations or midline rolls within each facial type separately, which was determined using the post hoc Bonferroni pairwise comparisons between the most appealing smiles and other forms (as detailed above). The level of significance for these 3 mixed-model regressions and their Bonferroni post hoc tests to 0.008, using the Bonferroni correction method.

The ideal esthetic zone: the most beautiful anatomic combinations

The ‘ideal esthetic zone’ was defined as the most appealing anatomic combination of facial forms and midline statuses (i.e., the perceptometric image with the highest average beauty score) as well as the other anatomic combinations (some other perceptometric images) that were not significantly different from the most beautiful one in terms of their esthetic scores. For determining it, first the most beautiful image was found and then its esthetic score was compared with each of the other images. This procedure was done once for the whole sample, and then for each group of occupation, separately: The most pleasing image was identified using a repeated-measures analysis of variance (RM-ANOVA). The follow-up pairwise comparisons to identify the ideal zone were performed using the Bonferroni post hoc test. In these analyses, the average of the two repeated questions was used instead of each of them. For each of the 45 perceptometric images, the scores given by men versus women were compared using an independent-samples t-test; this was repeated for different specialties. For these analyses, the level of significance was adjusted to 0.0011 using the Bonferroni method.

Results

Overall, 327 referees were included with a mean (SD) age of 30.44 ± 8.64 years (95% CI 29.50–31.38, minimum: 12, maximum: 66). Their mean experience in their respective fields was 5.24 ± 6.343 years (95% CI 4.55–5.93, minimum: 0, maximum: 34). There were 237 women and 90 men in total. The mean age (SD) of the 237 female judges was 29.83 ± 8.47 years (95% CI 28.75–30.92, minimum: 12, maximum: 62). The mean age of the 90 men was 32.06 ± 8.90 years (95% CI 30.19–33.92, minimum: 17, maximum: 66). The mean experience of the 237 women was 4.25 ± 5.747 years (95% CI 3.52–4.99, minimum: 0, maximum: 34). The mean experience of the 90 men was 7.84 ± 7.095 years (95% CI 6.36–9.33, minimum: 0, maximum: 30).

There were 243 laypeople (182 women, 61 men), 49 orthodontists (29 women, 20 men), and 35 prosthodontists/restorative specialists (26 women, 9 men). The distributions of men and women in these 3 occupation groups were not significantly different from each other (chi-square, P = 0.078). The groups laypeople and orthodontists (but not prostho-resto) were balanced in terms of the similarity of the ages of men and women (P > 0.05, detailed in Table 1). In terms of experience, only men and women of the orthodontists group (but not laypeople or prostho-resto groups) had balanced years of experiences (P > 0.05, detailed in Table 1). The difference between the ages of the 3 occupation groups was significant (one-way ANOVA, P = 0.00016, Table 1); the difference between the years of experience of the 3 occupations was not significant (one-way ANOVA, P = 0.076, Table 1).

Table 1 Descriptive statistics and 95% CIs for ages (years) and experiences (years) of judges in different subgroups.

Determinants of beauty; acceptable anatomic ranges

Table 2 presents descriptive statistics and 95% CIs for beauty scores given to each of the 45 photographs by female or male referees from each of the 3 specialties. The hierarchical mixed-effects multiple linear regression model’s parameters were as follows: n = 15,042, level of significance = 0.05, -2 Restricted Log Likelihood = 59,354.064; Akaike's Information Criterion (AIC) = 59,448.064; Schwarz's Bayesian Criterion (BIC) = 59,805.932. According to the mixed-model multiple linear regression, the referees’ specialty, age, or experience were not associated with their beauty scores (Fig. 8, Tables 3 and 4). However, the judges’ sex (men tended to give higher scores) and the photomodel’s midline deviation, midline roll, and facial index were all associated with beauty scores (Fig. 8, Tables 3 and 4). The interactions between the following variables were significant: the judges’ sex and occupation, the judge’s sex and the photomodel’s midline deviation, the judges’ job and the photomodel’s midline deviation, the judge’s sex and the photomodel’s midline roll, and the judge’s job and the photomodel’s midline roll (Tables 3 and 4).

Table 2 Descriptive statistics and 95% CIs for the beauty scores of each of the 45 perceptometric images according to laypersons, orthodontists, and prosthodontists combined with restorative specialists.
Fig. 8
figure 8

Estimated marginal means (and 95% CIs) calculated using the hierarchical mixed-model multiple linear regression (n = 15,042) for analyzing the beauty scores affected by different independent variables. Except for the variable ‘specialty’, all other categorical variables became significant. The bars are colored on a continuum of colors from green to red to show (1) non-significant features as well as (2) more or less attractive features. Green bars indicate the most beautiful features, while red bars indicate the most unattractive ones. Within each variable, the same-color bars indicate features with estimated marginal means that are not significantly different from each other, according to the Bonferroni post hoc test. The green bars mark the zone of acceptability. Ortho, orthodontist; P-R, prosthodontist or specialist in restorative dentistry.

Table 3 The results of the optimized hierarchical mixed-effects multiple linear regression analysis (n = 15,042).
Table 4 Estimated marginal means and 95% CIs for the categorical variables within the hierarchical mixed-model multiple linear regression framework.

The zone of acceptability: According to the Bonferroni test (level of significance = 0.05), the mesoprosopic face looked the most appealing, while the euryprosopic face was the least attractive. The most beautiful smile was the one with an ‘on’ (no deviation) midline or those with only 1 mm of deviation, which were not significantly different from the ‘on’ midline. The greater the deviation, the least appealing the smile. A rather similar result was observed for smile roll, with the ‘on’ smile being perceived as beautiful, and others significantly less appealing than the ‘on’ smile (Fig. 8, Tables 4 and 5). Both the variables midline deviation and midline roll seemed to be affected by the side of the deviation or roll: deviations or rolls to the right side were better tolerated by the judges compared those to the left (Fig. 8, Tables 4 and 5).

Table 5 The Bonferroni post hoc test’s results for pairwise comparisons after the hierarchical mixed-effects multiple linear regression.

Subgroup analyses for beauty determinants and acceptable anatomical ranges

According to orthodontists

According to the hierarchical mixed-model multiple linear regression (n = 2254, level of significance = 0.008, -2 Restricted Log Likelihood = 8093.747, AIC = 8097.747, BIC = 8109.170), the orthodontists’ experience and age had a marginally significant role in their esthetic perception (Tables 6 and 7), and the effect of sex become marginally significant (Tables 6 and 7). The photomodel’s facial index, midline deviation, and midline roll were associated with her esthetics in the eye of orthodontists (Tables 6 and 7).

Table 6 The results of the hierarchical mixed-effects multiple linear regression analyses conducted for each specialty separately.
Table 7 Estimated marginal means and 95% CIs for the categorical variables within the hierarchical mixed-model regression framework calculated for each specialty.

Esthetically acceptable zone: According to the Bonferroni test (level of significance = 0.008), the mesoprosopic face as well as the absence of any deviation or roll were the most appealing. In terms of FI, esthetics of the leptoprosopic face did not differ significantly from that of the mesoprosopic face (P = 0.242, Table 7) but the euryprosopic face was significantly less attractive than the mesoprosopic face (P = 0.0000001, Table 7) or the leptoprosopic face (P = 0.0002, Table 7). In terms of midline deviation’s esthetics, only deviations 1 mm to the right and left were not significantly different from the ‘on’ midline (both P values = 1.0, Table 7); the rest were significantly less attractive in the eye of orthodontists compared to the ‘on’ midline (all 6 P values ≤ 0.0002, Table 7). Regarding midline rolls, orthodontists found the ‘on’ midline the most attractive one; according to them, all other modifications to midline roll were significantly less attractive than the ‘on’ midline (all 6 P values ≤ 0.0005, Table 7).

According to prosthodontists and specialists in restorative dentistry

The hierarchical mixed-model multiple linear regression (n = 1610, level of significance = 0.008, -2 Restricted Log Likelihood = 5743.909, AIC = 5747.909, BIC = 5758.652) showed that sex, age, or experience of prosthodontists and restorative dentistry specialists did not matter (Tables 6 and 7). The photomodel’s facial index, midline deviation, and midline roll influenced the prostho-resto group’s esthetic judgement (Tables 6 and 7).

Acceptable zone: The mesoprosopic face as well as the absence of any deviation or roll were the most appealing features. Regarding facial index, the leptoprosopic face was not significantly different from the mesoprosopic face, although it was borderline significant (Bonferroni, P = 0.062, level of significance = 0.008, Table 7); the euryprosopic face was significantly less attractive than the mesoprosopic face (P < 0.00000005, Table 7) or the leptoprosopic face (P = 0.0007, Table 7). The prostho-resto group found only midline deviations 1 mm to the right and left as not significantly different from the ‘on’ midline (both P values ≥ 0.213, Table 7); they found the rest significantly less appealing than the ‘on’ midline (all 6 P values ≤ 0.00001, Table 7). According to them, the midline without any roll was the most attractive one; any midline rolls were significantly less appealing than the ‘on’ midline (all 6 P values ≤ 0.002, Table 7).

According to laypeople

The hierarchical mixed-model multiple linear regression (n = 11,178, level of significance = 0.008, -2 Restricted Log Likelihood = 45,352.467, AIC = 45,356.467, BIC = 45,371.107) showed that sex, age, or experience of laypersons was not a determinant of beauty judgement (Tables 6 and 7). The photomodel’s midline deviation, midline roll, and facial index affected laypeople’s beauty judgement (Tables 6 and 7).

The threshold of acceptability: The mesoprosopic face, the absence of any midline roll, and 1 mm midline deviation to the right were the most attractive. The mesoprosopic face was significantly more attractive than the other two (Bonferroni, level of significance = 0.008, both P values ≤ 0.00005, Table 7). The leptoprosopic face was better than the euryprosopic form (P < 0.00000005, Table 7). Compared to the best midline deviation (1 mm off to the right), the ‘on’ midline (P = 1.0, Table 7) and the one deviated to the left for 1 mm (P = 0.060, Table 7) were non-significantly different than the best midline deviation (both P values ≥ 0.213, Table 7); both 1-mm deviations to the left and right sides were esthetically similar to the ‘on’ midline (both P values = 1.0, Table 7). The rest were significantly less appealing than the best image (all 6 P values < 0.00000005, Table 7) and also compared to the ‘on’ midline (all 6 P values < 0.00000005, Table 7). According to laypeople, the midline without any roll was the most appealing one; all rolls were significantly less beautiful than the ‘on’ midline (all 6 P values ≤ 0.010, Table 7).

For mesoprosopic faces

The hierarchical mixed-model multiple linear regression (n = 4905, level of significance = 0.008, -2 Restricted Log Likelihood = 19,555.677, AIC = 19,559.677, BIC = 19,572.665) showed that sex, age, or experience of participants did not influence their esthetic preferences (Tables 8 and 9), but did the photomodel’s midline deviation and midline roll (Tables 8 and 9).

Table 8 The results of the hierarchical mixed-model multiple linear regression analyses conducted for each facial form separately.
Table 9 Estimated marginal means and 95% CIs for the categorical variables within the mixed-model regression framework calculated for each facial form.

Zones of acceptability: The reviewers rated mesoprosopic faces with the ‘on’ midline (without any deviation) as the best one; the zone of acceptability also included deviations 1 mm to both sides (Bonferroni, level of significance = 0.008, both P values = 1.0, Table 9) as well as 3 mm deviation to the right (P = 0.491, Table 9). Regarding midline rolls, the zone of acceptability was the 5° roll to the right (as the most pleasing feature) followed by the ‘on’ midline without any roll, which was not significantly different from the top feature (P = 1.0, Table 9) and the 15°roll to the right, which marginally significantly differed from the best feature (P = 0.022, Table 9).

For euryprosopic faces

The hierarchical mixed-model multiple linear regression (n = 4905, level of significance = 0.008, -2 Restricted Log Likelihood = 19,392.003, AIC = 19,396.003, BIC = 19,408.991) showed that judges’ sex, age, or experience did not affect their esthetic preferences (Tables 8 and 9). However, the photomodel’s midline deviation and midline roll mattered (Tables 8 and 9).

Thresholds of tolerance: The referees rated the midline 1 mm off to the right as the most appealing one; the zone of acceptability also comprised the ‘on’ midline, the midline off to the left for 1 mm (Bonferroni, level of significance = 0.008, both P values = 1.0, Table 9) as well as 2 mm deviation to the right, which was only marginally significant (P = 0.030, Table 9). The zone of acceptability for midline roll consisted only of the ‘on’ midline, which was significantly more appealing than any midline rolls (all 6 P values ≤ 0.008, Table 9).

For leptoprosopic faces

The hierarchical mixed-model multiple linear regression (n = 5232, level of significance = 0.008, -2 Restricted Log Likelihood = 21,080.125, AIC = 21,084.125, BIC = 21,097.243) indicated did not show any effect for raters’ sex, age, or experience (Tables 8 and 9). However, the photomodel’s midline alterations affected their preferences (Tables 8 and 9).

Acceptable thresholds: The referees rated the midline 1 mm off to the right as the most appealing one; the zone of acceptability also included the ‘on’ midline (Bonferroni, level of significance = 0.008, P = 1.0, Table 9). The midline with no roll was the most appealing one; the rolls 5° to the left (P = 0.041, Table 9) and 15° to the right (P = 0.013, Table 9) were marginally significantly different from the ‘on’ midline.

Ideal esthetic zone: most beautiful perceptometric images

The repeated-measures ANOVA (F = 68.150, P < 0.00000005) and the post hoc Bonferroni test showed that the esthetic zone was consisted of 4 images (the best image and 3 images that were not significantly different from the best image, P > 0.05, Bonferroni): The most attractive image was a leptoprosopic face without any midline roll but with 1 mm midline deviation to the right. The second attractive image was a leptoprosopic face without any midline roll or deviation. The third attractive image was a mesoprosopic (mesofacial) face, without midline deviation, but with a 5° midline roll to the right. The fourth attractive image was the original image –mesoprosopic without midline alterations (Fig. 9, Tables 10 and 11). The rest of images had esthetic scores significantly smaller than the most beautiful image (P < 0.05, Bonferroni, Fig. 9, Tables 10 and 11).

Fig. 9
figure 9

Estimated marginal means and 95% CIs for beauty scores pertaining to each image (n for each bar = 327) calculated using the RM-ANOVA. The green bars show the ideal esthetic zone. The yellow bars are above 6 and rather similar to each other, while at the same time not more than 1 point below the score of the best image. The red bars on the left are the least attractive image and the other 6 images that were not significantly different from it. Note: The perceptometric images are sorted in this figure only after data collection (for a better visualization); during the survey, the perceptometric images were randomized.

Table 10 Descriptive statistics and 95% CIs calculated within the framework of the RM-ANOVA. N for each image is 327 judges.
Table 11 A part of the results of the Bonferroni test comparing the most attractive image (deviation = 1 mm, roll = 0°, FI = 0.94) with all other images.

Differences between the sexes or specialties, in terms of each perceptometric image

The independent-samples t-test (with a level of significance adjusted to 0.001) did not detect any significant difference between the scores given by men and women to each of the 45 perceptometric images (Fig. 10, Table 2, all 45 P values ≥ 0.010). The one-way ANOVA (again at a significance level of 0.001) did not detect any significant difference between the scores given by the 3 specialty groups to each of the 45 perceptometric images (Fig. 10, Table 2, all 45 P values ≥ 0.006).

Fig. 10
figure 10

Mean beauty scores and 95% CIs for each of the image within different groups. Note: The perceptometric images are sorted in this figure only after data collection (for a better visualization); during the survey, the perceptometric images were randomized.

Discussion

Many aspects of the present study are not examined before, such as the ‘ideal esthetic zone’ or the simultaneous determination of beauty by a combination of a rather large range of bidirectional midline deviation and midline roll and facial indices, not to mention the sides of midline deviations and rolls. Therefore, we are limited to discussing the aspects shared with similar previous studies available; not to mention that in areas were other studies were available, the number of such studies were limited most of the time.

Our results showed that jobs, age, and level of experience had no significant effect on beauty scores. However, the referees’ sexes had a small but significant effect on their overall views in a way that men tended to give higher beauty points to the female model. Further subgroup analyses revealed that such an effect of participants’ gender on their esthetic views might be related mostly to orthodontists, but not laypeople or prosthodontists and specialists in restorative dentistry. It should be noted that in the present study as well, when the beauty scores given by men and women to each perceptometric image were compared, they did not turn up significant. Only when their beauty opinions towards all images were combined, such a male–female difference emerged. In some other studies, the gender of judges did not affect their preferences19,24,25. While Zhang et al.26 reported that the judges' gender influenced their perception towards beauty, such that female judges tended to consider more attractive a slight degree of midline deviation (as compared to no deviation) in their male models26. Our results were relatively similar and the most beautiful photograph had a slight midline deviation to the right. In general, midlines without deviation were the most beautiful in the present study, but slight deviations of 1 mm to either side did not have a significant difference in terms of beauty compared to the ‘on’ midline. Regarding the specialty and the level of experience of the judges, our result was not in line with other studies in this field13,19,25,27,28,29, which reported that dental professional training can affect the judgment of smile beauty28,30. We also did not find associations between the referees’ beauty preferences with their age or experience.

The present study found that overall (throughout the whole questionnaire), the best midline was the one without any deviation. The tolerable range extended to 1mm deviation to either side. Subgroup analyses showed that in each of the 3 specialty groups (orthodontists, prosthodontists, and laypeople), a similar pattern existed and all groups identified the acceptable midlines as the one without deviation as well as the ones with 1 mm deviations to the right or left. In the expert groups, the ‘on’ midline was the most appealing of the 3; however, in laypeople, the most appealing midline was the one with 1 mm deviation to the right. Even the most beautiful perceptometric image was not the original one or a modification with zero midline deviation; it was an image with 1 mm deviation to the right. This might be in line with some authors suggesting that the dental midline should not be in the center of the mouth, since it creates a false smile15. In the literature concerning the tolerable midline deviation, there has been no general agreement on acceptable deviation. ‏According to a study, the maximum amount accepted by orthodontists without reducing the beauty of a smile is one millimeter13. ‏In three articles, it was also stated that this deviation is acceptable to orthodontists up to 2 mm19,25,31. In the male model of a study, the 1-mm midline deviation score was not significantly different from the baseline score; however, in their female model, in all midline deviation values, the beauty score was significantly reduced18. ‏In another study, however, only 4 mm in orthodontists were associated with a decrease in the attractiveness of the smile31. For laypeople, it has been suggested that they cannot detect the midline deviation up to 2.1 mm or even 4 mm13,32. Although the current study as well detected significant interactions between the judges’ expertise and their perception of midline deviation, subgroup analyses showed a rather similar sensitivity of laypeople to midline deviations compared to dental experts. The controversy existing over the literature can be attributable to so many methodological parameters involved, such as the smaller sample sizes in previous studies as well as their different methods of detecting the sensitivity of judges.

In the current study, the side of midline deviation was also evaluated; there are only a few studies available in this regard and most previous studies have only studied midline deviations to one side. Our findings showed that deviations to the right are clearly more acceptable than those to the left. Mead and McLaughlin study showed that right-handed people prefer photos with more important details on the right side of the image29. The effect of laterality has been assessed also in one study on facial profiles; it was found that if the profile was facing the left, referees’ zones of acceptability narrowed down and also changed3. The authors explained the heightened accuracy when judging leftward profiles by the visuospatial dominance of the right hemisphere of the human brain resulting in the left-lateralized attention3,33,34. Nevertheless, this contradicts our results showing that the judges became more demanding and fastidious when the flaw was on the right side of the image, i.e., the left-sided of the model. Perhaps the small shifts of the midline to the sides are not comparable to very large jumps of facial profile components when flipping from right to the left side of the image. Therefore, other factors may be at play in the case of midline shifts, including the asymmetries in other facial components such as the nose, which make right-sided midline deviations more tolerable than left-sided ones. The other 2 studies evaluating both left- and right-sided deviations failed to differentiate between esthetic preferences towards midline deviations to the right and left sides28,35. Again, the reason for the dispute can be numerous factors that potentially affect the perception of beauty, and were different among the studies. Also important is the sample size and correctness of statistical analyses, as it has been shown that not all studies have a correct analysis.

Altering the angle of the dental midline may be esthetically acceptable only up to 10 degrees for normal people and up to 6 degrees for orthodontists; furthermore, it is more acceptable if this angular modification in the angle is aligned with the direction of the deviation of nose or chin36. On the other hand, the linear displacement of the dental midline for less than 2 mm (without angular roll) is esthetically accepted by normal people32. In the present study, 5 degrees of midline roll was enough for the participants—either as a whole or within each of the occupation groups—to reduce their beauty points. Previously, it was suggested that 10 degrees of midline deviation was considered unacceptable by 68% of orthodontists and 41% of laypeople; the average acceptance threshold of orthodontists for male and female models was 6.6 and 6.4 degrees, respectively, while according to laypeople, this value was 7.10 and 0.10 degrees, respectively37. In another study, it was stated that 5 degrees of midline rotation can be recognized by orthodontists and ordinary people38. Again, we found that midline rolls to the right side of the model, or the left side of the image, were better tolerated compared to left-sided midline rolls appearing on the right side of the image. No study was available in this regard to compare our results with. Yet the explanations presented for the improved perception of right-sided midline deviations asserted above might hold for midline rolls as well. This warrants future studies.

Our findings showed that overall (all images combined), the mesoprosopic facial form was the most appealing one; whereas, the two top-rated images among all 45 ones were both dolichofacial. In another study, the mesoprosopic facial form had the lowest beauty score given to the female model –still the most beautiful in the case of the male model18. For their male model, the acceptance threshold for the mesoprosopic face shape was significantly higher than the leptoprosopic and euryprosopic facial shapes, but for their female model, the acceptance threshold was significantly lower than the other two groups18. Clinically, this indicates a higher tolerance for mesopic men’s midline deviation18. The influence of facial dimensions on the perception of midline deviations or vice versa have been assessed in one study18 that found significant effects, although it did not take into account midline roll. In our study, euryprosopic faces were the least attractive ones according to all 3 groups. Leptoprosopic faces were in the middle; there was a difference between ‘occupation’ groups in this regard: laypeople saw leptoprosopic faces as significantly less attractive than mesoprosopic faces; however, according to both groups of dental experts, the beauty of leptoprosopic faces were not significantly different from that of mesoprosopic ones.

This study was limited by some factors. Due to practical limitations, it lacked a male photomodel like many other studies. This is because if we were to double-up the number of questions within this already-large questionnaire, the number of judges volunteering to participate would drop considerably. Therefore, we limited the questionnaire to female images only. Another limitation of this study was that it did not have smaller amounts of change such as 2 degrees or 0.5 mm. However, that as well would considerably increase the number of photographs. Future studies can fine-tune our results by creating much smaller modification within the zones of acceptability of our study (i.e., 1 mm of midline deviation and 5 degrees of midline angulation). One of the strengths of this research was its very large sample as well as the advanced statistical analyses adopted. Besides, we ensured to tightly correct for the multiple comparisons problem at two levels. Moreover, the order of perceptometric images were randomized to reduce error caused by subjective bias and human fatigue.

Conclusions

After strictly correcting for and controlling any ‘multiple comparisons problems’, it can be concluded that:

  1. 1.

    The 5 anatomic features of the photomodel all contributed to esthetic judgements, either in the whole sample or in all subgroup analyses. The zone of tolerability (or the tolerance threshold) was determined for midline deviations to include the no-deviation (‘on’) midline followed by 1 mm deviations to the right and left sides. More deviations would severely reduce esthetics. For midline rolls, the only tolerable form was the no-roll (‘on’) midline; even 5 degrees of midline roll would be noticed. In both of these, referees had an interesting tendency to tolerate shifts to the right side of the model more than ones to her left. The most beautiful facial form was mesoprosopic (brachyfacial), followed by leptoprosopic (dolichofacial).

  2. 2.

    The zones of acceptability mentioned above for the whole sample were almost identical for orthodontist, prosthodontists/restorative dentistry specialists, and laypeople. The viewer’s sex might play a small role in their perception of beauty: male judges were inclined to see the female model more beautiful than did female judges. This effect was visible merely in the whole sample, and not in the case of each facial image separately.

  3. 3.

    The referees’ specialty (or lack of it), their age, or their experience might not affect their esthetic preferences, either in the whole sample, or in any subgroup analyses.

  4. 4.

    The ideal faces were identified.