Introduction

Many studies about the aetiology and prevalence of MIH and HSPM have been published in the past decades (Alaluusua 2010; Jälevik 2010). Currently, the cause of MIH or HSPM is still not clearly identified (Alaluusua 2010; Ghanim et al. 2012; Elfrink et al. 2014) whilst the prevalence values published vary widely (Jälevik 2010). Apart from the differences in socio-behavioural, environmental and genetic factors of the studied populations, the wide variations in the reported data may be attributed greatly to the differences in the examination protocols including applied diagnostic criteria. Even after the establishment of the EAPD evaluation criteria (Weerheijm et al. 2003) and the alterations made in 2009 (Jälevik 2010), some researchers continue to use the modified index of developmental defects of enamel (mDDE) index or develop individual criteria for identifying cases of MIH. This makes valid comparison between studies impossible. Also, the suggestions of increasing prevalence of MIH may be questioned. This can only be confirmed by repeating MIH and HSPM research with the same conditions and inclusion criteria and in the same places. Therefore, at the EAPD congress in Sopot, Poland 2014, advice was provided on research for HSPM and MIH, and will be discussed here and in the article of Ghanim et al. (2015).

It seemed advisable to incorporate MIH and HSPM judgement criteria in national epidemiological surveys. Even in the existing projects where EAPD criteria were implemented, researchers’ ability to accurately represent the results varied widely which affected the quality of evidence and hence comparison between studies is difficult. This greatly lowered the value of results as it did not allow identification of high risk subjects and their treatment needs. Therefore, advice on how to perform a valid prevalence and/or aetiology MIH or HSPM study, and the development of a standardised scoring form are needed. The use of standardised scoring sheets is discussed in the article of Ghanim et al. (2015). In this present paper, we will focus on the guidelines to perform a prevalence and/or aetiology study.

Methods

A search was undertaken using PubMed online databases in November 2014 for all articles related to MIH/HSPM epidemiology in its widest sense. The following search strings were used; (n) represents the number of articles retrieved per term:

  • prevalence hypomineralisation (41)/prevalence hypomineralization (57)

  • prevalence Molar Incisor Hypomineralisation (545)/prevalence Molar Incisor Hypomineralization (549)

  • prevalence Deciduous Molar Hypomineralisation (7)/prevalence Hypomineralised Second Primary Molars (2)

  • prevalence cheese molars (2)

  • prevalence enamel opacities (167)

  • Hypomineralised first permanent molars (17)/Hypomineralized first permanent molars (33)

  • Idiopathic enamel hypomineralisation (2)/Idiopathic enamel hypomineralization (2)

  • Non fluoride hypomineralisation (7)/Non fluoride hypomineralization (15)

  • Non fluoride enamel opacities (30)

  • Idiopathic enamel opacities (4)

  • Opaque spots (42)

  • Developmental defects enamel (504)

Some search strings found the same articles and the duplicated articles were removed from the selection. The inclusion criteria were studies: (1) investigating the prevalence and/or risk factors of MIH/HSPM of both community and clinical based setting; (2) written in English; (3) published until the 1st of November 2014. If non-English language, animal studies or articles where MIH or HSPM could not be identified in their data, they were excluded.

The first stage was an assessment of each of the identified articles against the inclusion criteria based on its title, abstract, design and main conclusions. In the second stage, each article was assessed independently by two reviewers (ME, KW) and discrepancies resolved by consensus. In case of doubt on whether an article should be included, a consensus decision was made by the two reviewers.

An evaluative framework was used to summarise the background information of each of the studies included in this review. The information was summarised based on:

  1. (1)

    Number of examined subjects (n);

  2. (2)

    Age cohort of the participants;

  3. (3)

    Criteria/index used;

  4. (4)

    The setting in which the examination took place (research environment) (e.g., school, dental clinic);

  5. (5)

    The selection of the participants

  6. (6)

    Geographic location (i.e., city or rural environment);

  7. (7)

    Calibration of the researchers.

Results

The search identified a total of 1078 potentially relevant MH/HSPM epidemiological studies. After the exclusion of duplicates, irrelevant and unavailable articles, there were 157 studies for review. The full-text versions of these 157 articles were read and 60 articles were included; designated as MIH = 52, HSPM = 5, and 3 for both MIH and HSPM.

Table 1 summarises the selected articles on the prevalence of MIH. The majority of the prevalence studies also addressed possible aetiological factors. Most studies were performed by calibrated examiners/researchers in urban environments. The calibration methods differed widely between the studies, using photographs was the most common utilised method (n = 25). Although studies from all around the world were incorporated, an over-representation of Europe is present (31 out of 55). Besides the variation in the reported prevalence values, there is also a wide variation in the sample sizes and the selection criteria of the recruited subjects. The great majority of the studies examined 8–10 year-old age groups (n = 47). The reported prevalence values illustrated clear variation with few outliers. In most of these studies (n = 31), the prevalence of MIH globally varied between 10 and 20 % (see Fig. 1).

Fig. 1
figure 1

MIH world-wide prevalence studies

Table 1 Summary of published data from different countries on MIH prevalence

Table 2 presents the selected studies on HSPM. Most studies were performed in a specific group of children. The reported prevalence in those studies varied between 0 and 9 %. The outlier (21.8 %) was reported based upon in a small convenience sample.

Table 2 Overview of the studies stating the prevalence of HSPM

The number of publications concerning world-wide data has increased gradually from 1 published in 1987 to 11 in the first 10 months of 2014 (see Fig. 1).

The studies were published in 22 different journals. The most frequent were the European Archives of Paediatric Dentistry (n = 12), European Journal of Paediatric Dentistry (n = 5) and the International Journal of Paediatric Dentistry (n = 10).

Discussion

The present review represents an updated overview of the available literature on the prevalence of MIH and HSPM. The number of studies has increased, especially during recent years, but not all studies have used the same criteria or interpreted the criteria in the same way (see below).

Weakness of data

Some studies have confined their sample to a specific population. For instance, an HSPM study by Elfrink et al. (2009) derived its sample from referred patients attending paediatric dental clinics. Kar et al. (2014) on the other hand, studied a group of children conceived by In-Vitro-Fertilization (IVF) and compared them with a group of control children visiting an Institute of Dental Science and Research. In addition in the MIH studies of Soviero et al. (2009) and Wogelius et al. (2008), specific groups of patients were included in the research. This can be an explanation for the high prevalence found in these studies.

On other occasions it seems that the criteria used are not followed correctly. Wogelius et al. (2008) used the EAPD criteria but mentioned in the discussion that the opacities were probably over-reported because of the scoring instructions not stating the systemic origin of the opacities. Heitmüller et al. (2013) and Kühnisch et al. (2014) devised their own subgroups in the MIH definitions. The articles have to be read carefully to find the MIH prevalence that can be compared with the other prevalence studies. The two outliers (>40 %) are from the study of Balmer et al. (2005), and a small, as well as a selected, sample of children in Australia and Great Britain.

The examiners in eight previous studies used the mDDE index or their own criteria (n = 13), instead of or in combination with the EAPD criteria (n = 38) which makes comparison more difficult.

Although the EAPD criteria were based on the mDDE index, using the mDDE for scoring HSPM or MIH has major drawbacks, which are:

  1. (1)

    mDDE is more time consuming

  2. (2)

    Scoring post-eruptive enamel loss is not possible in the mDDE index, or it is scored as hypoplasia (which is incorrect)

  3. (3)

    Atypical caries, atypical restoration and atypical extraction are not taken into account in the mDDE index (Weerheijm et al. 2003)

To overcome this, it seems advisable to use the standardised assessment criteria in future research.

The suggestion that the incidence of MIH is increasing over recent years has not been proven yet. The same age cohort/population needs to be examined with the same criteria and examination conditions over time on at least three repeated occasions. This cannot be achieved at present due to the lack of a unified reporting system.

Comparability of studies

Because of the increase in the number of articles on the prevalence and aetiology of MIH and HSPM in recent times, it is important that any future research studies should be comparable. As described by Ghanim et al. (2015), to do so it is important that the same calibration set and same score-forms are used. The studies need to be comparable in order to perform a meta-analysis. Only in this way, the highest grade of evidence following the SIGN (Scottish Intercollegiate Guidelines Network) criteria (SIGN Criteria for assignment of levels of evidence and grades of recommendation 2014) can be reached.

The examination settings have changed in previous studies. In the more recent studies, most of the time children have been examined in a dental chair, ensuring more comparable conditions.

Examinations carried out at school or in a hospital have also to be described. In some studies, children were examined in a classroom. An additional light source is needed, if a dental lamp is not available, a torch or headlight can be used. Examination with natural light seems to be the least advisable because of natural changes in the light during the day depending on weather conditions.

Recently, some prevalence studies (Elfrink et al. 2009, 2012) were based on photographs of the teeth of children of 5 or 6 years of age. The sensitivity and specificity of scoring HSPM on photographs is high and intraoral photographs can be used in epidemiological studies (Elfrink et al. 2009).

Age of examination

For HSPM, it seems that the optimal age is around 5 years because the children are willing to cooperate with a proper oral examinations. Most 2nd primary molars will also be present and HSPM will be recognisable, although a younger age group would be preferable as gross destruction of severely hypomineralised primary molars may have occurred by this age.

Peak prevalence rates of MIH have been reported in some birth cohorts of Finnish, English, Australian, Thai, Brazilian and Danish subjects (Alaluusua et al. 1996a; Balmer et al. 2005; Wogelius et al. 2008; Soviero et al. 2009; Pitiphat et al. 2014). Wogelius et al. (2008) included children who were younger than 8 years. Alaluusua et al. (1996a), Balmer et al. (2005) and Soviero et al. (2009) included children older than 8 years. It is proposed that disturbances in amelogenesis can occur by the action of specific factors at a certain period of time (Alaluusua 2010). Therefore, the optimal age for the clinical examination recommended by the EAPD experts was 8 years; the recommendation given by the committee in Helsinki to examine both younger and older age groups should be considered in future research (Jälevik 2010).

Following groups over time (longitudinal research) can give insight into the development of the severity of HSPM and MIH and can show possible changes in treatment needs (Lygidakis et al. 2008). Examining more age groups at one time (cross-sectional research) seems to be a good approach to investigate the effect of HSPM and/or MIH in a population. But some studies report a variation in prevalence of MIH between different age cohorts (e.g., Koch et al. 1987). Therefore, it is important to report both the age of the studied children and also the prevalence per age cohort (Jälevik 2010).

Sample size and type of study

The prevalence studies need to have a minimum number of randomly selected children. In studies with only small numbers of children, the prevalence can be over or under estimated.

To estimate the number of children needed for a valid prevalence study, Naing et al. (2006) published a formula to calculate this: sample size (n) = [Z2 × P(1−P)]/d2, where Z is the statistical level of confidence [95 % confidence interval (CI) > Z = 1.96], and P is the expected prevalence and d the precision. In Table 3, the minimum calculated sample sizes are shown for estimated prevalence values of 5, 10, 15 and 20 %. For an estimated prevalence of 5 %, a minimum amount of nearly 300 children is calculated. Of the available published studies, 12 have used a smaller sample size (Alaluusua et al. 1996a, b; Balmer et al. 2005; Calderara et al. 2005; Kuscu et al. 2008; Elfrink et al. 2009; Kuscu et al. 2009; Soviero et al. 2009; Brogardh-Roth et al. 2011; Mahoney and Morrison 2011; Allazzam et al. 2014; Jankovic et al. 2014; Pitiphat et al. 2014).

Table 3 Sample size calculation for various estimated prevalences of MIH

For studies on possible aetiological factors, larger sample sizes are needed. The formula to calculate the sample size is much more complex (Hsieh et al. 1998; Demidenko 2007, 2008). To overcome this problem, there are websites where the sample size needed for logistic regression analysis can be calculated (Power/Sample Size Calculation for Logistic Regression with Binary Covariate(s) 2014). A sample size of about 1000 seems to be the least required for future research into possible aetiological factors of HSPM and/or MIH. Moreover, for such studies a longitudinal prospective cohort design is recommended because of the more accurate data collection, especially for the data that are collected by means of questionnaires (Crombie et al. 2009; Alaluusua 2010; Fagrell et al. 2011; Elfrink et al. 2014).

Calibration

Calibration of the investigators is also an important point of achieving comparable scoring between different researchers. Cohen’s kappa still is the most used measure for inter- and intra-observer agreement (Banerjee et al. 1999). Most investigators of the prevalence of MIH and HSPM state how they calibrated the researchers and give kappa scores for inter- and intra-observer agreement.

Calibration with the use of photographs is often performed, but the characteristics of photographs were dissimilar between the different studies, therefore comparison between different readings may not be considered valid. In addition comparison between the different values of prevalence may be biased (Kemoli 2008; Elfrink et al. 2009, 2012, 2013, 2014; Brogardh-Roth et al. 2011; Ghanim et al. 2011 2013a, b; Ahmadi et al. 2012; Balmer et al. 2012; Biondi et al. 2012; Bhaskar and Hegde 2014; Garcia-Margarit et al. 2014; Kuhnisch et al. 2014; Petrou et al. 2014). To reduce the variability of the outcome of MIH studies, dental research workers need to be efficiently trained and calibrated using a standardised method and a set of photographs for training and calibration. In this training set teeth with enamel defects other than MIH or HSPM also need to be included to reduce the risk of overestimating the prevalence of MIH and HSPM. Examples of such training sets can be found in the article of Ghanim et al. (2015).

Worldwide data

Most prevalence studies have been performed in Europe. From North America there are no studies available, and there are only a few publications from Oceania and Africa. In Asia, the number of publications is growing quickly, but from some countries, there are no publications in English on the prevalence of MIH and/or HSPM.

Recommendations

In future, prevalence studies should, besides correct interpretation of the definition of MIH and HSPM, include at least 300 children selected at random. For HSPM studies 5 years and for MIH studies 8 years of age seem the optimal examination ages. Scoring of the teeth needs to be carried out by calibrated examiners, preferably calibrated by a standardised set of photographs. Kappa scores for inter- and intra-observer agreement need to be published. Forms for scoring as for example described by Ghanim et al. (2015), need to be used for more comparable outcomes in prevalence studies. Validation studies on such scoring forms need to be performed.

For aetiological factor studies, at least 1000 children selected at random, should be included. Calibration of the examiners using a standardised format, publication of the kappa scores, and the use of a standardised recording as described by Ghanim et al. (2015) are also needed for aetiology studies.

When data are collected in this more comparable way, meta-analysis will be possible. This increases the grade of evidence [e.g., SIGN criteria (SIGN Criteria for assignment of levels of evidence and grades of recommendation 2014)] and possible causes of differences in reported prevalence could be explained better.

Conclusions

In order to achieve comparable outcomes, it is important that prevalence and aetiological factor studies on MIH (favourable examination age 8 years) and HSPM (favourable examination age 5 years) are carried out in a standardised manner worldwide. A clearly described sample of children––with a minimum of 300 children included for prevalence studies and 1000 for aetiological factor studies––is recommended. Standardisation of the calibration method and publication of the kappa scores for inter- and intra-observer agreement is necessary. Use of standardised scoring criteria and score sheets is considered essential. Following these guidelines will create comparable research, with the possibility for meta-analysis.