Keywords

Introduction

Since the first autism survey conducted in England (Lotter, 1966), epidemiological surveys have increased in number and complexity. Contrasting with the first studies that were simple head counts of children already diagnosed with a severe autism phenotype and residing in small, circumscribed geographical areas, current surveys now include large populations, multiple sites, stratified samples and rely on intricate sets of screening procedures followed by some form of diagnostic confirmation. However, no agreed-upon formula exists for planning and conducting a survey, and there is no standardization of autism survey methodology. As a result, differences in methodologies account for substantial heterogeneity in survey findings. Each survey has unique design features that reflect the local educational and health services infrastructure and that are influenced by current social policies for children with disabilities in the local region or country. Moreover, survey protocols vary in whether they include parents, teachers and subjects with Autism Spectrum Disorder (ASD) as participants, and rely on variable screening instruments and diagnostic confirmation procedures. As such, prevalence differences between studies are hazardous to evaluate and whether observed discrepancies are due to method factors or true differences in population parameters cannot usually be determined.

In Part I of this chapter, we first provide a review of the autism survey literature published from 1966 to 2020. A total of 141 surveys were identified and we summarize their design characteristics and main results. In Part II, we review some correlates of ASD in surveys such as age, sex, race and ethnicity and specific areas of survey methodology (surveys of parent reports, school screening) where advances have been made or progress is still needed. In Part III, we consider recent trends for expanding surveys of ASD worldwide and developing surveillance programs. Many issues were already discussed in our recent reviews (Fombonne, MacFarlane & Salem, 2021; MacFarlane et al., 2021) and commentaries (Fombonne, 2018, 2019), or in reviews from other scholars (Chiarotti & Venerosi, 2020; Zeidan et al., submitted). Of note, the scope of these reviews is restricted to prevalence studies of ASD and they do not include studies of environmental risk factors. Because published reviews have employed slightly different inclusion and exclusion criteria, we recommend to readers to consult other reviews in order to be fully comprehensive. In this Chapter, unless indicated otherwise, we use the terms ‘autism’ or ‘ASD’ interchangeably as reflecting a single broad diagnostic category as defined recently in DSM-5 (American Psychiatric Association, 2013). It also encompasses the class of Pervasive Developmental Disorders (PDD) as defined in DSM-IV (American Psychiatric Association, 1994), ICD-10 (World Health Organization, 1992) and DSM-IV-TR (American Psychiatric Association, 2000).

Part I. Review of Prevalence Surveys

We identified for this review 141 prevalence surveys of autism conducted between 1966 and 2020. Table 1 lists the 109 surveys conducted since 2000. The remaining 32 surveys conducted between 1966 and 2001 are described in a previous publication (Fombonne, 2003) to which the reader is referred for specific study details. The two sets of studies are mutually exclusive. We included all published cross-sectional surveys of population-based samples aiming at estimating the prevalence of ASD. Studies designed to test the efficacy of screening tools or screening programs were excluded. Cohort studies that yielded incidence rates estimates using person-years denominators were excluded; in a few instances, we included cohort analyses that generated cumulative incidence proportions that can often approximate prevalence. We also excluded studies with a target population size of fewer than 5000 individuals (because they lack precision and are more prone to sampling biases), studies where no diagnostic confirmation by a professional was available (e.g. surveys relying exclusively on parent reports), studies published in a different language than English or as abstracts only, and duplicate studies conducted on identical samples. We analyzed the set of 141 surveys and provide below summary statistics of their key characteristics.

Table 1 Prevalence surveys of ASDs since 2000

The interpretation of these summary statistics must be made with extreme caution in light of the substantial heterogeneity of methods across surveys and over time. Two major features must be borne in mind: case definition and case ascertainment. First, the definition of autism used in surveys has changed over time. As illustrated in Table 1 (and in Table 1 in Fombonne, 2003), numerous diagnostic criteria were employed in epidemiological studies, starting with relatively narrow definitions in Kanner-derived criteria in the first survey (Lotter, 1966), followed by ICD-9 (ICD-9; World Health Organization [WHO], 1977) or DSM-III (American Psychiatric Association [APA], 1980) diagnostic nomenclatures, and in the early 1990s, by the broader concepts of autism embraced in both ICD-10 (WHO, 1992) and DSM-IV (APA, 1994). The earliest diagnostic criteria reflected the more qualitatively deviant forms of the autism behavioral phenotype, usually associated with severe delays in language and cognitive skills. With ICD-10 and DSM-IV, less severe forms of autism were recognized and separate diagnostic categories (Pervasive Developmental Disorders Not Otherwise Specified [PDD-NOS]; Asperger Disorder) were introduced alongside Autistic Disorder within a broader class of autism spectrum disorders denominated “pervasive developmental disorders” (PDD). The impact of using different case definitions and diagnostic criteria was well illustrated in a Finnish study where the investigators evaluated how applying different diagnostic schemes to the same survey data influenced prevalence estimation. Thus, prevalence among 15–18 year olds was increased three-fold from 2.3/1000 with Kanner’s criteria to 7.6/1000 with the ICD-10 broader PDD/ASD definition everything being equal otherwise (Kielinen et al., 2000). While there was generally high interrater reliability regarding diagnosis of ASDs and high agreement between ICD-10 and DSM-IV umbrella PDD diagnosis, differences existed in their operationalization of criteria for subtypes of PDDs. For example, DSM-IV had a broad category of PDD-NOS, sometimes referred to loosely as “atypical autism” whereas ICD-10 had several corresponding diagnoses including Atypical Autism (F84.1, a diagnostic category that existed already in ICD-9), Other PDD (F84.8), and PDD Unspecified (F84.9). When used in epidemiological surveys, subtyping of PDDs proved highly inconsistent across studies. For example, the overall prevalence of ASD was similar in three independent surveys performed around the same time ranging from 0.529% to 0.674% (Baird et al., 2000; Chakrabarti & Fombonne, 2001; Bertrand et al., 2001). However, the partition of the autism spectrum led to widely variable specific prevalence estimates for autistic disorder (0.308%, 0.168%, 0.405%, respectively) and for other PDD subtypes (0.271%, 0.361%, 0.270%, respectively). The recent DSM-5 (APA, 2013) recalibration of diagnostic criteria for ASD should improve future comparisons of findings across surveys although it may also impinge on the evaluation of time trends. A more detailed discussion of the impact of changing nosographies on surveys can be found below.

Second, case ascertainment approaches vary across studies and also over time (for a detailed discussion, see Meyers et al., 2019; MacFarlane et al., 2021). Some surveys (such as those conducted with administrative databases or registries) rely on children already diagnosed with ASD and the proportion of cases missed because they are not yet diagnosed or diagnosed with non ASD diagnoses cannot generally be estimated. The resulting loss in sensitivity varies by age, area and period of the survey. Other surveys have employed more proactive ascertainment approaches. These include screening populations for children who have diagnoses of other neurodevelopmental and psychiatric disorders (especially, language and other specific developmental disorders, global developmental delay or intellectual disability, attention deficit hyperactivity disorders) and re-evaluating them for possible misdiagnosis or for the co-occurrence of ASD, and more recently screening mainstream schools to identify children newly diagnosed (see below for a methodological discussion of these school surveys). Proactive survey techniques identify more cases than studies relying on passive approaches to case finding. However, they pose particular challenges for survey data analysis and interpretation of the results due to the heterogeneity of screening techniques employed, the uneven participation rates of informants (caregivers, teachers, health professionals) and participants to the various survey stages, and the variability in methods used to confirm case status. Adding to the survey variability in case definition and case finding approaches, increasing awareness over time in both the professional and lay public has resulted in improved ascertainment and increasing prevalence in more recent epidemiological surveys which complicates further the interpretation of time trends in prevalence of ASD. Thus, it is necessary to keep in mind the methodological heterogeneity of surveys listed in Table 1 when summarizing the results of this body of data.

The 141 surveys were conducted in 37 countries and half of them were published since 2012. There was no study from any of the 28 countries classified as low-income by the World Bank (2020); seven studies originated from four of the 50 lower middle income (LMI) countries, 14 surveys from seven of the 56 upper middle income (UMI) countries, and the remaining 120 surveys from 26 of the 83 high-income (HI) countries (see Table 1). Some countries (e.g. Russia) have no prevalence data yet available; Africa and South America have very few published studies, most of which failed to meet our inclusion criterion due to their small sample size. The median size of populations surveyed was 62,000 (interquartile range (IQR): 15,350-284,536); there was a negative correlation between population size and prevalence (Spearman rank rho = −0.213; p = 0.011) consistent with case ascertainment being less sensitive in large national databases and registries. The median age of participants was 8.0 years (IQR: 6.5–9.6); only two surveys focused specifically on adults (Brugha et al., 2016; Jariwala-Parikh et al., 2019). The median number of subjects with ASD identified in studies was 177 (IQR: 58–1043), the median proportion without ID was 53.4% (IQR: 31.5–65.8%), and the median male:female ratio was 4.1:1 (117 surveys; IQR: 3.1–4.9). Over time, there was a marked increase in prevalence as indicated by a significant correlation between prevalence and year of publication (Spearman rank r = 0.58; p < 0.001; Fig. 1a) as well as of the proportion of participants without ID as indicated by a significant correlation between proportion without ID and year of publication (67 studies; Spearman rank r = 0.49; p < 0.001; Fig. 1b). Interested readers can also consult an interactive global autism prevalence map publicly available at: https://prevalence.spectrumnews.org/.

Fig. 1
figure 1

(a) Prevalence over time. (b) IQ over time

In order to generate summary information for current estimates of ASD prevalence, we first restricted our analysis to the 47 studies published in the last 5 years (2016–2020) given the marked time trends in prevalence (Fig. 1a). A comparison of prevalence in 35 high-income countries to the remaining 12 middle- and low-income countries showed that prevalence was significantly lower in the latter group (Mann–Whitney U test=87.5; p = 0.003) that was subsequently excluded. We further restricted the analysis to surveys conducted in samples aged 4–12, excluding one survey of preschoolers (Hamad et al., 2019) and three surveys of mostly adult populations (Brugha et al., 2016; Jariwala-Parikh et al., 2019; Hong et al. 2020). The final set of 31 surveys were performed in high-income countries on a total population of 14.7 million children with a median population size of 108,000 individuals (IQR: 15,800-347,000) per survey. The median age was 8 years (range: 4–12). The median number of participants with ASD was 1125 (IQR: 177–3138), the median male:female ratio was 4.07:1 (IQR: 3.65–4.50), and the median proportion of individuals with associated intellectual disability was 31.6% (IQR: 20–44.8%). Prevalence estimates ranged from 0.2% to 2.68% (IQR: 0.73–1.52%) with a median prevalence estimate of 1.14%. This figure of 1.14% can be considered a robust conservative estimate for current prevalence of ASD among children ages 4–12 living in high-income countries. It ought to be interpreted in the context of methodological heterogeneity of the surveys from which it is derived and of the large range of observed variation.

Part II. Advances and Remaining Challenges

We discuss below some methodological issues that must be borne in mind when evaluating the published literature; we outline methodological advances in recent surveys as well as persisting challenges in conducting, analyzing and interpreting these surveys.

Case Definition and Case Status Determination

An important aspect of survey methodology is how caseness is defined and how case status is determined in individual participants in each study. There is no uniform approach to case definition across published studies. Some surveys simply use diagnoses from electronic medical records, some rely on an autism special education eligibility that varies across countries and even across areas and over time within the same countries, some rely on endorsement by caregivers of a single questionnaire item while other perform in-person clinical assessments. Many, if not most, surveys use combinations of modalities. Reliance on a particular mode of defining caseness has often predictable consequences on prevalence estimation. Thus, surveys of large national registries or administrative databases usually result in downward bias in prevalence estimation since only cases already identified and diagnosed are counted. Conversely, surveys that rely on parent report in a household survey often overestimate prevalence (see below). However, in most studies, investigators have attempted to confirm directly an ASD diagnosis in a participant (or a subsample of participants) by reviewing the symptomatology and developmental history and referring it to a set of established diagnostic criteria, such as the International Classification of Diseases (ICD) or the Diagnostic and Statistical Manual (DSM).

Here, several issues need to be considered. First, the terminology of ‘meeting diagnostic criteria’ does not magically guarantee the validity of caseness unless careful attention is paid to the quality of the data used to score these criteria and to how much clinical wisdom was infused into this process. The DSM/ICD algorithms for PDD or ASD are only guiding principles which can help organize the available information and provide final coherence to clinical data stemming from different data sources and informants. Yet, how data are collected, by whom, from which informants and using which methods, and how discrepancies between data sources are resolved, are essential features to consider in gauging the validity of case confirmation in a given survey. Guidance by nosographic definitions has the merit to increase the reliability of symptom identification and through that process the reproducibility of diagnoses across investigators. However, although it is contingent upon high reliability of measurement being achieved first, validity is a separate issue that requires other demonstrations than simply being in agreement.

Second, even when gold standard tools such as the Autism Diagnostic Interview-Revised (ADI-R ; Rutter et al., 2003) and the Autism Diagnostic Observation Schedule (ADOS ; Lord et al., 2002) are employed for in-person assessments, case status confirmation based on ‘scoring above/below threshold’ results are far from being sufficient. In reputable investigations such as the Simons Simplex Collection (Lord et al., 2012) or the Collaborative Programs of Excellence in Autism (Lainhart et al., 2006), scoring rules and cut-offs had to be modified to maintain adequate sensitivity of both instruments in selecting participants in these specific samples. It is worth noting that in both studies expert clinical judgment was employed to provide final confirmation of diagnosis and inclusion in the study. Validity of case status determination does not reside solely in any instrument or its scores; rather, it requires a higher-order, interpretative, process informed by expert clinical judgment. It is important to remember that even instruments like the ADI-R and the ADOS have been developed to be used in conjunction and that their results must be reviewed and interpreted by a clinical expert (Risi et al. 2006; Lord et al., 2012). Mechanical translation of scores into diagnosis is unwise. Similarly, it ought to be remembered that diagnostic algorithms of the ICD and DSM have been validated against a gold standard that was precisely the clinical judgment of experts (see for example Volkmar et al., 1994). The importance of expert clinical judgment in making final decisions about caseness is generally acknowledged in epidemiological investigations even when they do not rely on in-person assessments with the ADOS and the ADI-R. Thus, the surveillance definition implemented by the Centers for Disease Control and Prevention (CDC) in its Autism and Developmental Disabilities Monitoring (ADDM) surveys guides clinicians’ evaluation of records materials along the nosographical criteria and algorithms but it also allows clinicians to rule out ASD based on insufficient or conflicting information. Unfortunately, how often this clause was used has not been reported and its influence on prevalence estimation remains therefore unknown. Similarly, quality and certainty ratings assigned by CDC clinicians to cases have not been examined with respect to their potential impact on prevalence estimation.

Third, diagnostic algorithms and ADI-R/ADOS cut-offs have been calibrated against control samples that have typically included participants with either typical development or intellectual disability and developmental delays without autism. The performance of these tools may be diminished when applied to samples enriched with varied types of psychopathology (Bastiaansen et al., 2011; Matsuo et al., 2015; Grzadzinski et al., 2016; Havdahl et al., 2016; Turban & van Schalkwyk, 2018) or with other types of neurodevelopmental and genetic disorders (e.g. Garg et al., 2013; Morotti et al. 2021). For example, Grzadzinski et al. (2016) reported that 20–30% of children with ADHD but without ASD scored over the cut-offs of standardized autism diagnostic tools (ADI-R and ADOS); likewise, in a study investigating the impact of both parent-reported and clinician-reported behavioral/emotional problems on ratings of autistic symptoms, Havdahl et al. (2016) showed that the presence of co-occurring problems increased ADOS, ADI-R and Social Responsiveness Scale (SRS ; Constantino & Gruber, 2005) scores resulting in decreased specificity of ASD instruments. Moreover, epidemiological samples include school (rather than preschool) age subjects with language and intellectual skills often within the normal range. At that age, psychiatric disorders are common and frequently associated with social-communication symptoms (e.g. lack of friendships, self-centeredness, low empathy) and even restricted and repetitive behaviors (e.g. behavioral rigidity, obsessions), allowing autism symptoms to be easily ‘scored’ albeit wrongly endorsed. This concern is heightened for older children or adults with language and intellectual skills within the normal range when they are newly diagnosed as part of their participation in an epidemiological study, especially in the absence of a previous neurodevelopmental evaluation and/or of a developmental history suggestive of previous autistic abnormalities. Differentiating autism in the context of psychiatric comorbidity presents challenges both to the performance of standardized instruments and the clinical judgment. However, in the absence of experienced clinical evaluation, simply scoring criteria and mechanical reliance on algorithms, either from record reviews or diagnostic instruments, may easily be misleading.

Fourth, surveys have incorporated in their case status definition ill-defined diagnostic subtypes such as PDD-NOS whether ICD or DSM (until recently) was used. To illustrate, PDD-NOS could be diagnosed based on the presence of two diagnostic criteria (one social, one other) only, and no requirement of evidence of abnormality before age three. Therefore, contamination of cases with phenocopies of all kinds was a strong possibility. There again, false positives are more likely when mechanical rules devoid of clinical judgment are used to establish caseness.

Fifth, screening and diagnostic confirmation should rely on reasonably independent procedures. If record review is used as the main procedure to screen and to confirm diagnosis, the risk of circularity is very high as exemplified in one CDC diagnostic validation study (Bakian et al., 2015). For a child with a clinical ASD diagnosis or ASD special education eligibility, the documentation available in his or her medical or educational record will obviously contain descriptions in support of that classification, making it in turn difficult to truly evaluate its validity. CDC ADDM surveys are particularly vulnerable to this problem due to their specific record review methodology (see Van Naarden Braun et al., 2007).

The repeated changes in nosographical systems create another source of measurement uncertainty in autism studies in general. There was relatively strong parallelism between ICD-10 and DSM-IV that was unfortunately lost with the recent changes in DSM-5. Nevertheless, the new, single, unified concept of Autism Spectrum Disorder (ASD) that replaces the previous umbrella diagnostic class of Pervasive Developmental Disorder (PDD) has increased specificity that should benefit epidemiological research. Preliminary studies comparing the effects of using DSM-IV or DSM-5 on prevalence estimates have shown that, all else being equal, the shift from DSM-IV to DSM-5 leads to a decrease of 13% to 20% in prevalence within the same study datasets (Kim et al., 2014; Maenner et al., 2014). The decrease in prevalence is largely due to subjects with a DSM-IV diagnosis of PDD-NOS no longer meeting ASD criteria in DSM-5 (−37% decrease in Kim et al., 2014). Likewise, in a recent CDC ADDM survey of children age eight (Baio et al., 2018), the prevalence of DSM-5 behavioral criteria for ASD was −18.1% lower compared to that of DSM-IV-TR (for precise calculations, see Fombonne, 2018); a similar pattern emerged from CDC surveys of children age four surveyed in 2010, 2012 and 2014 (Christensen et al., 2019) where the prevalence was 1.70% for DSM-IV based definition, 20% higher than the 1.41% estimate derived from DSM-5. As it introduced a new DSM-5 based case definition, the new CDC surveillance definition provided a “grand-father” clause by which subjects with a history of a PDD diagnosis would automatically meet criteria for the new surveillance definition even though DSM-5 behavioral criteria would not necessarily be met. This practical choice was in line with DSM-5 recommendation to provide a new DSM-5 ASD diagnosis to individuals having a “well-established” DSM-IV PDD diagnosis. However, because the old and new surveillance definitions are embedded in each other, the net effect on prevalence due to the change from DSM-IV to DSM-5 cannot be properly evaluated in recent CDC ADDM surveys; consequently, apparent similarity between DSM-IV and DSM-5 derived prevalence (Baio et al., 2018; Maenner et al., 2020) should not be taken as evidence that the two sets of diagnostic criteria perform equally in terms of sensitivity and specificity.

The defunct PDD-NOS diagnostic category is not missed. It was an ill-defined diagnostic category with poor inter-rater agreement. In a review of previous surveys (Fombonne, 2003), we noted that the proportion of PDD-NOS diagnosed in epidemiological surveys was highly variable, accounting anywhere between 20% and 70% of all spectrum diagnoses reached in surveys. As narrated by Volkmar et al. (2000), a printing mistake in the 1994 DSM-IV manual initially enforced a hyper-lax definition of PDD-NOS (one social OR communication criterion was sufficient) that was subsequently corrected (one social AND one communication criteria subsequently required) in the DSM-IV-TR Edition (APA, 2000). The fact that in CDC ADDM surveys, the proportion of PDD-NOS diagnoses has revolved around 40% of the caseload adds further challenges to the interpretation of CDC ADDM surveys results (Mandell & Lecavalier, 2014; Fombonne, 2018). In an attempt to increase its specificity, the CDC surveillance case definition for PDD-NOS added the requirement of the presence of at least 1 of 19 autism discriminators (see list in Baio et al., 2018). Unfortunately, the effect of applying or not that discriminator on prevalence has not been reported. Finally, like for PDD-NOS, a poor level of reliability of the Asperger disorder diagnosis was documented in epidemiological surveys (Fombonne & Tidmarsh, 2003) favoring its removal from DSM-5 as a separate diagnostic entity. Thus, findings from epidemiological studies concurred with those from other studies (e.g. Lord et al., 2012) in showing that reliability for subtypes within the autism spectrum was mediocre whereas it was excellent for differentiating spectrum and non-spectrum diagnoses.

The Problems of Parental Reports

In our reviews, we excluded surveys that relied solely on parental responses collected in various national health surveys due to concerns about the validity of the case definition employed and of the resulting prevalence estimates. Surveys using large nationally representative samples, such as the US National Survey of Children’s Health (NSCH), have yielded prevalence estimates relying on highly problematic caseness determination. Gains in sample size, participants’ age range and representativeness were mitigated by reliance on simple yes/no answers by household informants to one or a few survey questions (“Did a doctor or health professional ever tell you that [child’s name] had autism, Asperger’s disorder, pervasive developmental disorder, or autism spectrum disorder?”) to establish caseness (Kogan et al., 2018). In NSCH where a follow-up question required parents to also report a current ASD diagnosis in order to reduce misclassification, as many as 7.1% of parents reported an ever but not a current ASD diagnosis (Kogan et al., 2018) begging the question of whether children ever had ASD in the first place. Similar unconfirmed parent reports were used in other population surveys in the US and elsewhere (Table 2). In these surveys, non-clinically trained interviewers recorded verbatim answers from respondents without further checking, children were not seen, and additional diagnostic evaluation reports were not collected or reviewed. In each of these surveys, the prevalence was estimated to be much higher than that derived from more rigorous population surveys performed at the same time in the same country. For example, the prevalence of 1.7% reported by Russell et al. (2014) in the UK compares to prevalence figures of 1.16% and 0.94% reported in the UK by Baird et al. (2006) and Baron-Cohen et al. (2009) at the same time; likewise, in the US, the recent 2.5% prevalence estimated in the National Health Interview Survey (NHIS; Zablotsky et al., 2020) and in the National Survey of Children’s Health (NSCH; Xu et al., 2018) is higher than the latest 1.85% prevalence figure from CDC (Maenner et al., 2020). To illustrate further the limitations of this type of survey and the considerable concerns about what a ‘case’ really means, a study by Zablotsky et al. (2015) showed that changes in the wording, format and placement of the single autism question in the National Health Interview Survey resulted in a sharp prevalence increase from 1.25% in 2011 to 2.24% in 2014, a difference seen as arising purely from questionnaire design modifications. Much caution should therefore be exerted when interpreting or using these survey results.

Table 2 Surveys using parent reports for case definition

Novel Approaches to Case Finding/Ascertainment

Classically , surveys identified cases by zooming in on children already diagnosed with autism or other behavioral or developmental problems. This approach to case ascertainment did not permit researchers to identify cases without a previously recognized condition and resulted in imperfect sensitivity of case ascertainment procedures (due to false negatives). The addition of a regular school survey component in recent surveys (Kim et al., 2011; Baron-Cohen et al., 2009; Fombonne et al., 2016; Alshaban et al., 2019) and in new studies in China (Zhou et al., 2020; Sun et al., 2019) has addressed this concern from a study design perspective (Table 3). However, new issues arose with the implementation of this approach. First, screening tools such as the Social Responsiveness Scale (SRS), the Social Communication Questionnaire (SCQ ; Berument et al., 1999) and others show only mediocre specificity, especially among children with elevated levels of concurrent anxiety, attention deficit or other psychiatric symptoms (Hus et al., 2013; Grzadzinski et al., 2016; Fombonne et al., 2021); moreover, their cut-offs have not been well calibrated for use in general population studies, and when both teachers and parents are used as informants, no clear rules exist for combining their often-discrepant results. As seen in Table 3, multiple screeners have been employed to survey school samples reflecting opportunistic rather than data-derived choices. Second, and most importantly, is the relatively low participation (30–70%) in the initial screening and the other survey phases (e.g. participation in a stage two diagnostic confirmation session). Statistical analysis of these complex survey designs was made adequate by applying a series of weights to account for different sampling fractions and participation rates at each survey phase. However, in doing so, strong, unchecked, assumptions had to be made as to whether participation was associated (or not) to caseness. In the complete absence of information about non-participants (which is the usual situation), the assumption that non-participants do not differ from participants with respect to the presence/absence of autism is a guess rather than a tested proposition. Parents of children with autism have unusually high participation in surveys (Fombonne, 2003) making it plausible that non-participants have ‘less’ autism than participants. Differential participation in that direction may have biased upwards prevalence estimates, a possibility appropriately discussed in the Korean study by its authors (Kim et al., 2011) as well as other commentators (Pantelis & Kennedy, 2016). Conversely, prevalence could be underestimated if parents of children with ASD were less likely to participate.

Table 3 Contribution of school survey to overall prevalence estimates

Nevertheless, important findings were obtained by adding a school survey component to the study designs. As can be seen in Table 3, the prevalence estimated by the school survey alone was never nil, and ranged from 0.054% to 1.89% confirming that screening school children allows for the identification of new cases that would otherwise have been missed by previous methodology relying on children already diagnosed with some form of disability. Moreover, within each survey, the relative contribution of the school prevalence component to the overall population prevalence ranged from 13% to 72% (median: 33%, 10 surveys; see Table 3). It is probable that this wide variation reflects differences in school survey methodology across these studies although it might also reflect true differences across populations in the proportion of diagnosed/undiagnosed children. Unfortunately, there is no way to test these competing interpretations. Current preliminary evidence (Table 3) suggests nonetheless that up to a third of cases of autism in a population could be missed in studies that do not survey mainstream schools.

Execution of school surveys has confronted investigators with huge sample size and manpower issues, specifically due to very high numbers of screen positive, and sometimes of screen negative children to be assessed in second phases of diagnostic confirmation. Innovative techniques have been used to tackle this issue. For example, in a multisite Chinese study where 32.9% of school participants screened positive on an autism screening questionnaire, Zhou et al. (2020) implemented a second step screening procedure combining a brief semi-structured direct observation and group interviews in the school setting in order to eliminate a large proportion of false positives on the initial screener and thereby reduce the second phase sample to a manageable size. In the Qatar study, Alshaban et al. (2019) devised a brief semi-structured telephone interview allowing for rapid evaluation of a high number of screen negative children, leading to more valid and precise estimation of the prevalence estimate. The need to combine informants and data sources in efficient ways has led some European investigators (e.g. Narzisi et al., 2020; Fuentes et al., 2020) to develop a nested screening procedure whereby teachers are asked first to nominate children with suspected social communication or restricted/repetitive problems with a six-item Teacher Nomination Form (Hepburn et al., 2008). Parental screening is subsequently obtained only for the small sample of participants who first screened positively on teacher measures, allowing researchers to limit the final number of screened positive participants, hereby defined as screening positively both on teacher and parent informants. The efficiency of this approach is very attractive; however, its accuracy depends heavily on the performance and properties of the initial teacher identification, the sensitivity of which remains unknown. For example, it is possible that teacher nomination could disproportionately miss girls with autism, ‘passive’ autistic children in Wing’s nomenclature or those without behavioral problems. The method also creates another stratification layer within the screening phase that complicates the survey data analysis.

In sum, the addition of general schools to the samples surveyed in autism epidemiology was a logical improvement that has proven to be contributory. However, the methods used to screen and confirm cases in large samples of typically developing children need to be refined and adequately tested for their performance and cost-effectiveness.

The Male Preponderance in Autism

The male preponderance in autism is a well-recognized feature of the disorder, one that has been steady through decades of research. In a review of 29 surveys published up to 2001, we previously reported an average male:female ratio of 4.3:1 (Fombonne, 2003). And as indicated above, a median sex ratio of 4.1 could be derived from 117 published surveys with sex data; likewise, the sample size weighted average was 4.13:1. A sex ratio of 4.1:1 is equivalent to observing 80–81% males in surveyed samples. As illustrated in Fig. 2, this sex ratio has not changed over time as shown by the non-significant Spearman correlation between sex ratio and year of publication.

Fig. 2
figure 2

Sex ratio over time

In a recent meta-analysis of 54 surveys (data collected 1990–2011), Loomes et al. (2017) found that the male:female prevalence ratio was similarly 4.2:1. The authors rightly pointed out that conventional calculations of sex ratio (dividing the number of affected males by that of affected females) does not adequately capture the increased risk associated to male sex. Indeed, a better measure is the prevalence odds ratio (obtained by dividing the prevalence in males by the prevalence in females, hence the terminology of ‘prevalence odds ratio’ or POR). The difference between the sex ratio and the POR is that the POR adjusts on the relative sizes of the unaffected male and female population under study. For example, in the New-Zealand study (Table 1; Bowden et al., 2020), the sex ratio is 4.59:1 when calculated as the ratio of affected males to that of affected females (2577/561); however, the prevalence odds ratio decreases to 4.34 if the prevalence in males (2577/163,185) is divided by the prevalence in females (561/154,236). This change reflects the slightly higher proportion (51.4%) of males than of females in the underlying population. We have nonetheless kept our reporting of the conventional sex ratio because: (a) details about the male and female population denominators are not always available in published articles whereas sex ratio is routinely reported or can be calculated; (b) when population denominators by sex are available, simple calculations of the POR (as described above) may be erroneous in complex survey designs where survey weights should be applied separately for each sex to account for unequal sampling fractions and participation rates at different survey phases, and; (c) using sex ratio will facilitate comparisons since it is a widely reported metric.

After grouping surveys according to risk of bias, active/passive ascertainment and availability of IQ data, Loomes et al. (2017) reported a POR of 3.25 in 20 surveys with active case ascertainment, and of 3.32 in 17 surveys with low risk of bias. They concluded that the typical “4:1 male-to-female ratio is inaccurate” and that the true ratio is “lower than 3.5:1”. Furthermore, they interpreted this result as supportive of theories of female camouflage and systematic female underdiagnosis. As explained above, the analysis could not account for other survey design (sex specific participation rates at different phases, survey weights) and individual participants (e.g. sex associated exclusion/inclusion criteria such as genetic disorders (e.g. Fragile X) and other comorbidities, etc..) characteristics that may modify the results. The subset of 20 studies had small sample sizes—there were fewer than 1900 ASD participants in the 20 surveys with active case ascertainment. Like any biological variable, sex ratio in autism studies has a sampling distribution and variability across studies is to be expected. The dispersion of sex ratios across surveys is well illustrated in Fig. 2. Further demonstrating this variability, the median sex ratio in 26 of the 31 surveys with available sex ratio data (representing ≈ 115,000 participants ages 4–12 with ASD) used to generate a current prevalence estimate (see above) is 4.07:1 with a range of 2.80:1 to 6.20:1 (IQR: 3.65–4.50). Likewise, in the most recent CDC survey (Maenner et al., 2020; Table 1), the male:female ratio ranged from 3.4:1 (Missouri) to 4.5:1 (Arkansas), with an overall value of 4.3:1. Thus, asymptotic convergence of sex ratios towards a central value matters more than any specific study estimate. Furthermore, the interpretation of Loomes et al.’s lower sex ratio as evidence of underdiagnosis in females was unsubstantiated as are several corollary claims linking female camouflaging and underdiagnosis (Fombonne, 2020). Besides, in school surveys of ASD where new, previously undiagnosed, ASD cases were identified, we found no evidence that more females than males were previously undiagnosed (see Table 3, right column); if anything, the trend was for even higher male:female ratio among newly diagnosed participants as would be expected in samples of school children without intellectual disability.

Our review does not therefore support the hypothesis that the male preponderance in ASD has been overestimated nor that it has changed over the last 50 years. Indeed, the ratio of 4 males to 1 female remains a robust characteristic of ASD both in epidemiological and clinical samples.

Age Considerations

When evaluating surveys, careful consideration should be paid to the age range of included participants. Surveys have generally focused on school-age children and there are reasons why this is a good sampling choice. By ages 6–10, diagnoses can be verified and validated with robust instruments and methods. At lower ages, some children will be missed since the age of diagnosis is often delayed up to primary school entry or later. At older ages, some improvements in milder forms of the autism phenotype can pose difficulties for both identification and diagnostic confirmation. Importantly, a reason to focus on primary school-age is that, in most countries, school attendance is compulsory after age six which allows comprehensive, publicly available sampling frames to be used by survey researchers. In addition, most children with autism show some impaired functioning for learning and adaptive behavior that makes them eligible for school special support services, rendering them easier to identify in surveys.

In some studies that capitalize on existing databases or registries, prevalence estimates may be biased towards lower values when denominators include either infants or toddlers or older adults. For different reasons, those age groups are less likely to be diagnosed with autism: infant and toddlers simply because they have no or very little likelihood to be already diagnosed and adults because of secular changes in awareness and ASD identification. Therefore, inclusion of very young or adult age groups in prevalence calculations is not recommended as it will bias the prevalence estimate towards lower values. For example, in their analysis of the Germany national health insurance database, Bachmann et al. (2018) report for 2012 a prevalence of 0.38% when considering the whole age range 0–24. However, the prevalence in age groups <1 and 18–24 was much lower (about 0.11% and 0.18%) and a more accurate population estimate was 0.60% obtained for the 6–11 year old age group. Thus, while it may be useful for descriptive purposes to report prevalence at different ages, prevalence derived from school-age samples is likely more valid and accurate to inform service planning and public health policy.

Yet, even within the school-age range, cross sectional surveys that sample different age groups sometimes exhibit age-associated differences in prevalence that are difficult to interpret. For example, in some surveys of relatively narrow age ranges (6–12), prevalence was at its maximum in children age eight or nine and lower at older ages which is inconsistent with autism being a lifelong disorder (e.g. Yeargin-Allsopp et al., 2003; Alshaban et al., 2019). Differences in sampling frames, participation rates, access to diagnosis and services or awareness could explain these results although these age effects remain often unexplained. Thus, age trends in prevalence are best evaluated in surveys that provide age-specific lifetime prevalence rates in cohorts followed over time rather than in cross-sectional surveys of contiguous birth cohorts. Typically, and reflecting age-related patterns in the diagnosis of ASD, S-shaped curves portray low prevalence in preschoolers, followed by a steady increase through primary school age and progressive plateauing at older ages. An example of such a pattern can be found in a recent Italian study where prevalence in 2001–2003 birth cohorts rose steadily with age from 0.40% among 3–5 year olds to 0.96% among 9–11 year olds and to 1.19% among 15–17 year olds (Valenti et al., 2019). These trends in age-specific prevalence must be interpreted in the context of the specific survey methodology. Surveys that rely mostly on passive counts of already diagnosed cases will yield school-age prevalence figures that likely underestimate the population prevalence at that age. For example, in the new Canadian surveillance study (Ofner et al., 2018), only 72% of those participants diagnosed by age 17 had been diagnosed by age 8 although only 10% were diagnosed after age 12. The CDC ADDM methodology circumvents the problem related to late diagnoses of ASD by allowing new cases to be confirmed in previously undiagnosed children at age eight. Of note, consistent with the Canadian data, about 20% of the case load of CDC ADDM surveys correspond to such cases. In general, surveys that are designed to identify yet-undiagnosed cases should yield more accurate prevalence estimates at any age and exhibit less marked age effects.

Adult surveys are still scarce. Pioneering studies were performed in England on combined samples of adults living in typical households or in accommodations for adults with ASD and ID (Brugha et al., 2016). These authors reported a prevalence of 1.1% with no variation across different age bands. The prevalence was much higher in the subsample with moderate to severe ID that also had a low male:female ratio compared to the usual male preponderance found in the sample without ID. This survey piloted thoughtful adult survey methodology (Brugha et al., 2012). Limitations were a low participation rate in the subsample with ID, and the small number of affected adults among those without ID. In the US, prevalence of autism among adults 18–65 years old registered in Medicaid in 39 states was reported recently (Jariwala-Parikh et al., 2019). In 2008, the overall adult prevalence was 0.37% and marked birth cohort effects were seen as illustrated by the prevalence varying from 0.82% among 18–25 year olds down to 0.05% among 46–65 year olds. Few other studies have shown similar decreases with age of adult prevalence (e.g Bachmann et al., 2018), the magnitude of which suggesting that lack of awareness and diagnostic services for older cohorts rather than differential mortality accounted for this effect. Speaking to the importance of the population of adults with ASD, a simulation study by Dietz et al. (2020) estimated the national and state prevalence of ASD among US adults ages 18–84, taking into account prevalence data from the NSCH, mortality data for children and adults in the US, and the standardized mortality ratio that recapitulates the excess mortality in adults with ASD. The authors predicted that the current prevalence of ASD among adults over age 18 would be 2.21%, ranging from 1.97% in Louisiana to 2.42% in Massachusetts, and that 5.5 million adults were living with ASD at the national level. However, such models depend on some assumptions and input data that are not necessarily correct. For example, Dietz et al. (2020) used for their modelisation prevalence data for the 3–17 year old group obtained from the National Children Health Survey, a survey that notably relies on unconfirmed parent reports (see above and Table 2).

There is no doubt that more surveys of adults with ASD are necessary, not only to estimate the prevalence or track time trends in prevalence but in order to identify patterns of psychiatric and medical comorbidity and unmet service needs of this growing fraction of the population (Hand et al., 2020; Fombonne et al., 2020).

Social Class, Race and Ethnic Minority Status

Studies of associations between ASDs and socioeconomic status (SES), race/ethnicity, and immigrant status have shown variable results and face numerous technical challenges. Given the broad research suggesting a relationship between socio-economic adversity and child mental health and developmental conditions (Kerker et al., 2015; Cooke et al., 2019; McDonald et al., 2019), one might expect ASD prevalence to be higher in children from population groups with less social privilege. However, the picture is complicated because studies that base diagnosis rates on developmental service utilization often undercount minority and low SES children. Such undercounting may be related to less access to health and educational services generally (Shi & Stevens, 2005) or to autism health services in particular (Kataoka et al., 2002). For instance, a recent analysis of children whose records were used in the CDC ADDM surveys showed that Black and Hispanic children were more likely to have missing health records than White children, which is problematic since these records are used to make prevalence estimates (Wiggins et al., 2020). Prevalence studies based on parent report of ASD diagnosis are also problematic, as parent report of ASD is more likely among families who have adequate access to ASD-related services. Minority and low SES families may also participate in research studies at disproportionately low rates (Rajakumar et al., 2009), and many studies do not report on sociodemographic variables at all (Broder-Fingert et al., 2019). Minority, immigrant, or low SES families also may be excluded from studies or incorrectly assessed if forms are not available in appropriate languages or if a language-congruent assessor is not available. As a result, it is likely that differences in autism prevalence by race/ethnicity and/or SES may better reflect disparate services access than true prevalence differences.

Socioeconomic Status

Socioeconomic status can be defined via parental education, income, parental occupation, or some combination of these factors. Over 20 studies have investigated associations between these factors and ASD prevalence. Recent U.S.-based studies suggest an association between higher SES and higher ASD prevalence. Using CDC ADDM Network data states, Durkin et al. (2017) found a dose-response relationship between SES (defined as parental education) and ASD prevalence in all recent survey years, in White, Black, and Hispanic children. This difference remained present regardless of gender, prior ASD diagnosis, and source of records (i.e., medical only versus medical and educational). However, there was no significant difference among those with co-occurring intellectual disability, and the difference appeared to lessen in non-Hispanic children over time. Similarly, Dickerson et al. (2017) used U.S. Census tract data to show a negative association between neighborhood poverty level (as defined by median household income or proportion in poverty) and ASD prevalence at ADDM sites. A recent British study, using the Born in Bradford birth cohort, showed that children with higher maternal educational attainment had twice the rate of autism diagnosis compared to those with lower educational attainment, but no differences in diagnostic rates by household income or neighborhood material deprivation after controlling for maternal education (Kelly et al., 2019). An earlier British study done with a total population cohort in South Thames showed lower autism prevalence in children of parents with both lower education and a composite SES indicator, although only education was significant after statistical adjustment (Baird et al., 2006). Of note, other international studies in areas where health care access is more equitable show conflicting results. For instance, Larsson et al. (2005) showed no difference in ASD prevalence according to SES, and in Sweden, higher ASD prevalence was seen among children with lower SES (Rai et al., 2012).

Race and Ethnicity

Many studies of racial/ethnic minorities show lower rates of ASD compared to White or European populations, although these differences appear to be narrowing in more current studies. Recent data from 8-year-olds in the CDC ADDM Network (Maenner et al., 2020) suggest an overall lower rate of ASD among Hispanic children (154 per 10,000) compared to White children (185 per 10,000) in the U.S. However, it is notable that prevalence increased in all groups, and that prevalence differences previously found between non-Hispanic Black and Asian/Pacific Islander children and White children, were no longer significant generally, although state-level disparities were found. Similar trends have been noted in 4-year-olds in the ADDM Network; however most recent waves of this survey suggest that racial/ethnic differences may be narrowing or even disappearing, particularly for non-Hispanic Black children (Christensen et al., 2019; Nevison & Zahorodny, 2019).

In studies outside of the United States, reports about racial/ethnic differences in ASD prevalence have been mixed, and most studies are not adjusted for SES, which makes it difficult to assess the unique effect of race/ethnicity from other confounders. In addition, what constitutes a minority race or ethnicity is quite variable by country. The Born in Bradford study showed that children of Pakistani heritage (which was about 50% of the cohort), were about 70% less likely to have an autism diagnosis compared to children of White British mothers, and this difference persisted after adjustment for maternal educational attainment (Kelly et al., 2019). In Israel, Davidovitch et al. (2013) and Jaber et al. (2018) both conducted studies showing a lower prevalence of ASD in ultra-Orthodox Jews than in the general Israeli population. Davidovitch et al. also found a lower prevalence in Israeli Arabs, but Jaber et al. found no difference. Levaot et al. (2019) found lower prevalence in Bedouin-Arab compared to Jewish children in southern Israel. Findings from a 1999–2003 census report in Stockholm, Sweden (Barnevik-Olsson et al., 2010), revealed that the prevalence rate of PDD with learning disability was higher in Somali- versus non-Somali Swedish children. In a Minneapolis US study, Somali children with ASD were also reported to have more frequent intellectual disability although there were no differences in overall ASD prevalence between Somali and White children (Hewitt et al., 2016). Finally, in Western Australia, children of Indigenous mothers were found less likely to carry an ASD diagnosis, and children with East-African Black mothers were more likely to carry an ASD diagnosis, compared to White mothers.

Implications and Unmet Research Needs

Overall , the research findings related to low SES and minority status primarily point to problems of underdiagnosis due to problems in access to health care services and health literacy. In order to obtain an accurate depiction of ASD prevalence in underserved populations, investigators will need to specifically reach out to these populations to ensure equal participation, as well as oversample these groups so that sample sizes are adequate. In addition, there is a need for validated screening and diagnostic tools in multiple languages to ensure that diagnoses, when they occur, are accurate. Finally, key variables in these analyses such as parental education, income, and race/ethnicity need to be directly and routinely measured.

Part III. The Development of World Studies of ASD and Population Surveillance

Worldwide Studies and Cultural Issues

It is beyond the scope of this chapter to review in detail the issues raised by the world emergence of surveys of autism and by the cross-cultural questions they pose. Two matters are addressed. First, is there evidence today that autism is either very rare or very abundant in some areas in the world? This question is important as geographical variation in incidence might provide important etiological clues either on genetic or environmental causation. The second question briefly touches upon variability across cultures of the expression of the autism phenotype and of its measurement, specifically as it applies to epidemiological surveys. Readers interested in a thorough reflection about cross-cultural issues are referred to the excellent conceptual framework proposed recently by de Leeuw et al. (2020).

The last 20 years have seen a welcome expansion of ASD epidemiological surveys of child populations, worldwide. Of the 197 world countries, prevalence estimates exist for only 37 countries (see above; and Table 1). Data are still lacking in many low- and middle-income countries, especially in Africa, South America, Russia, Caucasus and Central Asia. In many countries, lack of awareness and of diagnostic and intervention expertise persist alongside social stigmatization (e.g. Alshaigi et al., 2020; Yu et al., 2020). However, with the development of the internet and social media, and of advocacy organizations, it has become more difficult for governments to ignore the individual, familial and societal problems associated with autism and neurodevelopmental disorders in general. Epidemiological surveys are a natural starting point for developing clinical and research expertise on these conditions; and governments and their decision-making agencies understandably demand local, quantitative data to guide their service planning decisions.

Everywhere it has been investigated, autism has been found. Small case series appeared in the literature in 1972 for Africa and 1982 for China. Following these seminal clinical descriptions, basic surveys followed consisting of simple head counts that underestimated the prevalence as they only included diagnosed cases in areas where diagnostic services were scarce. As services expanded, prevalence increased; for example, in Oman where specialized autism services were recently established, prevalence in Muscat rose to 0.37% (Al-Mamari et al., 2019) compared to a previous Omani estimate of 0.014% (Al-Farsi et al., 2011). And when more fully-developed survey methods are deployed, prevalence in the neighborhood of 1% has been reported in countries as diverse as India, Qatar, Mexico or China (Table 1), figures that are commensurate to those from high-income western countries. However, the variability in survey methodology from one country to the other makes it impossible to draw inferences about underlying differences, if they exist, in true population prevalence. Therefore, with today’s available published data, there is no evidence that there are countries with either very low or very high autism rates, or meaningful between-country variations in prevalence. By the same token, true differences could exist and remain undetected with current methodological limitations.

Turning to the second issue, the similarity of the autism phenotype and of its clinical presentations across cultural groups has been rather striking in our experience of conducting studies in varied cultural settings. Across countries, investigators have relied upon international diagnostic criteria and employed them without difficulty. Diagnostic tools such as the Autism Diagnostic Interview Revised (ADI-R) and the Autism Diagnostic Observation Schedule (ADOS) have now been translated in multiple languages, and implemented successfully in survey diagnostic confirmation phases (e.g. Kim et al., 2011; Fombonne et al., 2016; Alshaban et al., 2019; Zhou et al., 2020). Investigators from the Korean study specifically examined the cultural applicability of the ADOS and ADI-R in diagnosing autism in Korean children and concluded that both DSM diagnostic criteria and scores on standardized diagnostic tools performed well in that population (Kim et al., 2016). Thus, it appears that the concept of autism has some universality even though it might be labelled and named differently in some cultures (for example, “Takiwātanga” among Maoris of New Zealand which means “in his/her own time and space”; www.tepou.co.nz and Bowden et al. (2020)).

Even though a common concept of autism is identified, it remains possible that differences across cultures in the expression and measurement of its manifestations may occur. Indeed, some cultural adaptations of autism tools have been necessary here and there. In China, the birthday party task of the ADOS Module 1 needed to be replaced by an equivalent task since birthday parties are not part of the familial traditions. In South Africa, the screwdriver toy of the Toddler ADOS needed to be removed when used in townships where this particular tool is commonly associated with violence and murder (de Vries, personal communication). In several Asian countries, eye contact from children to adults is discouraged (although Kim et al. (2016) disputed that claim for Korean children) and rules for appropriate social behavior emphasize compliance in children. Chinese parents do not normally expect their child to imitate reciprocally facial expressions or to point fingers at objects to show interest which may reduce the predictive validity of some items of the M-CHAT (Zhang et al., 2006) or that of other screening instruments. In turn, these different cultural expectations in child rearing may require an adjustment in professional definition and evaluation of reciprocity in social interactions. For example, we previously adapted a version of the Social Communication Questionnaire in Inuktitut to use as a screening tool among Inuit communities of Northern Canada, only to discover that, to mean ‘No’ or ‘Yes’, frowning the nose or raising the eyebrows are often substituted to conventional shaking and nodding the head (Fombonne et al., 2006b). Comparisons of Indian, English and Japanese children on the Autism Quotient showed that some items perform differently in some cultural groups (Carruthers et al., 2018). The item ‘Enjoys social occasions’ performed poorly with Indian parents who typically raise their children with strong expectations for social conformity. Likewise, compared to Greek and Italian counterparts, US toddlers endorsed social interaction difficulties at higher frequencies on a toddler autism screener (Matson et al., 2017). In pioneering observations, Lotter (1978) reported a generally lower frequency of stereotyped behaviors, rocking and hand flapping in African samples. In the US, higher frequency of endorsement of routines and rituals, preoccupations with parts of objects and sensorimotor difficulties were documented in White compared to Black autistic children in record reviews at one CDC survey site (Sell et al., 2012); in other studies, Black children were reported to have more co-occurring ADHD symptoms than White children (Jarquin et al., 2011; Jo et al., 2015). Yet, direct observations of larger samples of Black and White children in the US evaluated with the ADOS did not confirm these differences (Fombonne & Zuckerman, 2021).

Overall, the reported differences across cultural groups are inconsistent and of small magnitude; to date, reports of cultural variation in symptom expression are best viewed as preliminary and require replication in larger samples after proper adjustment on background factors such as age, gender, language and cognitive level as well as on method of data collection. Nevertheless, these preliminary observations call for appropriate cultural sensitivity in working across cultures and may necessitate the occasional change in questionnaire item wordings or testing apparatus. While a single item’s performance might change according to cultural context, it appears that tools, in their totality, maintain measurement properties comparable to those established in Western countries where they were developed. For example, when screening tools were calibrated in local samples, the scores distributions and performances of the Social Communication Questionnaire (SCQ) or the Social Responsiveness Scale (SRS) were comparable in Qatar and Saudi Arabia (SCQ) and Mexico (SRS) to original UK and US studies (Fombonne et al., 2012; Aldosari et al., 2019). Finally, the magnitude of cultural effects on item or criterion endorsement and on discriminant power does not appear to be larger than that already reported for gender, age, language or intellectual level within culturally homogeneous samples although formal comparisons of effect sizes remain to be performed.

Thus far, examples of cross-cultural differences in ASD symptom profiles remain largely anecdotal and a systematic investigation of differences in the expression and measurement of the autism phenotype across cultures remains to be conducted. Cross-cultural comparisons have been performed in other areas of psychopathology, e.g. the WHO world studies of schizophrenia in the 1970s, the US-UK comparisons of ADHD diagnostic approaches in the 1980s, and more recently, cross-national comparisons of child psychopathology measured with the Child Behavior Checklist (Rescorla et al., 2007) or the Strengths and Difficulties Questionnaire (Kovess-Masfety et al., 2016). Investigators who are embarking on autism surveys should keep in mind that their research data could be leveraged by embarking into international collaborations set to more systematically test the transcultural robustness of the autism phenotype and of its measurement.

Databases, Ad Hoc Surveys and Surveillance

Prevalence studies of autism vary in their methodological complexity, feasibility, duration, generalizability and costs. The datasets used to generate prevalence estimates are not comparable across studies and their respective merits and limitations should be recognized. For convenience, we grouped them into three types: administrative databases and registries, cross-sectional surveys, and surveillance programs.

Studies that use existing databases with routinely-collected health information provide an easy opportunity to generate preliminary prevalence estimates for a given population. Investigators have used health insurance databases (e.g. Segev et al., 2019; Bachmann et al., 2018), educational databases (e.g. Gurney et al., 2003; Thomaidis et al., 2020) or regional or national registries (e.g. Delobel-Ayoub et al., 2020; Valenti et al., 2019). Some distinct advantages of such data sources are that they do not require costly data collection efforts, they have large and representative samples, they incorporate follow-up updates to clinical information allowing estimation of cumulative incidence or prevalence at different ages, they encompass cohorts born over long periods permitting detection of secular changes, they may include well-suited comparison groups of participants without ASD, and they may sometimes be merged with other databases containing more detailed health or socio-demographic information. Their limitations include reliance on electronic diagnoses/categories that cannot be verified, case definitions that reflect prevailing professional practice rather than research informed concepts, inability to capture undiagnosed or misdiagnosed participants, and intake that is contingent upon changing health or educational policies that in turn directly influence prevalence estimation.

A second type of surveys are cross-sectional investigations performed at one point in time in a given area or population. In the last 15 years, with increasing worldwide awareness, government authorities in low- to middle-income countries have initiated such studies often after extensive lobbying of influential individuals and newly-formed local family associations supported by advocacy organizations such as Autism Speaks, the World Health Organization or grassroots non-governmental organizations (Rosanoff et al., 2015; Hoekstra et al., 2018). The goal of these surveys is to generate an initial local prevalence estimate to gauge the magnitude of the health problem and to provide necessary information to decision-makers in charge of service planning. These ad hoc surveys provide a useful baseline against which surveys in other geographical areas or in the future can be calibrated. In addition to yielding a prevalence figure, carefully collected data can add value in describing trajectories of children with ASD in the local health and educational system, performing case-control comparisons of risk factors, developing locally-validated new tools for screening and diagnosis, collecting genetic specimens (e.g. saliva samples) and biomarkers in searchable repositories and creating an opportunity to follow up a population-based sample in order to study factors associated with later outcomes. It is not uncommon for such surveys to take 4–5 years to be completed from the initial planning to the final results. Idiosyncrasies of local health and educational systems, differences in levels of awareness, engagement and expertise result in major differences in survey design that ultimately make comparisons across surveys hazardous.

Finally, some countries have deployed programs aiming at monitoring autism occurrence in their population, often alongside surveillance of other developmental disabilities. A well known US surveillance program, the Autism and Developmental Disabilities Monitoring (ADDM; https://www.cdc.gov/ncbddd/autism/addm.html), was launched in 2000 by the CDC. The ADDM network comprises up to 16 sites that have estimated about every two years the prevalence of ASD among eight year-old children. The methodology of the ADDM relies on a systematic health and education records review that is relatively cost-effective (children are not assessed in person) and allows children without a prior diagnosis to be counted as cases if the behavioral pattern described in records meets criteria for the surveillance case definition (see Van Naarden Braun et al. (2007) for details of ADDM methodology; and a good summary in Baio et al., 2018). Since 2014, the ADDM has started to track ASD among four-year old children as well. The most recent ADDM survey yielded a prevalence of 1.85% among eight year-olds (Maenner et al., 2020; Table 1), with for the first time similar prevalence in White and Black children (1.85% and 1.83% respectively) but still lower prevalence in Hispanic children (1.54%), a male:female prevalence ratio of 4.3:1, a 33% frequency of associated intellectual disability, and an average age at diagnosis of 4.25 years for the 74% of children diagnosed prior to the survey. ADDM surveys have been useful in tracking over time prevalence and associated characteristics of ASD in the US population. Of note, ADDM surveys rely on convenience samples that are not nationally representative; in addition, the geographical repartition of ADDM sites has varied over time complicating the assessment of time trends. Other limitations of the ADDM methodology have been discussed elsewhere and include particular concerns about the validity of the surveillance case definition (Mandell & Lecavalier, 2014; Fombonne, 2018).

Repeated national surveys conducted in the US (National Health Interview Survey (NHIS); National Survey of Children’s Health (NSCH)) have also been used to track prevalence in the US population over time. The strengths of national surveys lie in their sampling methodology and representativeness, and their inclusion of wider age ranges, but as discussed above, they are seriously limited by the case definition they employ (see also Table 2). The CDC maintains a visualization tool that allows comparisons of these different data sources in the US (www.cdc.gov/ncbddd/autism/data/index.html).

In Canada, a newly formed National Autism Surveillance System (NASS) has released its first results from 2015 concerning almost two million children ages five to 17, using administrative data from seven Provinces and Territories (Ofner et al., 2018; Table 1). The NASS case definition relies on ICD- or DSM-derived ASD diagnoses provided or confirmed by licensed health care professionals. The prevalence was 1.52% with a male to female ratio of 4:1 at all ages; 56% of the 29,099 cases had been diagnosed by age six, 72% by age eight, and 92% by age 12.

In Europe, 14 countries of the European Union have engaged into a large multifaceted cooperative program to develop early detection programs, validate biomarkers, train professionals, improve support for adults and propose policies (Autism Spectrum Disorders in the European Union (ASDEU); www.asdeu.eu). The other objective is to investigate the prevalence of autism in 12 countries using a methodology that focuses on school age, identifies diagnosed children as well as undiagnosed ones through school surveys, relies on common instrumentation (SCQ, ADI-R, ADOS), and uses in-person assessments. Additionally, exploration of European regional and national registries is being pursued as a complementary strategy (Delobel-Ayoub et al., 2020). Details on the methodology can be found on the web site of ASDEU and in publications (Boilson et al., 2016; Narzisi et al., 2020; Fuentes et al., 2020). Although it is too early to evaluate the efficacy and success of NASS and ASDEU as surveillance programs for Canada and the EU, the increasing interest for establishing population surveillance of ASD is noticeable in several high-income countries.

Conclusions

Epidemiological studies of ASD have expanded worldwide with a median estimate of 1.14% providing a conservative figure for ASD population prevalence. Comparisons of results across studies should be made with extreme caution due to irreducible heterogeneity pertaining to case definition and ascertainment strategies unique to each survey. Definitions of ASD used in population surveys often do not coincide with those required in rigorous clinical research protocols such as randomized clinical trials or molecular genetic investigations. Survey definitions are influenced by the need to comprehensively capture cases (optimizing sensitivity) and to estimate service needs for developmentally impaired children (which may come at a price for specificity). The addition of a survey component where mainstream schools were surveyed has consistently proven that general school screening is required if a comprehensive picture of ASD is to be provided. However, screening schemes for school samples need to be further researched, available screeners compared, and more cost-effective approaches properly designed and evaluated.

Several countries are now considering the implementation of national registries or surveillance programs that will help track trends in prevalence and incidence of ASD in their populations in the future. To improve these programs, several additional features could be considered. First, more extensive validation of cases included in household surveys or registries would be beneficial. For example, validation of parental reports in the NSCH and NHIS even on subsamples could considerably augment their usefulness considering their acknowledged strengths in sampling and representativeness. Second, incorporating a follow-up of samples recruited as part of the ADDM network studies (now being planned) and of other cross-sectional surveys would provide critical information about diagnostic stability and developmental trajectories as well as their predictors. Accordingly, when designing new surveys, investigators should plan forward and implement ethically approved policies authorizing participant re-contacting in future investigations. Third, broadening the focus of surveys to the larger realm of neurodevelopmental disorders would increase their public health relevance and would also allow examination of important questions of boundaries and overlap between the autism phenotypes and other developmental disorders (motor, language, ADHD, etc.) and genetic syndromes. Premature and arbitrary decisions on what to include and what to not include in the definition of autism have historically proven to be detrimental to scientific enquiry. It is true that epidemiology appreciates binary codes and states (diseased/not diseased) that are necessary for prevalence calculations. Yet, there is more to epidemiological studies than calculating a proportion; inclusion of dimensional measurements of disease related constructs and of co-occurring phenomena and risk factors in population-based samples would go a long way to advance current debate about the autisms and overlapping phenotypes. Fourth, systematic incorporation in survey protocols of standardized measures of behavioral problems and psychiatric disorders should be considered both at the screening stage and at the diagnostic confirmation stage. As discussed above, co-occurring behavioral problems influence the performance of autism screening and diagnostic tools in a way that can only be elucidated with contemporaneous and separate measurement of those problems. In a child already diagnosed with autism, it will facilitate the assessment of comorbid disorders while in evaluating children without a previous ASD diagnosis, this approach will provide the means to increase the specificity of a new ASD diagnosis. Surveys of school age children, teenagers and adults would especially benefit from such additions to their instrumentation. Fifth, diagnostic criteria for ASD have changed over time and, with them, the case definitions used in epidemiological surveys and surveillance programs. In future studies, new definitions and criteria should be introduced while keeping operational prior criteria/definitions. This will allow for testing the impact on prevalence of the changes in those definitions and will preserve the possibility to evaluate time trends meaningfully. Sixth, to a large and unfortunate extent, surveys of autism and registries have failed to incorporate measures in the biological and genetic domains that are needed to tease apart the autism behavioral and cognitive heterogeneity. Addressing the disconnect between epidemiological surveys of autism and studies of its biological mechanisms should be regarded as a priority for the future of autism epidemiology. Leveraging epidemiological investigations by systematically developing regional registries and repositories may respond to that need.

To summarize, we suggest that in planning future surveys and surveillance programs of ASD, investigators should systematically contemplate the possibility of enhancing their research protocols by expanding the scope of enquiry to include a broad array of neurodevelopmental conditions, including longitudinal follow-up extensions, collecting genetic samples, and adding neuroimaging and biological sampling so as to maximize the return of information for their professional community, the participants and their families, and their funders.