Keywords

1 Introduction

According to the Centers for Disease Control and Prevention, 17 % of children aged three to seventeen were diagnosed with a developmental disability between 2009 and 2017 [1]. Autism spectrum disorder (ASD) is a group of complicated development in social contact, speech, and non-verbal expression, and restricted/repetitive behavior that entails ongoing difficulties [2]. In each person, the causes of ASD and the seriousness of the symptoms vary. ASD has been diagnosed in every 1 in 270 people in the world [3]. In the United States of America alone, 1 out of every 54 children has been diagnosed with ASD. Hence, early detection of ASD creates awareness both in the family and socially, enables better care and less negligence for diagnosed individuals, and results in overall better psychological growth. Even though there may be few visible physical impairments, people with ASD suffer from significant psychological sickness. Since there are no physical attributes quantifiable in lab tests, ASD diagnosis has been quite difficult until now. Doctors analyze communication, social, and behavioral development data to make a decision. The Diagnostic and Statistical Manual of Mental Disorders (DSM-5) [2] and Autism Diagnostic Observation Schedule (ADOS) [4], the two most often used manuals, have made a difference in detecting ASD. DSM-5 defined two key domains of ASD in order to assess impairment: (1) communication and social interaction and (2) restricted interests and repetitive behaviors. On the other hand, ADOS evaluation utilizes planned social circumstances to generate target responses and interpersonal interactions divided into four modules. These modules are suited to people depending on their language and stage of development to guarantee that a varied range of behavioral events are covered. Nonetheless, the psychometric features of each method are restricted, dependent on outdated diagnostic standards, various behaviors, restrictions on present operation, and age.

Complex characteristics and symptoms of developmental and cognitive disorders add complications to classifying in clinical decision making as well as deterministic computational methods. Machine learning (ML) algorithms have been utilized broadly to solve developmental disorders, specifically ASD [5, 6]. Hyde et al. [7] addressed the effectiveness of utilizing ML for autism identification and reviewed several detection methods. These methods include detection of behavioral and neuroimaging data, behavioral and developmental data, genetic data, and electronic health records. Reviewed methods include classifiers like support vector machine (SVM), alternating decision tree (AD Tree), neural networks (NN), random forest (RF), logistic regression (LR), decision tree (DT), random tree (RT), Bayesian network (BN), naive Bayes (NB), and more. Our contribution is this paper is given below:

  • We have prepared a questionnaire based on ASD symptoms found in previous studies with 82 questions;

  • We have conducted a survey in schools and communities leveraging the questionnaire and prepared a dataset;

  • We have found the most salient signs that distinguish ASD children from non-ASD children;

  • On the created dataset, we compared various machine learning classifiers.

The remainder of the paper is structured as follows: The Sect. 2 reviews the literature; the Sect. 3 discusses the proposed methodology. Section 4 contains the experimental analysis, and Sect. 5 concludes the work.

2 Literature Review

Ample research has been conducted related to ASD, its types, symptoms, and detection. Faras et al. [8] classified autism as a pervasive developmental disorder (PDD) and categorized ASD as autistic disorders (AD), Asperger’s syndrome (AS), childhood disintegrative disorder (CDD), pervasive developmental disorder-not otherwise specified (PDD-NOS) and Rett syndrome (RS). Biomarkers related to cognitive, behavioral, visual, and structural connectivity have demonstrated promise in several clinical screening and diagnostic procedures, like ADOS, DSM-5, Autism Diagnostic Interview-Revised (ADI-R), Developmental, Dimensional and Diagnostic Interview (3di), and Social Responsiveness Scale (SRS, SRS-2) [9]. Clinical standards, however, usually require the involvement of multidisciplinary teams in ASD diagnosis, and these processes need substantial amounts of time. Berument et al. [10] developed an autism screening questionnaire (ASQ) with 40 different ASD symptoms and tested a total of 200 individuals. In ASQ, though, there was less distinction between autism from other PDD kinds. Sadek et al. [11] investigated different categories for autism identification and analyzed various types of detection systems that use machine learning, computer vision, and neural networks. Rahman et al. [12] recommended several ways to accelerate the execution of data processing for detecting ASD using ML. They have also looked into several techniques for identifying and processing imbalanced data in these detection techniques. Raj and Masood [13] combined three publicly available datasets and performed a performance comparison of LR, SVM, NN, NB, and convolutional neural network (CNN) with the highest accuracy of 99.53%. Rule-based ML can also be used in autism screening, which further provides understanding to clinical professionals. Thabtah and Peebles [14] proposed such methods and tested them on adult, adolescent, and toddler datasets. Omar et al. [15] combined random forest-CART and random forest-ID3, evaluated it on a similar dataset, and then deployed the trained model in a mobile app. In more recent literature, Hossain et al. [16] tested 25 machine learning classifiers in a collected ASD dataset and concluded that SVM based on sequential minimal optimization (SMO) performs better in their experimental scenario. All aspects of the physiological and psychological activities are hard to be cataloged by health professionals [17]. Hence, a physiological outcome monitoring system that records continuous communication and behavioral changes produce intuition of the patient’s well-being [18]. Again these systems assess health professionals to monitor the growth in different contexts [19].

3 Methodology

Symptoms: Seltzer et al. [20] discovered that patients with ASD show a tendency to query inappropriately, spontaneously imitate, lack interest in people, difficulty sharing meals, and repetitive use of objects. Faras et al. [8] explored red flags indicating ASD in participants and described delays in speaking, repetitive play with toys, and communication difficulties. In addition, there was a lack of facial expression, pretend play, imagination, interest in playing near peers on purpose, ability to comprehend sarcasm, and awareness of personal space. According to Baskin et al. [21], individuals with Asperger syndrome manifest an inflexible adherence to specific nonfunctional routines, schizophrenia, repetitive and stereotyped motor mannerisms, and limited fields of interest. Mirkovic and Gérardin [22] discovered that people engage with others who share similar interests, struggle to maintain and develop acceptable peer relationships, and prefer social isolation. Karabekiroglu et al. researched PDD-NOS symptoms and discovered that the participants exhibit unusual non-verbal movement, lack of eye contact while interacting, hyperactivity and hostility, and inappropriate laughter [23]. Snow and Lecavalier [24] identified a concern with rule-breaking and aggressive conduct, as well as anxiety and depression among PDD-NOS patients. Mehra et al. examined symptoms of childhood disintegrative disorder and discovered that the participants exhibited limited interest, lack of imagination, sleep problems, and decreased motor abilities [25]. Elia et al. observed that the diagnosis of autistic disorder can be based on the first REM delay, muscle twitches density, and rapid eye movement density [26]. Repetitive behaviors may not be significant characteristics of autistic disorder; nevertheless, Militerni et al. [27] discovered that

Fig. 1
figure 1

Symptoms of ASD and types differentiated by color

younger subjects demonstrated repetitive motor and sensory behaviors, whereas older youngsters with higher IQ scores demonstrated complex repetitive behaviors. Hagberg et al. [28] discovered that the key clinical features of Rett syndrome are severe progressive dementia and unusual hand movements. Kyle et al. [29] investigated the four stages of Rett syndrome: slow head circumference growth, microcephaly, scoliosis, and wheelchair dependency.

Questionnaire: Previous similar checklists such as Mchat [8] are primarily focused on specific age groups. On the other hand, though DSM-5 [2] gave an overview of symptoms in ASD, the direct questionnaire has not been provided. Again, a straightforward question by asking whether any of the symptoms are present or not in individuals may carry a certain level of human error. The severity of these issues may remain unclear. As a result, a scenario-based severity scaled question was also required. Thus, the creation of a question set for determining ASD and its types was necessary. A questionnaire derived from the symptoms mentioned above has been listed in detail in Fig. 1 with relevant ASD types, which allows measuring across the whole spectrum of autism. The questionnaire consists of 24 questions, with 82 fields representing options for these questions. These questions enable discrimination of the three major components of autism. Each of these options was then graded on a five-point scale. Age, gender, ASD types, and other miscellaneous questions also have been added to the survey.

Participation and procedure: The collection of data from various people of various ages with clinically diagnosed ASD has been the survey’s main focus. The scenario of each question has been portrayed in such a way that it can

Fig. 2
figure 2

Values of responses corresponds to questionnaire

be relatable with all kinds of individuals: toddlers, children, adolescents, and adults. A Google Form has been prepared with multiple-choice options from the questionnaire. The form was sent to an autism specialized school, doctors, and students for completion. Filled results have been checked respectively to find out the anomaly. A separate form with the same questionnaire has been sent to ordinary educational institutions. Only form responses correspond to participants who had no prior disorders and were subsequently labeled as neurotypical.

Dataset details and data distribution: There are 71 data instances in the collection; all acquired from the same number of people. The participants were split into two groups: 42 men and 29 women. The ages of the participants ranged from four to twenty-seven, with an average of 18.8 years. The participants filled out 38 forms, family members filled out 32 forms, and a health professional filled out one. Thirty-nine participants were neurotypical, while 32 were clinically diagnosed with ASD, including 16 AD, 4 AS, 4 CDD, 4 RS, and 4 PDD-NOS patients. Figure 2 represents the relationship of responses with the particular question, whereas color represents the value of each field. The last few questions in the proposed questionnaire delineate physical impairment, which is nonexistent for most ASD cases except for Rett syndrome. Hence, those fields have been occupied with lower values. For the rest of the questionnaire, the values were evenly distributed.

Fig. 3
figure 3

ASD symptoms with most correlation in dataset

ASD Detection Using ML: Four machine learning techniques, namely support vector machine(SVM), k-nearest neighbors (KNN), random forest(RF), and artificial neural network(ANN), have been utilized for the classification of ASD and its types. SVM assumes data points as support vectors and uses hyperplanes to separate data into classes. One vs. one has been selected as a decision function shape in SVM, which calculates a hyperplane for two classes at a time. Radial basis function (RBF) has been utilized as the kernel. On the other hand, KNN groups together data points based on similarities or distance. The number of neighbors for ASD classification is selected as 20. RF is an ensemble classifier consisting of multiple decision trees, where each tree predicts the output, and the final prediction is given on the majority vote. In the experiment, the number of estimators is set as 20 with two random states and a max depth of 15. SVM, KNN, and SVM have been implemented using the Scikit-learn library. The proposed ANN consists of one input layer, three fully connected hidden layers, two batch normalization layers, two dropout layers, and an output layer. The number of neurons in hidden layers is 32, 256, and 64, respectively. The first two hidden layers utilize rectified linear unit (ReLU) as the activation function, whereas a sigmoid is used in the last hidden layer and the output layer. For loss function, categorical cross-entropy has been used with adam optimizer. All of the ML classifiers have been executed for 100 epochs.

4 Experimental Analysis

Correlation analysis assesses the extent and orientation of the relationship between input and output variables; in this case, values of each question and ASD categories. Figure 3 shows the top 18 symptoms with the highest correlation value, where blue represents negative correlation and red represents positive correlation.

The severity of the dataset has been transformed into numerical values ranging from 0 to 4. The occurrence of any specific value was then determined using the mean value of the symptoms in ASD and neurotypical individuals. Figure 5 depicts the seven symptoms with the highest association between ASD types and neurotypical traits. To determine the most common symptoms among ASD categories, the correlation between ASD types and questionnaires was evaluated individually. In Fig. 6, the symptoms with the highest correlation have been depicted with a mean value. As a result of principal component analysis (PCA), datasets become more interpretable while avoiding performance degradation. It accomplishes this by generating new negatively correlated parameters that sequentially optimize variance–principal component analysis of two components in the ASD dataset depicted in Fig. 7.

Fig. 4
figure 4

Comparison of performance metrics in testing data among ML classifiers

The dataset has been split into 20% data for testing and 80% data for training. Four machine learning models (SVM, KNN, RF, ANN) have been trained and tested on the accumulated dataset. Accuracy and F1-Score have been calculated for model performance and comparison. The percentage of correctly predicted classes, both positive and negative, is referred to as accuracy. F1-score is the weighted average of accurate classification among total positive predictions and valid classification among correct positive and false negative predictions. Testing accuracy for SVM, KNN, RF, and ANN was 89%, 78%, 83%, and 89.8%, respectively, with training accuracy near 100% for all classifiers. The achieved F1-Score of SVM, KNN, RF, and ANN in testing data is 86, 73, 83, and 85% subsequently. Figure 4 depicts the comparison of accuracy, recall, precision and F1-Score among ML classifiers. The evaluation metrics show that SVM and ANN perform significantly better than KNN and RF for ASD classification. The epoch-wise test and train AUC and loss of ANN is depicted at Fig. 8.

Fig. 5
figure 5

Mean value of symptoms relevant to ASD types and neurotypical

Fig. 6
figure 6

Mean value of symptoms relevant to ASD types

Fig. 7
figure 7

PCA analysis of dataset

Fig. 8
figure 8

Epochwise Area Under Curve (AUC) and Categorical Cross-entropy loss of ANN

5 Conclusion

The early and quick diagnostic method of ASD allows early intervention and medical treatment, which reduces the risk significantly. ASD refers to a broad range of psychological deficits that differ in each individual. Hence, detecting autism has been complicated by considering all possible physical and psychological problems. In this article, we have accumulated and analyzed a wide range of ASD symptoms, then converted these symptoms into a scenario-based questionnaire. A survey has been conducted to collect data using a questionnaire. Then, correlation analysis and PCA are used to find out the most prominent symptoms. SVM, KNN, RF, and ANN classifiers have been trained and tested for the classification task. Though ML classifiers achieved good performance, the limited dataset size is a major limitation of this study. In the future, input such as video, voice, and image data that correspond to symptoms can be collected with an open-source platform.

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the Biosafety, Biosecurity, and Ethical Clearance Committee of Jahangirnagar University, Savar, 1342 - Dhaka, Bangladesh and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.