Introduction

Alzheimer’s disease (AD) is the most common dementia defined as an organic neurodegenerative and progressive dementia characterized by deficient episodic memory and cognitive functions such as sensory gating, remembering, reasoning, and planning (Dierks et al. 1997; Jeong et al. 2001; Hiele et al. 2007; Bouwman et al. 2009; Thomas et al. 2010). It is one of the most prevalent diseases in elderly afflicting about 5 million individuals in the US alone and more than 26.6 million individuals worldwide (Jackson and Snyder 2008). Loss of neuronal synapses (Hampel et al. 2002; Adeli et al. 2005a), axonal degeneration, and neuronal cell body death (Fjell et al. 2010) in AD result in changes in various regions of the brain anatomically (structurally) and physiologically (functionally) (Head et al. 2005). Although autopsy is the only possible way for a definite diagnosis of AD, researchers have been trying to diagnose AD based on studies of computer imaging technologies (Thompson et al. 1998, 2001; Jones et al. 2006; Stebbins and Murphy 2009), electroencephalograms (EEGs), and magneto-encephalograms (MEGs) (Adeli et al. 2005a). Since the microscopic changes and brain dynamics cannot be detected by computer imaging techniques (which have low temporal resolution), diagnosis of AD in its early stages through EEG and MEG appears to be more promising (Dierks et al. 1997; Lee et al. 2007; Osterhage et al. 2007; Kramer et al. 2007). Although many studies have been performed on frequency changes in EEG and MEG signals, such as the well-known EEG slowing (Giannitrapani et al. 1991; Adler et al. 2003), increased power of EEG in gamma band (40–70 Hz) (Deursen et al. 2008), and increased 40-Hz steady-state response measured through MEG (Osipova et al. 2006) and EEG (Deursen et al. 2009), these studies cannot describe the highly complex dynamics of the brain of AD patients where complexity features are influenced greatly by loss of synapses and functional and structural changes.

Adeli et al. (2005a) presented a state-of-the-art review of research performed on computational modeling of AD and its markers based on the following approaches: computer imaging, classification models, connectionist neural models, and biophysical neural models. They concluded that a mixture of markers and a combination of novel computational techniques such as neural computing, chaos theory, and wavelets can increase the accuracy of algorithms for automated detection and diagnosis of AD. In a complementary article, Adeli et al. (2005b) presented a state-of-the-art review of models of computation and analysis of EEGs for diagnosis and detection of AD using time–frequency analysis, wavelet analysis, and chaos analysis.

Chaos theory provides two types of complexity measures: correlation dimension (CD) as a measure of complexity of a trajectory in its embedding space, and fractal dimension (FD) as a measure of self-similarity and complexity of a time series computed directly without reconstruction of the embedding space (Gomez et al. 2009; Rossello et al. 2009; He et al. 2010). Besthorn et al. (1995) report lower CD in AD EEGs than control EEGs in the entire band-limited EEG (0.5–30 Hz) in eyes closed resting state. Besthorn et al. (1997) report accuracy of 69.5%, in distinguishing AD from healthy subjects, based on the CD. Also, applying Principal Component Analysis (PCA) (Wu and Ben-Arie 2008; Lopez-Rubio and Ortiz-de-Lazcano-Lobato 2009) to a feature space consisting of CD, various frequency features (related to power of EEG sub-bands), and coherence, and adding the age of subjects as an auxiliary feature, they report an accuracy of 90%. Jelles et al. (1999) found lower CD in AD EEGs compared with healthy EEGs in three conditions: eyes closed, eyes open, and during an arithmetic task (p value = 0.023). Jeong et al. (2001) report reduced CD in the entire EEG of AD brain compared with healthy EEGs in most regions.

Using Hausdorf’s FD, Woyshville and Calabrese (1994) detected decreased complexity in occipital loci of AD patients based on studies of EEGs in resting eyes closed condition. Recently, Gomez et al. (2009) reported reduced Higuchi’s FD (HFD) (Higuchi 1988), in MEG of AD patients compared with healthy individuals with a detection accuracy of 87.8%. MEG, however, is very sensitive to environmental noise and hence needs an isolation room and very specific amplifiers. Although EEG has some limitations such as effects of neighboring electrodes, it is a more accessible diagnostic device and with automated signal processing techniques can be powerful in diagnostic applications.

Recently, Adeli et al. (2008) presented a multi-paradigm spatio-temporal wavelet–chaos methodology for analysis of EEGs and their sub-bands for discovering potential markers of abnormality in AD. The nonlinear dynamics of the EEG and EEG sub-bands are quantified in the form of CD, representing system complexity, and the largest Lyapunov exponent (LLE), representing system chaoticity. They found that the wavelet–chaos methodology and the sub-band analysis developed in their research accurately characterize the nonlinear dynamics of any non-stationary EEG-like signals. They also concluded that changes in the brain dynamics are not spread out equally across the spectrum of the EEG and over the entire brain, but are localized in certain frequency bands and electrode loci.

Recently, a new and simple method was introduced to convert a time series to a graph, called visibility graph (VG), and its structure was shown to be related to fractality (self-similarity) and complexity of the time series (Lacasa et al. 2008). Since quantification of complexity and self-similarity of a graph does not need many nodes in the graph, computation of complexity of a time series does not need many time samples when it is converted to a graph.

The authors investigate whether VG can be used as an effective tool for diagnosing AD. In this article, a new chaos–wavelet approach is presented for EEG-based diagnosis of AD employing VG. Following Adeli et al. (2007), the approach presented in this paper is based on the research ideology that nonlinear features, such as CD or FD, may not reveal differences between AD and control group in the band-limited EEG, but may represent noticeable differences in certain sub-bands. Hence, following Adeli et al. (2007) and employing their wavelet–chaos methodology, in this study complexity of EEGs is computed using the VGs of EEGs and EEG sub-bands produced by wavelet decomposition. Two methods are employed for computation of complexity of the VGs: one based on the power of scale-freeness of a graph structure (Lacasa et al. 2008) and the other based on the maximum eigenvalue of the adjacency matrix of a graph (Kim and Wilhelm 2008). Analysis of variation (ANOVA) is used for feature selection. Then, two classifiers are applied to the selected features to distinguish AD and control EEGs: a Radial Basis Function Neural Network (RBFNN) and a two-stage classifier consisting of PCA and the RBFNN.

Method

Figure 1 shows a flowchart for the research approach presented in the paper which includes: wavelet decomposition of the band-limited EEGs into four sub-bands (Adeli et al. 2003), mapping the EEG sub-bands to their VGs and computing associated complexities two different ways, statistical analysis of the extracted complexities to discover the most discriminating/distinguishing features for AD diagnosis, and a classification between AD and control groups based on the set of features selected across the various loci and sub-bands in the previous step.

Fig. 1
figure 1

Flowchart for the research approach presented in the paper

Wavelet analysis

Following Adeli et al. (2007), using a three-level wavelet decomposition, the band-limited EEG (0–30 Hz) is decomposed into four sub-bands: beta (15–30 Hz) (β), alpha (8–15 Hz) (α), theta (4–8 Hz) (θ), and delta (0–4 Hz) (δ). Therefore, this step results in 95 {19 [7 (AD) + 12 (normal)] × 5 [4 (EEG sub-bands) + 1 (band-limited EEG)]} signals for each state (eyes open and eyes closed).

VG

Recently, Lacasa et al. (2008, 2009) presented VG and showed that such a graph inherits some of the properties of the time series such as periodic series resulting in regular graphs, random series resulting in random graphs, and fractal series converting into scale-free networks. Scale-free refers to the property that distribution of graph degrees (degree of a vertex is the number of edges connected to it) is determined by a power law independent of the number of nodes. For example, if the power is 2 it means one node has a power of 2, say 25, number of edges and two nodes exist each with 24 edges, four nodes each with 23 edges, 8 nodes each with 22 edges, and 16 nodes with 21 edges. The few nodes with very high degrees are called hubs. To the best of the authors’ knowledge, this is the first application of the VG in EEG analysis.

VG of the time series x is constructed in the following manner: Consider the ith node of the graph, a i , corresponds to the ith point of the time series, x i . Two vertices (nodes) of the graph, a m and a n , are connected via a bidirectional edge if and only if:

$$ x_{m + j} < x_{n} + \left( {{\frac{n - (m + j)}{n - m}}} \right)(x_{m} - x_{n} )\quad \forall \;j \in Z^{ + } ;\;j < n - m. $$
(1)

Figure 2 illustrates the procedure of converting time series x (Fig. 2a) to its VG (Fig. 2b). The gray line between x i and x j in Fig. 2a shows x i and x j can see each other. If and only if x i and x j can see each other, the corresponding vertexes of the VG, a i and a j , connect together through a bidirectional edge. Some dynamical features of the time series, such as complexity and self-similarity, hidden in the signal may be revealed in the structure of the resulting VG (Lacasa et al. 2008, 2009).

Fig. 2
figure 2

Illustration of converting a time series {x i } shown in a to its visibility graph with node sequence {a i } shown in b. The gray line between x i and x j in a shows x i and x j can see each other. If and only if x i and x j can see each other, the corresponding vertexes of the graph, a i and a j , connect together through an edge. Gray lines and curves show the edges

Complexity of VG

Complexity and fractality (self-similarity) of a time series can be obtained through estimation of complexity of its VG quickly similar to computation of FD without the need to create the state space which requires a large number of sampling points. Two methods are investigated for measuring the complexity of VG in this research.

Power of scale-freeness in visibility graph (PSVG)

Using the VG algorithm, a periodic signal is mapped into an ordered graph where all of its nodes connect to the same number of edges and a fractal time series is mapped into a graph with a scale-free structure which is characterized by: P(k) = k r, where k indicates the order of a node, i.e. the number of edges connected to a node, P is the probability distribution of edges distributed in vertices or nodes of a graph, and r is called the power of the scale-freeness. Lacasa et al. (2008, 2009) show that the power of the scale-free structure (r) of the VG indicates the amount of the signal’s fractality, and slope of P(k) versus 1/k in a log–log plane indicates the FD of the signal.

Adjacency matrix of an unweighted bidirectional graph is a symmetric matrix indicating which nodes are connected together. The elements of this matrix are 1 or 0 indicating the corresponding nodes are connected or not, respectively. Figure 3a shows a sample healthy EEG with 1,024 sampling times. Figure 3b shows the degree sequence of the corresponding VG which has 1,024 nodes (the same as the number of sampling times). The ith element of the sequence indicates the degree of the ith node of VG. Figure 4 shows the 1,024 × 1,024 adjacency matrix of the VG of the EEG shown in Fig. 3a where black dots represent 1 or existence of an edge between the corresponding nodes and whites represent zeros for no edge. Since two sequential points of the time series can see each other (according to the visibility definition of the VG algorithm described earlier), all sequential nodes are connected together. They make the diagonal of the adjacency matrix. In the adjacency matrix of Fig. 4, black dots are located mostly close to the diagonal of the matrix. But there are also a few dots far away from the diagonal. Such dots indicate that the corresponding node pair are far from each other but are able to see each other. That is, such nodes are hubs of the graph which correspond to points with large or small values in the time series. Their largeness and smallness (compared with other points) makes them visible by many other points which in turn causes their correspond nodes to be the hubs of the VG.

Fig. 3
figure 3

a A sample healthy EEG and b degree sequence of its VG

Fig. 4
figure 4

Adjacency matrix of the VG of the EEG shown in Fig. 3a (black dots represent 1 and whites represent 0)

In Fig. 5, values of log2[P(k)] are plotted versus log2(1/k) [a small log base of 2 is used because we have a relatively small sampling times (1,024) which makes it more suitable than a log base of 10] and a least square fit is used to obtain the value of the slope known as PSVG as a measure of complexity and fractality of the EEG signal.

Fig. 5
figure 5

Plot of log2[P(k)] versus log2(1/k) for the VG of the sample healthy EEG shown in Fig. 3, in a log–log plane. Dashed line is the fitted line to the dots through the least square (LS) method. The corresponding PSVG is slope = 1.75

Graph index complexity (GIC)

As another measure of complexity of a graph, GIC (Kim and Wilhelm 2008) is investigated in this research for EEG-based diagnosis of AD. Let λ max be the largest eigenvalue of the adjacency matrix of a graph with n nodes. GIC is defined as follows:

$$ C_{{\lambda_{\max } }} = 4c(1 - c) $$
(2)

where

$$ c = {\frac{{\lambda_{\max } - 2\,{ \cos }(\pi /(n + 1))}}{{n - 1 - 2\,{ \cos }(\pi /(n + 1))}}}. $$
(3)

It can be shown \( C_{{\lambda_{\max } }} \) varies between 0 and 1 because the following inequality is true for all unweighted bidirectional graphs (Kim and Wilhelm 2008):

$$ 2\,{ \cos }(\pi /(n + 1)) \le \lambda_{ \max } \le n - 1. $$
(4)

The more complex the graph’s structure, the larger will be \( C_{{\lambda_{\max } }} \). In this research, application of this complexity measure to VG is called VGIC.

Statistical analysis and feature selection

In order to determine and evaluate the ability of features to discriminate groups based on variations both between and within groups based on PSVG or VGIC and reduce the input dimension for the final classification step, the one-way ANOVA is used.

One-way ANOVA produces a number called p value to evaluate the discriminating ability of the feature. It varies between 0 and 1 where a p value close to 0 indicates the high ability of the feature to discriminate the groups and a p value close to 1 indicates high similarity and closeness of distribution of the groups. Using this statistical tool, the most discriminating PSVGs and VGICs to distinguish the AD patients from control subjects are found in both eyes closed and eyes open conditions using a threshold p value of 0.001.

PCA

PCA is a statistical method of mapping data, linearly, into a space with a lower dimension. The mapping is usually done through a linear singular value decomposition (Lipovetsky 2009). The main goal of applying PCA to an input data set and using the outcome of the PCA analysis as input to a classifier is dimensional reduction of the input space without compromising the accuracy of the classification (Ghosh-Dastidar et al. 2008). Principal components are eigenvectors of the covariance matrix of the data with the highest eigenvalues, which show directions of maximum variations in the input data. The principal components are considered as axes of the new input space.

In this research, the input data of the PCA are the discriminative features obtained from the ANOVA. The number of appropriate eigenvectors for constructing the new feature space is obtained by trial and error so that the number of the features is sufficiently fewer than the number of data. A very rough guideline would be about 10% of the number of available data set.

Classification

In this step, the EEG data are classified into control and AD groups. The classifier used in this research is the RBFNN (Karim and Adeli 2003; Pedrycz et al. 2008; Anand et al. 2009; Savitha et al. 2009; Junfei and Honggui 2010; Ahmadlou and Adeli 2010; Wu et al. 2010) with PCA (PCA-RBFNN) and without PCA.

PCA-RBFNN consists of two stages: PCA and RBFNN. Inputs of the RBFNN are outputs of the PCA. RBFNN consists of an input node, one hidden radial basis function layer and an output layer. The number of nodes in the input layer is equal to number of the principal components produced by PCA. The number of nodes in the hidden layer is found based on the data used for training the network to yield the most accurate results as described in the next section. Neuron of the output layer has a hard limiter function which determines the group the data belongs to it. The outputs 1 and 0 correspond to AD group and control group, respectively. The weighted input to the radial basis transfer functions of the hidden layer is the Euclidian distance between the input vector and the weight matrix of links connecting the input neurons to the hidden layer’s nodes, multiplied by a bias value. The bias changes the spread of the radial basis functions, which can be tuned for different distributions of data set in order to obtain the best classification results. A decrease in the distance between the weight vector and the input vector results in an increase in the output. The weights and biases are obtained by training the neural network to minimize the misclassification error in a mean square error (MSE) sense.

Data acquisition

The data set includes 19-channel EEGs (according to 10-20 standard system) from two different groups of subjects: 7 healthy elderly (control group; with average age of 71 (range 61–83) with no history of neurological or psychiatric disorder and 20 probable AD patients (average age of 74, range 53–85) diagnosed through National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) and Diagnostic and Statistical Manual of Mental Disorders (DSM)-III-R criteria (Pritchard et al. 1991). Sampling rate of EEGs is 128 Hz. EEGs are collected under two rest states: eyes open and eyes closed. Eight-second EEG (1,024 time samples) segments free from eye blink, motion, and myogenic artifacts are extracted from the EEG recordings (one segment for each subject). The EEGs are band-limited to the range of 1–30 Hz during the EEG recording and preprocessing stages.

Results

95 PSVGs and 95 VGICs corresponding to 19 (channels) × 5 (4 EEG sub-bands and a band-limited EEG) EEGs per condition (eyes open and eyes closed) per subject (7 elderly normal and 20 AD) are obtained after the wavelet analysis and VG computations.

Statistical analysis and feature selection

The extracted PSVGs for neither eyes closed nor eyes open conditions did show significant discrimination (p value < 0.01) between the AD and normal aging groups in band-limited EEG and beta and theta EEG sub-bands. But the one-way ANOVA test showed the ability of PSVGs in alpha and delta sub-bands to distinguish AD and healthy classes. Table 1a and b presents discriminative PSVGs and the corresponding p values (p value < 0.01) in alpha and delta sub-bands, respectively. Figures 6 and 7 show the loci presented in Table 1a and b, respectively. The discriminative PSVGs in both sub-bands (alpha and delta) represent decreased FD in AD EEGs compared with elderly normal EEGs.

Table 1 Discriminating EEG channels obtained from PSVG (p value < 0.01)
Fig. 6
figure 6

Loci with discriminative PSVGs of alpha sub-band for distinguishing AD and elderly normal EEGs (p value < 0.01), in both states noted by ( ), for eyes open only ( ) and for eyes closed only ( )

Fig. 7
figure 7

Loci with discriminative PSVGs of delta sub-band for distinguishing AD and elderly normal EEGs (p value < 0.01), in both states noted by ( ), for eyes open only ( ) and for eyes close only ( )

The extracted VGICs show significant discrimination (p value < 0.01) between the AD and normal age groups in all EEG sub-bands except one. Table 2a and b present discriminative VGICs and the corresponding p values (p value < 0.01) in eyes open (in all sub-bands) and eyes closed (in all sub-bands except beta) conditions, respectively. Table 2a shows that the discriminative VGICs in beta and theta sub-bands are increased and in alpha and delta are decreased in AD EEGs compared with elderly normal EEGs in eyes open condition. Figures 8, 9, 10, and 11 show the loci presented in Table 2a and b for sub-bands beta, alpha, theta, and delta, respectively.

Table 2 Discriminating EEG channels obtained from VGIC (p value < 0.01)
Fig. 8
figure 8

Loci with discriminative VGICs of beta sub-band for distinguishing AD and elderly normal EEGs (p value < 0.01), in both states noted by ( ), for eyes open only ( ) and for eyes closed only ( )

Fig. 9
figure 9

Loci with discriminative VGICs of alpha sub-band for distinguishing AD and elderly normal EEGs (p value < 0.01), in both states noted by ( ), for eyes open only ( ) and for eyes closed only ( )

Fig. 10
figure 10

Loci with discriminative VGICs of theta sub-band for distinguishing AD and elderly normal EEGs (p value < 0.01), in both states noted by ( ), for eyes open only ( ) and for eyes closed only ( )

Fig. 11
figure 11

Loci with discriminative VGICs of delta sub-band for distinguishing AD and elderly normal EEGs (p value < 0.01), in both states noted by ( ), for eyes open only ( ) and for eyes closed only ( )

Based on ANOVA test

Using one-way ANOVA test as an open loop feature selection method, a threshold is needed for p value to find discriminative features so that the number of features is significantly less than the number of data for a meaningful classification. A smaller p value means a higher ability in discriminating AD and non-AD groups. Using a p value less than 10−3, five PSVGs (in loci C3 and F8 in alpha band in eyes closed and T4, T6, and Cz in delta band in eyes closed) and three VGICs (in loci P3 and Fz in theta band in eyes open and T6 in delta band in eyes closed) were found. The chosen threshold of 10−3 resulted in a sufficiently low number of features (8) with respect to the data number of 27 data points (7 elderly normal and 20 AD) to yield statistically meaningful classification results. These features are identified in boldface in Table 2a and b.

Classification

To evaluate the accuracy of the classification, 85% of the data (23 data) were selected randomly and used for training and the remaining 15% (4 data) were used for testing. This random selection was repeated 100 times and the average accuracy is reported. For the PCA-RBFNN classifier, using the rough guidelines mentioned earlier, three (roughly equal to 10% of the number of available data set, 27) principal components of the PCA are selected as its output for the subsequent step of RBFNN classification with three neurons in the input layer. Two other numbers were considered: four and five principal components, but no improvement in classification accuracy was achieved. Two nodes in the hidden layer of the RBFNN classifier (for both with and without PCA) were found to yield the most accurate results (a smaller number would yield less accurate results and a larger number would not improve the accuracy any further). A spread parameter (which tunes the spread of the radial basis neurons) of 3.7 was found to yield sufficient convergence in minimizing the training and testing errors for both classifiers (RBFNN and PCA-RBFNN). Table 3 presents the accuracy, sensitivity (percent of AD patients diagnosed as AD patient), and specificity (percent of healthy subjects diagnosed as healthy) for both classifications. It shows the PCA-RBFNN classifier with the input data selected by ANOVA yields the highest accuracy of 97.75 ± 3.2% with sensitivity of 100% and specificity of 91.08% for diagnosis of the AD. This accuracy is considerably higher than the accuracy of 87.8% reported recently by Gomez et al. (2009) based on MEG records and FD.

Table 3 Results of classifications

Conclusion

Complexity in EEG and MEG records is an effective criterion to reveal changes in the dynamics of AD brain caused by loss of neuronal synapses and cortical changes (Geula and Mesulam 1996; Lerch et al. 2005; Fjell et al. 2009; Kuczynski et al. 2010; Stephen et al. 2010). Decreased complexity of the AD EEGs in various regions of the brain in the band-limited EEGs has been reported by a number of researchers based on CD and FD but not with high accuracy. Two reasons can be cited for this: (1) lack of an effective measurement tool for complexity of EEG and MEG, and (2) lack of comprehensive assessment of the measurements. For example, up to now, no results were reported on the investigation of FD in EEG and MEG sub-bands.

Recently, VG algorithm has been introduced as a simple but powerful algorithm to convert time series to graphs while preserving the dynamic characteristics such as complexity. The research presented in this paper confirmed the authors’ hypothesis that complexity of VGs in certain EEG sub-bands presents significant differences between AD patient and control individuals.

A new chaos–wavelet approach is presented for EEG-based diagnosis of AD employing VG. After comprehensive statistical studies, effective classification features and mathematical markers were discovered. Finally, using a two-stage classifier (PC-RBFNN), a high accuracy of 97.7% was obtained using the selected features obtained via ANOVA. Interestingly, the nonlinear features selected by ANOVA are from delta, theta, and alpha sub-bands whose powers recently were reported as promising EEG markers of disease progression in Mild Cognitive Impairment (MCI) and mild AD (Jackson and Snyder 2008). Increased fractality in theta and delta and decreased fractality in alpha show correlation with conventional linear characteristics, called EEG slowing, in AD as well. The brain damage was found mostly in the temporal lobes (T4 and T6), some loci in the central regions (C3 and Cz), some in the loci in frontal lobe (F8 and Fz), and P3 in the left parietal lobe. The temporal lobes have the most critical role in episodic memory which is seriously affected in AD. Recently, an MRI-based study suggested medial temporal lobe atrophy as a suitable characteristic for distinguishing the Alzheimer’s disease from other dementias (dementia with Lewy bodies and vascular dementia) (Burton et al. 2009). The deficit in the dynamics of the temporal lobes discovered in this research implies deficits of AD in episodic memory. The deficiencies found in the central regions may be related to sensory-motor coordination in AD. Deficits in dynamic of frontal lobes generally may lead to deficits in different high level processing such as planning and managing emotions. Damage to the left parietal lobe may impair integration and processing of the sensory information associated with language production and object perception.

The local and global structural changes and impairments in a complex system such as brain affect its dynamical characteristics especially its complexity. Recent MRI-based findings report differences of structural impairments and atrophy patterns in different dementias (Barber et al. 2001; Meyer et al. 2007; Whitwell et al. 2007; Burton et al. 2009). The research presented in this paper offers a comprehensive method for detecting impaired loci and frequency bands in the context of complexity that can be used as an alternative to distinguish AD from the other dementias based on EEG. Also, in addition to diagnostic value, the discovered biomarkers would be more accurate in monitoring and evaluating long-term treatment methods of AD than quantitative EEG markers such as the mean absolute theta and beta 1 (12–15.5 Hz) powers (Kogan et al. 2001).

The authors have, in fact, presented a general methodology for EEG-based diagnosis of the neurological orders that can be used for automated diagnosis of other neurological and psychiatric disorders such as attention deficit hyperactivity disorder (ADHD) (Ahmadlou and Adeli 2010) and autism spectrum disorder (ASD).