Keywords

1 Introduction

Attention-deficit hyperactivity disorder (ADHD) is a long-term mental health condition that is marked by inattention and/or hyperactivity-impulsivity. It affects how people work and their intellectual growth. ADHD is estimated to impact 5% of children globally and is one of the most frequent mental disorders affecting children and adults [1]. Affected children/people lack the cognitive ability to follow brief talks and operate in a goal-oriented manner. As a result, despite having an IQ above average, their education and work performance is below average. The constant feeling of failure causes most afflicted people to develop further psychological issues, such as anxiety disorders, depression, and drug usage. Two gene mutations: dopamine transporter (DAT1) and dopamine D4 receptor are linked to the ADHD phenotype [11]. Dopamine and noradrenalin metabolism and neurotransmission in the prefrontal cortex and other subcortical areas are all dysfunctional in adults. Three clinical presentations of ADHD are characterized based on the most prevalent features: mainly ADHD-I: inattentive type, ADHD-H: predominantly hyperactive impulsive type, and ADHD-C: mixed type clinical presentations [2, 3].

In the last two decades, several studies have explored the usage of different types of QEEG features belonging to the frequency, spatial, temporal, and spectral domains to distinguish between ADHD and healthy individuals [22, 23]. Absolute power and relative power of a EEG signal were used as QEEG features in [18], with t-test and Principal Component Analysis (PCA) being employed for feature selection. The NEURAL toolbox that includes Range EEG (rEEG) features was proposed by Toole et al. to classify EEG signals [17]. The Approximate entropy (ApEn) as a nonlinear information-theoretic metric and spectral analysis of each band was utilized to analyze the EEG signal in [4]. The results of this investigation demonstrated that the mean approximate entropy of the ADHD patients was considerably lower than that of the healthy individuals over the right frontal regions (Fp2 and F8) while doing a cognitive activity, but not while the subjects were at rest state. Ghassemi et al. conducted a study to investigate EEG signals in adults while performing a Continuous Performance Test (CPT) using three nonlinear features: wavelet entropy (WE), correlation dimension (CD) and Lyapunov exponent (LE), and classified them using the K-Nearest Neighbor (KNN) classifier [5]. Another study by the same authors explores several frequency domain features from Event Related Potentials obtained from Independent Components of EEG (ERPIC), for an adult performing a CPT task [12]. They revealed a significant correlation between the clinical situation of the ADHD and normal adult participants, and several features were selected from independent components of EEG signals for the classification. The chaotic nonlinear dynamics of EEG signals was quantified using the multifractal singularity spectrum, the maximum Lyapunov exponent, and approximate entropy in another research [6]. Features that were highly associated were extracted through the application of PCA. They also analyzed that the greatest Lyapunov exponent (LE) over the left frontal-central cortex was significantly different between ADHD and age-matched control groups. In addition, in the prefrontal cortex of ADHD patients, mean approximation entropy was considerably lower. Finally, they found that nonlinear characteristics were more effective than band power features in distinguishing between ADHD and normal behavior. In another study, fractal dimension (FD), ApEn and LE nonlinear features were used for classification using a multilayer perceptron (MLP) neural network [7]. Double input symmetrical relevance (DISR) and minimum Redundancy Maximum Relevance (mRMR) approaches were used to select the best features for classification. Region-wise nonlinear properties (LE, Higuchi fractal dimension, Katz fractal dimension, and Sevcik fractal dimension) of EEG signals were classified by using a multilayer perceptron neural network in [8].

A novel idea was proposed by TaghiBeyglou et al. in [9] to combine the nonlinear EEG features with temporal and spectral analysis. This work used a combination of filter banks, time windowing techniques, Common spatial pattern (CSP) and nonlinear features for the analysis of ADHD data. Rezaeezadeh et al. developed two classification methods based on univariate data derived from individual EEG recording channels, and multivariate features collected from brain lobes for distinguishing ADHD children from normal children [10]. Entropy measurements were employed as nonlinear univariate and multivariate characteristics in [10]; the authors proved that entropy mapping could be a useful tool to visually track the changes in the brain region. The nonlinear features and decomposition method were recently combined to extract the features from EEG signals for ADHD detection [13]. The EEG signals were decomposed using empirical mode decomposition (EMD) and discrete wavelet transform (DWT), and the autoregressive modelling coefficient and relative wavelet energy were calculated. From the decomposed coefficient, a number of nonlinear features were retrieved for the classification. Several EEG features from different domains are used to diagnose ADHD and healthy individuals. The authors in their previous work have used a mix of amplitude, spectral, range and connectivity QEEG features for alcoholism diagnosis [32].

Several studies have explored the ADHD disorder and their impact on child brain functioning by using EEG. ADHD children show a significant difference in the direct information transfers from one electrode to other as compared to healthy children [20]. Coherence features have been used to find the functional connectivity and synchronization between brain regions of ADHD and healthy subjects [29,30,31]. The direct phase transfer entropy was used to find the flow of information transfer between the brain regions of ADHD and healthy children [21]. To investigate the structural and functional information of ADHD subjects, graph signal processing and graph learning techniques have recently become popular [24]. Deep learning and Convolutional Neural network (CNN) are also some of the recent techniques that have been applied to distinguish between ADHD and healthy children [25, 26]. Our paper proposes a framework that extracts a set of discriminative quantitative EEG features from spectral, amplitude, rEEG and functional connectivity domains to design an automated computer-aided diagnosis system for ADHD children. In this paper, Sect. 2 describes the materials and method for the proposed framework. Section 3 discusses the experimentation results, and Sect. 4 concludes the paper.

2 Materials and Method

2.1 Dataset

We have used a recently introduced ADHD dataset of raw EEG recordings of ADHD and healthy children, available online at [35]. A total of 121 children participated in this study; from these, 61 children were diagnosed with ADHD and 60 were healthy. Out of 61 ADHD participants, there were 48 boys and 13 girls, and the mean age was 9.62 ± 1.75 years. Similarly, among the 60 healthy children there were 50 boys and ten girls, with a mean age of 9.85 ± 1.77 years. An experienced child and adolescent psychiatrist used the DSM-IV criteria listed in [36] to classify children with ADHD. DSM-IV has listed some scales for inattention and hyperactivity-impulsivity symptoms. Some criteria for rating scales of inattention are; - failing to pay attention to schoolwork, less attention in play activity, lack awareness in listing, having difficulties in organizing a task, avoiding the task that requires attention, forgetting daily activities, and getting easily distracted by extraneous stimuli. Some of the criteria for rating scales of hyperactivity-impulsivity are: - difficulty in awaiting a turn in any task, frequently interrupting or intruding on others, fidgeting with hands or feet or squirming in the seat, often leaving the seat in the classroom, and talking excessively.

The patients were referred to the Roozbeh Hospital’s psychiatric clinic in Tehran, Iran, for an ADHD evaluation. EEG signals were acquired using a digital instrument (SD-C24, Sholeh Danesh Co., Tehran, Iran) (Tehran, Iran). The visual attention task was used to develop the EEG recording procedure. The children were given a task in which they were presented with 20 photos of various characters, and were instructed to count them. The images were picked at random, in sizes large enough to be seen, and the number of characters in each picture was calculated at random between 5 and 16. Each image was presented immediately after the child’s response to ensure constant stimulation during the EEG recording. As a result, the length of the EEG recording is determined by the child’s performance. The correct and incorrect replies were not taken into account, and the activity was not developed with rewards in mind. During this experiment, 19 electrodes: Fz, Cz, Pz, C3, T3, C4, T4, Fp1, Fp2, F3, F4, F7, F8, P3, P4, T5, T6, O1, O2, were put on the scalp using the 10–20 system as displayed in Fig. 1. The data is recorded at a 128 Hz sampling frequency with 16 bits EEG resolution. Electrodes A1 and A2 are used as earlobe references.

Fig. 1.
figure 1

19 electrode positions on the brain scalp according to the 10–20 system.

2.2 Proposed Method

The proposed method is divided into four steps. In the first step, we pre-process the raw EEG data and decompose the EEG signal into five frequency bands. After the pre-processing, we extract 540 features from the amplitude domain, 714 features from the spectral domain, 720 features from rEEG and 450 functional connectivity QEEG features for classification. In the spectral domain, we extracted four features (Absolute and relative power, approximate and permutation entropy) using the common average reference montage with respect to the entire signal and the remaining features were extracted using bipolar montage. So, the feature vector used for the classification has 2424 features from spectral, amplitude, rEEG and functional connectivity domains. In third step, we identified the most discriminative 600 features by averaging the feature ranks given by four popular feature selection techniques. The final step is to classify the selected QEEG features using Random Forest, SVM, KNN and ANN classifiers. Figure 2 shows the pipeline of the proposed framework. Each step is described in detail as follows.

Pre-processing.

All EEG signals were digitized at 128 Hz sampling frequency in the pre-processing stage. Then each signal is filtered using bandpass filter to generate five frequency bands corresponding to delta (δ: 0–3 Hz), theta (θ: 4–7 Hz), alpha (α: 8–12 Hz), beta (β: 13–30 Hz), and gamma (γ: 30–100 Hz) [14, 14].

Fig. 2.
figure 2

Proposed framework.

The topographic map represents the activity across the scalp. Figure 3 displays the band-wise topographic map of an ADHD and a healthy participant. In this representation, the blue colour across the scalp represents less activity and the red colour represents high activity. As observed from Fig. 3, the delta band (Fig. 3(a)) shows low activity for ADHD children while performing the task while the gamma band (Fig. 3 (e)) shows high activity.

Fig. 3.
figure 3

Topographic map of ADHD and healthy participants. ADHD band-wise topographic map is displayed from (a) to (e), and the healthy participant band-wise topographic map is displayed from (f) to (j). The blue colour represents less activity and the red colour represents high activity across the scalp. (Color figure online)

Feature Extraction.

In this step, we extract twenty-six QEEG features from each of the 19 channels in each frequency band. Tables 1, 2, 3 and 4 list the extracted features from four different quantitative measurements. Table 1 lists all the features extracted from the spectral domain. Spectral features are obtained by converting the EEG signal into the frequency domain, with frequencies ranging from 0 to 100 Hz at a resolution of 0.5 Hz using the Fast Fourier transformation (FFT). Afterwards, all the features are extracted from each of the five bands for the 19 EEG channels.

Table 1. Spectral measures.

Table 2 lists all the range EEG (rEEG) measurements [16, 16] that are similar to amplitude-integrated EEG and measure the peak-to-peak amplitude of EEG signal.

Table 2. rEEG measures.
Table 3. Functional connectivity measures.

Similarly, Table 3 lists all the functional connectivity measurements. Functional connectivity (FC) refers to features that measure how the neural activity in one brain area interacts with other brain regions. To find the FC features of a brain, we have to arrange the EEG signal in a particular montage. So here, we use the bipolar montage to extract the FC features. Table 4 lists all the amplitude features computed from signal power and signal envelope.

Table 4. Amplitude measures

Feature Selection.

Feature selection is a technique to reduce the input vector dimension for classification. In the proposed framework, we used the ensemble feature ranking method to select the most discriminative features for classification. We used four feature selection techniques: -ANOVA, Chi-square test, Information Gain and Gini Index (GI) to obtain the feature ranks which were then averaged and sorted in the ascending order. The top-600 features were thus obtained from the averaged feature ranks.

Classifiers.

In our experiments, four classifiers are used to predict the accuracies of various methods. These are explained below. The Random Forest (RF) classifier is a collection of individual decision trees, and each tree makes an individual prediction. The most voted class is the final prediction of the classifier. The KNN classifier assigns class labels to the test samples based on their similarity to those in the training set. A distance function is used to find the distance from the nearest neighbours. Support Vector Machine (SVM) classifier separates the classes by finding the best hyperplane for dividing the multidimensional space into categories. The artificial Neural Network (ANN) is a brain-inspired network that consists of an input layer, one or more hidden layers, and an output layer. A weight is associated with each connection and the performance of the network may be improved repeatedly by adjusting the network weights.

3 Results and Discussions

The machine configuration in this study has an Intel(R) Core (TM) i5-8265U CPU running at 1.80 GHz, 8 GB of RAM, with Windows 10 Professional K 64 bit installed. To conduct the experiments and analysis, we used the MATLAB R2019a version.

This section summarizes the findings of this study. A comprehensive evaluation has been conducted to evaluate the proposed framework for classifying the EEG signals of ADHD and healthy children. The accuracy of the proposed method and three state-of-art methods are shown in Table 5. Using the Random Forest classifier, we achieved the best accuracy of 81.82% for the proposed method.

Our feature vector comprises of 2424 features obtained from five frequency bands corresponding to the 19 EEG channels, and ranked separately using Information gain, Chi-square, ANOVA and GI feature selection techniques. Next, we averaged the feature ranks to yield the top-600 discriminative features that are passed as the input to an array of classifiers. We performed five-fold (split the data into 80–20 ratio) cross-validation to train the model. The k-value determines the number of nearest neighbours in KNN. We used k = 8 and the Manhattan function to calculate the distance. We used a RF classifier having 500 trees of depth 8 with six nodes in each subtree. For SVM, we employed an RBF kernel with a penalty value of 10 and an eps of 0.1. We use a rectifying linear unit activation function (ReLU) and two hidden layers with 90 and 10 neurons, respectively, in the ANN classifier. The Adam optimizer, which is a stochastic gradient-based optimizer for network weight optimization is used with 100 iterations. Table 5 presents the findings of the proposed technique for all four classifiers (KNN, SVM, RF, and ANN) in terms of accuracy, F1-score, precision, and recall. We compare our findings to those of [17,18,19], which are state-of-the-art techniques for QEEG-based classifications.

Table 5. Comparison of the classification accuracy of the proposed method and existing work

We also find the band-wise number of features shortlisted in the top-600 selected features for EEG classification. When we carefully examine the result, it is found that bands delta, alpha and gamma are equally important for ADHD classification among all the bands, as we observe in Fig. 4. Similarly, we also compute the participation of each electrode in the top-600 selected features for classification. Figure 5 displays a detailed analysis of the electrode participation for classification. The feature count for a pair of electrodes in bipolar montage is incremented by one for both the electrodes, for visualization purpose only.

Fig. 4.
figure 4

Number of features belonging to the five frequency bands in the top-600 feature subset.

Though all the 19 electrodes contribute to the shortlisted set of features for classification, but FP2, O2, F7 and F8 are slightly more important, as observed from Fig. 5. In [28], it is mentioned that for the eyes open resting state, the frontal and central region, especially electrode FP2, shows some significant activity, and for the eye closing state, the O1 electrode shows significant changes. Similarly, both the frontal and parietal regions are implicated in ADHD and involve brain networks and attention [27]. This evidence indicates that our framework is informative and gives accurate analysis. Our work has some limitations, though, such as it requires a fixed setup where the number of bands is fixed. We may try adaptive sub-bands with more advanced features in future.

Fig. 5.
figure 5

The number of features associated with different electrodes in the top-600 feature subset.

We also tried to find out the more affected part of the brain in ADHD children by examining a reduced feature set comprising of the top-100 ranked features. Figure 6 displays the electrodes contributing to the maximum number of features in the reduced feature set. The O2, P8 and T7 electrodes are related to the right occipital region, right parietal lobe and the left temporal lobe, respectively. Two electrodes are selected from the frontal lobe, F7 associated with the left frontal region, and FP2 associated with the frontal right lobe. Our findings reveal that the brain’s frontal, parietal, and occipital areas discriminate between ADHD and children, resulting in very accurate EEG signal classification.

Fig. 6.
figure 6

The most discriminative electrodes for ADHD detection.

4 Conclusion

In this paper, we designed and implemented a QEEG features based classification framework for identifying ADHD and healthy children. After pre-processing the data, 19 electrodes’ EEG signals were divided into five frequency bands. The spectral, amplitude, rEEG and functional connectivity features were extracted from each frequency band. The average feature rank was used to select the top-600 significant QEEG features for classification. Among the four classifiers, Random Forest gave the best accuracy of 81. 82% as compared to the other three classifiers (KNN, SVM and ANN with 78.51%, 76.86% and 76.93% accuracies, respectively). In future, we will explore the inclusion of more QEEG features with different feature extraction techniques in the proposed framework. More studies on ensemble-based heterogeneous classifiers and spiking neuron-based classifiers for ADHD diagnosis will be the future scope of the work [33, 33].