Keywords

1 Introduction

The main challenges in the domain of clinical decision-making processes are imprecision, uncertainty, and vagueness. The medical practitioners rely on their gained expertise from which they have to reason logically and infer correctly before making a decision regarding a disease. There are problem-specific clinical decision support systems developed with the aid of artificial intelligence, to support and improve this medical decision-making process. Artificial intelligence-based predictions in clinical diagnosis especially in the field of psychology has gained much research interest [1, 2]. Psychological disorders are usually assessed by observing the symptoms or features present in a human, where quantitative tests are having less involvement during a diagnosis. Hence, the differential diagnosis and grading of a disorder is comparatively difficult than that of a disease. Due to this qualitative assessment-based diagnosis, this chapter refers a decision support system as an assessment support system.

AI uses any classification or clustering technique which is a process of grouping individuals having the same characteristics into a set. A classifier can assign a class label to an object based on its object descriptions. Likewise, classifiers are applied to assign grade to a disorder based on the symptoms present on it. These have prompted research to progress into hybrid models, where the combinations enhance classification results. The objectives of this research article are:

  • To illustrate the usage of some complementary soft computing techniques in streamlining the autism diagnostic process with higher accuracy or with lower misdiagnosis rate.

  • To propose a parallel neural fuzzy possibilistic classifier model which gives better accuracy without any uncertainty in grading over an individual neural network that gives vague grading.

1.1 Artificial Neural Networks (ANN) in Decision Making

Various branches of science and technology use neural networks for different applications. The processing capability of ANN allows integrating diverse amount of clinical data to classify the output. Problem-specific diverse data can be processed by an ANN in the context of previous training history to produce clinically relevant output that supports a clinician to take accurate decision [3].

ANN belongs to the family of AI techniques due to its learning and generalization capabilities. ANN can model a highly nonlinear complex system in which the relationship between the variables are unknown. An ANN is formed as a series of nodes organized in layers and are connected through a weighted connection. The input layer receives input data and transfers to the hidden layer through the weighted links for mathematical processing. The intermediate results are then transferred to the next layer, and finally the last layer provides the output. Thus, network can be represented as black box with ‘x’ inputs and ‘y’ outputs.

1.2 Fuzzy Systems in Decision Making

Fuzzy classification is the process of grouping individuals having same characteristics into a fuzzy set. The truth value of a fuzzy proposition function defines the membership function of the above said fuzzy set. Thus, a fuzzy classification corresponds to a membership function (µ) that indicates whether an individual is a member of a class; given its classification predicate (PI),

  • (µ): PF × U −> T, where

  • PF = Propositional function

  • U = Universe of Discourse

  • T = Set of truth values

  • A Fuzzy Rule-Based System contains fuzzy if-then rules of the form:

  • R i :

  • If x is normal then the class is 1

  • If x is low then the class is 2

  • If x is medium then the class is 3

  • If x is high then the class is 4.

An individual vote of each rule is aggregated to find the output of a fuzzy classifier. The purpose of fuzzy classifiers in medical decision making is to mimic the behavior of a human expert physician who is able to diagnose the disease satisfactorily. To automate entire diagnosis process for supporting a human physician with a fuzzy classifier has to be made as computer software.

1.3 Classifier Combination Techniques in Decision Making

Multiple classifier fusion may generate more accurate classification than the constituent classifiers [4]. The outputs of homogeneous classifiers are then combined to form an ensemble for classifying novel patterns. The performance of the ensemble is strongly dependent on the accuracy of individual classifiers.

One of the most widely used ensemble structures is Ensemble Network. They are neural networks having same structure, but with different initializations that are applied to the same classification problem [5]. Such an ensemble network is a homogeneous classification system and the decisions of individual networks are fused using any decision fusion scheme. This ensemble homogeneous NN classifiers can be applied for developing a decision support system.

2 Problem Description and Related Works in Autistic Disorder

Childhood autism is a psychological disorder that disables the verbal and nonverbal communicative skills in a child especially social interaction. The differential grading of this disorder is highly challenging as it depends fully on the knowledge and expertise of a clinician. Autism expresses itself in diverse ways and hence it is prone to misdiagnosis. Early intervention and grading of disorder is necessary, because therapies like speech therapy, psychotherapy, etc., are the only methods to alleviate the problems happened due to the disorder. Conventionally, autism diagnosis and grading are based on assessment tools which are normally provided by a medical expert like: developmental pediatrician, psychologist, speech pathologist, etc. An experienced medical practitioner can easily spot an autistic child, and hence they rarely depends diagnostic tools for an initial screening about the presence of the disorder. Others will usually go for another opinion on any uncertainty in their diagnosis. But, the process of grading the severity of autistic disorder in an early childhood is not straight forward and even expert clinicians too feel difficulty and uncertainty [6]. The clear and vague grades are represented in a set ‘S,’ where Gi is as represented in Table 1.

Table 1 Grade representation

S = {Normal (G1), Probably autistic (G23), Autistic (4)} or

S = {Normal (G1), Mild to Moderate (G23), Moderate to Severe (G34), Severe (G4)}

The steps that lead to a diagnosis are as follows:

Step 1::

Child’s caretaker feels an abnormality in the language or behavior of the child, which led them bring it to the notice of a medical practitioner

Step 2::

Based on the expertise, the clinician makes an initial diagnosis of an autistic disorder

Step 3::

The child is then referred for an autism assessment that rely on standard diagnostic tools

Step 4::

Based on the observations made by the clinician using any of the tools, he sums up the score obtained for each qualitative symptom to calculate a total score. This total score is then compared with the threshold of each grade and classify it accordingly

Step 5::

This diagnosis ends up with a prediction that the child is either: Normal, probably autistic, or severely autistic

The main problem here is to confidently grade autism as Normal (G1), Mild (G2), Moderate (G3), and Severe (G3), where a correct assessment is needed to schedule the frequency of therapy or in other treatments.

This challenging uncertainty in the conventional grading and the improved predictive ability of hybrid soft computing techniques are the motivations behind this study. Better performance of an automated diagnostic system is depending on two factors. First is the identification of relevant symptoms that involves in a disease or disorder. The next factor is the formulation of appropriate function that relates these symptoms to a correct disease or disorder.

Soft computing techniques like fuzzy logic and neural networks have proven its application in clinical decision support systems [3, 7, 8]. Various studies on artificial intelligence techniques and its application in expert systems were conducted by many researchers [9]. The usage of NN for the diagnosis of autism has started in early 1990’s, and on an average a back propagation neural network performed with an accuracy of 95 % [2]. Likewise, Multilayer Perceptron provided a classification of 92 % which was higher than the accuracy of a logistic regression model in autism diagnosis [6]. The combination of fuzzy techniques with neural network has succeeded in improving the classification function in diagnosis application [1, 10].

3 Parallel Neural Fuzzy Classifier Model: An Overview

Parallel neural fuzzy is based on an architecture that integrates an appropriate parallel structure of a neural network and a fuzzy logic. This joint classification mechanism involves two parallel classifiers: a nonknowledge-based neural network classifier and knowledge-based fuzzy logic classifier .The model consists of three layers: an input layer, a parallel neural fuzzy layer, and a joint classification layer (The probabilistic fuser). The neural network has already been trained with a set of training data and can able to classify it to a vague grade. Similarly, the fuzzy system is also built with problem-specific theoretical knowledge for a unique grading. To diagnose a new patient, the input layer sends the input data to the trained neural network and fuzzy system in parallel. The independent neural network and fuzzy system work in parallel and outputs their support or belief toward the grades. The supports of corresponding classes are then fused using a possibilistic classifier for a combined diagnosis.

Figure 1 represents a Parallel Neural Fuzzy (PNF) decision support system model for autism diagnosis, in which an LVQ neural network and a Local Fuzzy system are used in parallel to classify the grade of childhood autism. The algorithm PNF for this problem-specific PNF classifier is as shown in the Table 2. The output units of neural network are G1, G23, G34, G4 and that of fuzzy system are G1, G2, G3, and G4 where a Gi is represented in Table 2. The output grades of the fuser are also G1, G2, G3, and G4 with an improved accuracy.

Fig. 1
figure 1

PNF model

Table 2 Algorithm PNF

4 Implementation Details

4.1 Knowledge Acquisition, Feature Selection, and Dataset Building

Accurate diagnosis of a disorder using any soft computing technique is based on the selection of input features. Knowledge acquisition was done through a group elicitation phase that includes: a developmental pediatrician, a psychologist, and a speech therapist. Major autistics features are addressed in Childhood Autistic Rating Scale (CARS) and a careful selection of suitable features have been carried out using CARS tool. Here, the features are represented through strength of the symptoms which is relevant in helping the grade of the disorder. These provide the information needed to discriminate different grades of childhood autism. Thus, a clinical dataset which contains CARS score of 100 autistic children whose diagnosis is already been made by clinicians were collected and evaluated to form training and testing samples (Figs. 2, 3 and 4).

Fig. 2
figure 2

LVQ model

Fig. 3
figure 3

Fuzzy model

Fig. 4
figure 4

The possibilistic fuser

The dataset properties include: 100 instances, 16 attributes, and 4 grades (G1, G23, G34, and G4). The prediction and generalization abilities of neural networks are strongly depending on the quality of input data and training method. Thus training sample is structured as a matrix (100 × 17), where each row refers to one autistic patient. The first 16 elements in a row represent input features and the last element represents the grade .This dataset have been used for network learning and verification.

4.2 LVQ Neural Network

In a LVQ, the input vectors are quantized to codebook values and are then used for pattern classification. It assumes that a set of codebook values, W {w i | i = 1, 2 … q} and a set of labeled training samples X = {x i | i = 1, 2, 3 … n} are available. Decision regions and boundaries are defined using a similarity measure, i.e., the Euclidean distance [11].

For each iteration ‘k’ until the stop criterion is not satisfied do steps 1–4:

  1. 1.

    For each x i , find w i that is closest to x i . Denote it as w c .

  2. 2.

    If the label on x i belongs to w c, i.e., correctly classified, then update w c (k + 1) = w c (k) + alpha(x i  − w c (k)). This moves w c closer to x i .

  3. 3.

    Otherwise, if x i is incorrectly classified then update w c (k + 1) = w c (k)-alpha(x i  − w c (k)).

  4. 4.

    Consider the next element in X.

The Euclidean distances of all output units show the similarity between the input and the output units. This trained LVQ is able to classify an input to one output where the similarity distance is minimum. But the joint decision model takes one more stage to modify the output of LVQ, where the intention is not to find a single output unit. The similarity distances between the input and outputs are normalized to form a degree of support \(\mu \left( i \right)\) or belief to all output units.

The normalization for a possibilistic support is as follows:

For each instance

  1. 1.

    For all output unit ‘j

  2. 2.

    Calculate \(dj\) ⩝ j = 1, 0.4, where \(dj\) = Euclidean similarity measure

  3. 3.

    Create vector \(D = \left[ {di, \ldots dj} \right]\)

  4. 4.

    Find \(Ej = {\text{abs}}\left( {dj - \hbox{max} \left( D \right)} \right)\)

  5. 5.

    Calculate \({\text{Sum}} = \sum\nolimits_{i = 1}^{4} {Ej}\)

  6. 6.

    Bel(j) = \(\mu \left( j \right) = \frac{Ej}{\text{sum}}\)

\(\mu \left( j \right)\) represents the possibilistic value or the degree of support of LVQ to the jth class, where its value ranges in between [0, 1], and form a possibilistic decision vector ‘V’ as given in algorithm.

The conventional CARS-based assessment calculates a total score obtained through symptoms without considering the relationship between 15 input symptoms and its contribution to the overall disorder. In other words, grading is based on a single variable which is the total score. Hence, an LVQ is trained with the 15 symptoms along with the total score (16th feature) for a better accuracy. The outputs are vague grades: Normal (G1), MildModerate (G23), ModerateSevere (G34), Severe (G4). Result shows that rather than giving an accurate unique grading, LVQ performs better for vague grading similar to a clinician’s diagnosis. The class overlapping like MildModerate and ModerateSevere are unable to separate for giving a unique grading like Normal, Mild, Moderate, Severe.

LVQ uses clustering, which is a process of grouping similar data points into same group rather than across the groups. Thus, LVQ is implemented with 16 input units and 4 output units. Each input ‘x i ’ represents the strength of a symptom and the output ‘y i ’ represents a grade.

4.3 Fuzzy Rule-Base Design

To support and improve the accuracy of LVQ along with the refinement of overlapped grades, a fuzzy rule-based system is also run in parallel using the input data. This subsection describes about the design of a knowledge-based autism diagnosis system that uses a fuzzy logic concept. The knowledge obtained from the domain experts during the group elicitation phase are embedded as rules mostly in the form of If-then-Else statements. For example, if there is any history of seizures and its frequency is given, then generate warning as the proneness to autism.

A problem-specific local fuzzy model that uses a Takagi-Sugeno-Kang-type rules has been developed. Local fuzzy rules find the relationship between input (x i ) and the output (y i ), and hence the consequent parts are represented as functions. Thus, fuzzy model tries to find out the contribution of individual symptoms to the overall grade of the disorder and, so the rules are of single input and single output structure. The outputs are clear grades: Normal (G1), Mild (G2), Moderate (G3), Severe (G4). The model uses a triangular fuzzifier that fuzzifies the input symptoms individually, and the inference mechanism uses a first order function to map the input feature to a confidence value for a grade. The confidence value of each symptom to the respective grades is mapped correctly and calculates the cumulative confidence obtained for each grade. Then, the confidence values of 4 output grades are normalized to a possibilistic values to form a possibilistic decision vector ‘U’, as follows:

For an instance

  1. 1.

    For all output grade, \(j\) = 1 to 4

  2. 2.

    Let \(cj\) represents the cumulative confidence

  3. 3.

    Create vector \(C = \left[ {c1, \ldots cj} \right]\)

  4. 4.

    Find \(Ej = {\text{abs}}\left( {cj - \hbox{max} \left( C \right)} \right)\)

  5. 5.

    Calculate \({\text{Sum}} = \sum\nolimits_{i = 1}^{4} {Ej}\)

  6. 6.

    Degree of support (j) = \(\mu \left( j \right) = \frac{Ej}{\text{Sum}}\)

Since this system can give a clear grading, it is used to support and separate the overlapped grades decided by the LVQ, which is similar to the second opinion of a doctor. Thus for a given case, if LVQ classifies as MildModerate (G23), then the Fuzzy system supports to refine it to an exact grade with an improved accuracy through a possibilistic classifier, i.e., either Mild (G2) or Moderate (G3).

4.4 Possibilistic Classifier—The Fuser

The decision vector of neural and fuzzy system contains (G1, G23, G34, G4) and (G1, G2, G3, G4), respectively and passes it to the last layer that contains a fuser.

The fuser considers a value in a decision vector as the belief or support to a grade by that individual classifier. In possibility theory, the belief potential of nested sets are called consonant evidences. Here, the overlapped grades G23 and G2 are consonant evidences supported by a neural network and fuzzy system, respectively. The fuser checks the supports of a grade by the neural network and fuzzy system, and possibilistic rules are applied to corresponding classes’ accordingly. Consider the nested sets in Fig. 5, where G2 \(\subset\) G23. The belief of G2 based on the consonant evidence is as in Eq. 1.

Fig. 5
figure 5

Nested sets

$${\text{Bel}}\left( {{\text{G}}2} \right) = {\text{Bel}}\left( {{\text{G}}2 \, \cap \, {\text{G}}23} \right)$$
(1)

Thus, evidences for grade “Mild” are obtained from G23 and G2, and the combined evidence is calculated using the min operator. Similarly, the evidence for “Moderate” is given through G23 and G34 by ANN, and G3 by Fuzzy system. Its combined evidence is calculated as:

$${\text{Bel}}\left( {{\text{G}}3} \right) = {\text{Bel}}\left( {{\text{G}}23 \, \cap \, {\text{G}}3} \right) \, \cup \, \left( {{\text{G}}34 \, \cap \, {\text{G}}3} \right)$$
(2)

Thus, Possibilistic rules (Pr i ) for consonant evidences are as follows:

Pr1::

Bel(Mild) = min[Bel(Mild), Bel(Mild–Moderate)]

Pr2::

Bel(Moderate) = max(min[Bel(Moderate), Bel(Mild–Moderate)], min[Bel(Moderate), Bel(Moderate–Severe)])

Pr3::

Bel(Severe) = max(min[Bel(Severe), Bel(Severe)], min[Bel(Severe), Bel(Moderate–Severe)])

Pr4::

Bel(Normal) = min[Bel(Normal), Bel(Normal)]

The above rules are applied for all V i and U i , and the G i having the maximum value is considered as the grade of the disorder.

5 Experimental Results and Discussions

This section contains two subsections: LVQ ANN-based autistic grading and its improvement through PNF-based autistic grading through a chart-based comparison.

5.1 LVQ ANN-Based Autistic Grading

The proposed model is implemented and tested using a matlab parallel processing pool. Table 6 shows a sample matlab code of the implemented PNF classifier model. To select a neural network for this application, both SOM and LVQ were designed and trained. The performance of LVQ is better than a SOM due to its supervised form of clustering; results show that LVQ can give a vague classification/grading of almost 94 % similar to a clinician during resubstitution testing using 100 samples. The confusion matrix of LVQ is calculated based on the experimental results which is given in Table 3.

Table 3 Confusion matrix of LVQ ANN

Other performance parameters are also calculated using this confusion matrix and is shown in Table 4. To improve the accuracy of the ANN diagnosis and to separate the vague or overlapped grades, the diagnosis of LVQ is supported with a parallel fuzzy system.

Table 4 LVQ ANN performance

Although the results of LVQ ANN were acceptable, it was unable to separate uncertain grades like G23 (Mild–Moderate) and G34 (Moderate–Severe). This is not only achieved by using a parallel neural fuzzy possibilistic classifier, but also a reduction in error rate or misdiagnosis was also seen.

5.2 Parallel Neural Fuzzy Based Autistic Grading

The similarity measures given by the LVQ are converted to certain possibilistic grades and a possibilistic decision vector ‘U’ is constructed, where some grades are overlapped. Similarly, the local fuzzy model also generates a possibilistic decision vector ‘V’, where the grades are certain. The result of joint decision is illustrated with an example.

In common, μ(1) and μ(4) represents Grades “Normal” and “Severe”, respectively. But μ(2) and μ(3) are represented by NN as “MildModerate” and “ModerateSevere”, where by FS and PNF are “Mild” and “Moderate”, respectively. Table 5 contains the possibilistic support for a grade by LVQ ANN (NN), Fuzzy system (FS), and the possibilistic fuser (PNF) for Case No:58 of the dataset, in which μ(I) represents the possibilistic support to Grade ‘i’. It is clear that LVQ gives maximum support to G23 and local fuzzy to G2. The possibilistic classifier, i.e., PNF takes the decision of NN and FS and joins the consonant evidences using max-min operators.

Table 5 Possibilistic vectors

LVQ diagnoses Case No: 58 as “MildModerate” due to the maximum possibilistic support for μ(2), and Fuzzy system calculates the maximum possibilistic support for “Mild” which is μ(2). The fuser calculates the percentage of support to “Mild” in “MildModerate” which again is the maximum, i.e., μ(2).

5.3 Chart-Based Comparison

Figure 6 is chart representing around 100 cases and its grades diagnosed by an LVQ ANN. This shows that it is able to clearly grade G1: “Normal” and G4: “Severe” only and the majority of cases are G23: “MildModerate” and G34: “ModerateSevere”. Figure 7 represents the chart of PNF classifier in which all the cases have been graded clearly and separately.

Fig. 6
figure 6

Bar chart representing the vague grading of 100 cases by LVQ ANN

Fig. 7
figure 7

Bar chart representing the clear grading of 100 cases by PNF classifier