Keywords

1 Introduction

A most emergent vector-borne disease known as Chikungunya, caused from chikungunya virus (CHIKV) [3], which is a mosquito-borne alphavirus a member of ‘Togaviridae’ family. Chikungunya was first identified through an outbreak of dengue-like disease between 1952 to 1953 in Tanzania [11]. This disease spreads when the mosquito feeds upon a viremic person (infected person), the virus then replicates in the mosquito before it can be transmitted. The infected mosquito then disseminates the virus to the next person it feeds. CHIKV, a mosquito-borne virus transmits to a new host faster in comparison to other virus-caused diseases. The vectors involved in the transmission of chikungunya are Aedes aegypti and Aedes albopictus. These vectors are also responsible for infecting people with dengue and the zika virus. Fever, joint pain and other symptoms such as heachache, muscle pain, joint swelling and rashes are characteristic manifestation of chikunguniya disease. These signs and symptoms may last for a few days, weeks, months, or even years, corresponding to chronic disease. Due to this uncertainty of diagnosis, CHIKV remains unrecognized. Common symptoms of Chikungunya are also found in dengue and zika illness, hence, it is often misdiagnosed. Moreover, no vaccine is available for immunization against Chikungunya. So treatment solely depends on observation of signs and symptoms [15]. Since signs and symptoms appear to be unreliable in the diagnosis of CHIKV, the development of a system that will address this uncertainty is necessary. For this purpose, an expert system capable of handling the uncertain nature of the signs and symptoms occurring in patients with Chikungunya needs to be developed. In addition, uncertain signs and symptoms arise as it is difficult to observe due to various factors such as miscommunication between patient and doctor, incompetence of patient to express his or her current state of health or inadequate probing by a physician, and so on resulting in uncertainty in diagnosis. Now if this deduction is used to build a rule-based system it will correspond to an ineffective and unreliable decision-making process. Hence, to address such uncertainty in clinical data BRBES is proposed to carry out this research. Moreover, BRBES is capable of dealing with any ‘what if’ scenarios, facilitating accurate decision-making of CHIKV in infected people. This BRBES should be developed to attain an optimal learning model which would minimize the error lying between the observed and expected level of Chikungunya.

The current section demonstrates the problem of this research. A comprehensive outline of other reviews is discussed in Sect. 2. Section 3 narrates the methodology. The experimental outcome are evaluated in Sect. 4. Section 5 concludes this research.

2 Literature Review

An expert system in [1] which is a knowledge base is developed for the diagnosis of Chikungunya to analyze the symptoms reported in patients. This system uses an input-output matrix generated from the questionnaire of patients. However, uncertainty factors were avoided as a result the output matrix corresponds to an unreliable deduction.

In [16] a fuzzy-based expert system is proposed which constitutes five layers that depict knowledge with uncertainty. Nonetheless, fuzzy-based systems depend on assumptions that are not often accepted in many cases.

A survey to evaluate different deep learning techniques on biological data from different domains is carried out in [17]. The research is aimed at inspecting the consequences of different deep learning architectures when applied to various patterns of complex biological datan [23]. However, implementing such models is tricky because of the troubleshooting errors which appear in the code. Also, it is impossible to predict the consequence of a deep learning model on a dataset prior to development. As data processing inside the model is neither transparent nor explainable.

Another survey in [18] gauges the impact of Deep Learning (DL), Reinforcement Learning (RL), and deep RL while mining features in different types of biological datasets. It was discovered that DL and RL consume immense computing power along with storage capacities. So these methods are not a good choice to be applied in a dataset of moderate size. It was also noticed, DL is not free from the problem of misclassification. Now when it comes to RL, it requires a large dataset to produce ab accurate result. For deep RL the process is complex and unstable, especially when working with nonlinear functions such as a neural network to demonstrate a specific action value.

A mobile application is developed [14] to assist industries to identify probable COVID-19 infected suspects among their staff for providing early treatment. Fuzzy Neural Network based on the industry employee database is embedded in the application. The application utilizes Bluetooth sensors, K Nearest Neighbor, and K-means modules to enable the application to tracks, trace, and notifies Covid-19 infection risk when the user is in contact with other employees. In addition, to evaluate the current health state of COVID-19 patient logistic regression, the Bayesian Decision Tree model is used. Since the app is based on industry health data, the credibility of such data cannot be guaranteed to address all uncertainties. Moreover, the use of geo-location by the app raises concerns about the privacy of individuals.

Here, air pollution is evaluated using a BRB-DL model in [13]. Two datasets obtained from sensors are used to train the model. However, BRB-DL was not trained against diverse datasets such as biological data which are complex and uncertain.

Different types of optimization models, namely single and multiple-objective non-linear problem-solving model is proposed in [27] for locally training the BRB. To train the initial BRB systems, optimization models are developed where attribute weight, rule weight, and belief degrees are utilized for learning parameters. This combination has been formulated using the nonlinear objective function to reduce the gap between the initial system and the BRB that has been implemented. However, not much error gap was reduced because the method used fewer learning parameters and did not perform any type of fine-tuning.

Another research in [20] is carried out to investigate different deep learning algorithms used in different domains to achieve success in accomplishing specific tasks. Deep Belief Networks (DBN) are also explored by pre-training them. Here, a layer after layer approach for learning significant weights is undertaken, with the top two hidden layers. Although DBN is able to gauge the difference between erroneous and real data, the hidden layers correspond to implicit training parameters which are not explainable. Furthermore, apart from high computational cost, the model resorts to ineffectiveness since the auto encoders are present in the first layer. This means any error from uncertain data will affect the rest of the layers. Thus, requiring reconstruction of the model [25].

Furthermore, BRBES [2, 4,5,6,7,8, 10, 12, 19, 21,22,23] works efficiently in dealing with uncertain clinical data. Thus, it is adhered as the main module to carry out the detection of CHIKV in patients.

3 Methodology

This section demonstrates the methodology used in developing an expert system which will enable the handling of uncertainty to assess Chikungunya. It provides a description of various components and tools, deployed to build a expert system. For the development of this BREBS based expert system [10], HTML, JavaScript, and PHP have been used. The training module which forms a core part of this system has been introduced and implemented in MATLAB. In order to decrease the error between the experimental and estimated results, an optimal learning model is built. Three distinctive combinations of training parameter sets have been considered while developing this optimal learning model. To construct the optimization model, an objective function is utilized to set the constraints required for the training parameters. Three training parameters namely, rule weights, belief degrees, and attribute weights are applied. This research work presents the design, development, and application of a BRBES which may assist the physicians to provide early treatment and accurately identify the CHIKV.

3.1 BRB Expert System Methodology

An inference mechanism is employed to construct a Belief Rule Base system. There are several steps in inference procedures that are describing below:

3.1.1 Input Transformation

Input transformations administers input values on top of attributes of referential values. For example, Chikungunya assessment belief rule Rk: IF Fever is Medium AND Joint Pain is High AND Muscle Pain is High, AND Headache is Medium AND Joint Swelling is Medium THEN Chikungunya (High, 0.8), (Medium, 0.2), (Low, 0.0). Where “High” = 80%, “Medium” = 20%, “Low” = 0%. Since the summation of the belief degree is (0.80 + 0.20 + 0.00) = 1.00, this demonstrates the completion of Belief Rule.

3.1.2 Activation Weights Calculated

The activation weight is calculated for individual rules in the BRB using the following formula:

$$\begin{aligned} \alpha _{{k}_{i}} = \prod _{i=1}^{T_k} (\alpha _{{k}}^{{i}})^{\delta _{{k}_{i}}} \end{aligned}$$

Where \(\alpha _k\) = joint matching degree, \(T_k\) = antecedent attributes of k-th rule [10]. Once the k-th rule becomes active, its weight of activation, is calculated by using the formula below.

$$\begin{aligned} \omega _k = \frac{\theta _k \alpha _k}{\sum _{j=1}^{L}\theta _j \alpha _j}= \frac{\theta _k\prod _{i-1}^{T_k}(\alpha _i^k)^{\delta '_{ki}}}{\sum _{j=1}^{L}\theta _j[\prod _{i-1}^{T_k}(\alpha _i^j)^{\delta '_{ji}}]}, \delta '_{ki} = \frac{\delta _{ki}}{\max _{i=1,...,T_k}{\delta _{ki}}} \end{aligned}$$

3.1.3 Belief Degree Update

For the missing or ignored input data in antecedent, the belief degree related to every rule base should be updated using the following formula

$$\begin{aligned} \beta _{ik} = \bar{\beta }_{ik} \frac{\sum _{t=1}^{T_k}(\tau (t,k)\sum _{j=1}^J{_{t}}{\alpha }{_{tj}}}{\sum _{t=1}^{T_k}\tau (t,k)} \end{aligned}$$

(t, k) = {1 if \(P_{ik}\) is used in defining \(R_{k}\)(t = 1, ..., \(T_{k}\)) 0, otherwise}

Here, \(\beta _{ik}\) = updated belief degree, while \(\bar{\beta _{ik}}\) = initial belief degree. \(\alpha _{tj}\) = denotes the degree where the value of input is a part of an attribute belief degree inclusive of possible consequences in the activated rules, updated in the rule base.

3.1.4 Rule Aggregation

Here all rules are arrogated. Using the analytical ER [26] algorithm, the final belief degree \(\beta _{j}\) is calculated using the following expression.

$$\begin{aligned} \beta _{j}= \frac{\mu [\prod _{k-1}^L(\omega _{k}\beta _{jk} + 1 - \omega _{k}\sum _{k-1}^N)-\prod _{k=1}^L(1-\omega _{k} \sum _{j-1}^N \beta _{jk})}{1-\mu x [\prod _{k-1}^L 1 - \omega _{k}]} \end{aligned}$$
$$\begin{aligned} \mu = [ \sum _{j-1}^N \prod _{k-1}^L(\omega _{k} \beta _{jk} + 1 - \omega _{k}\sum _{j-1}^N \beta _{jk}) - (N-1) \prod _{k-1}^L(1-\omega _{k}\sum _{j-1}^N \beta _{jk}]^{-1} \end{aligned}$$

\(\beta _{j}\) is the belief degree, linked to one of the consequent values. These values are calculated in an analytical format of the ER algorithm. \(\omega _{k}\) is the activation weight.

3.1.5 Optimal Learning Model

To determine the optimal value of rule weights, attribute weights; and Consequent belief degrees in a BRB system optimal learning module is introduced. To minimize the error gap between the experiential and estimated these learning parameters can be learned from domain experts. These parameters may also be produced randomly. But these may not be exact in 100%. To get the accuracy we have to create trained BRB and for this here use historical data [9]. Figure 1 illustrates the optimization model framework.

Fig. 1.
figure 1

Optimization model

FMINCON function is used for optimization in Matlab to solve the single-objective model. Construction of optimal learning model has the following steps: 1. Construction of an objective function namely “ObjBetaOneAll.m” 2. Constraints have to set for the training parameters. 3. For finding optimal parameter set training module to have to be developed (Fig. 2).

Fig. 2.
figure 2

Flowchart of BRB learning module

Fig. 3.
figure 3

BRBES architecture

3.1.6 BRBES Architecture

The organization of the system component can be defined as system architecture. The system architecture consists of an input and a BRB module to develop a BRBES, and a training module as shown in Fig. 3. Data is collected from different sources based on signs and symptoms of Chikungunya to provide to the input module and process in the BRB main module to predict the risk of Chikungunya. The training module receives training data and initial values from the input module. The trained learning parameters generated from the training module are stored as data in the Knowledge Base. Afterwards, the learning parameters are used to generate the rule base. Then the optimal value for the training parameters is attained by setting parameters to resort to original value. MATLAB is used for high performance to integrate calculation, conception and programming.

4 Experimental Result

In order to perform validation on the result generated by the BRBES, the data is collected from numerous hospitals of Dhaka and Chittagong in Bangladesh. 250 patients are interviewed, with questions regarding their signs and symptoms of CHIKV. Then a databse is formed from the answers recorded of patients. Table 1 represents samples of the collected data, where in column 1, for the patient Rubi Akter whose age is 35 years and occupation is a Housewife suffers from symptoms like high fever, high joint pain, medium headache, medium muscle pain and high joint swelling. On the basis of these symptoms, according to Table 1 the BRBES system outputs a 75.443% chance of CHIKV which is seen to closely match to the opinion of the Physician/Expert who states a risk of 75% of the patient.

Table 1. Collected data from patients

Three distinct sets, namely R1, R2 and R3 of training parameters are developed to train the BRB module [24]. Where, R1 are trained with rule weights, antecedent attribute weights, consequent belief degrees. R2 are trained alongside rule weights and antecedent attribute weights. R3 are trained with antecedent attributes weights and consequent belief degrees.

Optimal learning procedures are applied to obtain optimal values for the three learning parameters in R1. So, for training procedures, data from 200 patients are taken into account. The aim here is to transform the initial BRB into a trained BRB, so that the accuracy of the model is increased along with the correctness in prediction of CHIKV infection in patients. For instance, when R1 set of training parameters are applied, the total number of learning parameters with their optimal learning procedure, appears to consist of ((243 + 243 + (243) * 3) = 1215 (Table 2).

Table 2. Training with rule weights, antecedent attribute weights and consequent belief degrees
Table 3. Training with rule weights and antecedent attribute weights

Similarly, the learning parameter sets of R2, which consists of rule weight and antecedent attribute weight is optimized by applying optimal learning procedure. The same number of training data found in R1 is used in R2. The optimal values obtained from both rule weight and antecedent attribute weight for each of the 243 rules are illustrated in Table 3. The rule weight and antecedent attribute weight of rule “1” which are 1 and 1 is converted to 0.056 and 0.4667 after training. So that the total number of learning parameters in R2, consists of 243 + 243 = 486. Likewise, the training parameter sets of R3 comprising of rule weight and belief degrees. Table 4 illustrates the optimal values for the learning parameters of R3. In Table 4 the value of antecedent attribute weight and belief degree of rule 1 are changed to 0.578 and 0.466 respectively. The total number of learning parameters considered in case of R3 consist of (243 + (243 * 3)) = 972.

Table 4. Training with antecedent attribute weights and consequent belief degrees
Fig. 4.
figure 4

Reliability comparison among R1, R2 and R3

4.1 Reliability of Trained BRBES

The ROC curves for each of the training parameters (R1, R2, and R3) are developed from the data of 100 patients (Fig. 4) as well considering initial BRB and Trained BRBES as illustrated in Table 5. The Area under the curve (AUC) against R1, R2, R3 training parameter sets as well as BRBES which uses initial BRB is also demonstrated in Table 5. It is observed in the table that the value of AUC (0.837) for R1 training parameter sets is 1275, as it uses more number of learning parameter sets. The AUC value for R3 training parameter is the second largest (0.808) because it uses less number of umber learning parameters (972). In addition, the AUC for the R2 parameter sets (0.785) uses the least number of learning parameters. Thus, it can be deduced that number of learning parameters increases the accuracy. Moreover, the BRBES which uses initial BRB obtained less AUC value than trained BRBES. This is because the initial BRBES based on the survey is not much reliable. Thus the BRBES should persist to learn and train in order to generate more accurate prediction.

Table 5. Parameters R1, R2 and R3 for trained BRB

4.2 ROC for Trained BRB

The data of 250 patients have been obtained from various hospitals located in Dhaka and Chittagong. In the beginning, an interview was carried out where the patients questioned about their signs and symptoms of Chikungunya (Table 6).

Table 6. Reliability comparison among R1, R2 and R3

4.3 Comparison of Accuracy of Trained and Non-trained BRBES Using Test Data

250 data is divided into a split ratio of 8:2 for training and testing, so that 200 and 50 data are allocated for training and testing the method. Data for testing is applied to the initial BRBES as well as trained BERES, in order to enable comparison of accuracy. Figure 5 compares the original obsered output with that of the obtained from the initial BRBES, where it is noticed that the original data is significantly greater than that of the output of the initial BRBES. Figure 6 illustrates the real output data with the output obtained for trained BRBES, where the original output data is seen to lie closer to the output data of BRBES. Thus, from the two figures, it is deduced that the accuracy of trained BRBES is greater than that of the non-trained BRBES. For instance, the real output data of a patient can be deduced from the trained BRBES, to be under a 95% risk of Chikungunya. In contrast, from the initial BRBES for the same patient, is shown to have 90% chance of Chikungunya. Therefore, it can argued that better results are obtained from the trained BRBES compared to that of the non-trained BRBES.

Fig. 5.
figure 5

Comparison among real system observed output and BRB (Before training)

Fig. 6.
figure 6

Comparison among real system observed output of the and BRB (After training)

4.4 Comparison among Deep learning and other Machine Learning Algorithm with BRBES

CNN, SVM, and RF are applied to compare the performance of BRBES. Each layer of CNN has two parameters, weights, and biases but BRBES uses three learning parameters namely, belief degree update, attribute weight and rule weight. BRBES produces better results than that seen in CNN. SVM is used to find optimal separating hyperplane that outputs the highest value for the training data. Without distributing input data this algorithm uses input data directly for prediction. As a result, SVM is unable to handle any kind of uncertainty. On the other hand, the use of RF resulted in an overfit model. This is because a large number of trees from the aftermath of uncertain data hindered the performance of RF.

Table 7 demonstrates the AUC’s of Trained BRBES is 0.891, Non-trained BRBES is 0.878, CNN is 0.820, SVM is 0.810, and Random Forest is 0.744. By considering 95% CI, The lower limits and upper limits of AUC, where the CI occurs to be 95% for Trained and Non-Trained BRBES are 0.825-0.950 and 0.770-0.921 where CNN is 0.728-0.887, SVM is 0.716-0.878 and for Random Forest the value is 0.630-0.758.

Table 7. Comparison of AUC of distinctive learning techniques

4.5 ROC for Trained BRB

Figure 7 represents the ROC curves for Non-Trained BRBES, Trained BRBES, ANN, SVM, and Random Forest. Hence, from ROC it is observed that Trained BRBES gives more accurate output than ANN, SVM, and Random Forest. Not only that, but it also performs better than Non-trained BRBES.

Fig. 7.
figure 7

Reliability comparison among BRBES, and other ML algorithms

5 Conclusion and Future Work

Chikungunya is still a concern worldwide, because of no vaccine. It is often misclassified as the signs and symptoms of Chikungunya are similar to other mosquito-borne diseases. The BRBES system proposed in this research will assist countries like Bangladesh where doctor to patient ratio is abysmally low. The goal of this particular research was to deduce a reliable system to accurately diagnose Chikungunya disease from uncertain clinical data in order to provide early treatment. This research work demonstrates the design, development, and application of a BRBES to assist patients alongside physicians to early detect CHIKV so that patients can attain accurate treatment. Moreover, the research outputs a trained BRBES which is able to handle various uncertainties associated with this disease. Also, it was observed that the optimal learning model minimizes the error between the observed and expected level of Chikungunya. The trained BRBES was also compared with initial BRBES and other deep learning and machine learning algorithms. Where the trained BRBES provided more accurate results compared to other methods.

In the future, this research aims to build larger real-time data gathered from wider geographical zones as well as increased referential value. An increased referential value will also ensure greater system validation and better performance of the system. Furthermore, to add a new dimension, other methods such as Transfer Learning will be introduced.