Abstract
In this current work, Weighted Bayesian Association rules using the Fuzzy set theory are proposed with the new concept of Fuzzy Weighted Bayesian Association Rules to design and develop a Clinical Decision Support System on the Bayesian Belief Network, which is an appropriate area to work in Clinical Domain as it has a higher degree of unpredictability and causality. Weighted Bayesian Association rules to construct a Bayesian network are already proposed. A "Sharp boundary" issue related to quantitative attribute domains may cause erroneous predictions in medicine and treatment in the medical environment. So to eradicate sharp boundary problems in the medical field, the fuzzy theory is applied in attributes to deal with real-life situations. A new algorithm is designed and implemented in this paper to set up a new Bayesian belief network using the concept of Fuzzy Weighted Association rule mining under the Predictive Modeling paradigm named Fuzzy weighted Bayesian belief network using numerous clinical datasets with outshone results.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
The Strong Bayesian association rules are extracted using the Weighted Bayesian Association rule Mining Algorithm (WBAR) was already designed and implemented with outperforming results [1, 2]. In this paper, Predictive Modeling concepts play a crucial role in developing a new algorithm for the medical support system with enormous medical records [3]. Unfortunately, patients' records are not thoroughly mined for effective decision-making to discover hidden patterns [4]. So to analyze medical records, advanced data mining approaches show significant results in the research field, finally contributing to a more accurate and high-performance medical decision support system. Sometimes clinical and treatment decisions are taken on the ground of a doctor's experience and knowledge, despite the inside, which can be extracted from a rich, substantial medical database [5]. And also, due to redundant and interrelated symptoms in medical diagnosis, physicians may fail to diagnose it accurately. Unfortunately, at the early stage, accurate diagnosis of the disease is quite challenging due to interdependence on various features [6].
A Fuzzy clinical decision support system (CDSS) based on a Bayesian belief network (BBN) is proposed, which can support medical staff or any experts with knowledge of patient-specific information to excavate and represent the hidden information when required intelligently [7]. But uncertainty always occurs in every building phase of the decision support process. Uncertain sources are like patients lacking in describing their sufferings accurately, degree of errors in laboratory reports, doctors or nurses sometimes fail to examine precisely their detection results, and it becomes harder to determine one's prognosis. Therefore with machine learning techniques, more advanced and accurate decision support systems should be implemented to adapt to a new environment and implicitly learn from instances. So to build CDSS, various methodologies can be incorporated to predict, assess, and extract information like statistical methods, data mining techniques, Soft computing techniques, and many more can be included, and significant research should be done in academic and practical areas. But several misconceptions arise to tamper with the accuracy of CDSS in the medical field, like representation and interpretation of clinical attributes under uncertainty which need a lot of refined methodology and techniques. So to handle this uncertainty, the current work proposes a new model known as Fuzzy Weighted Bayesian Belief Network (FWBBN) CDSS with new formulas and algorithms. The main contribution of the proposed framework are as follows:
-
Usage of Fuzzy Logic to deal with sharp boundaries, vagueness, and imprecision in medical attributes [8].
-
Weight assignment method on medical dataset attributes [2].
-
And to find the interdependence among attributes and to generate well-built rules, association rule mining is applied.
-
A hybrid novel approach is anticipated, incorporating fuzzy weighted association rule mining rules to build a Bayesian belief network.
The following is the workflow of the research proposal; Sect. 2 briefly points to the related work in tabulated form. Section 3 focuses on research methodology with new formulas and the Fuzzy Weighted Bayesian Association Rule(FWBAR) algorithm; Sect. 4 covers results and discussion; Sect. 5 shows the comparative study; Sect. 6 concludes the work with future scope.
2 Background work
Various soft-computing techniques, including data mining techniques, are surveyed, especially fuzzy logic, weight assignment methods, Association rule mining, and Bayesian belief network. Here Table 1 demonstrates relevant review findings of these techniques used in the clinical domain for building a predictive model are reviewed in the literature.
From the exhaustive literature survey and its relevant finding, the gap is identified to work on the dataset's attributes as attributes have extraordinary importance with sharp boundary problems and are interdependent with some association levels. So to find out the impact of attributes and their interdependencies, a novel idea is proposed in the following section.
3 Methodology
The method of the proposed research work is elaborated using the following proposed algorithm as framed in Fig. 1.
This approach incorporates fuzzy theory with the WBAR mining algorithm [1]. The previous paper discussed the basic concept of the Bayesian belief network, Association rule mining, and types of weight assignments [1]. In this paper fuzzy approach will be incorporated to enhance the accuracy. The fuzzy model is a valuable technique for discovering the presence of imprecision in data patterns and understanding data semantics [30]. The study and experiments are done using a breast cancer dataset and other clinical datasets extracted from the University of California Irvine(UCI) machine learning repository via LUCS-KDD DN software [2, 31].
3.1 Fuzzy property of quantitative attribute
Association Rule Mining (ARM) model plays a significant role in dealing with quantitative data in many applications like temperature, pressure, etc., which are very common [32]. Discretization is needed in an ARM to convert quantitative data into the nominal domain. Here to deal with this, the Apriori-type method is used. Thus, association rule P → Q gives a relationship between nominal values of data items. Consider an example like "(FamilyHistory, yes), (Obesity, severe) → (Diabetics, yes)” [9]. These mined results are affected by partitioned intervals called "Sharp Boundary", particularly for data values near interval boundaries. Numbers of quantitative parameters which suffers from sharp boundary problem are present in the medical field. Consider an attribute Smoking in a particular record of a patient where the Smoking frequency per day is 11 then according to following discretization rules, Smoking [1,2,3] → LungCancer = " Low", Smoking [2,3,4,5] → LungCancer = " Moderate”, Smoking [4,5,6,7,8,9,10] → LungCancer = " High”, Smoking [9-*] → LungCancer = " Severe". In this case, according to a sharp boundary, the patient falls in the severe cancerous zone, which will not give the correct result. Here comes the role of fuzzy logic, using which the patient will partially belong to the different fuzzy sets. Therefore the patient membership value to the fuzzy set should be for example (µ (LungCancer, “low”) = 0.01, µ (LungCancer, “moderate”) = 0.02, µ (LungCancer, “high”) = 0.3) µ (LungCancer, “severe”) = 0.67). Due to the impact of the sharp boundary problem on the quantitative attribute in the ARM model [4], a new idea is proposed known as the Fuzzy Weighted based ARM Algorithm. Then the redefined framework is proposed as Fuzzy Weighted Support (FWS) and Fuzzy Weighted Confidence to adapt to a Fuzzy environment. In this proposed paper fuzzy membership value of each fuzzy set is calculated using the trapezoidal membership function as shown in Eq. 1.
Table 2. shows the fuzzy values obtained for attributes using the trapezoidal membership function named D1. Here tabulation is done for a few attributes, and only five records are populated.
These tabulated fuzzy values of attributes remove the sharp boundary problems present in the medical world. They can further be used to assign different weights using the automatic weight assignment method.
3.2 Weight assignment using maximum likelihood estimation method
After the fuzzification of attributes, the next step is calculating automated weights for each fuzzified value. Here weights are computed using the Maximum Likelihood Estimation (MLE) method [33]. MLE is a statistical method in which parameter estimation is done using probability distribution on the observed data. When enforced with a data set, MLE estimates the model's parameters. This technique discovers the estimate of a parameter which maximizes the probability of a particular observed value for a given training data model. The likelihood function is defined as Eq. 2:
where P is the initial probability of occurrence of a particular event.
L(P) is the likelihood value for probability value P.
x1,x2,…xn is the n instance of a given sample.
Here the calculation starts by finding a prior probability of a class label “yes” value using the training data set. The MLE is measured upon divergent probability values in the neighbouring locality of this prior probability, varying in slight offset amounts to compute the likelihood of the observed data with the highest value, i.e. the probability value for which the Likelihood estimation is maximum is assigned as the weight to that particular attributes. All the weights are calculated using the MLE technique, as shown in Table 3.
In this proposal, novel modifications are done in the medical domain to construct BBN with improved prediction accuracy by fuzzyfing quantitative medical attributes and then applying weights. Hence the core problem is to define the terms and new concepts to build Fuzzy Weighted BBN.
3.3 Fuzzy weighted approach
Consider a dataset comprised of fuzzy relational Database D = { t1, t2, t3…. ti…tn} with a set of attributes A = (a1, a2, ……am}; each aK is related with a linguistic labels set L = {l1, l2, ……lL} for example L = {high, low, moderate}. Consider that each ak is associated with fuzzy set Fk = {(ak,l1), (ak,l2), (ak,l3), ……(ak,lL)}. In the given record rk, each attribute ai is associated with some degree of fuzzy sets. A membership degree in the range [0.0.1] is produced by some degree of association. Consider any fuzzy attribute ai of fuzzy set lj in record rk; the degree of membership will be denoted as rk[µ(Ii, lj)] of dataset D1. Here to generate association rules and strong rules between attributes following definitions and formulas are offered.
Definition 1
Weight of Fuzzy Attribute: Table 3 exhibits the automated weight computed for fuzzy attributes of the breast cancer dataset [14]. This approach is used to give weight W(Ii, lj) to each fuzzy Item I (Ii, lj) where (1 ≤ i ≤ n), (1 ≤ j ≤ L), and (0 ≤ w ≤ 1).
Definition 2
Weight of Fuzzy Attribute Set Record: rk[FASRW(X)] is calculated as the product of the weight of the fuzzy attribute of the set and membership degree of an attribute in a given fuzzy set in the transaction rk as formulated below in Eq. 3.
Definition 3
Weight of Fuzzy Attribute_Set: FA_SW(X) is calculated as the sum of FASRW of all clinical records, and the formula is framed as follows Eqs. 4 and 5.
Definition 4
Support with Fuzzy_Weighted Concept: In this concept, a generalized formula is framed for Fuzzy weighted support of two attributes, Multi attributes and class label.
SupportOfFuzzy _Weight of rule X → Y, where X and Y are set of non-empty subsets of fuzzy weighted attributes is calculated as the sum of weights of all records in which the given Y is true, divided by the total number of records, denoted by SupportOfFuzzy_Weight (X → Y) provided by Eq. 6.
where rk is all transactions for which the given class_label is true.
Definition 5.
Confidence with Fuzzy_Weight Concept: In this concept, a generalized formula is framed for Fuzzy weighted Confidence of two attributes, Fuzzy weighted Confidence of Multi attributes and Fuzzy weighted.
Confidence in the given class label. Confidence Fuzzy_ Weight of a rule X → Y where X is non-empty set of attribute and Y is also an attribute. And it is defined as the ratio of SupportOf Fuzzy_Weight of (X ∪ Y) and SupportOfFuzzy _Weight of (X) as mentioned in Eq. 7.
A new concept known as fuzzy_weighted_bayes_confidence is proposed to construct a fuzzy_weighted Bayesian belief network, i.e. FWBBN.
Definition 6.
To define FuzzyWeighted _BayesianConfidence (FW_BC) consider a rule X → Y which is framed as P (Y|X) as in Eq. 6 and used to assess BN as given below in Eq. 8.
Applying the above algorithm and formulas to various clinical datasets to achieve desired and outshone results.
4 Result and discussion
The model is developed using the proposed methodology and designed formulas in which the dataset's attributes are manipulated using a fuzzy weighted approach related to the generation of strong rules to build the Bayesian networks for the medical domain, which will be an efficient model for higher accuracy. Table 4. reveals the experimental value setup, generation of rules, and extraction of solid rules based on Fuzzy Weighted Bayesian Confidence (FWBC) using a minimum threshold value of fuzzy weighted support and fuzzy weighted confidence to eradicate overfitting and underfitting problem [34]. FWAR mining is applied to generate strong rules to design a Bayesian model termed FWBBN with an efficient and more accurate predictive model in the form of a clinical decision support system.
The experiment shows that the model developed using training data = 70% and test data = 30% with strong rules based on fuzzy weighted Bayes confidence gives the accuracy of 99% for the breast cancer dataset particularly.
5 Comparative analysis
This model is enforced to numerous clinical datasets from the UCI repository for rigorous comparative analysis. The LUCS KDD DATASETS in.num format are downloaded of Heart disease, Pima Indian diabetic, Hepatitis and liver disorder datasets [31]. The results are excellent as FWBBN perform with noteworthy accuracy, proving that the proposed model FWBBN executes efficiently with diverse clinical datasets, as shown in Table 5. This analysis reveals the highest accuracy by setting different minimum threshold values for fuzzy weighted support and fuzzy weighted confidence with varying training and testing datasets ratios. Thus, the proposed model outshone its performance in varieties of the clinical dataset, proving that Bayesian Networks is best suited to work in the clinical world.
The put forward model FWBBN is analyzed with existing fuzzy classification models using various medical datasets in the clinical world. Table 6. manifest the comparisons of the proposed model with other already available state-of-the-art systems like Fine Tuning Fuzzy KNN classifier [35], Spare Bayesian Randon Weight Fuzzy Neural Network (RWFNN) [36], Fuzzy Decision Tree (FDT) [37], Fuzzy Random Forest (FRF)Technique [38], Neuro-Fuzzy Classifier [39], Fuzzy Temporal rule-based classification model [40].
Through rigorous comparisons of the proposed model with existing fuzzy models, it seems FWBBN outperforms when compared with some models and is at par for some. And the experimental results confirmed that the FWBBN is more bonafide and justifiable than other existing models and can be used for various disease diagnoses and refinements.
6 Conclusions and future scope
A new methodology and algorithm for improving WBAR are proposed and termed FWBAR, an efficient algorithm for constructing CDSS using BBN as FWBBN. This proposed algorithm with new formulas and concepts is implemented using the UCI machine learning repository, especially with the breast cancer data, Heart disease data, and many more benchmark datasets to be worked with. The fuzzy approach is applied to reduce the sharp boundary problem in WBAR. Thus, stronger rules will be yielded to datasets using a weighted and fuzzy method. For prediction, FWBBN-CDSS can be utilized very effectively and accurately in terms of high performance, minor error, and low time complexity compared to the conventional Bayesian model. In future work, fuzzy weighted Bayesian rules can be used to generate synthetic datasets most demanded in the clinical world for research and deep analysis, which will be validated using the FWBBN model.
Data availability
The datasets used in this proposal are extracted from the University of California Irvine machine learning repository. Like UCI machine learning breast cancer dataset is extracted from "https://csc.liv.ac.uk/~frans/KDD/software/LUCS-KDDDN/datasets/dataSet.html".
References
Kharya S, Soni S, Swarnkar T (2019) Weighted Bayesian association rule mining algorithm to construct Bayesian belief network. In: Proceedings - 2019 International Conference on Applied Machine Learning, ICAML 2019, pp 27–33. https://doi.org/10.1109/ICAML48257.2019.00013
Kharya S et al (2022) Weighted Bayesian belief network : a computational intelligence approach for predictive modeling in clinical datasets. Comput Intell Neurosci 2022:1–8. https://doi.org/10.1155/2022/3813705
Jameel R, Ashish MS, Mourya K (2022) Predictive modeling and cognition to cardio-vascular reactivity through machine learning in Indian adults with sedentary and physically active lifestyle. Int J Inf Technol 14(4):2129–2140. https://doi.org/10.1007/s41870-021-00721-y
Tech GSM (2011) Decision support in heart disease prediction system using Naive Bayes 2(2):170–176
Yadav DC, Pal S (2022) Thyroid prediction using ensemble data mining techniques. Int J Inf Technol 14(3):1273–1283. https://doi.org/10.1007/s41870-019-00395-7
Anooj PK (2012) Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules. J King Saud Univ Comput Inf Sci 24(1):27–40. https://doi.org/10.1016/j.jksuci.2011.09.002
Sharma A (2022) Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. Int J Inf Technol 14(4):1949–1960. https://doi.org/10.1007/s41870-021-00671-5
Dhyani M, Singh G (2022) A novel intuitionistic fuzzy inference system for sentiment analysis. Int J Inf Technol. https://doi.org/10.1007/s41870-022-01014-8
Ibrahim D (2016) An overview of soft computing. Procedia Comput Sci 102:34–38. https://doi.org/10.1016/j.procs.2016.09.366
Gambhir S, Malik SK, Kumar Y (2016) Role of soft computing approaches in healthcare domain: a mini review. J Med Syst. https://doi.org/10.1007/s10916-016-0651-x
Susmita Mishra MP (2018) Study of fuzzy logic in medical data analytics. Int J Pure Appl Math 119(12): 16321–16342. https://acadpubl.eu/hub/2018-119-12/articles/6/1515.pdf
Mokeddem SA (2018) A fuzzy classification model for myocardial infarction risk assessment. Appl Intell 48(5):1233–1250. https://doi.org/10.1007/s10489-017-1102-1
Fazel Zarandi MH, Seifi A, Ershadi MM, Esmaeeli H (2018) An expert system based on fuzzy bayesian network for heart disease diagnosis. Adv Intell Syst Comput 648:191–201. https://doi.org/10.1007/978-3-319-67137-6-21
Fan CY, Chang PC, Lin JJ, Hsieh JC (2011) A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl Soft Comput J 11(1):632–644. https://doi.org/10.1016/j.asoc.2009.12.023
Paul AK, Shill PC, Rabin MRI, Murase K (2018) Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease. Appl Intell 48(7):1739–1756. https://doi.org/10.1007/s10489-017-1037-6
Adeli A, Neshat M (2010) A fuzzy expert system for heart disease diagnosis. In: Proc. Int. MultiConference Eng. Comput. Sci. 2010, IMECS 2010, pp 134–139
Soni S, Vyas OP (2013) Building weighted associative classifiers using maximum likelihood estimation to improve prediction accuracy in health care data mining. J Inf Knowl Manag. https://doi.org/10.1142/S0219649213500081
Alwidian J, Hammo BH, Obeid N (2018) WCBA: Weighted classification based on association rules algorithm for breast cancer disease. Appl Soft Comput J 62:536–549. https://doi.org/10.1016/j.asoc.2017.11.013
Ramasamy S, Nirmala K (2017) Disease prediction in data mining using association rule mining and keyword based clustering algorithms. Int J Comput Appl 7074:1–8. https://doi.org/10.1080/1206212X.2017.1396415
Horný M (2014) Bayesian networks: A Technical report. Commun ACM 53(5):15. http://www.bu.edu/sph/files/2014/05/bayesian-networks-final.pdf%0Ahttp://portal.acm.org/citation.cfm?doid=1859204.1859227
Xie J, Liu Y, Zeng X, Zhang W, Mei Z (2017) A Bayesian network model for predicting type 2 diabetes risk based on electronic health records. Mod Phys Lett B 31(19–21):1–6. https://doi.org/10.1142/S0217984917400553
Topuz K, Zengul FD, Dag A, Almehmi A, Yildirim MB (2018) Predicting graft survival among kidney transplant recipients: a Bayesian decision support model. Decis Support Syst 106:97–109. https://doi.org/10.1016/j.dss.2017.12.004
Agrahari R et al (2018) Applications of Bayesian network models in predicting types of hematological malignancies. Sci Rep 8(1):1–12. https://doi.org/10.1038/s41598-018-24758-5
Ershadi MM, Seifi A (2020) An efficient Bayesian network for differential diagnosis using experts’ knowledge. Int J Intell Comput Cybern 13(1):103–126. https://doi.org/10.1108/IJICC-10-2019-0112
Setiawan NA, Venkatachalam PA, Hani AFM (2009) Diagnosis of coronary artery disease using artificial intelligence based decision support system. In: Proceedings of the International Conference on Man-Machine Systems (ICoMMS), October, pp 11–13
AdelAzar KD (2019) A method for modelling operational risk with fuzzy cognitive maps and Bayesian belief networks. Expert Syst Appl 115:607–617. https://doi.org/10.1016/j.eswa.2018.08.043
Kingsley C (2020) Adaptive neuro fuzzy inference system for diagnosing coronavirus disease 2019 ( COVID-19). Int J Intell Comput Inf Sci 20(2):1–31. https://doi.org/10.21608/ijicis.2020.40518.1027
Amadin FI, Bello ME (2019) A Bayesian belief network approach for predicting kernicterus. Niger J Technol 38(2):416. https://doi.org/10.4314/njt.v38i2.18
Simsek S, Dag A, Tiahrt T, Oztekin A (2020) A Bayesian belief network-based probabilistic mechanism to determine patient no-show risk categories. Omega. https://doi.org/10.1016/j.omega.2020.102296
Sunita Soni OPV (2012) Fuzzy weighted associative classifier : a predictive technique for health care data. Int J Comput Sci Eng Inf Technol 2(1):11–22, 2012. https://doi.org/10.5121/ijcseit.2012.2102.
UCI machine learning breast cancer dataset. http://csc.liv.ac.uk/~frans/KDD/software/LUCS-KDDDN/datasets/dataSet.html
Dutta P (2022) ORIGINAL RESEARCH A new association coefficient measure for the conflict management and its application in medical diagnosis. Int J Inf Technol. https://doi.org/10.1007/s41870-022-01000-0
Kaur I, Kumar V, Kavitha NT, Mohan P (2022) Maximum likelihood based estimation with quasi oppositional chemical reaction optimization algorithm for speech signal enhancement. Int J Inf Technol. https://doi.org/10.1007/s41870-022-01032-6
Manogaran G, Varatharajan R (2018) Hybrid recommendation system for heart disease diagnosis based on multiple kernel learning with adaptive neuro-fuzzy inference system. Multimed Tools Appl, pp. 4379–4399
Salem H, Shams MY, Elzeki OM, Elfattah MA, Al-amri JF, Elnazer S (2022) Fine-tuning fuzzy KNN classifier based on uncertainty membership for the medical diagnosis of diabetes. Appl Sci 12(3):1–26. https://doi.org/10.3390/app12030950
Altilio R, Rosato A, Panella M (2018) A sparse Bayesian model for random weight fuzzy neural networks. IEEE Int Conf Fuzzy Syst 2018:1–7. https://doi.org/10.1109/FUZZ-IEEE.2018.8491645
Maheshwari V et al (2021) Nanotechnology-based sensitive biosensors for COVID-19 prediction using fuzzy logic control. J Nanomater. https://doi.org/10.1155/2021/3383146
Zeinulla E, Bekbayeva K, Yazici A (2020) Effective diagnosis of heart disease imposed by incomplete data based on fuzzy random forest. Conf Fuzzy Syst IEEE Int. https://doi.org/10.1109/FUZZ48607.2020.9177531
Tarle B, Akkalaksmi M (2019) Improving classification performance of neuro-fuzzy classifier by imputing missing data. Int J Comput 18(4):495–501. https://doi.org/10.47839/ijc.18.4.1619
Kanimozhi U, Ganapathy S, Manjula D, Kannan A (2019) An intelligent risk prediction system for breast cancer using fuzzy temporal rules. Natl Acad Sci Lett 42(3):227–232. https://doi.org/10.1007/s40009-018-0732-0
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kharya, S., Soni, S. & Swarnkar, T. Fuzzy weighted Bayesian belief network: a medical knowledge-driven Bayesian model using fuzzy weighted rules. Int. j. inf. tecnol. 15, 1117–1125 (2023). https://doi.org/10.1007/s41870-022-01153-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-022-01153-y