1 Introduction

The liver is the second most significant internal organ in the human body, and it plays an integral part in metabolism and various other processes, such as red blood cell decomposition. The liver performs numerous essential bodily activities, including digestion, metabolism, immunity, and nutrition storage. These tasks establish the liver as a vital organ; without it, body tissues would quickly perish due to a lack of energy and nutrition. A variety of variables contribute to the development of liver disease. Even though the liver is partly damaged, it works as usual. Generally, doctors can predict liver disease only when the liver is severely affected. Therefore, there is complexity in liver disease detection at an early stage.

Liver disease accounts for approximately two million deaths per year worldwide, one million due to complications of cirrhosis and one million due to viral hepatitis and hepatocellular carcinoma [1]. Cirrhosis and liver cancer account for 3.5% of all fatalities worldwide and are currently the 11th and 16th most common causes of mortality worldwide [1]. Hence, liver testing should be readily available, and the cost of testing should be low.

Machine Learning (ML) applications are making a considerable impact on healthcare. ML is a subtype of Artificial Intelligence (AI) technology that aims to improve the speed and accuracy of physicians' work [2]. Countries are currently dealing with an overburdened healthcare system with a shortage of skilled physicians, where ML provides a big hope. It is crucial in medicine because it can identify patterns in large data sets and improve the process of identifying risk or disease-related diagnostic indicators. ML techniques can assist clinical management and specialists in exploring excellent performance in many medical applications, such as medical image analysis [2,3,4,5], language processing [6, 7], and tumor or cancer cell detection [8,9,10,11,12]. Classification algorithms from ML have begun to be used in clinical treatment. It is possible to extract knowledge using classification algorithms. Access to large-data classification algorithms will aid clinicians in making better judgments and improving patient consequences through accurate liver disease prediction. Many ML-based techniques have been employed in the early identification of liver disease.

Onwodi Gregory has recommended two actual liver patient datasets to create classification algorithms to predict liver illness [13]. The datasets are subjected to eleven different data mining classification techniques. According to the experiments' results, the FT Tree algorithm's classification accuracy is superior to that of other algorithms. It also demonstrates outstanding performance, with findings of 78% accuracy, 77.5% precision, 86.4% sensitivity, and 38.2% specificity. Backpropagation Neural Networks and Radial Basis Function Neural Networks have been proposed by Ebenezer Obaloluwa Olaniyi et al. to diagnose disorders and avoid misdiagnosis of liver condition patients [14]. The algorithms are compared to the c4.5, CART, Naive Bayes, and Support Vector Machine (SVM) algorithms, and it is determined that the Radial Basis Function Neural Network is the best model since it has a recognition rate of 70%, which is more accurate and efficient than the other algorithms. Tapas Ranjan Baitharua et al. have concentrated on the component of medical diagnosis through the learning process using the collected data of liver illness to construct intelligent medical decision support systems that will assist clinicians [15]. This research compares multiple classification algorithms' (J.48, SVM, Random Forest) effectiveness and correction rate in classifying these disorders. A comparative comparison of data categorization accuracy utilizing liver disorder data in various situations is conducted in this paper. The prediction abilities of standard classifiers are quantitatively compared. When the results are analyzed, the Multilayer Perceptron provides the best overall classification result, with an accuracy of 71.59%, compared to other classifiers.

In another study, Ramana et al. introduced a Modified Rotation Forest model using a Multilayer Perception (MLP) model and Random Subset feature selection technique for the UCI dataset for liver disease classification [16]. The accuracy of the UCI liver dataset is given at 74.78% using MLP and the random subset approach. On the India liver illness dataset, the accuracy of the KStar model is 73.07% with Correlation-based Feature Selection (CFS). The NBTree algorithm has been developed by Alfisahrin et al. by combining Decision Tree and Naive Bayes algorithms [17]. The accuracy of the NB Tree algorithm is 67.01%, while Decision Tree and Naive Bayes's accuracies are 66.14% and 56.14%, respectively. The Naive Bayes algorithm, on the other hand, has the quickest runtime of all the algorithms. The features of the UCI dataset are determined using a ranking approach. Dhamodharn has examined data mining approaches to treat liver illness [18]. He has compared two models: FT growth and Naive Bayes. Compared to the FT growth model, he has found that Naive Bayes (75.54%) outperforms in terms of accuracy (72.66%). A total of 29 datasets with 12 different attributes are compared.

Gulia et al. have investigated intelligent algorithms for classifying liver patients using UCI datasets [19]. Different algorithms, such as J48, MLP, Random Forest, SVM, Bayesian Network, and the WEKA tool, are used in this study report. J48 scored 70.669%, MLP scored 70.8405%, SVM scored 71.3551%, Random Forest scored 71.8696%, and Bayes Net scored 69.1252% in accuracy measurement after feature selection. Vijayarani et al. have employed Support Vector Machine and Naive Bayes classification methods to predict liver disorders [20]. MATLAB is used to analyze the data. In 1670.00 ms, Naive Bayes achieved 61.28% accuracy, whereas SVM achieved 79.66% accuracy in 3210.00 ms. To predict fatty liver disease, Islam et al. have developed four classification models (Random Forest, Support Vector Machine (SVM), Artificial Neural Network (ANN), and Logistic Regression) [21]. The Logistic Regression technique outperforms all other ML algorithms (accuracy 76.30%, sensitivity 74.10%, and specificity 64.90%).

Singh et al. have created computer programs based on classification methods (such as Logistic Regression, Random Forest, and Naive Bayes) to estimate the likelihood of developing liver disease from a set of data that included the results of liver function tests [22]. Particle Swarm Optimization (PSO) with SVM has predicted the most crucial features for liver disease identification with the highest degree of accuracy over SVM, Random Forest, a Bayesian network, and an MLP-Neural Network. SVM has outperformed Bayesian and other earlier models in terms of accuracy in predicting drug-induced hepatotoxicity with fewer molecular descriptors [23]. A Convolutional Neural Network (CNN) model in hepatitis-infected individuals has accurately identified liver cancer with a 0.980 accuracy rate [24]. When used with imaging data sets, Neural Network techniques can aid in the differentiation between various forms of liver tumors [25].

Due to the adverse effects of liver disease on society, significant efforts have been undertaken to improve the diagnosis and treatment of the condition. Therefore, it is important to determine the most significant attributes for liver disease prediction. In prior works, most studies have looked into how classifiers can reliably detect liver disease cases. However, only a few studies have made an effort to examine all of the patient's conditions and pinpoint the most significant variables required for liver disease prediction. Studies in related fields have shown that selecting the critical feature is crucial for healthcare professionals to understand how the risk factors for liver disease interact and how each affects the precision of liver disease prediction. The current work has presented a high-performance paradigm for efficiently finding the most significant features of liver disease by using sensitivity analysis. Sensitivity Analysis is essential in the medical field to discover the most crucial attribute responsible for the prevalence of the disease. It is a strategy for modifying model input in a controlled manner and evaluating the impact of these changes on the model output. This method reveals the model's sensitivity to these changes and the impact of particular features on the model's performance. It is crucial to determine how reliable the findings from clinical trials are. It plays a vital role in interpreting or proving the integrity of the findings [26]. This study has used standard deviation-based formula for calculating sensitivity analysis. Seven attributes (Age, Gender, Total Bilirubin (TB), Direct Bilirubin (DB), Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and A/G (Albumin/Globulin ratio)) have been used as features to detect liver disease. Data have been collected from several clinics and hospitals in different districts of Bangladesh. The patients’ data are pre-processed and analyzed. The collected data are arranged according to category- (male and female), and the affected rates for men and women are calculated. This research has conducted experiments employing ML algorithms for prediction and compared them to the data set of liver disease patients using some assessing criteria. Four different ML algorithms (Bagged Tree, SVM, K-Nearest Neighbor (KNN), and Fine Tree) are utilized in this study. Bagged Tree has been successfully applied in a variety of medical fields, including disease prediction [27,28,29], medical image recognition [30, 31], and gene selection [32,33,34]. SVM has a large number of applications in the medical industry, such as Breast cancer [35,36,37], skin cancer [38, 39], and many other issues relating to disease prognosis. SVM can also attain greater generalization ability in small sample classification assignments. It is also widely utilized in many other domains, including handwritten character recognition, text classification, image classification, and recognition [40,41,42,43,44]. The supervised learning algorithm KNN is primarily employed for classification tasks. It has been extensively applied to disease prediction [45, 46]. Almustafa et al. have used KNN to classify the heart disease dataset [47]. The use of Fine Tree models aids in the early detection of cancer [48, 49], diagnosing cardiac arrhythmias [50, 51], forecasting stroke outcomes [52,53,54], and assisting with chronic disease management [55, 56]. There are no works that perform prediction of liver disease on the same raw data using these four algorithms and pick the best one based on accuracy. The effectiveness of the model is assessed using the confusion matrix and all pertinent metrics, such as the ROC curve, True Positives, True Negatives, False Positives, False Negatives, error rate, accuracy, True Positive Rate, and False Positive Rate, etc. This study focuses on ML algorithms, which produce improved results that can aid physicians in making correct diagnoses. Our best-performing approach constantly provides an accuracy of over 81.3% for all collected data.

The remainder of the paper is laid out as follows. The phases and strategies used in this suggested system are described in Sect. 2. Section 3 offers experimental results for each classifier, displays a comparison chart to investigate which algorithms are more accurate, and conducts sensitivity analysis experiments on the dataset. Finally, Sect. 4 concludes the paper by laying out future guidelines.

2 Methodology

This research aims to develop a model that can predict liver disease in an automated and accurate manner as soon as possible. The research methodology of this study is divided into the following distinct sections: data collection, pre-processing, ML techniques, training and testing, performance analysis, comparative analysis, and sensitivity analysis to attain the research goal. In this section, different ML classification algorithms and their implementations used for predicting liver disease are discussed in detail, and the whole research process is shown step by step. This research of analyzing liver disease is summarized in Fig. 1. It starts with collecting patient details. The data of the patients are then analyzed and normalized. Then normalized data is used in training, testing, and validation. Once training, testing, and validation output is acceptable after comparing all other algorithms, the final model is selected for predicting disease.

Fig. 1
figure 1

Workflow of the liver disease prediction model

2.1 Data collection and processing

The first stage in developing a model is gathering and analyzing data. The details of data collection and processing are discussed below.

2.1.1 Attributes

For the proper diagnosis, evaluating the main attributes of liver disease is necessary. It is found that the attributes of liver disease are Gender, Age, Total Bilirubin, Direct Bilirubin, Total Protein, Albumin, Alanine Aminotransferase, Aspartate Aminotransferase, Albumin/Globulin, Sodium, Potassium, White Blood Cell, Hemoglobin, Body Mass Index, and Red Blood Cell, etc. [13,14,15,16,17,18]. Seven parameters are selected from these attributes in this work, and their details are shown in Table 1.

Table 1 Dataset description

2.1.2 Data collection and normalization

The dataset is collected from Dhaka Medical College Hospital and two other private hospitals in Bangladesh. The dataset contains 203 data samples from patients with liver disease and 101 data samples from healthy patients. The differences in the ranges of variables appear; hence, it is essential to normalize the data. Normalization is accomplished according to the equation as follows:

$$\mathrm{Xnormalized }=\frac{x-\mathrm{min}(x)}{\mathrm{max}\left(x\right)-\mathrm{min}(x)}$$
(1)

where Xnormalized is the updated normalized value.

The lowest value of each feature is taken as ‘0’, the highest value is considered as ‘1’, and all other values are converted to an integer between ‘0’ and ‘1’.

2.2 Machine learning techniques

The sample of the liver disease prediction model used in this proposed system is given in Fig. 2. It starts with the input parameters of the dataset. After the pre-processing, the dataset is fed into the proposed ML models. This study has used four classification algorithms of ML for comparative analysis.

  1. a.

    Bagged Tree

  2. b.

    Support Vector Machine (Linear SVM)

  3. c.

    K-Nearest Neighbor (K-NN)

  4. d.

    Fine Tree

a. Bagged tree The term “bagging” is an acronym for “bootstrap aggregating,” which uses the original data n times with replacement and a bootstrap or sampling strategy to produce training sets. It is an algorithm for improving the accuracy and stability of ML algorithms used in statistical classification and regression. Additionally, it lowers variance and aids in preventing overfitting. Assume that there is a training data set S with T examples in it. Bootstrap sampling creates a sample of training examples Si by randomly selecting m examples and replacing them with ones from S. The substitution suggests that examples might be repeated in Si. The T classifier is then trained by bagging on each T bootstrap example after the creation of T bootstrap samples. A new instance is classified with the help of the weighted majority of the T-learned classifiers. The result is an ensemble of classifiers.

b. Support vector machine A relatively recent development in supervised ML is the SVM. The kernel Adatron technique is used to implement the SVM. By isolating those inputs near the data's borders, the kernel Adatron maps inputs to high-dimensional feature space and then optimally divides the data into the appropriate classes. As a result, the kernel Adatron is particularly good at separating data sets with complex boundary relationships. SVM cannot be used to approximate functions; it can only be used for classification.

c. k-nearest neighbor: The supervised ML technique known as the KNN can be used to tackle classification and regression issues. It is straightforward to apply. KNN is a form of instance-based learning, often known as lazy learning, in which all computation is postponed until after the function has been evaluated and the function is only locally approximated. Normalizing the training data can significantly increase the accuracy of this method because it uses distance for classification.

d. Fine treeFine Tree learning is one of the predictive modeling techniques used in statistics, data mining, and ML. To move from observations about an item (represented in the branches) to deductions about the item's target value (represented in the leaves), it employs a Fine Tree (as a predictive model). Classification trees are tree models where the target variable can take a discrete range of values. In these tree structures, the leaves correspond to class labels, and the branches to the attributes combine to form those class labels. Regression trees are Fine Trees when the target variable can take continuous values (usually real numbers).

Fig. 2
figure 2

Liver disease prediction model structure

2.3 Training and testing

Different ML techniques are developed and trained, including Bagged Tree, SVM, KNN, and Fine Tree. The complete data stream is divided into three categories: training, validating, and testing, as illustrated in Table 2. All of the models are trained using 70% of the data. The validation dataset is notably useful for preventing overfitting the dataset. So, 15% of the data are used for validation. A comparative analysis of the models is conducted to determine which model is the best based on its performance. Finally, new and unused samples (15%) are used to test the models.

Table 2 Proportions of data in each dataset

2.4 Performance measurement

This study has used different evaluation metrics to evaluate the efficacy and usefulness of classification algorithms for liver disease prediction. A confusion matrix and all relevant metrics, including the ROC curve, True Positives, True Negatives, False Positives, False Negatives, error rate, accuracy, True Positive Rate (TPR), and False Positive Rate (FPR), etc., are used to evaluate a model's performance.

Confusion matrix: One of the most straightforward methods for assessing the model's efficacy and accuracy is the confusion matrix. Several classifications can be assigned to an outcome; it is used to address classification difficulties. A table containing the two dimensions, “Actual class” and “Predicted class”, in each dimension is the confusion matrix. Rows are the actual classifications, while columns are the predicted ones. Two classes, Class 0 and Class 1, are present in the dataset. Table 3 following is a confusion matrix that has been made:

Table 3 Confusion matrix

True Positives (TP): True Positives are the cases when the actual class of the data point is True, and the predicted is also True.

True Negatives (TN): True Negatives are the cases when the actual class of the data point is False, and the predicted is also false.

False Positives (FP): False Positives are the cases when the actual class of the data point is False, and the predicted is True.

False Negatives (FN): False Negatives are the cases when the actual class of the data point is True, and the predicted is False.

True Positive Rate (TPR): It is calculated as the number of correct positive predictions of liver disease divided by the total number of positives. It is also called recall (REC) or sensitivity.

$$\mathrm{Sensitivity }=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$
(2)

True Negative Rate (TNR): It is calculated as the number of correct negative predictions of liver disease divided by the total number of negatives. It is also called specificity.

$$\mathrm{Specificity }= \frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}}$$
(3)

Accuracy (ACC): Accuracy is calculated as the number of all correct predictions of liver disease divided by the total number of the dataset. Accuracy comparison is based on the performance among the four classification algorithms.

$$\mathrm{Accuracy }=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{FP}+\mathrm{TN}+\mathrm{FN}}$$
(4)

ROC: ROC stands for Receiver Operating Characteristics, a visual illustration of performance evaluation for classification problems. The ROC graph is constructed with TPR on the Y-axis and FPR on the X-axis.

2.5 Sensitivity analysis

The objective of sensitivity analysis is to figure out how input and target factors interact. In liver disease prediction, seven attributes (Age, Gender, Total Bilirubin (TB), Direct Bilirubin (DB), Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and A/G (Albumin/Globulin ratio)) of the dataset have been used as input variables. The output variable is categorized into two classes: absence (num = 0) and presence (num = 1) of the liver disease. The statistical characteristics of the dataset (inputs and output) are presented in Table 4.

Table 4 Statistical characteristics of the inputs and outputs

This study has used the standard deviation-based based formula for calculating sensitivity analysis. This method calculates the sensitivity when each input and output is evaluated at its mean. Then all the parameters (input and output) are evaluated at their mean plus or minus some multiple of its standard deviation (Mean ± Std) [57]. This study has changed one variable at a time from its mean value to (Mean ± 4 Std) while keeping all other parameters held constant at the reference condition. The reference condition is decided to be the means of the experimental values [58]. The percentage of change of an input variable has been determined using the following formulas if the input variable's mean and standard deviation values are Xmean and Xstd, respectively.

$$ \left( {\left( {X_{{{\text{mean}}}} \, \pm { 4 }X_{{{\text{std}}}} } \right) \, - \, X_{{{\text{mean}}}} } \right)*{1}00)/ \, X_{{{\text{mean}}}} ) $$
(5)

The percentage change in input is divided by the percentage change in output to determine sensitivity. The output's sensitivity to each independent variable is then calculated by repeating those processes.

3 Implementation and analysis

In this study, various ML techniques are employed to forecast liver disease. These techniques' effectiveness has been assessed, and a comparison between them has also been completed. The following section analyzes the data and presents the findings before moving on to the section on performance evaluation for various classification techniques.

3.1 Data analysis

The collected data are analyzed and arranged according to category—male and female, as shown in Fig. 3 and Table 5. Out of 304 samples, it was discovered that 203 people had been diagnosed with liver disease overall. According to the analysis, the affected rates for men and women are 22.17% and 77.83%, respectively. The average number of females with liver disease is more than that of males.

Fig. 3
figure 3

Relationship of the gender feature to liver disease

Table 5 Analysis of liver disease dataset

3.2 Performance analysis

The confusion matrix and all relevant metrics, including the ROC curve, True Positives, True Negatives, False Positives, False Negatives, error rate, accuracy, TPR, and FPR, are used to evaluate and examine the effectiveness of the algorithms. The following section presents the performance analysis of each algorithm.

3.2.1 Bagged tree

Figure 4 depicts the confusion matrix that was created as a result of the Bagged Tree's training and testing on the gathered data. The training and testing results are also included in Table 5. In the confusion matrix, the green cell indicates that the output matches the target, while the red cell indicates that it does not.

Fig. 4
figure 4

Confusion matrix of bagged tree

According to Fig. 4 and Table 6, the Bagged Tree correctly predicts 69 samples for the negative class (Class 0) and incorrectly predicts 33 samples. The ratio of True Positives to False Positives is 67.6–32.4%. This classifier can determine 178 instances) rightly for the positive class (Class 1) with an 88.1% True Negative Rate and 11.9% False Positive Rate. The total correct prediction instances are 247, and the total wrong prediction instances are 57, which causes the overall percentage of right predictions and the wrong prediction to be 81.3% and 18.7%, respectively.

Table 6 Accuracy table of Bagged Tree

Figure 5 depicts the ROC curve of Boosted Tree. The X-axis represents False Positive Rate, whereas the Y-axis represents True Positive Rate. The area under the ROC curve is 0.86 for both Class 0 and Class 1.

Fig. 5
figure 5

ROC curve of Bagged Tree

3.2.2 Support vector machine

The confusion matrix produced from the results of training and testing of SVM on the collected data is displayed in Fig. 6. Table 7 also displays the training and testing results. The red cell in the confusion matrix shows that the output is not matched with the target, whereas the green cell shows that the output matches the target.

Fig. 6
figure 6

Confusion matrix of SVM

Table 7 Accuracy table of SVM

According to Fig. 6 and Table 7, SVM adequately predicts 49 samples for the negative class (Class 0) and incorrectly predicts 53 samples. The True Positive Rate is 48.0%, and the False Positive Rate is 52.0%. SVM can accurately predict 162 samples in the positive class (Class 1) scenario and 40 samples inaccurately. The False Negative Rate is 19.8%, and the True Negative Rate is 80.2%. It is proven that there were 211 total cases of correct prediction and 93 total instances of incorrect predictions, resulting in overall percentages of right predictions and wrong predictions of 69.4% and 30.6%, respectively.

Figure 7 exhibits the ROC curve. In Fig. 7, the Y-axis represents the True Positive Rate, and the X-axis presents the False Positive Rate. The area under ROC is 0.69 for both Class 0 and Class 1.

Fig. 7
figure 7

ROC curve of SVM

3.2.3 K-nearest neighbor

The collected data is trained and tested using KNN. The training and testing results are also shown in Table 8. The results confusion matrix of KNN is shown in Fig. 8. Here, the green cell represents the output class matched with the target class, and the red cell exhibits the output class, which is not matched with the target class.

Table 8 Accuracy table of K-NN
Fig. 8
figure 8

Confusion matrix of K-NN

For the negative class (Class 0), this classifier predicts 72 samples correctly and 30 samples incorrectly, according to Fig. 8 and Table 8. The True Positive Rate is 70.6%, and the False Positive Rate is 29.4%. KNN determines 171 samples rightly for positive class (Class 1) with an 84.7% True Negative Rate and the 31 instances incorrectly resulting in a 15.3% False Positive Rate. It appears that the total correct prediction instances are 247 and the total incorrect prediction instances are 57, which causes the overall percentage of right prediction and wrong prediction as 79.9% and 20.1%, respectively.

Figure 9 exhibits the ROC curve. In Fig. 9, the Y-axis represents the True Positive Rate, and the X-axis presents the False Positive Rate. The area under ROC is 0.78 for both Class 0 and Class 1.

Fig. 9
figure 9

ROC curve of K-NN

3.2.4 Fine tree

The confusion matrix produced by Fine Tree's training and testing on the collected data is displayed in Fig. 10. Table 9 also includes the training and test results. In the confusion matrix, the green cell represents a match between the output and the target, whereas the red represents a mismatch.

Fig. 10
figure 10

Confusion matrix of Fine Tree

Table 9 Accuracy table of Fine Tree

The Fine Tree correctly predicts 102 samples for the negative class (Class 0) and wrongly predicts 48 samples, as shown in Fig. 10 and Table 8. The True Positive Rate is 52.9%, and the False Positive Rate is 47.1%. This classifier has a True Negative Rate of 80.7% and a False Positive Rate of 19.3%, and it can correctly identify 163 instances for the positive class (Class 1). There were 247 cases of correct forecasts and 57 instances of incorrect predictions, resulting in overall percentages of correct predictions and incorrect predictions of 69.4% and 30.6%, respectively.

The ROC curve of the Fine Tree is shown in Fig. 11. The Y-axis shows True Positive Rate, and the X-axis presents False Positive Rate. Figure 11 explores the area under ROC for Class 0 and Class 1 as 0.77.

Fig. 11
figure 11

ROC curve of Fine Tree

3.3 Comparative analysis

A comparison of different classifiers- Bagged Tree, SVM, KNN, and Fine Tree is carried out. Accordingly, comparisons with the confusion matrix and ROC are shown in Figs. 12 and 13, respectively. Table 10 compares the performance of Bagged Tree, SVM, KNN, and Fine Tree in terms of prediction accuracy, error accuracy, modeling time, and ROC. Comparisons of accuracy and error among Bagged Tree, SVM, KNN, and Fine Tree are shown in Figs. 14 and 15. It has been found that accuracy rates for Bagged Tree, SVM, KNN, and Fine Tree are 81.3%, 69.4%, 79.9%, and 69.4%, respectively. The Bagged Tree classifier exhibits the highest accuracy (81.3%) compared to the other three techniques, as shown in Table 10.

Fig. 12
figure 12

Confusion matrix of classifiers; a Bagged Tree, b Support Vector Machine, c K-Nearest Neighbor, d Fine Tree

Fig. 13
figure 13

ROC curve of classifiers; a Bagged Tree, b Support Vector Machine, c K-Nearest Neighbor, d Fine Tree

Table 10 Comparison of performance among different algorithms
Fig. 14
figure 14

Comparison chart of accuracy

Fig. 15
figure 15

Comparison chart of error

Figures 14 and 15 show that the Bagged Tree algorithm has a higher accuracy rate of 81.3% and a lower error rate of 18.7%. Figure 16 evidences areas under ROC for Class 0 and Class 1 are 0.86, 0.69, 0.78, and 0.77 obtained for Bagged Tree, SVM, KNN, and Fine Tree, respectively. The highest value of the ROC curve (0.86) is found for Bagged Tree.

Fig. 16
figure 16

Comparison chart of ROC

The amount of time (in seconds) needed to create each classifier's model is shown in Fig. 17. It has been noted that building a model using Bagged Tree, SVM, KNN, and Fine Tree takes 50.27, 16.17, 10.94, and 24.26 s, respectively. Figure 17 demonstrates that the Bagged Tree model is highest in build-in time measurement, which is 50.27 s.

Fig. 17
figure 17

Comparison chart of the time to build the model

The Bagged Tree, SVM, KNN, and Fine Tree algorithms are now used to test new samples that have never been tested. The following are the steps in the algorithm's prediction process:

i. Test the dataset with new instances.

ii. After the training process, export the selected whole model from the software APP in the working space for the prediction process.

iii. Then import the new sample dataset, which is also normalized. In the dataset, all the attribute fields will be the same as the previous full dataset for training purposes. Just the values of the target class are not included.

iv. In the working window, write a specific function for all the trained models exported, and it is 'yfit = trainedmodel.predictFnc(T)'. The trained model is the compact model name, and T is the name of the test dataset.

v. Run the test dataset. Then apply different classifier algorithms for testing purposes.

The main goal of this work is to find the best algorithm that will give better accuracy than the early prediction system of liver disease. To this end, the prediction output is shown in Table 11.

Table 11 Comparisons chart of target and predicted value of new samples

3.4 Comparisons with earlier studies

Some distinctions between this proposed system and past studies are listed below. Most of the studies, like [13, 15, 17], have focused on how classifiers can correctly recognize cases of liver disease. Our suggested approach not only predicts liver disease cases but also considers the impact of each characteristic on the prediction process by employing sensitivity analysis to identify the most critical component that causes liver disease in most cases.

Our suggested approach outperforms various previously published papers. The table below shows a comparison of the suggested model to past research. Table 12 shows that the performance of our model outperforms that of other existing models.

Table 12 Comparative analysis of the proposed system with existing works

3.5 Sensitivity analysis

Sensitivity analysis is used to determine which attributes have the most influence on the diagnosis of liver disease. This function calculates the sensitivity of each attribute to the class to estimate its worth. Sensitivity analysis has been performed on the seven attributes of the dataset using three different methods to determine the most significant attribute responsible for liver illness. Table 13 represents the result of sensitivity analysis for the dataset of this liver disease prediction system. Figure 18 shows the graphical representation of the sensitivity analysis of the standard deviation-based method.

Table 13 Sensitivity analysis of the proposed system
Fig. 18
figure 18

Graphical representation of sensitivity analysis using the standard deviation-based method

It is critical to understand the relative importance of the various factors contributing to liver disease occurrence, to choose the best way to reduce the number of positive cases. The sensitivity analysis indices in this study show how important each parameter is to the prevalence of liver disease. Figure 18 demonstrates that Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and Age are the most influential parameter in the sensitivity analysis method.

  • Alanine Aminotransferase (sgpt) is the most significant parameter in the sensitivity analysis of liver disease. The range of SGPT in a liter of blood serum is 7 to 56 units. High levels of certain liver enzymes can be a significant sign of disease or injury. Liver illnesses include fatty liver or non-alcoholic fatty liver disease (NAFLD), viral hepatitis, autoimmune hepatitis, and liver cancer cause an increase in Alanine Aminotransferase (sgpt) levels [59,60,61]. About 35% of Americans have fatty liver disease, which frequently co-occurs with diabetes and obesity [62].

  • Aspartate Aminotransferase (sgot) is the second most important parameter. A sgot/sgpt ratio higher than 2:1 (where the sgot is more than twice as high as the sgpt) is a sign of alcoholic liver disease [63]. Every year 493,300 people die from alcoholic liver disease, which is 47.9% of all liver cirrhosis deaths [64].

  • Age ranks as the third most crucial parameter. With age, the liver's blood flow and volume gradually decline. Studies employing ultrasound have shown that as people age, their liver capacity reduces by 20–40%. These alterations are caused by a decrease in blood flow to the liver, as evidenced by the fact that those over 65 had a 35% lower blood volume than those under 40. Hepatic sinusoidal endothelial cells and other liver cells, as well as gradually changing hepatic shape and function, are all connected with aging. Additionally, aging might increase the risks for several liver illnesses and act as an adverse prognostic factor, increasing the death rate [65].

  • The development of diseases is greatly influenced by gender. Women are more frequently diagnosed with acute liver failure and toxin-mediated liver diseases, such as alcohol- and drug-induced liver disease. Even though males misuse or depend on alcohol more than women do at a ratio of 2:1 in adults over the age of 26, women are more vulnerable than men to the toxic effects of alcohol on the liver for any given dose of alcohol [66].

  • The serum Albumin/Globulin ratio (A/G) can predict the prognosis of liver illness. A kind of pyogenic infection in the liver called a pyogenic liver abscess (PLA) can be fatal if it is not appropriately treated. Monitoring A/G has significant clinical implications for assessing PLA patients' progress [67].

  • Numerous predictive models have been developed to forecast outcomes and categorize risk in liver cirrhotic patients. Total Bilirubin (TB) is a component of the most widely used predictive models, including the Child–Pugh score and the Model for End-stage Liver Disease (MELD) score. In particular, serum bilirubin level accurately reflects hepatic synthesis and excretory function. Direct Bilirubin (DB) levels rise in liver cirrhosis due to portal flow distortion, intrahepatic cholestasis, and impaired hepatic bilirubin clearance. In the meantime, splenomegaly and portosystemic shunting cause hemolysis, which raises indirect bilirubin levels. Due to the various pathophysiologies of high DB and indirect bilirubin levels, patients with primarily indirect bilirubin may have different prognoses and predisposing variables than those with Direct Bilirubin. Several studies investigated that Direct Bilirubin is more valuable than Total Bilirubin for predicting prognosis in patients with Liver Cirrhosis [68].

The variation of the target in relation to input parameters is depicted in Fig. 19. It is apparent from Fig. 19 that with the growth of the value of Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and Age, the possibility of growing the risk of liver disease is also increased. The expanding value of Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and age have substantial effects on the growth of the liver disease in most cases. Compared to the findings of other studies conducted in this field [59, 62, 66], these results appear reasonable.

Fig. 19
figure 19

Effect of distinct variables on liver disease a Alanine Aminotransferase (sgpt). b Aspartate Aminotransferase (sgot). c Age. d Gender. e A/G ratio. f Direct Bilirubin. g Total Bilirubin

3.6 Impact of the model on healthcare

The liver has several vital functions that keep the body healthy, including the production of bile, which allows the body to use protein, fat, and carbohydrates, as well as the use and storage of fats, sugar, iron, and vitamins, the detoxification of drugs, alcohol, and other potentially harmful substances. Cirrhosis occurs when the liver tissue is destroyed, reducing blood flow to the liver and preventing the liver from performing vital processes for human health. Acute liver failure (ALF) occurs in about 2000 cases yearly, accounting for 6% of all liver-related deaths and 6% of liver transplants [17]. Although ALF is uncommon, it is linked to a high mortality rate. As a result, to avoid acute problems and limit the likelihood of long-term complications, liver disease necessitates ongoing medical care and self-management education. Our proposed model will improve disease diagnosis and benefit the medical profession. These tools will assist clinicians in accurately determining whether or not a patient has liver disease.

It is vital to understand the relative relevance of the various factors that contribute to the occurrence of liver disease to choose the best way to reduce the number of positive cases. This study's sensitivity analysis indices indicate how vital each parameter is to liver disease prevalence. Agegender, Total Bilirubin, Direct Bilirubin, Alanine Aminotransferase, Aspartate Aminotransferase, and Albumin/Globulin ratio are all factors that clinicians consider making initial liver disease diagnoses. The suggested system discovered that the Alanine Aminotransferase (sgpt) characteristic has a considerable impact on the cause of liver disease through sensitivity analysis of the datasets. It can be expected that the proposed sensitivity analysis-based approach will aid clinicians in making decisions about detecting liver disease at an early stage by evaluating Alanine Aminotransferase (sgpt) levels.

4 Conclusion

The number of patients with liver disease is constantly rising, and identifying its symptoms has become challenging. Accurate detection is required to aid the medical professional in prescribing the proper medications and medical care. This study highlights the application of various supervised classification approaches to detect liver disease at an early stage. Data have been collected from several clinics and hospitals in different districts of Bangladesh. Age, gender, TB, DB, Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and Albumin/Globulin ratio have been used as attributes to identify liver diseases. The patients’ data are pre-processed and examined. The diagnosing rates for men and women are also calculated. The diagnosis rate for men is 22.17%, while it is 77.83% for women. This research has conducted experiments employing four ML techniques (Bagged Tree, SVM, KNN, and Fine Tree) for prediction and compared them to the data set of liver disease patients using some assessing criteria. The results of these approaches are evaluated with the confusion matrix, TPR, FPR, ROC Curve, and accuracy. Bagged Tree provides 81.3% accuracy, 18.7% error rate, 0.86 ROC, and 50.27 s to create the model. SVM yields 16.17 s to build the model, with an accuracy of 69.4%. KNN achieves the model building time of 10.94 s, an accuracy of 79.9%, an error rate of 20.1%, and a ROC of 0.78. Fine Tree offers 69.4% accuracy, 30.6% error rate, 0.77 ROC, and 24.26 s to build the model. The experimental results conclude that the Bagged Tree classifier can be considered the best algorithm among other algorithms because of its highest classification accuracy of 81.30%. This study also examines the impact of each attribute on the prediction process by conducting sensitivity analysis to find the most significant factors responsible for most cases of liver disease. Age, Gender, TB, DB, Alanine Aminotransferase (sgpt), Aspartate Aminotransferase (sgot), and Albumin/ Globulin ratio attain 218.93%, 124.17%, 38.53%, 39.99%, 451.62%, 295.61%, and 42.37% sensitivity, respectively. It has been found that Alanine Aminotransferase (sgpt) is the most significant parameter in the sensitivity analysis of liver disease. The proposed approach could benefit physicians in making final predictions about liver patients. Physicians can make very accurate decisions if they use such a tool. More data exploration can lead to more exciting outcomes. It will be our main focus in the future.