1 Introduction

Machine learning is an artificial intelligence technology that automatically allows the AI system to learn from the surroundings and uses this learning to make intelligent decisions. From last few years, machine learning is widely used in several sectors such as retail, media, agriculture finance, healthcare, etc. Figure 1 presents the scenario of the application of machine learning in different areas. The media market has the highest share but it is expected that after few years healthcare will dominate the market [1]. In 2017, the healthcare market was valued at $1806 million and it is expected till 2025 of around $8464 million at a compound annual growth rate (CAGR) of 21.2% till 2025 from 2018 [2].

Fig.1
figure 1

Scenario of application of machine learning market 2018 [1]

Machine learning has a significant and important role in healthcare 4.0. Machine learning is considered as a part of artificial intelligence. Machine learning has a global market of worth $6.9 billion in 2018 and now it is expected huge growth of CAGR of 43.8% till 2025 [1]. The market size of artificial intelligence in healthcare is expected around $31.3 billion till 2025 [3]. The market size of IoT for healthcare was $147.1billion in 2018 and expected a CAGR of 19.9% in each year [4]. Figure 2 shows the scenario of the application of IoT in different areas of healthcare. Use of IoT in healthcare reduces the waiting time in an emergency, tracking for inventory, staff and patient become easy, enhancing the power of drug management, monitoring and reporting become easy, alerts message in case of emergency is sent to doctors, reduces the cost, remote medical assistance, and faster disease diagnosis become easy. At home, it is very difficult to care for the patient for 24 h and sometimes forget to provide medicine on time but use of IoT devices can easily monitor the patient for 24 h and provide an alarm or message notification for medicine [5].

Fig. 2
figure 2

Scenario of application of Internet of Thing market 2017 [4]

The human being begins to suffer from various disorders due to unintentional behavior and lifestyle. The early prediction of disease is a difficult task due to the time taken in the analysis of the patient’s data and it becomes more time-consuming if practitioners try to predict manually. Artificial intelligence-based machine learning techniques make the prediction early, accurate, timely, and easy. Real-time data is collected through the IoT_sensors and it reduces the time of consumption in the collection of patient data from different sources. In real-time, the IoT sensors collect data and communicate with other physical devices. In recent scenarios, medical practitioners are enhancing their computer skills to provide better diagnosis by the use of machine learning techniques.

The technological growth in healthcare from 2015 to 2021 uses machine learning, IoT, fog computing, and cloud computing as healthcare 4.0 created the revolution in healthcare [6]. The performances and accuracy of healthcare models are improved due to the use of machine learning techniques equipped with IoT proceeds using fog and cloud computing concepts. All technology such as fog computing, cloud computing, machine learning, and IoT are the growing technology and it have attracted the attention of the researcher.

Cloud computing has a large storage capacity, processing capabilities, computation capabilities, and the same facilities are also available in fog computing. Fog computing is worked as a catalyst to cloud computing, not as a substitute for it. The major difference between fog and cloud computing is created due to storage space. Fog computing has less storage space in comparison to cloud computing. Due to the less storage capacity, the processing speed of data takes less time in fog computing rather than cloud computing. Due to this advantage of fog computing over cloud computing, we used fog computing for the processing of data. In the proposed model, we used IoT_sensors for data collection purpose (such as temperature, heart rate, blood pressure, etc.), machine learning is used to classify the collected data, fog computing is used for fast computation and cloud computing is used for storage purpose [7, 35, 36]. Different machine learning techniques classify the collected data and distinguish the data between healthy and unhealthy people. Manual analysis of patient data takes a lot of time and makes it difficult for the physician to predict the disease early. The main challenge of fog computing is effectively managing the massive amount of data produced by the exponential growth of IoT sensors.

In this proposed work, we have used seven machine learning classification techniques such as DT, SVM, NB, AB, RF, ANN, and K-NN are used to develop the healthcare model. We have also considered the nine fatal diseases such as heart disease, diabetes breast cancer, hepatitis, liver disorder, dermatology, surgery data, thyroid, and spect heart in our work. In general, these diseases are very potential and affecting most people and their relatives. Motivated by the effect of the diseases, we developed the healthcare model that can be used in hospitals in the early prediction of disease so that general people can save their life and money. With the use of the seven machine learning classification techniques, we developed a healthcare model which provides the prediction about fatal diseases. The aim purpose of this work is to.

  1. 1.

    Develop a fog computing-based healthcare model using machine learning and IoT.

  2. 2.

    Evaluate the performances of the developed healthcare model for different diseases.

  3. 3.

    Compare the performance of the developed model with prior developed models.

The remaining work is organized as follows. Section 2 is used to represent prior work done in this sector and Sect. 3 is used for proposed work. Results and discussion is shown in Sect.4 and finally, Sect. 5, concludes the developed work.

2 Related Work

Perveen et al. [8] presented a healthcare model to predict people with diabetes and they considered AdaBoost and decision tree classifier to design the model. They considered three age groups of the patient like 18–35, 36–55, and more than 55 years. The experimental results show that AdaBoost is better than the decision tree. Wu et al. [9] designed a model predicting liver disease developed the model with four machine learning algorithms (RF, NB, ANN, and linear regression). The model achieved the highest accuracy of 87.48% with an RF classifier. Shankar et al. [10] proposed a model for classifying the thyroid data using feature selection techniques. They used feature selection techniques to enhance the performance of the developed model. The model achieved an accuracy of 97.49%, sensitivity of 99.05%, and specificity of 94.5%. Sisodia and Sisodia [11] presented a machine learning-based model to predict diabetics. The model used only three classification algorithms such as NB, SVM, and DT. The model achieves an accuracy of 76.30% with the NB classifier.

Kumar and Vigneswari [12] designed a model for the prediction of hepatitis disease. The designed work used five machine learning classifiers such as Multilayer-perceptron, RF, DT, C4.5, and logistic regression. The experimental results show that the RF achieved an accuracy of 90.32% and it took 0.14 s in execution. Parisi et al. [13] proposed a hybrid model for the prediction of hepatitis in the patients. The model used lagrangian SVM and MLP to classify the data for hepatitis. The model achieved perfect accuracy and AUC. Hameed et al. [14] presented an e-healthcare model based on cloud computing. The service-oriented architecture is used to design the e-healthcare model, store patient’s details, and provide the correct specialist to the patients.

Vijayarani and Dhayanand [15] developed a model for the prediction of kidney disease. Two classification algorithms (SVM and NB) were used by the authors for the prediction of disease. The performances of classifiers are analyzed according to the accuracy and execution time. As per the result received, SVM achieved higher accuracy in comparison to the NB classifier. Harimoorthy and Thangavelu [16] developed a model to predict multiple diseases such as heart disease, diabetics, and kidney disease using different machine learning classifiers such as SVM-linear SVM-Radial, RF, and DT. The designed system achieved an accuracy of 89.9% for heart disease, 98.7% for diabetics, and 98.3% for kidney disease.

Jahangir et al. [17] designed an automatic Multi-Layer Perceptron (Auto MLP) application for the prediction of diabetes. Enhanced class outlier detection is also used in this technique. It automatically tunes parameters during the training process. The outlier detection is carried out during data pre-processing. Verma et al. [18] proposed the CAD method for determining the risk factor using particle swarm and K-means algorithms. Different learning algorithms were deployed for data extraction such as a multilayer perceptron (MLP), a multilayer logistic regression (MLR), a fuzzy, unordered rule induction algorithm, and a C4.5. The data is collected from Indira Gandhi Medical College, Shimla, India, and the Department of Cardiology. There are 26 features and 335instances in this data set. The experimental results show that 88.4 percent of MLR were most accurate.

A hybrid intelligent healthcare model for the prediction of heart disease is developed by Amin et al. [19]. Three feature selection algorithms (mRMR, Relief, and Lasso) were used to enhance machine learning classifiers' performance. The developed model achieved an accuracy of 89% and 88% with logistic regression and SVM. Muhammad et al. [20] designed a healthcare model to early and accurately predict heart disease using K-NN, AB, DT, RF, NB, LR, ANN, and SVM classifiers and three feature selection algorithms (mRMR, Relief, and Lasso). The developed model achieved an accuracy of 94.41%. Alkeshuosh et al. [21] have applied new diagnostics in the study of the disorder of heart disease and had an overall accuracy of 87%. A model is proposed by Samuel et al. [22] for the prediction of heart failure and the developed model achieved an accuracy of 91.10%.

Haq et al. [23] proposed a framework to predict Parkinson's disease. They used the SVM classification technique to predict the disease. Mathur et al. [24] provided the usage of AI application in the prediction of cardiovascular disease. They also provided the importance of ML techniques in the detection of cardiovascular disease. The authors show the relationship between AI, ML, and cardiovascular disease. Khourdifi and Bahaj [25] proposed a method for the prediction of heart disease using ML techniques. They used nature-inspired optimization techniques to get the optimized features. Zou et al. [26] highlighted ML techniques to predict diabetic Mellitus. 689,994 instances of data (healthy and diabetics patients) are used in work with RF, J48, and neural network techniques. The results show that RF has better performance as compared to other used techniques. Joloudari et al. [27] used to predict liver disease. To design the prediction model, they used several classification techniques and particle swarm optimization techniques. The model achieves an accuracy of 87.37% with the RF classification technique.

Analysis of existing work with identification of gap It has been found that a lot of hard work has previously been done to predict diseases in the health care system after studying exiting research. However, there is still plenty to improve the effectiveness of healthcare disease predictions that will help doctors predict and diagnose patients at an early stage.

3 Materials and Methodology

The IoT describes networking between the zillions of physical devices to collect and communicate data over the Internet. IoT is made of a combination of different sensors and software. It uses wireless communication techniques to establish communication between remotely located devices, mobile devices, and other used physical devices. IoT plays a significant role in the enhancement of healthcare models. Many bodies’ implanted and external sensors are used to collect patient data. Body implanted sensors collect patients' internal data, and the eternal sensor collects the patients' environmental and external data. Doctors analyze the received data for the prediction of disease. In the developed model, various machine learning classification algorithms have been used to classify the collected data to differentiate between healthy and ill people. Machine learning classifiers early and accurately predict the disease. In the proposed model has been used to collect three kinds of patient data through IoT:

  1. 1.

    Homely patient data: In this kind of patient data, the patients are equipped with easily available low-cost IoT_sensors. These IoT_sensors collects the health data of the patient’s and send it to IoT agent for further processing.

  2. 2.

    Laboratories or clinical patient data: In this, the patient reaches clinics and laboratory but there is no availability of concerned doctors but all resources were available. The medical supporting staff used to collect the data of the patients.

  3. 3.

    Remotely located patients data: Here the patient is staying in a remote area or very far away from the hospitals. IoT_sensors are used to collect the patient data and send it to the doctors in real-time to get better treatment.

After collecting data, it is used to send on a fog server for further analysis via any IoT device. The fog server analyzes the data using classification algorithms. It sends the information to the cloud server for storage and the doctors for the patient's early diagnosis. As the implementation is a concern in the development of healthcare model based on Machine learning classification algorithms like as Decision Tree (DT), Support Vector Machine (SVM), Naïve Bayes (NB), Adaboost (AB), Random Forest (RF), Artificial Neural Network (ANN), and K-Nearest Neighbor (K-NN) as an application of AI [37]. These algorithms are applied on data set of heart disease, diabetics, hepatitis, dermatology, thyroid, breast cancer, and liver disorder collected from UCI machine learning repository system. Figure 3 presents the architecture of the proposed system model. In the proposed model, AI and IoT have a significant role. IoT is the first parameter and it is used to connect everything to the internet. It collects and processes the patient data in real-time and the processed data reach the concern without any delay. Second parameter is AI, which works on these collected data to provide the outcomes on time. IoT and AI maintain a huge volume of data and process it effectively.

Fig. 3.
figure 3

The architecture of the proposed AI and IoT based healthcare model

Working of a proposed system model The work is divided into three phases. The first phase is the collection of data, the second phase is pre-processing and computation of data, and third phase is the visibility of results to doctors or end-users and stored at the cloud server.

Collection of data In this, patient data is being collected from different sources such as home, laboratory or clinic and remote data. Different sensors and IoT devices are used to collect the patient data in real-time. Homely patients are equipped with required different sensors. The lab technicians are used to send the laboratory and clinical data to the IoT agent. Remotely located patients are equipped with different sensors. These sensors collect the data and send it to the IoT agent for further processing.

Pre-processing and computation of data In pre-processing, the received data is filtered and checked for missing values. Once the pre-processing is completed the data is sent to the fog server for the computation process. Here seven machine learning classifiers (such as DT, SVM, NB, AB, RF, ANN and K-NN) are being used to for the computation of the data and classify the data.

Decision Tree (DT) DT is a supervised machine learning classification technique. A structure-like tree is used with three nodes known as leaf, internal-nodes (branch) and non-leaf. These three nodes are acts as different attributes and it is used to evaluate the conditional probabilities. The topmost node of the tree is considered as root of the tree, class labels are defined by leaf nodes, and branch nodes are used to derive the decisions of the test. Test is denoted by non-leaf nodes in the DT [28]. Domain awareness isn't needed for the decision tree technique. In addition, numerical and categorical data can be easily interpreted and controlled. In contrast, the performance depends on the dataset and is limited to one output attribute.

Support Vector Machine (SVM) SVM uses the theory of statistical learning and it is also a supervised learning approach. SVM approach is used for binary classification and multi-class problems. The SVM method produces large hyperplanes in high dimensional space, maximizing the distance between data points and using support vectors to construct a hyperplane. Better accuracy can be achieved by SVM but it took high computation time [29].

Naïve Bayes (NB) NB is a supervised learning approach based on Bayes' theorem that is used to solve problems in classification. It is primarily used for high-dimensional training data sets for classification. NB uses a probabilistic classification technique which is based on the likelihood of an object. NB has good accuracy and low computational cost [30].

Adaptive Boosting (AB) Yoav Freund and Robert Schapire has developed the Adaboost classification technique. Adaptive boosting is also known as Adaboost and it is based on meta-algorithms of machine learning. It uses the ensemble method and principle of boosting. AB converts the weak learner into the strong learner. AB made several decision trees or model. More priority is given to the record incorrectly classified during the first model [8]. Only these records are transmitted for the second model as input. The process will be repeated till the developed model reaches to the target. AB is very useful classification technique but it has high computation time.

Random Forest (RF) RF is one of the famous machine learning classification techniques and is based on a supervised learning approach. The random forest algorithm generates decision trees on data samples and then predicts each of them and selects finally the best solution by voting. It's a better ensemble than a single decision tree since it eliminates the overfit by averaging the results [31]. The major advantage of using RF is the reduction in over-fitting. It doesn’t overfit the model. RF has high accuracy and low computational cost [32].

Artificial Neural Networks (ANN) ANN is the most famous machine learning classification technique based on feed-forward neural networks. ANN consists of three layers such as input, hidden, and output layer. In this technique, the input layer takes the input of attributes and hidden process these input data and produce the output to the output layer [9]. The output layer sent back the output to the hidden layer for further processing till the desired out is not achieved. The modification is performed in the training process. The output layer reduces the error in output with the help of the hidden layer.

K-Nearest Neighbor (K-NN) K-NN is based on a supervised learning approach. The technique of K-NN is focused on neighboring data points finding unidentified data points and use a voting system to classifying data points. K-NN technique predicts a new input class label; K-NN uses the resemblance between a new input and its training samples. K-NN is simple to implement but requires massive storage, noise-sensitive, and high computation time [33].

Visibility of results and storage at cloud In this section, the fog server is used to send the computed data to doctors or end-user for early treatment of patients and to the cloud server [34]. Once the outcome is received by doctors, they are used to respond to the patient for the treatment. Cloud server used to store the received record for future use such as billing, future treatment of the patient etc.

Evaluation of classifier’s performance Four metrics are used to evaluate the performance of the seven classifiers. The details of the performance evaluation metrics are as follows.

  1. (1)

    Accuracy: It is the overall performance of the classifier and it is evaluated as

    $$Accuracy = \left( {\frac{{TP + TN}}{{TP + FP + TN + FN}}} \right)*100$$
    (1)
  2. (2)

    Sensitivity: It is ratio between true positive cases and total number of cases affected by the disease. Sensitivity is also known as precision. The sensitivity is evaluated as

    $$Sensitivity{\text{/}}Precision~ = \left( {\frac{{TP}}{{TP + FN}}} \right)*100$$
    (2)
  3. (3)

    Specificity: It is ratio between true negative cases and total number of cases affected by the disease. It is also known as Recall. The specificity is evaluated as

    $$Specificity{\text{/}}Recall~ = \left( {\frac{{TN}}{{TN + FP}}} \right)*100$$
    (3)
  4. (4)

    AUC: It is a graphical comparative analysis of true and false positive rate. The higher value of AUC is considered as the best.

where TP and TN indicate the true positive and true negative prediction of the healthcare model. FP and FN indicate the false positive and false negative predictions of the healthcare model.

4 Results and Discussion

This section explores the experimental outcomes of different classification algorithms such as DT, SVM, NB, AB, RF, ANN, and K-NN. We have used various disease datasets such as heart disease, diabetics, breast cancer, hepatitis, liver disorder, dermatology, surgery data, thyroid, and spect heart. This dataset is collected from “https://archive.ics.uci.edu/ml/datasets.php”. Table 1 shows the used dataset with several samples. Implementation work was carried out at Intel(R) Core(TM) i7 CPU M60 @ 2.80 GHz in Python. For the experimental work, the dataset is divided into the ratio of 80% and 20%. 80% of the dataset is used to train classification algorithms, and the remaining 20% is used for testing purposes. Accuracy, specificity, sensitivity, and area under the curve are evaluated for the seven classifiers.

Table 1 Experimental work dataset

Accuracy of the seven classifiers for different diseases Outcomes of the seven classifiers for accuracy is presented in Table 2. For heart disease, the RF classifier achieves the maximum accuracy of 95.82% and the ANN classifier achieves the second-highest accuracy of 94.61%. NB classifier achieves the minimum accuracy of 84.2% in comparison to others. For the diabetics, the RF classifier performs well with 94.1% and K-NN has the lowest accuracy of 84.63%. For breast cancer, RF achieves the highest accuracy of 96.56% of and SVM has the second-highest accuracy of 96.22%, have a marginal difference with RF accuracy. NB achieves the lowest accuracy of 90.37% for breast cancer. For the hepatitis dataset, the RF get the maximum accuracy of 96.8% and SVM gets the next highest accuracy of 96.65%, Here the performance of AB is also good and having an accuracy of 96.2%. NB gets the lowest accuracy of 91.45%. Next, we tested the model for liver disorder and we get the highest accuracy of 76.9% with the RF classifier and the lowest accuracy of 70.11% with the NB classifier respectively. Next, we tested dermatology; the developed model achieves the maximum accuracy of 97.62% with RF classifier and the lowest accuracy of 88.35% with NB classifier. Next, we conducted the test for surgery data and the highest accuracy of 90.23% is achieved by the SVM classifier and the lowest accuracy of 82.37% is achieved by the K-NN classifier. Similarly, we conducted the test for the thyroid and spect heart disease dataset; we get the highest accuracies of 86.15% and 86.3% with RF classifier for thyroid and spect heart respectively. The lowest accuracies of 81.82% and 81.34% are achieved by NB and DT classifiers for thyroid and spect heart respectively. The developed model taken 132 ms of time in the computation. Figure 4 presents the comparative graphical view of accuracy for the seven classification algorithms with a different disease. In most of the cases, RF classifier achieves the highest accuracy in the prediction of disease. RF classifier is made of with a large number of DT’s. RF classifier provides the results on the voting strategy. Due to this, the RF classifier providing the best results as compared to other classification algorithms.

Table 2 Accuracy of the seven classifiers with different disease
Fig. 4
figure 4

Accuracy of the seven classifiers for different disease

Sensitivity of the seven classifiers for different diseases The seven classifiers' sensitivities outcomes for the different diseases are presented in Table 3. The RF classifier achieves the highest sensitivity of 98.83% and the NB classifier achieves the lowest sensitivity of 87.41% for the heart disease dataset. For diabetics' diseases, the RF achieves the maximum sensitivity of 97.7% and the NB achieved the lowest sensitivity of 88.62%. For breast cancer, 99.62% and 92.25% sensitivities are achieved by the RF and the NB classifiers. For the hepatitis disease dataset, the maximum sensitivity of 99.31% is provided by the RF classifier, and minimum sensitivity of 89.83% is achieved by the NB. The maximum sensitivity of 83.18% with the RF classifier and minimum sensitivity of 72.26% with the DT classifier is achieved by the developed model for the liver disorder disease dataset. In dermatology, the maximum sensitivity of 99.67% is provided by the RF and the minimum sensitivity of 88.63% is provided by the DT classifier. Surgery data achieved the highest sensitivity of 93.78% with the SVM classifier and achieved the lowest sensitivity of 80.36% with the K-NN classifier. The thyroid dataset achieves the highest sensitivity of 89.83% by the SVM classifier. The AB classifier provides a minimum sensitivity of 84.83%. For spect heart disease dataset, the ANN classifier is achieved the highest sensitivity of 90.71% and the lowest sensitivity of 81.4% is achieved by the DT classifier. Figure 5 shows the comparison of sensitivity achieved by seven classifiers for different diseases in the developed model.

Table 3 Sensitivity of the seven classifiers with different disease
Fig. 5
figure 5

Sensitivity of the seven classifiers for different disease

Specificity of the seven classifiers for different diseases Table 4 shows the specificity outcomes of seven classifiers for different diseases. The RF classifier achieved the maximum specificity of 93.25% and K-NN achieved the minimum specificity of 80.31% for heart disease. The SVM achieved the maximum specificity of 93.45% and achieved the minimum specificity of 81.47% by K-NN classifier for the diabetics' dataset. For breast cancer, the maximum specificity of 97.81% is achieved by the RF classifier and the minimum specificity of 92.14% is achieved by the NB classifier. The RF classifier achieved the highest specificity of 97.72% and the DT classifier achieved the lowest specificity of 90.15% for the hepatitis dataset. In liver disorder, the maximum specificity of 74.82% is achieved by the SVM classifier and the minimum specificity of 66.42% is achieved by the DT classifier. The RF classifier provided the maximum specificity of 98.34% and the DT classifier achieves the minimum specificity of 90.37% for the dermatology dataset. For surgery data, specificity of 90.52% as the highest is achieved by the RF classifier, and specificity of 78.45% is achieved by the K-NN classifier. The SVM classifier achieved the highest specificity of 86.62% and the NB classifier achieved the minimum specificity of 77.38% for the thyroid dataset. The RF classifier provides the maximum specificity of 88.25% and the NB classifier provides the minimum specificity of 82.55% for the spect heart disease dataset. Figure 6 presents the comparative graphical view of specificity achieved by seven classifiers for different disease in the developed model.

Table 4 Specificity of the seven classifiers with different disease
Fig. 6
figure 6

Specificity of the seven classifiers for different disease

AUC of the seven classifiers for different diseases The seven classifiers' AUC values: with the different disease are presented in Table 5. For heart disease, the developed model provides the maximum AUC value of 98.34% with the RF classifier and the minimum AUC value of 88.35% is achieved by the NB classifier. The SVM classifier provides the highest AUC value of 97.3% and the NB classifier provides the minimum AUC value of 89.72% for the diabetics' dataset. For breast cancer, the maximum AUC value of 95.15% is achieved by the SVM classifier and the minimum AUC value of 84.41% is achieved by the DT classifier. The RF classifier achieved the maximum AUC value of 98.35% and the K-NN classifier achieved the minimum AUC value of 91.26% for the hepatitis dataset. For liver disorder, the SVM classifier provides the maximum AUC value of 84.34% and the NB classifier provides the minimum AUC value of 78.26%. The dermatology dataset achieved the highest AUC value of 99.32% with the RF classifier and the lowest AUC value of 92.36% with the DT classifier. For the surgery dataset, the RF classifier achieved the maximum AUC value of 94.25% and the K-NN classifier achieved the minimum AUC value of 85.82%. For the thyroid and spect heart dataset, the maximum AUC values of 90.37% and 89.52% are achieved by the RF classifier and the minimum AUC values of 78.52% and 85.18% is achieved by the DT and the NB classifiers. Figure 7 shows the comparative graphical view of AUC values achieved by seven classifiers for different disease in the developed model.

Table 5 AUC of the seven classifiers with different disease
Fig. 7
figure 7

AUC of the seven classifiers with different disease

Comparison of developed models with prior developed models Table 6 shows the comparative study of developed work with prior developed healthcare models based on machine learning algorithms. The developed work is compared with Amin et al. [19] had an accuracy of 89%, Sisodia and Sisodia [11] had an accuracy of 76.3%, Kumar and Vigneswari [12] had an accuracy of 90.23%, Muhammad et al. [13] had an accuracy of 94.41%, Alkeshuosh et al. [21] had an accuracy of 87%, Samuel et al. [22] had an accuracy of 91.10%. Harimoorthy and Thangavelu [16] had an accuracy of 89.9%. The developed model has an accuracy of 97.62%, which is 8.62%, 21.32%, 7.39%, 3.21%, 10.62%, 6.52%, and 7.72% greater than Amin et al. [19], Sisodia and Sisodia [11], Kumar and Vigneswari [12], Muhammad et al. [20], Alkeshuosh et al. [21], Samuel et al. [22], and Harimoorthy and Thangavelu [16] respectively. Figure 8 represents the graphical view of the comparative analysis of the developed model with existing models.

Table 6 Comparative analysis between proposed work and previous work
Fig. 8
figure 8

Comparative analysis of the accuracy

5 Conclusion

Implementing machine learning classification algorithms for the prediction of disease is an emerging field in the world. In this proposed work, we developed a healthcare model based on seven classification algorithms such as DT, SVM, NB, AB, RF, ANN, and K-NN. These classifiers are applied to different disease datasets such as heart disease, diabetics, breast cancer, hepatitis, liver disorder, dermatology, surgery data, thyroid, and spect heart. The classifiers' performance is evaluated with four metrics, such as accuracy, sensitivity, specificity, and AUC. The developed healthcare model achieves the different accuracy for the disease and achieves the maximum accuracy of 97.62% and the minimum accuracy of 70.11% is achieved by NB classifier. The model achieves the maximum sensitivity of 99.67% by RF classifier and minimum sensitivity of 72.26% by DT classifier. Maximum specificity of 97.81% is achieved by the RF classifier and minimum specificity of 66.42% is achieved by the DT classifier. The performance of the model is also evaluated by AUC and the maximum AUC of 99.32% is achieved by RF classifier and minimum AUC of 78.26% is achieved by NB classifier. The RF classifier observes the maximum accuracy, sensitivity, specificity, and AUC. It is analyzed that for most of the datasets, RF provides accurate results in comparison to other classifiers. In the future, we can extend this work for different applications like weather forecasting, military applications, flood predictions, etc.