Abstract
Respiratory diseases such as Asthma, COVID-19, etc., require preventive and precautionary measures. Due to the lack of medical treatment for the masses, researchers are currently focusing on clinical decision support systems (CDSS). CDSS for respiratory diseases utilizes Machine Learning (ML) techniques to classify the symptoms into a possible diseases. This approach not only grasps the attention of researchers worldwide but also assists medical doctors in the early diagnosis of the disease. In this review paper, PRISMA guidelines are used to conduct a detailed overview of the early detection of respiratory diseases using ML techniques are identified. Among various ML techniques, Artificial Neural Networks (ANN), Support Vector Machine (SVM), Decision Tree (D-Tree), Logistic Regression (LR), K Nearest Neighbor (KNN), Random Forest (RF), and AdaBoost are discussed. Then respiratory diseases are identified whose CDSS are available with the ML techniques and possible future direction for its improvement. Furthermore, the tools and ML techniques are compared with each other to enhance the researcher’s clarity for future use. The paper concluded with the future direction of the ML in the successful implementation of the CDSS in the field of respiratory disease.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Respiratory diseases (also known as pulmonary or lungs disorder) are basically biochemical disturbances of the lungs’ tissues and cells that become the inhaling process difficult. More specifically any parts of the respiratory system such as the bronchi, trachea, bronchioles, alveoli, pleurae, pleural cavity, nerves, and physique when affected by ailments then we fall under the category of respiratory diseases [1]. Some basic types of respiratory diseases are cold, influenza, and pharyngitis, while some serious diseases are bacterial pneumonia, pulmonary intercalation, TB, severe asthma, lung cancer, and severe acute respiratory disorders, such as coronavirus. Respiratory illnesses can be classified by considering the affected part of the respiratory system such as organs or tissues, by nature and pattern, or by cause of disease [2]. The field and study of respiratory disease are called pulmonology and the physician is known as a pulmonologist or respirologist. International organizations of world international respiratory societies such as the American College of Chest Physicians (ACCP), International Respiratory Societies (IRS), European Respiratory Society (ERS), Asian Pacific Society of Respirology (APSR), International Union against Tuberculosis and Lung Disease, and others relevant societies/associations meet regularly to control and monitor the respiratory diseases worldwide [2, 3]. The societies have 70,000 professional workers worldwide to emphasize the liability of respiratory diseases worldwide. Prevention from respiratory diseases may be possible if diagnosed at an early stage. If one takes precautions and adheres to the Standard Operating Process (SOP), the disease can then be reduced and, in many cases, avoided [3,4,5]. Therefore, basic treatment guidelines and medical supervision are vital to control respiratory diseases at early stages and if patients were unable to find proper guidance and treatment on time then the disease may increase and become life-threatening. To cope with the gap, several researchers have proposed a system typical called a “Clinical Decision Support System (CDSS)”. This CDSS is a type of software that take patient information and symptoms and processes this information using advanced artificial intelligence algorithms through SOPs, and then prescribes possible treatment for patients after predicting the disease [6,7,8,9].
Since CDSS is a computer-based software that also records and keeps the patients’ data of previous visits, this system may also improve patient care by providing detailed prescriptions based on patients’ history. Since CDSS is primarily a decision support system, it assists clinicians to make improved decisions and hence, reduces errors in medical treatment. In US 439 quality indicators found that adults receive only half of the prescribed treatment [10]. According to the US Institute of Medicals, which estimated that each year 98,000 US residents die due to preventable medical errors [11]. Similarly, UK hospitals observed that 11% of affected roles experienced chronic events, 48% were inevitable, while 8% of patients led to mortality [12]. To reduce such cases and increase the efficiency of medical treatment healthcare companies are turning to CDSS. As compared with other manual approaches, CDSS is shown more effective and more likely to result in lasting improvement in clinical practice. Recently 66% of CDSS systems worked well while 34% did not [13]. To improve and enhance CDSS efficiency researchers are working on enhancing the rules embedded in the software. For this purpose, Artificial Intelligence such as the Machine Learning technique will be best to enhance the efficiency of CDSS [6, 7].
Artificial Intelligence (AI) started in late 1956 and is described as “the study of "intelligent agents", tools that recognize their environment and take acts that maximize the probability of effectively achieving their goals” [14]. AI in the healthcare industry started through the enhancement of whiz systems, rules were acquired from interviews with medical experts, and the provided rules were programmed into a software system [15]. Round nearly 450 rules and SOPs based on an initial expert system was developed in 1976 named “MYCIN”, this system was used to suggest antibiotics for bacterial diseases. Due to the large volume of rules, the expert system was never used in practical clinics. To overwhelm the limitations of expert systems, Machine Learning (ML) techniques were developed and adapted for CDSS [15]. In ML manual rules are replaced by system-generated rules, as the ML algorithms learn and practice the environment and then use those learning rules in the future, which is more helpful in CDSS. The famous and effective ML technique is known as deep learning based on artificial neural networks [16]. ML works on the availability of data with volume and quality, consequently, sometimes ML-based systems are also known as Data-Intensive Systems. Data on healthcare is rapidly increasing especially after the evolution of the Internet of Medical Things (IoMT) and the availability of inexpensive electronic-based internet devices. Most of the CDSS are expert systems as compared to ML-based CDSS which becoming more popular due to efficient learning and prediction [17]. ML is widely used in health care experts’ systems, such as in X-rays, MRI machines, pointing cancerous places in lung nodules, tissues, and many other fields. In this article, we only focus on clinical decision support systems (CDSS) for respiratory illnesses. Therefore, it may be concluded that by using ML techniques we can enhance CDSS for respiratory diseases. Currently, researchers are using different ML techniques to diagnose and treat respiratory diseases, such as artificial neural networks, Deep reinforcement systems, and Convolutional neural networks (CNN). From the results point of view, ML for CDSS needs more work to enhance efficiency and accuracy. According to the cleverest authors’ information, there is no survey paper on machine learning CDSS for respiratory diseases. The following are the contributions of the proposed research work:
-
The article covers the overview of ML techniques used by CDSS for respiratory diseases.
-
The latest research work is provided for respiratory disease diagnosis using a clinical decision support system. After that recent research work for early detection of respiratory disease using machine learning is provided pros and cons.
-
Systematic comparison and discussion on the results of the state-of-the-art ML techniques used for respiratory disease treatment and precautionary measures. Detailed analysis and future directions are provided in the discussion section as analyzed from the literature.
Further, the paper is planned as follows: In Sect. 1 we have given a brief background of the title and discussed the aim of this paper. Section 2 is about the methods to recruit the studies included in this systematic review and the guideline that was followed by inclusion and exclusion criteria. Section 3 is the results in which the studies have been summarized in the form of a table. Section 4 presents a comprehensive overview of techniques used for the early detection of respiratory diseases using machine learning techniques. Section 3 consists of a detailed discussion of the provided techniques and future directions. In Sect. 5, the authors have discussed the future directions in detail. Section 6 consists of the conclusion of the systematic review paper.
2 Materials and Methods
In recent decades, there has been a significant increase in the number of studies published in biomedical literature, particularly in tropical medicine and health. Nonetheless, the available studies are often heterogeneous in nature, operational efficiency, and subject matter and can interact with the investigative problem in different ways, adding to the complexity of proof and the convergence of findings [18]. The high standard of proof as established by the evidence-based pyramid is the systematic examination and meta-analysis (SR/MAs). A well-managed SR/ MA is thus considered a viable approach to keep healthcare practitioners aware of current evidence-based medicine. In addition, despite increased guidance for the successful performing of a systematic review, we found that the key steps still begin with the problem of framing, identifying relevant study consisting of requirements creation and article-searching and evaluating the quality of the studies used, summarizing the data and interpreting the findings. Most problems can be dealt with by a researcher without any detailed clue [19]. In accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [20,21,22], this review was planned, conducted, and reported.
2.1 Search Engine and Keyword Strategy
Using the Boolean operator, analogies and alternative phrases were combined to form a set of search sequences: AND focuses and limits the hunt, while OR extends and widens the search. We were able to refine our search by using Boolean operators like as (CDSS) AND (asthma OR COPD OR Cystic Fibrosis OR lung cancer) AND (machine learning OR computer vision OR neural network OR pattern recognition OR artificial intelligence).
PubMed, IEEE Xplore, and ScienceDirect were used to search for peer-reviewed papers. It was decided to limit the results in ScienceDirect to article reviews, peer-reviewed research publications, and conference abstracts. Until June 2020, all three sources have been reviewed (Fig. 1). These sources have been searched using the specified keywords.
2.2 Inclusion Criteria
-
Studies predicting respiratory diseases.
-
To make predictions about the diseases, studies are conducted using respiratory measures/parameters or pictures of the lungs obtained from different imaging techniques.
-
Studies based on machine learning methods.
-
To identify diseases using photographs, methodology based on usage of image segmentation algorithms software or applications
-
The English language is used throughout all of the content.
-
Published studies are only published in a journal or conference proceedings.
2.3 Exclusion Criteria
-
Studies that do not include the prediction of respiratory disease.
-
Studies that did not use respiratory parameters or pictures of the lungs as data to identify the condition.
-
Non-machine learning or detecting software research articles.
-
Studies that were published in any other language other than English are excluded.
-
Articles that were not reported on in any academic journals or conference sessions.
3 Results
Some of the studies that are recruited after following PRISMA guidelines are summarized in the table briefly. Using the search approach, authors independently assessed the quality of any papers that met the criteria for inclusion. To evaluate research quality and identify relevant data relating to study design and statistical methodologies were done. The details that are extracted from the chosen studies and separated in each column of Table 1 are the essence of any machine learning study. Table 1 column 3 represents the name of the study which was chosen in that particular study. The most important analysis this column represents is which disease is the most chosen and proved to be the center of focus by most researchers. When we talk about the machine learning article, in specific we are looking out for the details about the algorithm, feature extraction method, and accuracy. These three correlate with each other. For example in [23], the author yielded an accuracy of 96% by using CNN but in [24] the author yielded an accuracy of 92%. Both authors worked on the same disease, but the neural network architecture is different in both studies.
4 Discussion
In this section, a detailed discussion is provided on the techniques of machine learning (ML) being used and utilized by researchers around the world for the early detection of respiratory diseases. A typical ML techniques process starts with the collection of the relevant data, this data is then processed, analyzed, and examined for useful information [41, 42]. This information is the unique feature exhibited by the collected data. Then the ML algorithm is trained on these unique features. Finally, after training the model is implemented and tested in real time. The complete process is summarized in Fig. 2.
4.1 Machine Learning Techniques
Numerous Machine Learning techniques can be used to detect respiratory diseases, but few among them have grasped the researchers’ attention (Fig. 3). In this section, these ML techniques are discussed in detail:
4.1.1 Logistic Regression
Logistic regression uses a logistic function by using binary dependent variables. It is a statistical model to evaluate variables. It uses two variables with the value pass or fail. Logistic regression produced linear classifiers. It uses a linear model followed by a link function. Logistic regression is a straightforward machine learning technique that can be applied to detect respiratory diseases by examining affected areas of the body [43,44,45,46].
4.1.2 K Nearest Neighbor (KNN)
KNN is an ML technique that is one of the standard classification processes in pattern recognition. It uses an instances-based learning and memory-based model, first it learns the available dataset, when new data is added to the dataset then the model follows the already followed instructions. KNN finds K numbers of training points close to the asked point using a similarity function established on the Euclidean Formula. It can be used to detect lung cancer and other respiratory diseases [47,48,49].
4.1.3 Decision Tree (DTREE)
A decision tree, one of the important ML techniques used in a hierarchical arrangement, consists of nodes and sub-nodes in the form of a tree. It has three types of nodes such as root node, internal, and terminal nodes. The provided data set we classified according to tree rules and maintaining hierarchical order. This is an efficient ML model and can be used to detect respiratory diseases [50,51,52].
4.1.4 Artificial Neural Networks (ANN)
ANN is a comparable technique built of various neurons whose purpose is defined by network architecture, correlation paths, and dispensation. This model gets knowledge from the available data and stores it in a weighted path [1]. It has multiple architectures, but multilayer perceptron (MLP) is the most popular as shown in Fig. 4. ANN model can be used for respiratory disease detection and treatment [53].
4.1.5 Support Vector Machines (SVM)
A support vector machine is an ML technique that is applied in statistical learning theory for prediction and is widely used in clinical decision support systems (CDSS). It is used for two-class classification problems and its essential form is a direct classifier, which performs classification by making a hyperplane that optimally takes apart the classes [2, 3]. It can use nonlinear classifiers for classification, and SVM can be used for early detection of respiratory diseases.
4.1.6 Random Forest (RF)
Random forest combines and produces many DTREEs using ensemble learning models. It is applied to a dataset and divided into a hierarchy such as a tree as shown in Fig. 5. Every node is further classified accordingly to reach the last word. This technique was created by Breiman and Cutler that includes two fundamental characteristics, bagging ideas. “Bagging” is pronounced as “Bootstrap and Aggregation” [54,55,56,57,58,59].
4.1.7 AdaBoost
AdaBoost is also a Machine Learning technique that can be used to predict future events by training the algorithm through available datasets. It works based on combining lean classifiers into a single robust classifier. It is used on top of any classifier for the classification process efficiently [60,61,62,63,64]. It works in two directions, begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted by calculating weight and assigned to each classifier to maximize global performance. 50% accuracy gets zero weight, and below 50% negative weight is assigned. Results are obtained from all the weak classifiers for further results in the future. The above ML technique can be used to train respiratory diseases and other diseases by providing proper rules/SOPs.
4.2 Machine Learning Techniques Used for Respiratory Diseases Through Clinical Decision Support System
In this section, a detailed overview of ML techniques/ research work done in the area of respiratory disease detection and prevention is provided. Two popular methods are used for testing respiratory diseases such as “Spirometry” and “Forced oscillation”. ML techniques used for respiratory diseases are divided into models used for Spirometry, used for Forced oscillation, and pulmonary function as shown in Fig. 6.
5 Research Work Done for “Spirometry”
In [32], three parameters such as forced expiratory volume in the first second (FEV1), forced vital capacity (FVC), and the ratio of FEV1/FVC is used for multiclass Support Vector Machine (SVM) correlated with Error-Correcting Output Codes (ECOC) for classification of spirometry patterns such as (obstructive, restrictive, and normal). According to the authors, the proposed model was capable during simulation of diagnosis of respiratory diseases and its accuracy was 97.32%. Its main limitation was that the proposed model was not evaluated on a real-world dataset as the accuracy was a dummy. Similarly, in [4] the authors used normal and abnormal FEV1 information collected from hundred patients to diagnose respiratory diseases using the ANN. The proposed model focused on FEV1 and ignored the other parts such as FVC and ECOC etc. In [5], the authors developed a second-order transfer function used to reduce airflow in COPD. A total of 336 patients were studied with COPD and five ML techniques were used to predicate COPD such as KNN, SVM, Linear Bayes, DTREE, and RBFNN. The accuracy achieved in the diagnosis of COPD was 88.2%, and sensitivity was 85%, with 98.1% specificity. The model performed well but its time complexity was too high as they used several techniques in one algorithm.
In [35], SVM classification was used with the similar model described in [34] for COPD classification, when formal spirometry criteria were discordant. Other diseases were also classified in the proposed mechanism. The proposed model correctly allocated 68% of the discordant (n ¼ 53) and correctly diagnosed it. The system also considered non-discordant subjects (n ¼ 370) and it was adept to correctly detect and diagnose of COPD in 95% of subjects. In [36] the author proposed the ML technique for early diagnosis of Asthma that bands clinic-epidemiological and spirometry knowledge. A total of 42 features were considered for diagnosis. MLP neural networks, as well as the sporadic decision tree method, were used for classification. Seventeen features were found efficient while detecting Asthma respiratory disease. The proposed classifier model has achieved an accuracy of 96% on only seven features selected iteratively and spontaneously. The main drawback of this model was the linear execution of features, if a parallel feature extraction model is used then the accuracy may be increased to a high level. Topalovic et al. [37] offered an ML method to detect lung diseases using clinical variables with a general accuracy of 68%. The proposed model was built on DTREE and provided upbeat prophetic value and warmth for diseases like COPD (78/83), asthma (66/82), neuromuscular disorder (54/100), and interstitial lung disease (52/59).
6 Research Work Done for “Forced Oscillation”
Researchers from State University, Rio de Janeiro is working on ML techniques for enhancing the force oscillation method to detect respiratory diseases [65]. Initially, they developed a model to manufacture classifier systems to assist in the analysis of respiratory diseases. The proposed model was managed to detect COPD using the forced oscillation technique. This was compared with other ML techniques such as Support Vectors Machines (SVM), KNN, ANN, Bayes Normal Classifier, and DTREE by using performance parameters such as the area under the ROC curve (AUC), sensitivity (Se), and specificity (Sp). Out of the used techniques SVM, KNN and ANN performed well and reached the value of analysis (AUC > 0.95, Se > 87%, and Sp > 94%) [66]. Similarly, the above methodology was employed to improve a reflex classifier to enhance the accuracy of the compelled oscillation technique to as soon as possible detect respiratory diseases in smoking patients. The author in [6] used a genetic algorithm (GA) with ten-fold cross-validation with the help of average AUC. The proposed enhanced model increased which shows the high correctness of respiratory disease detection.
ML techniques were used to enhance clinical decision support systems and enhance the correctness of the forced oscillation technique (FTO) in the hierarchy of air route obstruction levels in patients with COPD [7]. In the proposed study two-step solution was provided, in step first proved that FOT parameters did not distribute enough correctness in identifying COPD questions in the first step. In the second round of the proposed model, several ML techniques were analyzed. The ML techniques were applied where accuracy was not efficient in the FOT diagnosis process. From the results, it was concluded that KNN and RF classifier provided more accuracy as compared with other used ML techniques. From the result in the second round, it was analyzed that accuracy was reached to the required extent. But its limitation was not evaluated on a large real-world dataset. Similarly, the ML technique is used for the early detection of asthma using the FOT process through airway obstruction. During the study, it was noticed that the most excellent parameter of the FOT method was resonance frequency which achieved sufficient accuracy [8]. The second phase of the proposed study consists of various ML techniques used. All the used ML techniques enhanced detection exactitude up to the acceptable rate, however, a proper train model is required for future detection of respiratory diseases.
7 Research Work Done for “Miscellaneous Pulmonary Function Methods”
In this section, research work was done for respiratory disease detection other than Spirometry and Forced Oscillation. In [9] 2013, the authors proposed and provided the neural network usage for matching formal lung properties with the use of airflow. This was a very basic model in the start ear of ML in the CDSS. By the proposed system the author enhanced IT. Similarly, in [10] the authors proposed an ML-based model that detects exacerbation and successive triage in COPD patients. The proposed model uses a clinical predictive physician rule to train supervised prediction algorithms. Out of the used ML techniques, the Logistic Regression (LR) and Gradient-Boosted DTREEs (GB DTREEs) proved the ultimate accomplishment. By analysis, the system outperforms as compared with physical physicians. The proposed model also showed better performance in the context of sensitivity, specificity, and positive projecting value when predicting a patient’s demand for emergency care. The author also provided that the proposed system is not a substitute for physicians however it can be used in the home whenever the physician is not available in an emergency.
The difference between asthma and various wheezing subtypes in childhood was studied by [11]. Subject class members are not allocated to a particular class in the proposed method but are divided probabilistically into all classes to see different perspectives of the desired output. The author also concluded that the proposed model disambiguates the complex designs of symptoms communicated by these distinct diseases. Furthermore, in [12] the authors proposed an ML-based model using the ANN method, with social capabilities in evaluating and detecting respiratory disease. The proposed system works on 27 critical asked questions were prepared and implemented for respiratory diseases such as COPD, Asthma, Tuberculosis, and Pneumonia, the dataset consists of 60 cases. The proposed system achieved more than 90% accuracy, indicating that the provided system can be useful for clinical decision-support systems to early detect respiratory diseases. In [13] the author proposed an ML-based model to differentiate between, normal lung function, asthma, and COPD. The system works on the provided knowledge and symptoms of the diseases. The proposed model increased the accuracy up to 98.71%.
7.1 Latest ML Technique Used for the Early Detection of Respiratory Diseases
In this we provided the most recent work done in the field of early detection of respiratory diseases through ML techniques. In [14] the author provided a CDSS model for the detection of COVID-19 using chest X-ray images. They provided three-layer architecture to analyze the disease. In the first part preprocessing is done on the provided images supported by data intensification. The second part comprises learning and feature abstraction. The third phase of the proposed model generates prediction and classification using different classifiers. The provided result in the paper bore an AUC of 0.97 for inner validation and 0.95 for outer validation established on the number of chest X-ray images. In [15] the author proposed an early detection system for detecting the morality of COVID-19 through five features such as neutrophils, hs-CRP, age, lymphocyte, and LDH that improves in perfectly forecasting the impermanence of COVID-19 patients. In the proposed model different ML techniques were used to accurately predicate morality from COVID-19. The ML technique neural network predicated morality 96 percent for the entire duration, during the COVID-19 duration and 90 percent accuracy was predicated sixteen days before the actual duration of the disease. The model was evaluated into three different scenarios and the result shows that more than 90% accuracy was obtained. The main limitation of the provided model is that it should be analyzed on more diverse real-world data to check that either the system maintains the same accuracy or not.
In [16] the author proposed a new ML-based technique named as “BOMLA” indicator for respiratory disease patients. The dataset is collected from Khulna and Bangladesh Asthma patients. The proposed technique used various ML techniques for classification such as ACC, SE, kappa index, MCC, etc., alongside ROC chemical analysis. The proposed model detected ASTHMA with an accuracy of 94.35 percent by using the ADASYN classification technique. In [67] Autor presented a user-friendly and low-cost early detection tool for ASTHMA respiratory disease. From the proposed technique a practical application named DSS was developed to give benefit to clinical staff for asthma detection. From the proposed method one can build a recommender system that can be easily incorporated into a mobile system. The limitation of the proposed ML technique for the early detection of asthma is that the data set used consists of very limited data. The model needs to implement in a real environment and then check the accuracy.
In [17] the authors used three classifiers for detecting lung cancer such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Convolutional Neural Network (CNN). The proposed model is used to detect lung cancer in the early stage. They used the WEKA tool for the implementation of the proposed model. Results show that SVM, CNN, and KNN provided accuracy with 95.56 percent, 92.11 percent, and 88.40 percent respectively. If this model is used for other respiratory diseases, then the result will be better as compared to the already available models. The authors developed an ML-based CDSS technique that can be used in the clinic for respiratory disease detection. They presented BIOMARKER SIGNATURE that could classify SSC patients with and without PAH. The findings of the study were evaluated by an external associate. Serum trials from patients along with SSc and PAH (n = 77) and SSc not including pulmonary hypertension (non-PH) (n = 80) were randomly chosen from the clinical DETECT analysis and experienced proteomic vetting applying the Myriad RBM Discovery manifesto consisting of 313 proteins. Samples from a sovereign justification SSc cohort (PAH n = 22 and non-PH n = 22) were gained from the University of Sheffield [68].
7.2 Case Findings to Validate the ML Techniques on CDSS
Min et al. using the Geisinger Health System's medical claims information from January 2004 through September 2015, may compare and contrast the efficacy of the various methods under consideration. Both knowledge-driven features, which are features derived based on clinical information possibly connected to COPD readmission, and data-driven features, which are features collected from the patient data itself, are used as the basis for the machine learning models we develop. Based on the one-year claims history before discharge, our investigation indicated that the prediction performance may be improved from roughly 0.60 utilizing knowledge-driven features to 0.653 by integrating both knowledge-driven and data-driven features. We also show that the best AUC for these predictions is about 0.65, showing that even the most complicated deep-learning models cannot help [69]. Karthikeyan and his colleagues (2021) recommend using data from blood tests and machine learning (ML) techniques to estimate the fatality rate associated with COVID-19. Mortality may be predicted with 96% accuracy using a combination of neutrophils, lymphocytes, lactate dehydrogenase (LDH), high-sensitivity C-reactive protein (hs-CRP), and age. We have trained and evaluated the performance of many machine learning (ML) models (neural networks, logistic regression, XGBoost, random forests, support vector machines, and decision trees) to identify the model with the highest accuracy during the whole duration of the illness. The optimal approach, based on XGBoost feature significance and neural network classification, achieves an accuracy of 90% as early as 16 days prior to the conclusion. The suggested model's high predictive performance and applicability are verified by robust testing with three examples based on days to result. Using these primary biomarkers, a thorough analysis and trend detection were undertaken to provide actionable insights. The findings of this research provide methods that might improve the timeliness, precision, and dependability of healthcare system decisions for targeted medical treatments [70]. In prospective research conducted in a real-world inpatient environment, Segal et al. assessed the accuracy, validity, and therapeutic value of medication mistake warnings provided by a new system using outlier detection screening algorithms. In one hospital's tertiary care unit, they included an innovative outlier system in the current electronic medical record system. During those 16 months, the system tracked every prescription medication that was filled. All warnings were evaluated for precision, clinical relevance, and practicality by the department's personnel. All doctors' instantaneous reactions to alarms were captured. For just 0.4% of all pharmaceutical orders, the system issued an alert, resulting in a very light alert load. Adjustments in patient status leading to medication changes accounted for 60% of warnings that were raised after the drug had already been delivered (eg, changes in vital signs). The clinical validity of 85% of the warnings was verified, and 80% were deemed clinically valuable. There were corrections to subsequent medical instructions in 43% of warnings [65].
7.3 Limitations
If one class contains samples much higher than the other class, the obtained model can be overfitted or biased. Every class must have the same number of samples, however, this is usually not possible in the dynamic environment of diseases. In the multi-class categorization of COVID-19, pneumonia, and usual lung, the total no of images of pneumonia will be more than the images of COVID-19. Original image size classification and evaluation are necessary, as it takes more space and requires power [71, 72]. Researchers mostly squeeze/compress the image to reduce its size due to huge computational power and storage. So, large data sets of huge images are complex and difficult to process. To obtain high accuracy and provide efficient ML-based respirator disease detection mechanism one needs to have thousands of images in their dataset. By providing more data, the more accurate classifier can be built. However, due to the limitation of available datasets, there is no such accurate respiratory disease detection algorithm. This limitation will lead research to find another solution and it is a big challenge for ML-based methods as well. It necessitates various faults to make the set of classifiers perform best. The basic classifier applied should have very minimal correspondence. This will ensure that the errors of these classifiers will also change. Best classifiers combine another classifier to obtain more accurate and efficient results. Most surveyed studies only combine classifiers accomplished on similar characteristics. This leads to high correlation errors in the base classifier [73].
8 Future Directions
The following future directions are concluded from the provided literature and general future directions of early detection of respiratory diseases using Machine Learning Techniques. The following are major future directions:
-
Data set availability is necessary for ML techniques while detecting early respiratory diseases. Such as if we don’t have enough lung cancer images data set then if we build an efficient ML technique it will not provide the required accuracy as the data set will be weak. This condenses oversimplification errors because the model befits more wide-ranging when trained on supplementary examples. Healthcare data is difficult to obtain. Consequently, if the data set is made public, researchers can get supplementary data.
-
Most researchers used CNN for automatic feature extraction. Approximately other features are studied, such as HOG, Gabor, GIST, SIFT, etc. Though, many other features to be explored, such as quadtrees and image histograms. Exertions can be directed to dissimilar types of characteristics. This can solve the problem that errors are highly related when using integrated technology. More features bring more changes. When combining many variants, the results are usually better. Feature engineering permits further material to be extracted from current data that can better describe the variance in the training data.
-
Integration of open medical literature into future decision systems [74] may be aided by developments in Natural Language Processing (NLP, i.e. the capacity of computers to analyze human language). However, when data is lacking in some regions or in resource-limited contexts, the creation of ML-CDSS employing minimal variables may be of relevance. Paying close attention to the factors utilized by the ML-CDSS to forecast their result is essential; for instance, we discovered an ML-CDSS that relied on the administration of antibiotics in the intensive care unit to make predictions.
-
Quality and availability of clinical data utilized for developing and validating ML-CDSS are limitations. Future machine learning (ML) methods will only be helpful to physicians if extensive datasets containing relevant clinical data are made available.
-
Future ML-CDSS in ID should be incorporated in a systematic process of integration into clinical settings, and should be developed across a variety of health settings, including primary care and LMICs that are presently underrepresented. Clinical outcomes after long-term implementation in normal clinical treatment should be the focus of future research
9 Conclusion
In this article, a review study is conducted by using PRISMA guidelines on the early detection of respiratory disease through clinical decision support systems using machine learning techniques. Initially, the overview of machine learning techniques that are used for the early detection of respiratory diseases such as Asthma, lung diseases, etc., are discussed. Then different ML techniques such as KNN, DTREE, SVM, ANN, etc., are explained. The pros and cons of different work done in the field of respiratory disease diagnoses are also explained in the study. Furthermore, detailed work on the early detection of respiratory diseases using machine learning, such as COVID-19 morality detection, Lung cancer detection in early stages is presented. Finally, the studied literature is discussed and open issues are highlighted in the field of early detection through ML techniques with future directions.
Data Availability
The manuscript has no associated data.
Code Availability
The manuscript has no associated code.
References
Villegas, P. (1998). Viral diseases of the respiratory system. Poultry Science, 77(8), 1143–1145.
Ferkol, T., & Schraufnagel, D. (2014). The global burden of respiratory disease. Annals of the American Thoracic Society, 11(3), 404–406.
Pappas, G., Bosilkovski, M., Akritidis, N., Mastora, M., Krteva, L., & Tsianos, E. (2003). Brucellosis and the respiratory system. Clinical Infectious Diseases, 37(7), e95–e99.
Prevention of transmission of respiratory illnesses in disaster evacuation centers. Cdc.gov, 2021. [Online]. Available: https://www.cdc.gov/disasters/disease/respiratoryic.html.
Tips to keep your lungs healthy. Lung.org, 2022. [Online]. Available: https://www.lung.org/lung-health-diseases/wellness/protecting-your-lungs.
Kanbay, M., Kanbay, A., & Boyacioglu, S. (2007). Helicobacter pylori infection as a possible risk factor for respiratory system disease: A review of the literature. Respiratory Medicine, 101(2), 203–209.
Makker, H. (2010). Obesity and respiratory diseases. International Journal of General Medicine, 3, 335. https://doi.org/10.2147/IJGM.S11926
Kawamoto, K., Houlihan, C. A., Balas, E. A., & Lobach, D. F. (2005). Improving clinical practice using clinical decision support systems: A systematic review of trials to identify features critical to success. BMJ, 330(7494), 765.
Wright, A., et al. (2016). Analysis of clinical decision support system malfunctions: A case series and survey. Journal of the American Medical Informatics Association, 23(6), 1068–1076.
Patel, P. (2019) EHRs + clinical decision support = better healthcare. Perficient.com. [Online]. Available: https://blogs.perficient.com/2012/04/24/ehrs-clinical-decision-support-better-healthcare/.
Charatan, F. (1999). Medical errors kill almost 100000 Americans a year. BMJ, 319(7224), 1519.
Donaldson, L. J., Panesar, S. S., & Darzi, A. (2014). Patient-safety-related hospital deaths in England: Thematic analysis of incidents reported to a national database, 2010–2012. PLoS Medicine, 11(6), e1001667.
Sutton, R. T., Pincock, D., Baumgart, D. C., Sadowski, D. C., Fedorak, R. N., & Kroeker, K. I. (2020). An overview of clinical decision support systems: Benefits, risks, and strategies for success. NPJ Digital Medicine, 3, 17.
Crevier, D. (1993). AI: The tumultuous history of the search for artificial intelligence. New York: Basic Books Inc.
Mitchell, T. M. (1997). Does machine learning really work? AI Magazine, 18(3), 11–11.
Hinton, G. (2018). Deep learning-a technology with the potential to transform health care. JAMA, 320(11), 1101–1102.
Rawson, T. M., Ahmad, R., Toumazou, C., Georgiou, P., & Holmes, A. H. (2019). Artificial intelligence can improve decision-making in infection management. Nature Human Behaviour, 3(6), 543–545.
Bello, A., Wiebe, N., Garg, A., & Tonelli, M. (2015). Evidence-based decision-making 2: Systematic reviews and meta-analysis. Methods in Molecular Biology, 1281, 397–416.
Tawfik, G. M., et al. (2019). A step by step guide for conducting a systematic review and meta-analysis with simulation data. Tropical Medicine and Health, 47(1), 46.
Liberati, A., et al. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: Explanation and elaboration. BMJ, 339, b2700.
Page, M. J., et al. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Systematic Reviews, 10(1), 89.
PRISMA. Prisma-statement.org. [Online]. Available: http://www.prisma-statement.org/. Accessed 22 May 2022.
Liz, H., Sánchez-Montañés, M., Tagarro, A., Domínguez-Rodríguez, S., Dagan, R., & Camacho, D. (2021). Ensembles of convolutional neural network models for pediatric pneumonia diagnosis. Future Generation Computer Systems, 122, 220–233.
Victor Ikechukwu, A., Murali, S., Deepu, R., & Shivamurthy, R. C. (2021). ResNet-50 vs VGG-19 vs training from scratch: A comparative analysis of the segmentation and classification of Pneumonia from chest X-ray images. Global Transitions Proceedings, 2(2), 375–381.
Min Kim, H., Ko, T., Young Choi, I., & Myong, J.-P. (2021). Asbestosis diagnosis algorithm combining the lung segmentation method and deep learning model in computed tomography image. International Journal of Medical Informatics, 158(104667), 104667.
Kavya, R., Christopher, J., Panda, S., & Lazarus, Y. B. (2021). Machine learning and XAI approaches for allergy diagnosis. Biomedical Signal Processing and Control, 69(102681), 102681.
Sills, M. R., Ozkaynak, M., & Jang, H. (2021). Predicting hospitalization of pediatric asthma patients in emergency departments using machine learning. International Journal of Medical Informatics, 151(104468), 104468.
Do, Q., Son, T. C., & Chaudri, J. (2017). Classification of asthma severity and medication using TensorFlow and multilevel databases. Procedia Computer Science, 113, 344–351.
Isaac, A., Nehemiah, H. K., Isaac, A., & Kannan, A. (2020). Computer-aided diagnosis system for diagnosis of pulmonary emphysema using bio-inspired algorithms. Computers in Biology and Medicine, 124(103940), 103940.
Isaac, A., Nehemiah, H. K., Dunston, S. D., Elgin Christo, V. R., & Kannan, A. (2022). Feature selection using competitive coevolution of bio-inspired algorithms for the diagnosis of pulmonary emphysema. Biomedical Signal Processing and Control, 72, 103340.
Madero Orozco, H., Vergara Villegas, O. O., Cruz Sánchez, V. G., de Ochoa Domínguez, H. J., & de Nandayapa Alfaro, M. J. (2015). Automated system for lung nodules classification based on wavelet feature descriptor and support vector machine. Biomedical Engineering Online, 14(1), 9.
Sahin, D., Ubeyli, E. D., Ilbay, G., Sahin, M., & Yasar, A. B. (2010). Diagnosis of airway obstruction or restrictive spirometric patterns by multiclass support vector machines. Journal of Medical Systems, 34(5), 967–973.
Waghmare, K., & Chatur, D. (2014). Spirometry data classification using self organizing feature map algorithm. International Journal for Research in Emerging Science and Technology, 1, 35–38.
Abadia, A. F., et al. (2022). Diagnostic accuracy and performance of artificial intelligence in detecting lung nodules in patients with complex lung disease: A noninferiority study: A noninferiority study. Journal of Thoracic Imaging, 37(3), 154–161.
Li, L., et al. (2020). Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: Evaluation of the diagnostic accuracy. Radiology, 296(2), E65–E71.
Topalovic, M., Aerts, J.-M., Decramer, M., Troosters, T., & Janssens, W. (2017). Artificial intelligence detects lung diseases using pulmonary function tests. C47. COPD: Physiologic assessment (pp. A5678–A5678). American Thoracic Society.
Bharati, S., Podder, P., & Mondal, M. R. H. (2020). Hybrid deep learning for detecting lung diseases from X-ray images. Informatics in Medicine Unlocked, 20(100391), 100391.
Zhu, J., Shen, B., Abbasi, A., Hoshmand-Kochi, M., Li, H., & Duong, T. Q. (2020). Deep transfer learning artificial intelligence accurately stages COVID-19 lung disease severity on portable chest radiographs. PLoS ONE, 15(7), e0236621.
Topalovic, M., Das, N., Troosters, T., Decramer, M., & Janssens, W. (2017). Late breaking abstract–applying artificial intelligence on pulmonary function tests improves the diagnostic accuracy. Respiratory Function Technologists/Scientists (p. 4561). Lausanne: Eur Respiratory Soc.
Das, D. K., Chakraborty, C., & Bhattacharya, P. S. (2016). Automated screening methodology for asthma diagnosis that ensembles clinical and spirometric information. Journal of Medical and Biological Engineering, 36(3), 420–429.
De Ramón Fernández, A., Ruiz Fernández, D., Gilart Iglesias, V., & Marcos Jorquera, D. (2021). Analyzing the use of artificial intelligence for the management of chronic obstructive pulmonary disease (COPD). International Journal of Medical Informatics, 158, 104640.
Trovato, G., & Russo, M. (2021). Artificial intelligence (AI) and lung ultrasound in infectious pulmonary disease. Frontiers Medicine (Lausanne), 8, 706794.
Abu-Mostafa, Y., Magdon-Ismail, M., & Lin, H. (2012). Learning from data. New York: AMLBook.
Hu, X., Luo, H., Guo, M., & Wang, J. (2022). Ecological technology evaluation model and its application based on logistic regression. Ecological Indicators, 136(108641), 108641.
Meyners, M., & Hasted, A. (2022). Reply to Bi and Kuesten: ANOVA outperforms logistic regression for the analysis of CATA data. Food Quality and Preference, 95(104339), 104339.
Cabero-Almenara, J., Guillén-Gámez, F. D., Ruiz-Palmero, J., & Palacios-Rodríguez, A. (2022). Teachers’ digital competence to assist students with functional diversity: Identification of factors through logistic regression methods. British Journal of Educational Technology, 53(1), 41–57.
Kuncheva, L. I., & Alpaydin, E. (2007). Combining pattern classifiers: Methods and algorithms. IEEE Transactions on Neural Networks, 18(3), 964–964.
Lin, J., et al. (2022). Ultrahigh energy harvesting properties in temperature-insensitive eco-friendly high-performance KNN-based textured ceramics. Journal of Materials Chemistry A. Materials for Energy and Sustainability, 10(14), 7978–7988.
Zheng, T., et al. (2022). Compositionally graded KNN-based multilayer composite with excellent piezoelectric temperature stability. Advanced Materials, 34(8), e2109175.
Azuaje, F. (2006). Witten IH, Frank E: Data mining: Practical machine learning tools and techniques 2nd edition: San Francisco: Morgan Kaufmann publishers. Biomedical Engineering Online, 5(1), 51.
Magazzino, C., Mele, M., Schneider, N., & Shahzad, U. (2022). Does export product diversification spur energy demand in the APEC region? Application of a new neural networks experiment and a decision tree model. Energy Buildings, 258(111820), 111820.
Wang, K., Lu, J., Liu, A., Song, Y., Xiong, L., & Zhang, G. (2022). Elastic gradient boosting decision tree with adaptive iterations for concept drift adaptation. Neurocomputing, 491, 288–304.
Guhathakurata, S., Saha, S., Kundu, S., Chakraborty, A., & Banerjee, J. S. (2022). A new approach to predict COVID-19 using artificial neural networks. Cyber-physical systems (pp. 139–160). Amsterdam: Elsevier.
Hernández-Pereira, E. M., Álvarez-Estévez, D., & Moret-Bonillo, V. (2015). Automatic classification of respiratory patterns involving missing data imputation techniques. Biosystems Engineering, 138, 65–76.
Widder, S., et al. (2022). Association of bacterial community types, functional microbial processes and lung disease in cystic fibrosis airways. ISME Journal, 16(4), 905–914.
Uegami, W., et al. (2022). Mixture of human expertise and deep learning-developing an explainable model for predicting pathological diagnosis and survival in patients with interstitial lung disease. Modern Pathology, 35, 1083–1091.
Amini, N., & Shalbaf, A. (2022). Automatic classification of severity of COVID-19 patients using texture feature and random forest based on computed tomography images. International Journal of Imaging Systems and Technology, 32(1), 102–110.
Gaur, D., & Dubey, S. K. (2022). Impact of environmental concern factors on lung diseases using machine learning. Computational intelligence in pattern recognition (pp. 719–730). Singapore: Springer.
El-Askary, N. S., Salem, M.A.-M., & Roushdy, M. I. (2022). Features processing for random forest optimization in lung nodule localization. Expert Systems with Applications, 193(116489), 116489.
Schapire, R. E. (2013). Explaining AdaBoost. Empirical inference (pp. 37–52). Berlin Heidelberg: Springer.
Vaishnaw, G. K. (2022). A method of micro pixel similarity for lung cancer diagnosis using adaboost. Algorithms for intelligent systems (pp. 75–90). Singapore: Springer.
Sevinç, E. (2022). An empowered AdaBoost algorithm implementation: A COVID-19 dataset study. Computers & Industrial Engineering, 165(107912), 107912.
Venkatesh, S. P., & Raamesh, L. (2022). Predicting lung cancer survivability: A machine learning ensemble method on seer data. Research Square, 45, 4789. https://doi.org/10.21203/rs.3.rs-1490914/v1
Mary, S. R., Kumar, V., Venkatesan, K. J. P., Kumar, R. S., Jagini, N. P., & Srinivas, A. (2022). Vulture-based AdaBoost-feedforward neural frame work for COVID-19 prediction and severity analysis system. Interdisciplinary Sciences, 14(2), 582–595.
Segal, G., Segev, A., Brom, A., Lifshitz, Y., Wasserstrum, Y., & Zimlichman, E. (2019). Reducing drug prescription errors and adverse drug events by application of a probabilistic, machine-learning based clinical decision support system in an inpatient setting. Journal of the American Medical Informatics Association, 26(12), 1560–1565.
Amaral, J. L. M., Lopes, A. J., Jansen, J. M., Faria, A. C. D., & Melo, P. L. (2012). Machine learning algorithms and forced oscillation measurements applied to the automatic identification of chronic obstructive pulmonary disease. Computer Methods and Programs in Biomedicine, 105(3), 183–193.
Abdullah, D. M., Abdulazeez, A. M., & Sallow, A. B. (2021). Lung cancer prediction and classification based on correlation selection method using machine learning techniques. Qubahan Academic Journal, 1(2), 141–149.
Bauer, Y., et al. (2021). Identifying early pulmonary arterial hypertension biomarkers in systemic sclerosis: Machine learning on proteomics from the DETECT cohort. European Respiratory Journal, 57(6), 2002591.
Min, X., Yu, B., & Wang, F. (2019). Predictive modeling of the hospital readmission risk from patients’ claims data using machine learning: A case study on COPD. Science and Reports, 9(1), 2362.
Karthikeyan, A., Garg, A., Vinod, P. K., & Priyakumar, U. D. (2021). Machine learning based clinical decision support system for early COVID-19 mortality prediction. Frontiers in Public Health, 9, 626697.
Tiwari, S., Chanak, P., & Singh, S. K. (2022). A review of the machine learning algorithms for covid-19 case analysis. IEEE Transactions on Artificial Intelligence. https://doi.org/10.1109/TAI.2022.3142241
Ajaz, F., Naseem, M., Sharma, S., Shabaz, M., & Dhiman, G. (2022). COVID-19: Challenges and its technological solutions using IoT. Current Medical Imaging Review, 18(2), 113–123.
Sharma, M., Prakash, U., Kumari, A., & Singla, K. (2022). Early detection of covid-19 based on preliminary features using machine learning algorithms. Advances in intelligent systems and computing (pp. 391–402). Singapore: Springer.
Andrade, D. S. M., et al. (2021). Machine learning associated with respiratory oscillometry: A computer-aided diagnosis system for the detection of respiratory abnormalities in systemic sclerosis. Biomedical Engineering Online, 20(1), 1–18. https://doi.org/10.1186/s12938-021-00865-9
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
SWA: Conceptualization, Validation, Methodology, Writing- Original draft preparation, investigation. MA: Supervision, Project administration, Methodology. YIZ: Writing- Review & Editing, Validation. MR: Supervision, Project administration. SA. Resources, Formal Analysis. EN: Writing- Review & Editing, Visualization.
Corresponding author
Ethics declarations
Conflicts of interest
The author declared no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ali, S.W., Asif, M., Zia, M.Y.I. et al. CDSS for Early Recognition of Respiratory Diseases based on AI Techniques: A Systematic Review. Wireless Pers Commun 131, 739–761 (2023). https://doi.org/10.1007/s11277-023-10432-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-023-10432-1