1 Introduction

Health care access, affordability and quality are problems for developing countries of the world and a large number of individuals do not receive the quality care that they need [1]. Generally, rural public health systems across the country face difficulty in attracting, retaining, and ensuring the regular presence of highly trained medical professionals. It is well known that many doctors are not willing to work in the rural areas due to lack of facilities, even if they are paid high salaries. Nevertheless, many health problems that rural people suffer from are preventable and easily treatable [2]. In this respect, a simple computer-aided symptom-based diagnostic application is needed to improve the delivery of health services in rural areas, to guide the paramedic/nurse in handling common ailments directly by administering simple remedies; and only refer to secondary care for the more complex problems. In recent times, these are increasingly possible due to changes in information and medical technologies. Over the last decades, there have been numerous implementations of computational intelligence in medicine [35] for diagnosing a disease. Computer technology can be used to organize, store and retrieve medical knowledge base (KB) as needed and, thus, can improve the accessibility of curative, preventive and promotive health services [6] to poor and unreached groups.

In spite of the increasing scientific evolution of both information technology and medical science, the inherent imprecision of the later makes the amalgamation of these technologies a rather difficult task. The main sources of this natural uncertainty are insufficient understanding of biological systems and their interconnections, the ambiguity of medical results and measurements [7]. As complex decision making, disease diagnosis involves a number of symptom analysis, and in this process, we have to face uncertainty which essentially lies in human expressions. For example, suppose at a consultation session [8] to evaluate a set of symptoms for disease diagnosing purpose, a doctor asks a patient about his(er) condition. When a patient describes his(er) condition, there may be inexactness of information about source of disease, period of suffering or the symptoms of disease. The doctor has to deal with this kind of incomplete and imperfect information. Therefore, in medical diagnosis problems, the primary task may be regarded as the proper management of incomplete information. Basically, the natural evolution of various diseases and the uncertain nature of medical data require a consistent framework that can model the uncertainty or imprecision, intrinsically connected to the medical problem, by using variables and multiple class memberships and facilitate approximate reasoning. This evidently makes the fuzzy logic (FL) a valuable tool for representing medical concepts by treating them as fuzzy sets [9]. Researchers have made several advancements regarding the applications of fuzzy approach in the field of medical diagnosis problems [1014]. Fuzzy set assigns to each member of the universe of discourse a degree of membership between zero and one. By considering the non-membership degree to the fuzzy set, Atanassov [15] introduced intuitionistic fuzzy set (IFS) which implicates the fact that the non-membership degree is not always compliment to the membership degree. There may arise some hesitations. Thus, there are some cases where IFS theory is more suitable to deal with incomplete or inexact information present in real-world applications [1618]. In view of this, IFS theory has been employed to cope with uncertainty in medical diagnosis problems successfully [16, 1921].

Based upon the observation that medical data are often imprecise, incomplete and vague so that using standalone IFS may not improve the accuracy of diagnosis, the aim of this present study is to propose a new algorithmic approach for solving medical diagnosis problem by utilizing both fuzzy and intuitionistic fuzzy frameworks. We will provide a brief overview of the relevant works for the medical diagnosis problems in Section 2, including the motivation and ideas of the proposed approach.

The paper is planned as follows: Section 3 describes some useful results that are used in our proposed medical diagnostic support system (MDSS). The proposed MDSS is presented in Section 4. In Section 5, a numerical example is presented. In Section 6, a discussion about significance and usefulness of the proposed approach is described. Further, a similarity measurement is applied to validate the diagnostic results of our approach. A sensitivity analysis is also conducted in this Section to examine the impact of small perturbations in the values of confidence level of the doctor regarding evaluation of patients’ symptoms, on the diagnostic results. A web-based tool for MDSS is introduced in Section 7 while Section 8 concludes the discussion.

2 Related works and motivation

In this section, a brief review of the related works is carried out followed by a discussion about the motivation behind this work.

2.1 Related works

Nowadays, fuzzy rule-based model has been widely and successfully used in medical reasoning [14, 2224]. As per our knowledge, several rule-based medical diagnosis systems have been published. Some of those works are given below.

By utilizing fuzzy set theory and fuzzy production rules, a weighted fuzzy reasoning algorithm for handling medical diagnostic problems was introduced in [22]. In 2001, Yao and Yao [23] proposed a fuzzy decision making method for medical diagnosis on the basis of fuzzy number and compositional rule of inference. Moreover, by utilizing fuzzy classification trees, Chiang et al. [25] proposed a medical decision support system for colon polyp screening. For the management of tropical diseases, Obot and Uzoka [24] used fuzzy rule base framework. A neuro-fuzzy rule base network methodology was applied in medical diagnosis in [26]. In 2009, Gadaras and Mikhailov described a methodology of medical diagnosis by using fuzzy rule-based classification technique [7]. Maio et al. [27] presented a medical diagnosis system based on an ontology model, which represents a relation between medical disease and it’s symptomatology in a qualitative manner by using fuzzy labels. Fenza et al. [28] developed an architecture for context-aware service discovery in the health care domain that exploits synergy among intelligent agent technology, semantic web models and computational intelligence techniques. Aribarg et al. [29] introduced a classification method by using simulated annealing for efficient medical diagnosis. An effective heart disease diagnosis was proposed in [30] with the help of framing fuzzy rules by using support sets. By using fuzzy inference system, Singh et al. [31] diagnosed the arthritis disease. A web-based decision support system by using fuzzy rule base for the diagnosis of typhoid fever was developed by Samuel et al. [14]. Caponetti et al. [32] proposed an approach based on fuzzy mathematical morphology to delineate the segment images of human oocyte from the entire image. Furthermore, fuzzy rule base system was used to solve lung disease [33], breast cancer [34], abdomen pain [35], disease in human brain [36], malaria [37], vision defects [38], etc.

On the other hand, as mentioned in the introduction, among all the higher order fuzzy sets, IFS has been used effectively in medical diagnosis problems, due to its capability in modeling hesitation of human mind. For example, in 2001, De et al. [16] formulated a medical diagnosis system by using composition of intuitionistic fuzzy relation (IFR). Kharal [19] presented a homeopathic drug selection by using the same composition rule. Szmidt and Kacprzyk [21] proposed a medical diagnosis process by distance measures of IFSs. Further, they used similarity measure [39] of IFSs for supporting medical diagnostic reasoning. Again in [4042] several medical diagnostic approaches were developed based on similarity measures under intuitionistic fuzzy environment. Applying an improved intuitionistic fuzzy cross-entropy approach, Hung [43] solved a medical diagnosis problem. In 2012, Zhang et al. [44] presented a new medical diagnosis process by utilizing score functions on IFSs with double parameter. Hung and Tuan [20] pointed out some shortcomings of [16] by providing appropriate examples. Recently, few research articles are published in this domain, where disease diagnosis is performed based on intuitionistic fuzzy soft sets [45, 46] and type-2 fuzzy sets [47, 48] based operations.

2.2 The motivation and ideas

Now although, as evident from the above literature review, research articles of medical diagnosis problems are available on purely fuzzy rule-based methods and purely IFS based models. The amalgamation of these two existing approaches is an issue that remains looked into. The limitations of IFSs based medical diagnosis problems are that the information of previous diagnosis about patients are not utilized in most of the works. On the other hand, intrinsic uncertainty of medical problems cannot be tackled solely by fuzzy rule-based models. Thus, the next step of evolution in solving medical diagnosis problem, requires a paradigm that combines both fuzzy rule-based and IFS based approaches. However, the problem of integrating these two approaches in medical diagnosis problem setting is not yet attended to so well and this is the aspect that has motivated the present work. In other words, our idea in this article is proposing a hybrid method between fuzzy rule-based system and IFS based approach, to solve medical diagnosis problem, which can improve the effectiveness of medical treatment as well as improve the accuracy of the diagnosis.

For this purpose, first a medical KB is required to utilize the previous diagnoses of patients. In the process of making the medical KB, it is realized that in the evaluation of a set of symptoms for disease diagnosis purpose, medical expert may feel comfortable to express his(er) opinion in natural language. The confidence level [49] of the doctor in his(er) own opinion is also an important aspect. For example, in a hospital two doctors are consulting to assess a patient’s current physical condition. One expert says “I have ‘full confidence’ that (s)he will be recovered albeit slowly”. Another says that “I am ‘not so confident’ how his(er) recovery will be”. It is to be noted here that, the diagnosis of a patient may be dependent on the confidence level of a doctor. In such scenario, generalized fuzzy number (GFN) [50] is a more appropriate tool to represent linguistic opinion of the doctor as it considers the degree of confidence of doctor’s opinion. With this viewpoint, in this paper, a medical KB is constructed in the form of a set of fuzzy decision rules. To form the KB, five symptoms, such as, Temperature, Headache, Stomach pain, Cough and Chest pain and five diseases, such as, Viral Fever, Malaria, Typhoid, Stomach problem and Chest problem are considered. The antecedent part of the rules consist of linguistic evaluation of patients’ symptoms with different degrees of confidences of the doctors and the consequent part represents the degree of association and non-association of diseases into the patients. To model this type of situation, we propose a KB in which the variables in the antecedent part are linguistic and they are quantified by GFNs. This means that the values of the patients’ symptoms are expressed in natural language term which easily represent the knowledge of the medical experts. The variables in the consequent part are quantified by IFSs, which consist of membership and non-membership degrees. The membership degree captures the degree of association of the disease to the patient and non-membership degree represents the degree of non-association of the disease to the patient. Thus, both FL and intuitionistic fuzzy logic (IFL) provide a rigorous framework for verbal representation of medical concepts that can be then embedded in meaningful fuzzy rules. Such rules can be easily absorbed, verified, adapted and possibly expanded by medical experts, and used for the development of MDSS, which could be valuable in the process of medical diagnosis. Then the new input data sets from patients are collected as their observed symptoms. To determine a proper diagnosis for each patient with the given values of tested symptoms, an appropriate intuitionistic fuzzy inference system (IFIS) based on the proposed medical KB is used.

Finally, the research proposes a web-based MDSS by using our proposed algorithm for the primary diagnosis of diseases, such as, Malaria, Viral fever, Typhoid, Stomach problem and Chest problem of patients having Temperature, Headache, Stomach pain, Cough and Chest pain as symptoms. The system is developed with the aim of improving the accessibility of the primary health care facilities, particularly to improve accessibility for the poor and unreached groups of the rural areas in developing countries of the world, where there are shortages of doctors and lack of proper diagnostic infrastructure. The proposed system will assist medical-personnel, especially in rural areas, to ensure in providing quality at all levels of health and medical care services.

3 Basic concepts

Due to inherent vagueness in medical diagnosis problems, medical experts always feel comfortable to express their opinions in natural languages. So, the first task is to quantify these linguistic expressions. Since, in this present study, during development of medical KB the other aim is to capture the degree of confidences of experts’ opinions, naturally the most suitable approach to quantify the opinions is with the help of GFNs. Further, in this work IFSs are used to model the possibility and impossibility of diseases into the patient (a brief justification and deeper discussion for using GFNs and IFSs are provided in Section 2.2). Thus, a primary discussion of various concepts, including fuzzy number, GFN, generalized triangular fuzzy number (GTFN) and IFS are mathematically introduced in Appendix A so as to facilitate further study.

The following section establishes a convenient way of constructing generalized triangular membership function [51] for linguistic values of decision variables involved in the antecedent part of the rule base.

3.1 Membership functions for the linguistic value of decision variable

In this section, we revise the concepts of linguistic variable and its representation. In this article, the medical KB is designed in terms of if-then rules, where the decision variables in antecedent part of the rules expressed by linguistic evaluation of patients’ symptoms and these variables can be treated as linguistic variables [52]. A linguistic variable may be regarded either as a variable whose value is a fuzzy number or as a variable whose values are defined in linguistic terms.

Let u be the name of the linguistic variable and T(u) is the term set of u, i.e., the set of names of linguistic values of u with each value being a fuzzy number defined on U. In this work, it is assumed that the values of each of the linguistic variables, namely, u 1,u 2,...,u n are defined in the interval \([a,b]\subset \mathbb {R}\). Let U=[a,b] and the term set T(u) consists of K+1(K≥2) terms as follows:

$$\begin{array}{@{}rcl@{}} T(u) &=& \{low_{1}, around(a+\beta), around(a+2\beta),....,\\ &&around(a+(K-1)\beta), high_{K} \} \end{array} $$

where β=(ba)/K and each term may be represented [51] with the help of generalized triangular membership functions \(\{\mu _{\widetilde {A}_{1}}(u),\mu _{\widetilde {A}_{2}}(u),...,\mu _{\widetilde {A}_{K+1}}(u)\}\) (Fig. 1) of the following form:

$$ \mu_{\widetilde{A}_{1}}(u)= \mu_{low_{1}}(u)=\left\{ \begin{array}{ll} \displaystyle \frac{(b-u)w_{1}}{(b-a)},& \text{for}~ a\leq u \leq b, \\ \displaystyle 0,& \text{otherwise} \\ \end{array} \right. $$
(1)
Fig. 1
figure 1

Membership function for the linguistic variable with (k + 1) terms when w = 1

The GTFN \(\widetilde {A}_{1}\) can be expressed as \(\widetilde {A}_{1}=[(a, a, b); w_{1}]\).

$$\begin{array}{@{}rcl@{}} \mu_{\widetilde{A}_{k}}(u)&=& \mu_{around(a+k\beta)}(u)\\ &=&\left\{ \begin{array}{ll} \displaystyle \frac{(u-a)w_{k}}{k\beta},& \text{for}~ a \leq u \leq a+k\beta, \\ \displaystyle w_{k},& \text{for}~ u=a+k\beta,\\ \displaystyle \frac{(b-u)w_{k}}{(b-a-k\beta)},& \text{for}~ a+k\beta \leq u \leq b, \\ \displaystyle 0,& \text{otherwise} \\ \end{array} \right. \end{array} $$
(2)

For 1≤kK−1 and each of the corresponding GTFN \(\widetilde {A}_{k}\) is denoted as \(\widetilde {A}_{k}=[(a, a+k\beta , b); w_{k}]\) and

$$ \mu_{\widetilde{A}_{k+1}}(u)=\mu_{high_{k}}(u)=\left\{ \begin{array}{ll} \displaystyle \frac{(u-a)w_{k+1}}{(b-a)},& \text{for}~ ~a\leq u \leq b, \\ \displaystyle 0,& \text{otherwise} \\ \end{array} \right. $$
(3)

The GTFN \(\widetilde {A}_{k+1}\) is denoted as \(\widetilde {A}_{k+1}=[(a, b, b); w_{k+1}]\). Where w 1,w 2,...,w k+1(∈[0,1]) denote the corresponding confidence level of the expert.

The novelty of this term set arrangement is that while very little knowledge is available about the boundaries of individual term, each term is stretched over the whole domain, though the mid values of each of the terms are situated at a fixed distance apart.

For example, we consider a term set as T(u)= {Very low, Low, Medium, High, Very high} where each term in T(u) is defined by a fuzzy number in the universe of discourse [0,1]. By using (1)–(3), each term of T(u) can be transformed to associate fuzzy number as given in Table 1 (for the sake of simplicity, in Table 1 and Fig. 1, the confidence level of the expert is assumed as one).

Table 1 The term set

The next section describes the aggregation operators involved in the proposed IFIS.

3.2 Logical connective operators

In fuzzy reasoning scheme to find the matching degree among the new fuzzy input data and each given rule, conjunctions and disjunctions are two basic operations. The realizations of conjunction and disjunction are facilitated via triangular norm (t-norm) and triangular conorm (t-conorm), respectively, as they offer a very wide class of binary functions of conjunction and disjunction in fuzzy context [53]. Moreover, the selection of suitable t-norm and t-conorm operators depend on the user behavior towards the aggregation and specific requirement of the decision scenario [54]. The definitions of t-norm and t-conorm are given as

Definition 1

[51, 55] A t-norm T is a function from [0,1]×[0,1]→[0,1] such that it is symmetric, associative, non decreasing in each argument and T(a,1) = aa∈[0,1].

Definition 2

[55] A t-conorm S is a function from [0,1]×[0,1]→[0,1] such that it is symmetric, associative, non decreasing in each argument and S(a,0) = aa∈[0,1].

However, in the computation process of our proposed IFIS, we utilize minimum and maximum operators as these two operators are basic t-norm and t-conorm respectively.

Further, to obtain the inferred outcome of each rule and overall output of the system, we need to facilitate the operations of multiplication of IFS value by a scalar and aggregation of several IFS values. In this work, we utilize continuous Archimedean t-norm and t-conorm based addition and multiplication rules, proposed by Beliakov et al. [56], as they produce more consistent result to aggregate IFS values. In this regard, a brief description of the concepts of IFS operations and aggregation operators for IFSs based on continuous Archimedean t-norm and t-conorm, which are extensively used in the proposed IFIS, are briefly reviewed in Appendix B.

4 Proposed medical diagnosis approach

In this section, we present the proposed approach for medical diagnosis. First, we shall describe how we interpret medical diagnosis problem mathematically.

Given two sets X={x 1,x 2...,x n } and Y={y 1,y 2...,y m } where X is a set of symptoms and Y is a set of diseases, respectively. The values n and \(m \in \mathbb {N}\) are numbers of symptoms and diseases, respectively. The interrelations among the symptoms and diseases are characterized by using a set of decision rules that constitute the medical KB. Subsequently, consider a new set, namely P={P 1,P 2,...,P t }, of t number of patients with symptoms {x 1,x 2...,x n }. The medical diagnosis problem aims to provide an accurate and timely diagnosis of the diseases {y 1,y 2...,y m } to the new set of patients.

4.1 Medical knowledge base

In our proposed approach, a medical KB is required, which can be expressed as a set of inference rules in the form ‘if antecedent and then consequent’ specifying a relationship between the inputs (symptoms) and outputs (diseases). Consequently, it can provide useful information to physicians/medical experts for better diagnosis of diseases while a new set of symptoms is observed. The medical KB can be carefully formulated after a detailed discussion about the various aspects of Viral Fever, Malaria, Typhoid etc., with the concerned medical experts.

During the discussion with the medical experts, it is realized that to evaluate a set of symptoms, when doctor asks a patient regarding his(er) condition, s(he) (i.e., patient) is not to be so confident to describe the conditions. The confidence level of the patient is inherently involved in the linguistic expressions which are used for expressing their conditions. As a result, when medical expert provides the judgments based on the patient’s symptoms, the confidence level of the patient is intrinsically connected to the expert’s advice. Such linguistic expressions can be then suitably captured by GFN as it considers the degree of confidence of expert’s opinions. It is also worth to mention that sometimes an expert is also in hesitation to provide judgments regarding the possibility of disease into a patient. These observations form the background of the present study where the medical KB is established by using a set of decision rules based on FL and IFL and thus, the expertise of the doctor is captured when diagnosing the patient.

In the decision rules that constitute the medical KB, symptoms are considered in antecedent part and diseases in consequent part. The antecedent part of the rules involve linguistic evaluation of patient’s symptoms associated with the degree of confidence of the doctor and the consequent part reveals the state of the patient in terms of diagnosis. For example, consider the following system of p fuzzy rules

$$\begin{array}{@{}rcl@{}} && R_{i}:\mathit{If}~x_{1}~\mathit{is}~\widetilde{A}_{i1}~\mathit{and}~ x_{2}~ \mathit{is}~ \widetilde{A}_{i2},..,x_{n}~\mathit{is}~ \widetilde{A}_{in} \\ &&~\mathit{then}~ \mathit{possibility~ of}~y_{1}~\mathit{is}~ C_{1i},y_{2}~\mathit{is}~ C_{2i},..,y_{m}~ \mathit{is}~ C_{mi}. \end{array} $$
(4)

where i=1,2,...,p (number of rules), GTFNs \(\widetilde {A}_{ij}(j=1,2,...,n)\), representing the values of the linguistic variables {x 1,x 2,...,x n }, are defined in the universe [0,1]. The computation procedure of constructing membership functions of GTFNs is described in Section 3.1. The set {y 1, y 2,...,y m } represents the set of diseases and IFSs \(C_{ki}=(\mu _{y_{ki}},\nu _{y_{ki}}) (k=1,2,...,m)\) represent the degrees of possibility and impossibility of disease y k , described in i th rule. More specifically, \(\mu _{y_{ki}}\) represents the degree of association of disease y k into the patient and \(\nu _{y_{ki}}\) represents the degree of non-association of disease y k into the patient.

4.2 Intuitionistic fuzzy inference system

This section presents intuitionistic fuzzy inference engine (IFIE) which is the core of the proposed IFIS. In this part, the discussed medical KB is applied to diagnose the disease for a new patient and the process is described below.

Inputs: It is assumed that the medical KB, consists of p fuzzy rules (as given in (4)), receives a new set of linguistic values of symptoms, as fuzzy inputs denoted as \(\{\widetilde {U}_{1},\widetilde {U}_{2},...,\widetilde {U}_{n}\}\).

Matching degree: The degree to which the fuzzy input data \(\widetilde {U}_{j}(j=1, 2,...,n)\) matches the i th rule R i is computed by using a conjunction operator (t-norm T) and a disjunction operator (t-conorm S) as follows [57]:

$$ \alpha_{i}= T(\displaystyle S_{x \in \widetilde{U}}(T_{j=1,2,...,n} (\mu_{\widetilde{A}_{ij}(x)}, \mu_{\widetilde{U}_{j}(x)})), i=1,2,...,p $$
(5)

Output of each rule: Subsequently, the inferred outcome of each rule is defined as an intuitionistic fuzzy value based on the concept of Larsen’s [58] product implication operator and computed by using Definition B.1 (Appendix B) as follows:

$$ \alpha_{i}(\mu_{y_{ki}},\nu_{y_{ki}})=\left( h^{-1}\left( \alpha_{i} h(\mu_{y_{ki}})\right), g^{-1}\left( \alpha_{i} g(\nu_{y_{ki}})\right)\right).$$

Larsen’s [57, 58] product implication operator is used for finding the association and non-association of diseases into the patient. The reason for using such kind of implication is that, it is the most commonly used implication operator and is simpler in nature.

4.3 Aggregation of the rule outcomes

Eliciting the corresponding output obtained from each rule R i for the fuzzy inputs \(\{\widetilde {U}_{1},\widetilde {U}_{2},...,\widetilde {U}_{n}\}\), the overall system output is computed as follows:

We apply the intuitionistic fuzzy arithmetic mean (IAM) (defined in (13)) that combines the outcome (calculated in the previous step) for each rule in the rule base. Thus, the aggregated outcome C k (k=1,2,...,m) is computed as

$$\begin{array}{@{}rcl@{}} C_{k}\!\!&=&\!\!IAM\!\left( \alpha_{1}(\mu_{y_{k1}},\nu_{y_{k1}}),\!\alpha_{2}(\mu_{y_{k2}},\nu_{y_{k2}}),\!...,\alpha_{p}\!(\mu_{y_{kp}},\!\nu_{y_{kp}})\right)\\ \!\!&=&\!\!\left( 1-\frac{1}{p}\displaystyle\sum\limits_{i=1}^{p}\alpha_{i}(1-\mu_{y_{ki}}), \frac{1}{p}\displaystyle\sum\limits_{i=1}^{p}\alpha_{i}\nu_{y_{ki}}\right) \\ \!\!&=&\!\!(\mu_{y_{k}},\nu_{y_{k}})(k=1,2,...,m) \end{array} $$
(6)

where,

$$ \mu_{y_{k}}= 1-\frac{1}{p}\displaystyle\sum\limits_{i=1}^{p}\alpha_{i}(1-\mu_{y_{ki}}) $$
(7)

and

$$ \nu_{y_{k}}= \frac{1}{p}\displaystyle\sum\limits_{i=1}^{p}\alpha_{i}\nu_{y_{ki}} $$
(8)

Here, p = total number of rules in the rule base, C k is the aggregated outcome for the kth disease y k (k=1,2,...,m).

More specifically, \(\mu _{y_{k}}\) and \(\nu _{y_{k}}\) are averages of strength and non-strength of kth (k=1,2,...,m) disease into a patient, and computed by (7) and (8), respectively. Hence, \(\mu _{y_{k}}\) reveals the degree of association of disease y k into a patient and \(\nu _{y_{k}}\) reveals the degree of non-association of disease y k into a patient.

Thus, the association and non-association of the diseases, namely y 1,...,y m into the patient are obtained. Then final diagnosis of the patient needs to be evaluated. For this purpose, a process is required which translates the output of the IFIE into crisp values, and this crisp diagnostic result is mostly expected by medical experts for proper analysis and explanation. For this purpose, we introduce the ‘most possible disease’ and we apply a relative closeness function F over the aggregated outcome with respect to each disease. This function will determine the relative closeness of each disease with respect to the ‘most possible disease’.

4.4 Final crisp output for finding the accurate diagnostic result

First, we shall describe how we interpret ‘most possible disease’ mathematically. If the possibility of a disease is represented by IFS I +=(1,0), then it implies that the possibility of attacking that disease is cent percent, i.e., the patient is suffering from that particular disease. Similarly, if the possibility of a disease is represented by IFS I =(0,1), then it implies that the non-possibility of attacking that disease is cent percent, i.e., the patient is not suffering from that particular disease.

In order to find out the final diagnosis, the relative closeness coefficient [59] for each disease with respect to ‘most possible diseases’ can be computed as follows:

$$ F(C_{k})=\frac{d(C_{k},I^{-})}{d(C_{k},I^{-})+d(C_{k},I^{+})} (k=1,2,...,m) $$
(9)

where, \(d(C_{k},I^{+})=\sqrt {(\mu _{y_{k}}-1)^{2}+(\nu _{y_{k}}-0)^{2}}\), \(d(C_{k},I^{-})=\sqrt {(\mu _{y_{k}}-0)^{2}+(\nu _{y_{k}}-1)^{2}}\) are computed by using Euclidean distances of C k from I +=(1,0) and I =(0,1), respectively.

Then the patient’s diseases can be determined according to greater relative closeness coefficient value defined in (9). However, for any two diseases y 1 and y 2, if the corresponding relative closeness coefficient values are equal and maximum, i.e., F(C 1) = F(C 2), then it implies that the patient may possess both the diseases y 1 and y 2.

A brief flow-chart of the proposed medical diagnosis process is shown in Fig. 2.

Fig. 2
figure 2

Flowchart of the inference system

5 Case study

Suppose in a given pathology, there are four patients, namely Arka, Bumba, Chandra and Dip whose symptoms are recorded by a routine case taking practice. We use the proposed medical diagnostic approach for diagnosing some diseases, such as, Malaria, Viral fever, Typhoid, Stomach problem and Chest problem. For these four patients, we also consider Temperature, Headache, Stomach pain, Cough and Chest pain as symptoms for the aforementioned diseases.

Our next aim is to construct a medical KB in the form of rule base. In literature, there are many ways of framing the rules. One of the popular ways to formulate the rules is using expert’s knowledge [14, 35, 37] and this process is widely used in many applications. In this context, to construct the rule base, the required data set is collected via survey from a local government hospital at Patna, India, after a consultation with the concerned medical experts about the various aspects of the aforementioned diseases. We consider this data as a primary level of our survey. However, in future our aim is to modify this data via survey in sub-centres, rural family welfare-centre, block primary health-centre and other health-centres in the state.

Based on the collected data, a medical KB is constructed by using ten fuzzy decision rules (R 1 R 10) which are described below.

R 1 If Temp is H with confidence level (CL) 0.7 and Hdc is M with CL 0.8 and Stp is VL with CL 0.8 and Cough is VH with CL 0.5 and Chp is L with CL 0.9 then possibility of Vfev is (0.9,0.0) and Mal is (0.5,0.3) and Typ is (0.4,0.6) and Stpr is (0.3,0.7) and Chpr is (0.1,0.8)

R 2 If Temp is H with CL 0.6 and Hdc is L with CL 1.0 and Stp is VL with CL 0.8 and Cough is M with CL 0.8 and Chp is VH with CL 0.3 then possibility of Vfev is (0.8,0.0) and Mal is (0.3,0.7) and Typ is (0.8,0.2) and Stpr is (0.1,0.8) and Chpr is (0.1,0.9)

R 3 If Temp is VH with CL 0.9 and Hdc is M with CL 0.4 and Stp is VL with CL 1.0 and Cough is H with CL 1.0 and Chp is L with CL 0.7 then possibility of Vfev is (0.6,0.3) and Mal is (1.0,0.0) and Typ is (0.2,0.6) and Stpr is (0.1,0.7) and Chpr is (0.1,0.9)

R 4 If Temp is VH with CL 0.7 and Hdc is L with CL 0.8 and Stp is VL with CL 1.0 and Cough is M with CL 1.0 and Chp is H with CL 0.7 then possibility of Vfev is (0.6,0.3) and Mal is (0.8,0.0) and Typ is (0.2,0.5) and Stpr is (0.1,0.8) and Chpr is (0.2,0.8)

R 5 If Temp is M with CL 0.9 and Hdc is VH with CL 0.7 and Stp is H with CL 0.8 and Cough is L with CL 0.9 and Chp is VL with CL 0.9 then possibility of Vfev is (0.5,0.4) and Mal is (0.5,0.3) and Typ is (0.9,0.0) and Stpr is (0.1,0.7) and Chpr is (0.0,0.9)

R 6 If Temp is M with CL 0.7 and Hdc is VH with CL 0.5 and Stp is L with CL 0.8 and Cough is H with CL 0.4 and Chp is VL with CL 0.8 then possibility of Vfev is (0.4,0.4) and Mal is (0.3,0.5) and Typ is (0.8,0.1) and Stpr is (0.3,0.7) and Chpr is (0.1,0.9)

R 7 If Temp is VL with CL 0.8 and Hdc is L with CL 0.9 and Stp is VH with CL 0.9 and Cough is H with CL 0.3 and Chp is M with CL 0.5 then possibility of Vfev is (0.1,0.7) and Mal is (0.3,0.5) and Typ is (0.4,0.4) and Stpr is (1.0,0.0) and Chpr is (0.3,0.5)

R 8 If Temp is VL with CL 0.7 and Hdc is M with CL 0.4 and Stp is VH with CL 0.8 and Cough is L with CL 0.5 and Chp is H with CL 0.4 then possibility of Vfev is (0.2,0.7) and Mal is (0.1,0.7) and Typ is (0.2,0.7) and Stpr is (0.8,0.1) and Chpr is (0.5,0.4)

R 9 If Temp is L with CL 0.7 and Hdc is VL with CL 1.0 and Stp is M with CL 0.5 and Cough is H with CL 0.4 and Chp is VH with CL 0.9 then possibility of Vfev is (0.1,0.6) and Mal is (0.2,0.6) and Typ is (0.3,0.6) and Stpr is (0.2,0.8) and Chpr is (1.0,0.0)

R 10 If Temp is L with CL 0.5 and Hdc is VL with CL 0.9 and Stp is H with CL 0.6 and Cough is M with CL 0.7 and Chp is VH with CL 0.7 then possibility of Vfev is (0.2,0.7) and Mal is (0.1,0.6) and Typ is (0.2,0.6) and Stpr is (0.4,0.5) and Chpr is (0.8,0.1)

where, VL =Very Low, L =Low, M =Medium, H =High, VH =Very High; Temp =Temperature, Hdc =Headache, Stp =Stomach Pain, Chp =Chest Pain; Vfev =Viral fever, Mal =Malaria, Typ =Typhoid, Stpr =Stomach problem and Chpr =Chest problem and CL =Confidence level.

As mentioned earlier, in the present study the rules that constitute the rule base are carefully formulated with the assistance of medical experts from a local government hospital at Patna, namely, Mahavir Vaatsalya Aspatal, Patna, India. The fuzzy rules can be further refined by embodying doctor/medical experts’ domain knowledge, who may decide to modify or delete some rules, or even add new one [60].

In our rule base, to express antecedent part of the rules, a linguistic term set having five polar terms, such as, Very Low, Low, Medium, High and Very High is utilized. The graphical representations of antecedent parts of rule R 1 are shown in Fig. 3, where, the blue line represents the clauses of antecedents; the black dotted line represents the input, whereas the green for matching degree.

Fig. 3
figure 3

Matching degree calculation from Rule-1

Fuzzy input data (i.e., linguistic evaluations of new symptoms) for four patients are given in Table 2.

Table 2 Input data

To compute the matching degree, we consider t-norm T and t-conorm S as the minimum operator and maximum operator, respectively. The calculated matching degree for each rule corresponding to the aforementioned patients’ symptoms is shown in Table 3.

Table 3 Matching degree of the rules

The average of each rule’s output for every patient is computed by using (6) and the results are given in Table 4.

Table 4 Average output

The relative closeness coefficient value with respect to each disease for each patient is calculated by using (9) and given in Table 5.

Table 5 Relative closeness coefficient value

The normalized relative closeness coefficient value of each disease for each patient is shown in Table 6.

Table 6 Normalized relative closeness coefficient value

Decision

From Tables 5 and 6, it is clear that, Arka suffers from malaria whereas Bumba, Chandra and Dip suffer from viral fever.

6 Further analysis

6.1 Some important observations towards the proposed approach

Some important analyzes of the novelty and significance to the proposed method are summarized below.

  • This study encompasses the idea of both fuzzy rule-based and IFSs based medical diagnosis models to enhance the accuracy of the final diagnostic result. Furthermore, the proposed method could handle all the issues of the standalone fuzzy rule and IFS based methods.

  • From the mentioned literature survey (Section 2.1), it is observed that in the medical KB, none has considered the idea of capturing confidence level of the medical experts in their opinions. The present work uses the idea for the first time, where the medical experts’ confidence levels are taken into consideration in the formulation of medical KB. Further, the proposed method makes the diagnosis more flexible by modeling the decision part (i.e., consequent part) by using IFS. An IFS can model the chance of occurring and non-occurring of a disease into a patient and this is very close to the real-life situations. Thus, the proposed method can also be regarded as a consistent and meaningful framework to address medical diagnosis problem.

  • In this work, to capture the confidence levels of medical experts, the values of each of the linguistic variables in the antecedent are represented by using GFNs. It is worth noticing that, each GFN [51] in a term set is stretched over the whole domain [0,1]. Due to the same base of each of the GFNs, every rule provides a non-zero matching degree (for the non-zero fuzzy input) and it ensures that each and every rule could be fired and consequently, it provides better efficiency to the output. This research intends to address this issue, in an effort to provide a more accurate diagnosis.

  • In this paper, we have used t-norm, t-conorm and intuitionistic fuzzy aggregation operator in the IFIS, which means that our method has a solid theoretical foundation. The method is simple in terms of computation compared to the other existing methods.

  • The computational complexity of the proposed method is evaluated as follows: if there are ‘n’ number of symptoms, ‘p’ number of rules and ‘m’ number of diseases, then our algorithm requires (3n−1)p+(5p+1)m+10m’ arithmetic operations for producing diagnostic result. In addition, for a fixed number of rules, with the increase in number of symptoms and diseases, the computational burden to generate diagnostic results grows in linear order. So, there will be no big change in the computational complexity in the case of more diseases and symptoms as the complexity of our algorithm is of polynomial order.

6.2 Validation of results using similarity measure

The proposed method makes the diagnosis flexible by accommodating the idea of confidence level and incorporating the concept of IFS during the diagnostic process. As per our knowledge, none has considered these ideas, therefore, direct comparison to any other method is not possible. However, in an attempt to assess the accuracy of the results of our proposed method, we will apply similarity measurements in this section.

Researchers have introduced various similarity measures [50, 6163] for GTFNs. One of such measures, employed in this work to validate the results of our case study, is Hwang and Yang’s [62] similarity measure.

In our medical KB, variables in the antecedent part of each rule are represented as GTFNs. The new inputs (symptoms of new patients) are also modeled as GTFNs. First, the similarity between antecedent of each rule and input data is computed by utilizing similarity measure. For instance, suppose we have a rule

$$\begin{array}{@{}rcl@{}} && R_{1}: If~X_{1}~\mathit{is}~\widetilde{A}_{11}, X_{2}~\mathit{is} ~\widetilde{A}_{12},..., X_{5}~\mathit{is}~\widetilde{A}_{15}~\mathit{then}~Y_{1}~\mathit{is}\\ &&\quad C_{11},..., Y_{5}~\mathit{is}~C_{15} \end{array} $$

and one new input with five clauses P 11,P 12,...,P 15. Now find the similarity values, such as, \(S(\widetilde {A}_{11},P_{11}),S(\widetilde {A}_{12},P_{12}),...,S(\widetilde {A}_{15},P_{15})\) by [62]. Then the overall similarity between the set of new inputs and antecedent of rule R 1 is computed as

$$ S(R_{1})=\frac{S(\widetilde{A}_{11},P_{11})+S(\widetilde{A}_{12},P_{12})+...+S(\widetilde{A}_{15},P_{15})}{5} $$
(10)

In this way, we can compute the overall similarity between the set of new inputs and antecedent of each rule R i (i=1,2,...,p). Then, by observing the similarity values, one particular rule is selected which has maximum overall similarity. Then the consequent part of the selected rule is considered for diagnosis of the disease for the new set of inputs. For this purpose, based on the consequent part of the selected rule, the relative closeness coefficient value for each disease is computed in a way similar to the method described in Section 4.4.

After applying, the similarity measurement method to our case study (mentioned in Section 5), the obtained diagnostic results are described in Table 7.

Table 7 Results by similarity measure

From the Table 7, it is observed that, Arka suffers from malaria, Bumba suffers from stomach problem, and Chandra and Dip suffer from viral fever. This result is almost same as described in our proposed approach.

Note

Similarity measurement method allows to validate the results of the medical diagnosis problem given in the case study. However, by using this method strength of activation of the if-part of each rule for new input data can not be computed. Thus, the standalone similarity measurement method may not improve the accuracy of final diagnosis. With this observation, similarity measurement method can be viewed as a validation approach, wherein our aim is to get an idea for estimating the possibility of diseases intuitively by observing the maximum similarity value between the new input data and the given rules.

6.3 Sensitivity analysis

Sensitivity analysis addresses to the answer of the question ‘How sensitive is the overall decision with respect to small changes in the input values?’ In other words, sensitivity analysis is the systematic investigation regarding the impact of potential changes in the input values on the final decision. Here we conduct a sensitivity analysis to explore the effect of variation of confidence level of medical experts about the symptoms of patients, on the final diagnostic result. Without loss of generality, we assume that linguistic values (strengths of the symptoms are expressed by linguistic values) are quantified by GTFNs and they are of the form S i =[(a i ,b i ,c i );w i ], where w i represents confidence level of doctors/medical experts with respect to corresponding symptoms. A slight variation in the original confidence level provided by medical experts is computed as follows:

S i =[(a i ,b i ,c i );w i −Δ i ], where Δ i is variation value and its range is [0,mini w i ] with 0≤w i −Δ i ≤1.

Sensitivity analysis is performed to determine the influence of doctor’s confidence level in evaluating the symptoms of patients, on the final diagnostic result. The result of the sensitivity analysis, i.e., the variation of strength of diseases into the patients Arka, Bumba, Chandra and Dip are depicted in Figs. 456 and 7, respectively.

Fig. 4
figure 4

Diagnostic result of Arka sensitive to confidence level of symptoms provided by Doctors

Fig. 5
figure 5

Diagnostic result of Bumba sensitive to confidence level of symptoms provided by Doctors

Fig. 6
figure 6

Diagnostic result of Chandra sensitive to confidence level of symptoms provided by Doctors

Fig. 7
figure 7

Diagnostic result of Dip sensitive to confidence level of symptoms provided by Doctors

From Fig. 4, we observe that, when the confidence level of doctors in evaluating five symptoms of the patient Arka, vary from 0 to 0.02, the strengths of the two diseases, such as, Viral fever and Malaria, are changed. But over the range [0,0.4], strengths of other three diseases, such as, Typhoid, Stomach problem and Chest problem remain constant. It demonstrates that, diseases Viral fever and Malaria are more sensitive than other three diseases for the patient Arka.

From Fig. 5, we observe the following. During evaluation of symptoms of the patient Bumba, for different values of confidence levels of doctors in the range [0,0.05], strengths of Stomach problem and Chest problem vary whereas in the range [0.05,0.2] strengths of Malaria and Typhoid diseases vary. But throughout the range [0,0.8], strength of Viral fever remain constant. Hence, for the patient Bumba, Malaria, Typhoid, Stomach problem and Chest problem are more sensitive than the Viral fever.

From Fig. 6, it is observed that, if the confidence levels of five symptoms of the patient Chandra vary from 0.08 to 0.12, then strengths of Malaria and Typhoid are changed. In the range [0.1,0.16], strengths of diseases Viral fever, malaria and Typhoid vary, whereas throughout the range [0,0.7], strength of Chest problem remain fixed.

From Fig. 7, it is clear that, if the confidence levels regarding five symptoms of the patient Dip vary from 0 to 0.06, then strengths of Viral fever and Malaria are changed. Again, other three diseases Typhoid, Stomach problem and Chest problem remain unchanged over the range. It states that, Viral fever and Malaria are more sensitive than other three diseases for the patient Dip.

Finally, in order to explore the applicability of the proposed method, a web-based MDSS for diagnosing diseases is developed in the following section.

7 A web-based tool for supporting medical diagnosis

One of the key challenges in today’s world is to provide primary healthcare for the people living in rural areas. As mentioned in the introduction, delivering health services in such areas is quite difficult due to lack of adequate medical infrastructure and scarcity of trained doctors. The advancement of web technologies and wide access of the internet, open up a new avenue for delivering primary medical services via web-based applications. The major advantages of a web-based system are that it can be accessed from anywhere via internet and can perform specific complex decision making task to assist users. In view of this, we develop a web-based tool to diagnose five common diseases based on the proposed medical diagnosis approach. Basically, the system is designed to support primary health workers working in rural health service centers, to diagnose the disease at initial level. Thus, the system can be regarded as a way of complementing the inadequate number of medical experts in developing countries. The architecture of the web-based MDSS is depicted in Fig. 8.

Fig. 8
figure 8

Architecture of the web based system

The system consists of three components: user interface, decision making module and data management module. User interface helps the primary health workers to interact with the system via interactive web pages. First, the health workers complete examining the patients and collect data related to the signs and symptoms of diseases. Then, they provide their assessments for each of the symptoms (Fig. 9). These assessments are basically GFNs and used as new inputs of the system for diagnosis the diseases of the new patients.

Fig. 9
figure 9

Interactive web page for diagnosis

The decision making module is responsible for processing each set of inputs by using the proposed medical KB and IFIS. When a health worker feeds the system with values representing signs and symptoms of diseases based on his(er) observations regarding a patient and directs system to diagnosis the disease by clicking the button ‘Diagnosis patient’ (Fig. 9), consequently this decision making module is started running on a web server on the background of the system and finally, produces the diagnosis result (Fig. 10). Figure 10 shows that the patient with ID ‘01’ has viral fever with 81 % possibility. The data management module is responsible for storing all the data generated during this process.

Fig. 10
figure 10

Diagnosis result of the system

The system is developed in Netbeans integrated development environment (IDE) by using HyperText Markup Language (HTML), Java (J2SE), Java server pages (JSP), JavaScript programming languages with My Structured Query Language (MYSQL) as the database management systems.

To demonstrate the method that drives the proposed system, we evaluate it by using the collected medical data of fifteen patients as presented in Table 8. For each of the patients, the diagnostic result is shown in the last column of Table 8.

Table 8 Web based diagnostic result

The following observations on the proposed web-based MDSS, we would like to share with our readers.

In most of the rural areas of developing countries, an insufficient number of doctors has enhanced the mortality of patients who are suffering from various diseases [5]. As an illustration, in rural areas of India sometimes specialist doctors may come to the health-centre only once in a week. The waiting time for treatments in health-centre of rural areas sometimes takes a few days or weeks. By that time, the condition of patients gets worse as the disease may have already spread out. As many of the diseases could have been cured at the early stage, the patients may have to suffer for the rest of their life.

With this observation in the background, a web-based MDSS is developed by emulating human intelligence, which could be used to assist the health workers in predicting the diseases, without consulting the specialists directly. Health workers will be trained on how to use such a system for diagnosing patients as a way of complementing the insufficient number of medical experts/specialists in rural areas [14]. The software will not replace the specialist or doctor [5], as it is developed to assist medical practitioner and health workers in diagnosing and predicting the patient’s condition from certain rules without spending several hours. The proposed system can assist/helps the medical workers of health-centre in predicting the disease which can provide early diagnosis before starting a proper medical treatment. The system can reduce the chances of developing serious illness as well as avoid unnecessary tests. If the patients’ diseases are not diagnosed in early stage then they would not have the appropriate medical treatment, they have to buy several drugs, may go for several medical tests, etc., which may imply an over cost for the patients [10].

In this respect, the proposed web-based MDSS can save time, avoid unnecessary drugs and tests, reduce extra cost and help patients.

8 Conclusion

This study has presented a novel algorithm to solve medical diagnosis problem by integrating fuzzy and intuitionistic fuzzy frameworks. The research contribution has two phase: first phase executes a new diagnostic approach and validates the final outcome. Second phase develops a web-based MDSS for the diagnosis of diseases by adapting the proposed approach. This method has considered a set of five related diseases with a set of common symptoms. The advantages of this approach can be pointed out as follows: (i) the proposed methodology presents an attempt to develop a medical diagnostic approach considering medical experts’ confidence levels, (ii) the proposed medical KB has a higher utility due to its ability to include possibility and non-possibility of diseases during the diagnostic process, (iii) the method is simple in terms of computation, (iv) the membership construction method makes an effort so that each rule could be fired for providing better efficiency to the output, (v) the proposed algorithm is used for developing web-based MDSS.

In general, it may be concluded that combination of GFN and IFS in medical knowledge base makes the medical diagnostic model developed in the present work more realistic. This is because the use of GFN and IFS imply that both confidence level of patients in expressing their conditions and hesitation of medical experts in providing the degree of possibility of diseases—are being taken into consideration simultaneously and thus, enhance the capability of handling uncertainty in the medical diagnosis problems. It is important to remark here that, this is a reasonable starting point, since, we could not find any ready-made benchmark data which can fit in the proposed MDSS. This is because the rule base of our medical diagnostic model is not only framed by using fuzzy numbers but also IFSs. Thus, in the present study, we have fitted the approach by a short amount of collected data (with the assistantship of doctors) and validated the result of our case study by similarity measurement method.

In future, we would like to extend the proposed MDSS with a large set of samples and symptoms collected via survey in sub-centre, rural family welfare-centre, block primary health-centre and other health-centres in our state and analyze the results in comparison with the conventional approaches.