1 Introduction

The “Bayesian belief network (BBN)” was introduced by Judea Pearl, in 1985, which is basically based on Bayes theorem proposed by Rev. Thomas Bayes [3]. Bayesian belief network combines graph theory with probability theory. Probability theory and graph theory are used to explain the system's uncertainty, and they help to show that the probability distribution has an independent structure that can be broken down into more manageable components [12]. Cooper has shown that [4] an inference in BBN is a kind of NP-Hard problem, where inference complexity to calculate the queries involves O(dn) steps for given n number of nodes with d number of variables in the network. In order to store full-joint distribution, a total space complexity requires nearly to O(dn).

A Bayesian belief network has been effectively used in a number of fields, including software development, medical diagnostic systems, weather forecasting, agriculture area problems, stakeholder engagement, project management, reliability engineering and safety engineering, signal processing, and other domains of engineering. In the last twenty years, BBN has been used widely in software engineering such as software quality prediction [18], software defect prediction [6, 8, 14, 23, 27], and software risk assessment [32]. It has been established that BBN can express uncertainty wherever it is used. However, there are two major obstacles to the construction of a substantial BBN: (i) construction of causal model and (ii) construction of node probability table (NPT).

Nadkarni and Shenoy [19] have mentioned and described the two different methods for the construction of a causal model named data-based and knowledge-based approach. These two methods are widely used in the construction of a causal model. This paper focuses purely on the challenge of constructing the NPT.

NPT specifies the probability distributions of the child node, the descendent, for all possible combinations of the parent node's states. In the literature, there are different methods/approaches available for constructing the NPT of BBN, such as Expert elicitation based [22, 29], Data analysis based [20, 30], Data and Expert based [35], Survey and weighted functions [25], Noisy-OR [9, 24, 34], Noisy-MAX [5], “Recursive Noisy-OR (RNOR)” [17], and Extended Recursive Noisy-OR (ERNOR) [26], Ranked nodes [7], Improved versions of ranked nodes [15, 16], and so on. The most difficult aspect of building a BBN is often constructing NPTs manually through domain experts; however, outmost care should be taken in constructing of NPTs as when the participating number of nodes increases, NPT's size grows exponentially. In some cases, surprisingly even for single NPT, there are tens or hundreds of probabilities that need to be given. A method based on data analysis has been suggested [20, 30] to address the elicitation difficulty of expert-based NPT construction. However, major challenge with the real-world applications has insufficient sample data. Therefore, in this situation, the pure data-based approach of NPT construction may not work well. Zhou et al. [35] proposed an amalgam of partial data and an expert discernment-based approach for NPT construction to reduce the complexity of expert-based as well as data-based NPT construction. Perkusich et al. [25] proposed a technique that creates NPTs using weighted expressions created using information gathered from subject-matter experts through a survey. This method's benefit over expert-based ones is that it generates NPTs using weighted expressions. A technique called Noisy-OR was proposed by Pearl [24]; however, their method considers the Boolean nodes only ignore the interactions among variables [13]. Diez and Galan [5] have proposed Noisy-MAX to overcome this drawback. However, the heavy assumptions of independence and some approximations might not be accurate if standard deviations are large, and the observed evidence greatly deviates from the expected values. An improved version of Noisy-OR has been proposed by Lemmer & Gossink [17] which is known as RNOR. However, in RNOR when there are more than three causes, there is an asymmetry issue. To overcome this issue ERNOR has been proposed by Quintanar-Gago & Nelson [26]. Fenton et al. [7] developed the ranking nodes strategy especially for the parent nodes which is basically using probabilistic distribution as weighting function and describes continuous values in discretized intervals, enabling the estimation of huge CPTs. This approach has been improved by Laitila and Virtanen [15, 15], and the viability of applying it to real-world issues has been established. According to the literature, all of the NPT generating techniques currently in use are problem-specific and cannot be used to solve all kinds of problems. Typically, a BBN falls under the category NP-Hard problem, which require efficient solution; however, no distinct solution is available in the literature. Therefore, in this paper, a generalized approach to constructing the NPT has been proposed using fuzzy logic to provide a near-optimal solution.

Additionally the paper is presented as follows: Sect. 2 briefly presents BBN. The Sect. 3, described the proposed method to construct the NPT of BBN model. An illustrative example of applying the proposed methodology is described in Sect. 4. The validation of the proposed method is presented in Sect. 5, followed by the conclusion in Sect. 6.

2 Background

A BBN primarily gives a visual depiction of causal links among variables based on conditional probabilities to convey the uncertainties in the dependences among variables in a given “Directed Acyclic Graph (DAG),” G(V,E), where E represents the edges and V symbolizes the vertices. In the BBN each vertex denotes a variable, and causal connections between variables being denoted by each edge. Based on the evidence of the information given, the posterior probability in BBN is computed using the Bayes theorem. Figure 1 shows a simple BBN where the causal relationships among 3 variables are shown. Authors would like to investigate the effect of programmer experience (PE) and programmer capability (PC) on quality of code (QC).

Fig. 1
figure 1

Example of BBN

The node probability table of all variables is also presented in Fig. 1, where high and low are the two states of each node, i.e., P(PE) = High, P(~ PE) = Low. Similarly for P(PC), P(~ PC), P(QC) and P(~ QC). With the available values in NPT, next step is to calculate the probability of each vertex. For an example, the probability of given vertex in Fig. 1, i.e., programmer capability (PC) and quality of code (QC), is P (PC|QC). Based on the Bayes theorem, this can be calculated by the mathematical expression shown in Eq. 1.

$$P({\text{PC}}/{\text{QC}}) = \frac{{P({\text{QC}}/{\text{PC}})*P({\text{PE}})}}{{P({\text{QC}})}}$$
(1)

The same can also be written as:

$$\begin{aligned} P({\text{QC}}) \,=\, & P({\text{QC}}/{\text{PE}},{\text{PC}})*P({\text{PE}}) \\ & + P({\text{QC}}/{\text{PE}},\sim {\text{PC}})*P({\text{PE}})*P(\sim {\text{PC}}) \\ & + P({\text{QC}}/\sim {\text{PE}},{\text{PC}})*P(\sim {\text{PE}})*P({\text{PC}}) \\ & + P({\text{QC}}/\sim {\text{PE}},\sim {\text{PC}})*P(\sim {\text{PE}})*P(\sim {\text{PC}}) \\ \end{aligned}$$
(2)

Since P (QC|PE, PC), P (QC|PE, ~ PC), P (QC|~ PE, PC), P (QC|~ PE, ~ PC) are being calculated based on node probability table, thus the diagnosis probability P (PC|QC) may also be determined.

A BBN has several other advantages [12], Nadkarni and Shenoy [19] such as method is probabilistic, it can be created using a modest dataset, BBN manages circumstances in which some data entries are inaccessible or missing. Further, it is also possible to model causal links using BBN, BBN can readily include expert data, development of group model building is also possible, and, whenever new information becomes available, updating is simple.

3 Proposed approach

Here, a generalized approach for constructing the NPT using fuzzy logic is being presented. The suggested strategy has been validated by applying it to a BBN model of software design and development and evaluating it with best-case and worst-case software metrics.

Proposed approach starts with identifying a casual concept. The causal concepts refer to the variables or factors within a specific domain that are believed to have causal relationships with each other. These concepts represent the cause–effect relationships among different elements in the system being modelled. Each causal concept is typically represented as a node in the BBN. To identify the causal concepts in a BBN, a domain knowledge is required along with understanding of system under consideration. It is important to note that identifying causal concepts in a BBN is an iterative process that involves a combination of domain knowledge, empirical evidence, and expert input. The process may require adjustments and refinements so that a deeper understanding of the system and its causal dynamics may be obtained.

In the next step a casual relationship is being generated which basically refers to the cause–effect connections between variables represented as nodes in the network. These relationships indicate how changes in one variable can influence or cause changes in another variable within the system being modelled. As suggested by Nadkarni and Shenoy [19] there are mainly two ways to generate a causal relationships data-based approach and knowledge-based approach. However, it is important to note that generating causal relationships in a BBN sometimes requires a combination of domain knowledge, expert input, and empirical evidence.

In the next step it is necessary to define a membership function for each vertex (input and output node), which basically assigns a degree of membership to an element in the given fuzzy set. It represents the extent to which an element fits in a particular fuzzy set or category. Membership functions are essential in fuzzy logic because they allow for the representation of uncertainty and partial truth. It is important to note that defining membership functions in fuzzy logic involves both subjective and objective considerations. Subjective aspects include expert opinions and linguistic interpretations, while objective aspects can involve statistical analysis or data-driven approaches to determining membership values.

In the following step, a fuzzy “IF–Then” rule has been shown that is the fundamental component of fuzzy logic system, which basically prompts a relationship among input (antecedents) and output (consequents) variables using fuzzy logic terms. These rules help in making decisions or performing actions based on the input conditions. Designing fuzzy “IF–Then” rules require a combination of empirical data, domain understanding, and expert knowledge. It is an important to ensure the rules adequately capture the relationship among input and output variables and reflect the desired behavior of the fuzzy logic system.

In subsequent step a fuzzy inference and defuzzification using MATLAB tool have been performed. Fuzzy inference and defuzzification are two key steps in a fuzzy logic system that converts fuzzy inputs into crisp outputs. Fuzzy inference mainly determines the degree of membership or the fuzzy output based on fuzzy “IF–Then” rule and the input variables. It involves evaluating the rules and combining their activations to obtain a fuzzy output. Defuzzification is a procedure to convert the fuzzy output (obtained from the fuzzy inference) into a crisp or numerical value that represents final output of fuzzy logic system.

Finally a NPT with the help of defuzzified area has been obtained. The defuzzified shape obtained from the fuzzy inference and defuzzification process has been further used to calculate the NPT using geometrical or definite integration method.

Algorithm NPT_fuzzy_logic summarizes the steps described above in formal way.

figure a

For sake of simplicity and reader perspective, an example in the next section elaborates the entire process of proposed method.

4 A descriptive example

This section demonstrates the procedure of proposed algorithm with help of an example. The BBN model of software design and development [8] as presented in Fig. 2 has been taken anonymously to explain the working of proposed approach.

Fig. 2
figure 2

BBN model of software design and development

4.1 Identify the causal concepts.

One of the most crucial steps in BBN creation is the identification of causal notions. A causal concept can be defined as cause-and-effect relationships among the development activities and endpoints in a graphical model. It may be an attribute, issue, factor, assumptions or variable of a domain and typically represented by a node in BBN. Fenton et al. [8] have identified the 9 causal concepts for the BBN model of software design and development. The identified causal concepts are RDSE (Relevant Development Staff Experience), DSM (Development Staff Motivation), CP (Capability of Programmer), QDS (Quality of Development Staff), DPF (Defined Process Followed), QDP (Quality of Development Process), EDP (Effort in Development Process), ODPE (Overall Development Process Effectiveness), PDD (Probability of Defects in Development).

  • RDSE (Relevant Development Staff Experience) RDSE is the first input metric of QDS. The impact of staff having strong technical backgrounds and experience on QDS is significant.

  • DSM (Development Staff Motivation) DSM is the second metric of QDS. The employees who work on software development are positive people who do their utmost to generate high-quality design and code.

  • CP (Capability of Programmer) CP is the last metric of QDS. The ability of a programmer is influenced by their education, background, intelligence, and domain expertise.

  • QDS (Quality of Development Staff) QDS is one of the output metrics of the BBN model of software design and development. The output of QDS is depending on the evidence of RDSE, DSM, and CP.

  • DPF (Defined Process Followed) DPF is the input metrics of QDP. The defined process must be followed to achieve a good-quality development process.

  • QDP (Quality of Development Process) QDP is the 2nd output metric of the BBN model of software design and development. The output of QDP is depending on the evidence of DPF and QDS.

  • EDP (Effort in Development Process) EDP is the input metric for ODPE. More effort spent in the development process will increase the chances of overall development process effectiveness.

  • ODPE (Overall Development Process Effectiveness) ODPE is the 3rd output metric of the BBN model of software design and development. The output of ODPE is depending on the evidence of EDP and QDP.

  • PDD (Probability of Defects in Development) PDD is the desired/last output metric of the BBN model of software design and development. The outcome of PDD is depending on the evidence of ODPE.

4.2 Generate the causal relationships

Causal relationships among nodes can be achieved with the help of causal connections. A unidirectional arrow is used to indicate a causal connection, which is a knot connecting two or more causal notions. Positive or negative causal relationships are possible. A positive connection means that increasing the causative concept causes the effect concept to increase, whereas a negative connection means that increasing the causal concept causes the effect concept to decrease. For example, in Fig. 2, “RDSE,” “DSM,” and “CP” have a positive influence on “QDS.” Thus, the higher RDSE, DSM, and CP, the higher will be the QDS. On the other hand, “ODPE” has a negative influence on “PDD.” Thus, the higher ODPE, the lower PDD.

4.3 Outline the membership function

In order to build membership functions, either domain experts or actual data might be used. [28, 33]. Numerous geometries, including trapezoidal, triangular, polygonal, and others, are possible for membership purposes. [28]. However, triangular and trapezoidal forms being preferred as it provides a useful depiction of domain expert knowledge and subsequently make calculation easier [10, 31]. Domain experts are used to define the membership functions for all input and output metrics taken into account in Fig. 2 and subsequently in Figs. 3, 4, 5, 6, 7, 8, 9, 10, 11 of the BBN model.

Fig. 3
figure 3

Relevant development staff experience

Fig. 4
figure 4

Development staff motivation

Fig. 5
figure 5

Capability of programmer

Fig. 6
figure 6

Quality of development staff

Fig. 7
figure 7

Defined process followed

Fig. 8
figure 8

Quality of development process

Fig. 9
figure 9

Effort in development process

Fig. 10
figure 10

Overall development process effectiveness

Fig. 11
figure 11

Probability of defects in development

4.3.1 Design “IF–THEN” fuzzy rules

Different sources, including subject matter experts, knowledge engineering, and historical data analysis, from existing literature, can be used to create a fuzzy “IF–THEN” rule. [28]. Typically, domain specialists assist in the formulation of fuzzy rules. The designed “IF–THEN” fuzzy rules concerning individual output node of the BBN model are explained below:

  • Quality of Development Staff (QDS) If RDSE, DSM, and CP are high, then QDS will be high. Similarly, if RDSE, DSM, and CP are low, then QDS will be low. Three input nodes, each with following three linguistic states low (L), medium (M), and high (H), make up this output node. Consequently, there are 27 rules in total. The following interpretation is given to the fuzzy rules.

Rule 1:- If RDSE is H and DSM is H and CP is H, then QDS is H.

Rule 2:- If RDSE is H and DSM is H and CP is M, then QDS is H.

Rule 26:- If RDSE is L and DSM is L and CP is M, then QDS is L.

Rule 27:- If RDSE is L and DSM is L and CP is L, then QDS is L.

  • Quality of Development Process (QDP) There are two input nodes and three linguistic states in each of the input nodes in this output node: low (L), medium (M), and high (H). Consequently, there are nine rules in total. Thus following fuzzy rules have been designed:

Rule 1:- If QDS is H and DPF is H, then QDP is H.

Rule 2:- If QDS is H and DPF is M, then QDP is H.

Rule 8:- If QDS is L and DPF is M, then QDP is L.

Rule 9:- If QDS is L and DPF is L, then QDP is L.

  • Overall Development Process Effectiveness (ODPE) In this output node, there are also two input nodes and three linguistic states in each of the input nodes: low (L), medium (M), and high (H) and thus there are total 9 rules. Following fuzzy rules are developed.

Rule 1:- If QDP is H and EDP is H, then ODPE is H.

Rule 2:- If QDP is H and EDP is M, then ODPE is H.

Rule 8:- If QDP is L and EDP is M, then ODPE is L.

Rule 9:- If QDP is L and EDP is L, then ODPE is L.

  • Probability of Defects in Development (PDD) This output node has only one input node and three linguistic states, such as low (L), medium (M), and high (H) and thus only 3 fuzzy rules are being developed.

Rule 1:- If ODPE is H, then PDD is L.

Rule 2:- If ODPE is M, then PDD is M.

Rule 3:- If ODPE is L, then PDD is H.

4.4 Fuzzy inference and defuzzification

The output of each fuzzy rule is evaluated and combined by the fuzzy inference engine. An algorithm for fuzzy inference converts one fuzzy set into another. The crisp value must be retrieved as an output in various applications, and fuzzy set is mapped into crisp value using defuzzification techniques like centroid, max–min, bisection, etc. The proposed approach calculates the crisp value using centroid method of defuzzification, commonly known as center of area or center of gravity. This method of defuzzification is the most popular and physically pleasing method available [28]. The output of defuzzification for the first evidence of node QDS is shown in Fig. 12. The MATLAB Fuzzy Logic Toolbox is used to perform the fuzzy inference and defuzzification process.

Fig. 12
figure 12

Output of defuzzification for first evidence of node QDS

4.5 Construct NPT with the help of defuzzified area

The NPT of a node can be constructed with the help of a defuzzified shape obtained from the fuzzy inference and defuzzification process. For better visualization and understanding purpose the graphical representation for the output of the first evidence of node QDS is drawn, and it is shown in Fig. 13. In our case, triangular and trapezoidal membership functions have been used. So, the obtained defuzzified shape is also in triangular and trapezoidal form. For calculating the defuzzified area of triangular and trapezoidal shapes, the geometrical method or definite integration method can be applied. However, for the membership functions like Gaussian, Sigmoidal, Z curves, S curves, Pi curves, etc., only the definite integration method can be applied for calculating the defuzzified area. Therefore, in our case, both methods (geometrical and definite integration) can be applied.

Fig. 13
figure 13

Graph of defuzzified area of first evidence

First Method Geometrical method.

  • Defuzzified area of low (DVL): From Fig. 13, it is found that the shape is in triangular form.

    $$\begin{aligned} {\text{DV}}_{{\text{L}}} = & \, 0.{5}*{\text{Base}}*{\text{Height}} \\ = & \, 0.{5}*0.0{7}*0.{17} \\ = & \, 0.00{595} \\ \end{aligned}$$
  • Defuzzified area of medium (DVM): From Fig. 13, it is found that the shape is in trapezoidal form, but the area of low is overlapping.

    $$\begin{aligned} {\text{DV}}_{{\text{M}}} = & {\text{ Total defuzzified area }}{-}{\text{ overlapping area of low}} \\ = & \, \left[ {\left( {\text{Area of trapezium}} \right) \, {-} \, \left( {\text{Area of triangle}} \right)} \right] \\ = & \, \left[ {\left( {0.{5 }\left( {\text{Sum of parallel side}} \right){\text{ Height}}} \right) \, {-} \, \left( {0.{5}*{\text{Base}}*{\text{Height}}} \right)} \right] \\ = & \, \left[ {\left( {0.{5}\left( {0.{27} + 0.0{7}} \right)0.{47}} \right) \, {-} \, \left( {0.{5}*0.0{7}*0.{17}} \right)} \right] \\ = & \, 0.0{7} \\ \end{aligned}$$
  • Defuzzified area of high (DVH): From Fig. 13, it is found that the shape is in trapezoidal form.

    $$\begin{aligned} {\text{DV}}_{{\text{H}}} = & \, 0.{5 }\left( {\text{Sum of parallel side}} \right){\text{ Height}} \\ = & \, 0.{5 }\left( {0.{3} + 0.{4}} \right)0.{47} \\ = & \, 0.{1645} \\ \end{aligned}$$

Now, the probability of low, medium, and high can be calculated using the formula given in Eq. 3.

$$\mathrm{Probability\, of} L\, \mathrm{or}\, M\, \mathrm{or}\, H= \frac{\mathrm{Defuzzified\, area\, of }L\,\mathrm{ or }M\,\mathrm{ or }\,H}{\mathrm{Total\, Defuzzified\, area}}$$
(3)

Probability of low:

$$= \frac{0.00595}{0.00595+0.07 +0.1645} =0.025 =0.03$$

Probability of medium:

$$= \frac{0.07 }{0.00595+0.07 +0.1645} =0.29$$

Probability of high:

$$= \frac{0.1645}{0.00595+0.07 +0.1645} =0.6839=0.68$$

Second Method Definite Integration Method (DIM).

The graph shown in Fig. 13 is between the possibility (µ) and the point (z) which is a straight line. The equation of a straight line is shown in Eq. 4.

$$\mu -{\mu }_{1}=m\left(z-{z}_{1}\right)$$
(4)

Here m is the slope of a straight line and its formula is shown in Eq. 5.

$$\mathrm{slope\, of\, a\, straight\, line}=\frac{{\mu }_{2}-{\mu }_{1}}{{z}_{2}-{z}_{1}}$$
(5)

where (z1, µ1) and (z2, µ2) are the known points of a straight line

From this graph (Fig. 13), it is easy to calculate defuzzified area of low, medium, and high using definite integration method (DIM) by eliminating the overlapping area. Equation for calculating the defuzzified area of low, medium, and high is shown in Eq. 6.

$${\text{DIM}} = \left[ {\mathop \int \limits_{{a_{1} }}^{{b_{1} }} \mu_{1} \left( z \right){\text{d}}z + \mathop \int \limits_{{a_{2} }}^{{b_{2} }} \mu_{2} \left( z \right){\text{d}}z + \ldots + \mathop \int \limits_{{a_{n} }}^{{b_{n} }} \mu_{n} \left( z \right){\text{d}}z} \right] - \left[ {\mathop \int \limits_{{c_{1} }}^{{d_{1} }} \mu_{{k_{1} }} \left( z \right){\text{d}}z + \mathop \int \limits_{{c_{2} }}^{{d_{2} }} \mu_{{k_{2} }} \left( z \right){\text{d}}z + \ldots + \mathop \int \limits_{{c_{n} }}^{{d_{n} }} \mu_{{k_{n} }} \left( z \right){\text{d}}z} \right]$$
(6)

where \({a}_{i},{b}_{i}, {c}_{i}, {d}_{i} \forall 1\le i \le n\) are the limits of integration.

  • Defuzzified area of low (DVL): The defuzzified area of low is lying in the interval z ∈ [0.43, 0.5]; it is divided into two parts, i.e., z ∈ [0.43, 0.47] and z ∈ [0.47, 0.5]. By putting the value of z1 = 0.43, µ1 = 0, z2 = 0.47, µ2 = 0.17 in Eqs. 4 and 5 for z ∈ [0.43, 0.47] the equation of straight line is:

    $$\mu =4.25z-1.827$$
    (7)

Similarly, by putting the value of z1=0.5, µ1=0, z2=0.47, µ2=0.17 in Eqs. 4 and 5 for z ∈ [0.47, 0.5] the equation of straight line is:

$$\mu =-5.67z+2.835$$
(8)

The DVL can be calculated using Eqs. 6, 7, and 8 as:

$$\begin{aligned} {\text{DV}}_{{\text{L}}} = & \mathop \int \limits_{0.43}^{0.47} 4.25 z - 1.827{\text{d}}z + \mathop \int \limits_{0.47}^{0.5} - 5.67z + 2.835 {\text{d}}z \\ = & 0.00597 \\ \end{aligned}$$
  • Defuzzified area of medium (DVM): The defuzzified area of medium is lying in the interval z ∈ [0.43, 0.7]; it is divided into four parts, i.e., z ∈ [0.5, 0.47], z ∈ [0.47, 0.53], z ∈ [0.53, 0.6], and z ∈ [0.6, 0.7]. By putting the value of z1 = 0.5, µ1 = 0, z2 = 0.47, µ2 = 0.17 in Eqs. 4 and  5 for z ∈ [0.5, 0.47] the equation of straight line is:

    $$\mu =-5.67z+2.835$$
    (9)

Similarly, by putting the value of z1=0.43, µ1=0, z2=0.47, µ2=0.17 for z ∈ [0.47, 0.53], z1=0.53, µ1=0.47, z2=0.6, µ2=0.47 for z ∈ [0.53, 0.6], and z1=0.7, µ1=0, z2=0.6, µ2=0.47 for z ∈ [0.6, 0.7] in Eqs. 4 and 5.

The equation of straight line for z ∈ [0.47, 0.53], z ∈ [0.53, 0.6], and z ∈ [0.6, 0.7] is as follows:

$$\mu =4.25z-1.827$$
(10)
$$\mu =0.47$$
(11)
$$\mu =-4.7z+3.29$$
(12)

The DVM can be calculated using Eqs. 6, 9, 10, 11, and 12 as:

$$\begin{aligned} {\text{DV}}_{{\text{M}}} = & \left[ {\mathop \int \limits_{0.5}^{0.47} - 5.67 z + 2.835{\text{d}}z + \mathop \int \limits_{0.47}^{0.53} 4.25z - 1.827 {\text{d}}z + \mathop \int \limits_{0.53}^{0.6} 0.47{\text{d}}z + \mathop \int \limits_{0.6}^{0.7} - 4.7z + 3.29 {\text{d}}z} \right] \\ =\, & 0.0717 \\ \end{aligned}$$
  • Defuzzified area of high (DVH): The defuzzified area of high is lying in the interval z ∈ [0.5, 1]; it is divided into two parts, i.e., z ∈ [0.5, 0.58] and z ∈ [0.58, 1]. By putting the value of z1 = 0.5, µ1 = 0, z2 = 0.58, µ2 = 0.47 in Eqs. 4 and  5 for z ∈ [0.5, 0.58] the equation of straight line is:

    $$\mu =5.875z-2.937$$
    (13)

Similarly, by putting the value of z1=0.58, µ1=0.47, z2=1, µ2=0.47 in Eqs. 3 and 4 for z ∈ [0.58, 1] the equation of straight line is:

$$\mu =0.47$$
(14)

The DVH can be calculated using Eqs. 6, 13, and 14 as:

$${\mathrm{DV}}_{\mathrm{H}}=\left[\underset{0.5}{\overset{0.58}{\int }}5.875 z-2.937\mathrm{d}z+\underset{0.58}{\overset{1}{\int }}0.47\mathrm{d}z\right] -\left[\underset{0.5}{\overset{0.58}{\int }}5.875 z-2.937\mathrm{d}z+\underset{0.58}{\overset{0.6}{\int }}0.47\mathrm{d}z+\underset{0.6}{\overset{0.7}{\int }}-4.7 z+3.29\mathrm{d}z\right]$$
$$=0.1645$$

Now, the probability of low, medium, and high can be calculated using the formula given in Eq. 3.

Probability of low:

$$= \frac{0.00597}{0.00597+0.0717 +0.1645} =0.025=0.03$$

Probability of medium:

$$= \frac{0.0717 }{0.00597+0.0717 +0.1645} =0.29$$

Probability of high:

$$= \frac{0.1645}{0.00597+0.0717 +0.1645} =0.679=0.68$$

Here, the probability of low, medium, and high from both the methods for the first evidence of node QDS has been calculated. But for the rest 26 evidence, only first method is applied because in our experiment triangular and trapezoidal membership functions have been used. The complete NPT of node QDS is shown in Table 1. The NPTs of numerical values are expressed in percentage (100 of scale). Similarly, the NPT of nodes QDP, ODPE, and PDD have been constructed and are shown in Tables 2, 3, 4.

Table 1 NPT of node QDS
Table 2 NPT of node QDP
Table 3 NPT of node ODPE
Table 4 NPT of PDD

5 Validation of proposed method

To validate the constructed NPTs the following steps have been applied:

Step 1 Use of BBN tool Netica [21].

Step 2 Model construction in Netica.

Step 3 Apply the constructed NPTs to all the output nodes.

Step 4 Compilation of BBN model.

Step 5 Apply evidence on output of step 4, i.e., compiled mode of BBN model.

Step 6 Analyze obtained result from BBN model.

With the help of Netica tool, BBN model of software design and development subnet as shown in Fig. 2 has been constructed. After model construction the constructed NPTs are inserted to all the output nodes. The compilation process of BBN model is started using the Netica tool. The obtained complied mode of BBN model of software design and development subnet from the Netica tool is shown in Fig. 14.

Fig. 14
figure 14

Compile mode of BBN

Next, the best case and worst case of software metrics have been applied which are shown in Table 5 on compiled mode of BBN model to validate the correctness and applicability of the proposed approach. The obtained outcomes are shown in Figs. 15 and 16.

Table 5 Qualitative value of software metrics
Fig. 15
figure 15

Outcome of best-case scenario

Fig. 16
figure 16

Outcome of worst-case scenario

5.1 Result analysis

Outcomes of BBN model after applying best- and worst-case scenario are shown in Figs. 15 and 16 which further shows the probability of defects in design and development is low (53.2) in the best-case scenario, whereas the probability of defects in design and development is high (59.5) in the worst-case scenario. Further, when the evidence of RDSE, DSM, and CP is applied, the probabilistic outcome of QDS is high (68.0%) in best-case scenario, whereas it is low (89%) in worst-case scenario. Similarly, when evidence of DPF is applied, the outcome of QDP is high (58.9%) in best-case scenario, whereas it is low (85.2%) in worst-case scenario. Next, applied the evidence of EDP the outcome of ODPE is high (53.5%) in best-case scenario, whereas it is low (71.8%) in worst-case scenario.

From the above observation it is clear to observe that the constructed NPTs are correct because the outcomes of all the output nodes (QDS, QDP, ODPE, and PDD) are affirmative in best-case scenario and adverse in worst-case scenario.

6 Conclusion and future scope

Generating a node probability table (NPT) in Bayesian belief networks (BBN) has been considered as NP-Hard problem, as the time complexity grows exponentially with increasing number of variables. Rich articles are available in scientific community that suggest various methods to solve this issue, however, mostly are problem specific and designed by considering special case(s). This paper presented a novel universal approach NPT_fuzzy_logic() to generate a NPT in BBN using fuzzy logic technique. The method proceeds with identifying casual concepts among the nodes in given graph G(V, E), next generating causal relationship among nodes, defining membership function among input and output nodes, defining fuzzy “IF–THEN” rules, then performing fuzzy inference and defuzzification procedure to find the defuzzified area followed by calculating NPT based on premeditated defuzzified area. Fuzzy inference and defuzzification have been performed using Fuzzy Logic Toolbox™ provided in MATLAB®. For sake of simplicity and easy understanding the proposed method has been demonstrated with an illustrative example by considering the well-known BBN model of software design and development [8]. The proposed method has been validated by applying the BBN tool of Netica®. The result analysis section shows the significance of proposed method by considering the outcome of BBN model, after applying best- and worst-case scenario followed by probabilistic outcomes among causal concepts. The correctness of constructed NPT has been shown by considering the consequence of output node. The proposed approach will also be useful in domain experts-based NPT construction because fuzzy logic represents qualitative perception-based reasoning by “IF–THEN” fuzzy rules, which makes it easier for experts to express their judgment of NPT. As the proposed approach is not problem specific, thus may be applied universally in other problem domain. Further, in future the work can also be extended to other problem domain such as software development problems, agriculture area problems, environmental area based on availability of real-time dataset. Due to enormous applications of BBN in decision-making system, nature-based solutions, stakeholder engagement, reliability engineering and safety engineering, etc., the work will be extended to analyze and support the various circumstances such as analyzing the pipe failure in water/oil distribution system [11], stakeholder’s knowledge to support nature-based solution implementation [2], software project management [1] etc.