1 Introduction

Cost overrun, which can be defined as an excess of actual cost over budgeted [3, 4] in construction projects, has become a global phenomenon [15]. All project participants are affected by cost overruns. They are the causes of less return-on-investment for the client and additional charges for end-users. Similarly the contractors are not able to earn enough profit [2]. Cost overruns in the Indian construction industry are also very common. The Construction industry in India is the second largest industry of the country after agriculture and accounts for some 6.5% of GDP [10]. It is observed that problem of cost overrun has not been addressed and handled with required attention and due consideration that its impact over the Indian construction industries has been witnessed significantly to the extent that almost 60% project have undergone the time and cost overrun problem. According to the reports published by programme implementation division of the Ministry of Statistics and Programme Implementation (MOSPI) May 2016 [48], total original cost of implementation of 86 projects when sanctioned, was of the order of Rs. 179,120.81 crore but this was subsequently anticipated to Rs. 230,344.92 crore implying a cost overrun of 28.6%. The expenditure incurred on these projects till May, 2016 is Rs. 30,149.3 crore, which is 13.1% of the revised cost of the projects. Hence it is a matter of great concern to assess the causes of cost overrun to alleviate the issue of cost overrun.

During the project cycle cost overrun occurs due to the involvement of various factors. These factors are associated with some form of risk resulting from different sources. Hence, risk assessment of factors causing cost overrun has become a required mission of the construction industry. The can identify the importance of risk factor causing cost overrun so that they can deploy more resources on it to eliminate or mitigate the expected consequences. It is considered a necessary feature in the decision-making process.

The main objective of risk assessment is to estimate risk by identifying the factors causing the risk, the likelihood of occurrence of risk factor, and the severity of such risk. The existing approaches for risk analysis are qualitative, based on the fully subjective judgment of the competent personnel or quantitative such as, Monte-Carlo simulation, sensitive analysis, fault tree analysis, event tree analysis, failure mode effects and criticality analysis, based on precise data. The exact data required for these quantitative methods are very difficult for complex and real situations like construction projects as they involve too many factors with a high level of uncertainty.

A model based on the qualitative approach to assess the risk can be developed by incorporating linguistic variables to address the uncertainties associated with construction activities. The fuzzy set theory (FST) has the potential to deal with incomplete, imprecise and uncertain data. It is better equipped to handle the almost same analogous which is found in the complex construction projects. It can be used for non-accurate inputs also. Baloi and Price [5] compare different theories used for dealing with uncertainty within the construction industry and recommend FST as a vital solution for assessing construction uncertainty. Regarding the uses of fuzzy techniques in cost overrun assessment problems, very limited studies have been found in India during literature survey. This has encouraged for the application of fuzzy techniques in this paper.

Therefore the objective of this study is to propose a model based on fuzzy technique to estimate the magnitude of the risk factors causing cost overrun in the Indian construction industry. On the basis of risk magnitude, the cost overrun factors are prioritized in Indian construction industry. In order to assess the magnitude of the risk factors causing cost overrun, Probability index (PI) and severity index (SI) in the form of fuzzy linguistic variables are considered and the risk magnitude, namely fuzzy index for cost overrun (FIC) for each factor is calculated using the risk matrix originated from PMBOK (version, 2004).

This study will help the participants of Indian construction industry to understand the causes of cost overrun factors associated with construction projects and it will provide a solid overview of the knowledge domain for practitioners.

2 Related work

The fuzzy set theory has been used for decision support tool, evaluation and assessment, forecasting and modelling risks in construction projects. Kangari and Riggs [22] proposed a risk assessment model based on fuzzy set theory (FST) by incorporating natural language computation, a fuzzy set’s evaluation of risk, and linguistic approximation modes [41]. The linguistic approximation technique deals with the subjectivity matters in construction risk assessments by determining the nearest natural language expression for the estimated fuzzy set numbers. Paek et al. [35] developed a risk-pricing algorithm, using FST to calculate the bid price of a construction project. Fuzzy arithmetic operations are applied to compute the risk contingency value. Wirba et al. [47] also proposed a fuzzy based risk management system, which recognized risks, checked for dependence amongst them and assessed the risk likelihood of occurrence by using linguistic variables. Tah and Carr [42] also presented a model by incorporating linguistic variables to analyse risk. Kuchta [27] used fuzzy numbers to assess the risks of construction projects. Cho et al. [8] proposed a methodology for risk assessment by combining uncertainty (using fuzzy concepts) and traditional frameworks of risk assessment. Knight and Fayek [25] also made use of fuzzy theory to predict potential cost overruns on engineering design projects. Wang and Liang [44] then proposed the multiple fuzzy goals programming model to minimize project total costs, total completion time, and total crashing costs. Choi et al. [9] presented a risk assessment methodology for underground construction projects. Zheng and Ng [51] analyzed cost and time in the construction projects management, the authors provided methods and tools based on fuzzy sets, to evaluate and manage risks in the underground construction projects. The proposed methodology used a fuzzy risk assessment approach to assess the priority of risks in terms of extra costs over the budget. Shang et al. [39] developed a FST based risk model to assess the risk probability and impact by using linguistic variables in design and conceptual stages. Dikmen and Birgonul [11] presented a methodology to fuzzy risk assessment for international construction projects. The proposed methodology uses effect diagrams to build the risk model and a fuzzy risk assessment approach to estimate priority of risks in terms of extra costs over the budget. Dikmen et al. [12] proposed a fuzzy risk assessment methodology for assessing the risk of cost overrun of international construction projects. Wang and Elhag [45] presented grouped fuzzy decision making approach for assessment of risks in bridge construction project. Lee and Lin [28] presented a new method for fuzzy risk assessment by using fuzzy numbers directly instead of linguistic variables. Chen and Wang [7] have proposed the fuzzy AHP method for evaluation of risks within global construction projects. Karimiazari et al. [23], and Nieto-Morote and Ruz-Vila [33] introduced fuzzy set theory (FST) based risk modelling and analytic methods to deal with ill-defined, vague, imprecise, and complex risk analysis problems. Yeung et al. [49] adopted fuzzy set theory to measure the performance of relationship-based construction projects in Australia. San Cristobal [38] proposed the use of the PROMETHEE method under fuzzy environments in order to determine the critical path of a network, considering not only time but also cost, quality and safety criteria. Gunduz et al. [18] proposed a fuzzy Assessment Model to Estimate the Probability of delay in Turkish Construction Projects. Shaktawat, and Vadhera [40] presented a fuzzy tool for determination of cost overrun of a river type hydro power plant. Khalek et al. [24] developed the risk models for the global construction market using the analytic hierarchy process (AHP) to evaluate risk factors weights (likelihood) and FUZZY LOGIC approach to evaluate risk factors impact (Risk consequences) using software aids such as EXCEL and MATLAB software. Manoliadis [31] applied the concepts of fuzzy association and fuzzy composition to identify relationships between risks and the consequences on projects performance measures.

3 Proposed approach for developing the model

This section discusses the proposed methodology for developing the model for assessing the risk factors causing cost overrun in detail. The following steps are performed to develop the model.

  1. 1.

    Cost overrun factor identification The risk factors causing cost overrun in construction projects are identified through literature review.

  2. 2.

    Model development A model is developed to assess the risk factors causing cost overrun using fuzzy inference system.

3.1 Cost overrun factor identification

Risk factors causing cost overrun prevailing in the construction industry are identified through literature review. Due to the seriousness of the problem of cost overrun in construction projects the literature survey reveal a quite number of research studies that discussed and investigated the cost overrun factors in both developed and developing countries. Table 1 illustrates the studies regarding cost overrun factors from different part of the world.

Table 1 Studies regarding risk factors causing cost overrun from different part of the world

On the basis of the above studies and with the opinion of the experts from Indian construction industry 55 commonly occurred risk factors causing cost overrun in Indian construction projects are considered for assessment. Table 2 shows the commonly occurred risk factors causing cost overrun in Indian construction projects.

Table 2 commonly occurred risk factors causing cost overrun in Indian construction projects

3.2 Model development

A model to assess the risk magnitude of the factors causing cost overrun is developed using fuzzy theory. Hence some fundamentals of fuzzy set theory (FST) and fuzzy inference process have been described here.

3.2.1 Basic concept of fuzzy set theory

3.2.1.1 Fuzzy set

The fuzzy set theory was first introduced by Zadah [50]. Formally, a fuzzy set A of a universe of discourse X is characterized by a membership function µA (x): X → [0, 1] that takes values in the interval [0 1], can be defined as

$${\text{A }}={\text{ }}\{ ({\text{x}},{\text{ }}{\upmu_{A}} {\text{(x)}}){\text{ /x}}\in{{A}},{\upmu_{A}}{\text{(x)}}\in[0,1]{\text{}}\}$$

where µA(x) is a membership function, which states the degree to which any element x in A is a member of the fuzzy set A. This definition unites each element x in A with µA (x) in the interval [0, 1] which is assigned to x.

3.2.1.2 Membership functions

In fuzzy logic a membership function (MF) is represented by a curve that defines the fuzziness value (or degree of membership) of linguistic variables in between 0 and 1. Membership function gives a numerical meaning for each label. There are different shapes of membership functions, viz, triangular, trapezoidal, Gaussian, bell-shaped, piecewise-linear etc. A triangular fuzzy number x (see Fig. 1) with membership function can be expressed by the Eq. (1) given below

Fig. 1
figure 1

Triangular membership function

$${\upmu _{{\text{A}}({\text{x}})}}=\left\{ {\begin{array}{*{20}{c}} {({\text{x}} - {{\text{a}}_1})/({{\text{a}}_{\text{m}}} - {{\text{a}}_1})}&{{{\text{a}}_1} \leq {\text{x}} \leq {{\text{a}}_{\text{m}}}} \\ {({{\text{a}}_2} - {\text{x}})/({{\text{a}}_2} - {{\text{a}}_{\text{m}}})}&{{{\text{a}}_{\text{m}}} \leq {\text{x}} \leq {{\text{a}}_2}} \\ 0&{{\text{otherwise}}} \end{array}} \right.$$
(1)
3.2.1.3 Fuzzy operators

Important fuzzy operators “AND”, “OR” and “NOT” are used to make fuzzy rules. These “fuzzy combination” operators can be explained as follows.

Consider the two fuzzy sets A and B be with membership functions µA(x) and µB(x) respectively.

The intersection operation (corresponds to the logical ‘AND’) can be defined as:

$$\upmu {\text{A}} \cap {\text{B(x) }}={\text{ min }}\left[ {\upmu {\text{A(x)}},{\text{ }}\upmu {\text{B(x)}}} \right]$$

Union operation (which corresponds to the logical ‘OR’) can be defined as:

$$\upmu {\text{AUB(x) }}={\text{ max }}\left[ {\upmu {\text{A(x)}},{\text{ }}\upmu {\text{B(x)}}} \right]$$
3.2.1.4 IF-THEN rules

A fuzzy system is a compilation of IF-THEN rules that connect input linguistic variables to an output value. Each fuzzy rule contains the antecedent and the consequent that includes fuzzy propositions. These propositions in turn are statements and join the linguistic variables with linguistic operators “and”, “or” and “not”. In the majority of fuzzy modelling, only the linguistic operator “and” is used to join the linguistic labels of the antecedent, whereas the consequent is formed by only one linguistic label. For this reason IF-THEN rules with connective “and” is considered in this study. The general form of a fuzzy rule used in fuzzy logic control system can be explained by relation (2):

$${\text{If }}{{\text{x}}_1}\;{\text{is}}\;{{\text{A}}_1}\;{\text{AND}}\;{{\text{x}}_2}\;{\text{is}}\;{{\text{A}}_2}\;{\text{then}}\;{\text{y}}\;{\text{is}}\;{\text{B}}$$
(2)

where x1, x2 are input linguistic variables with A1, A2 being their corresponding fuzzy values and y is the output linguistic variable with B as its fuzzy value.

3.2.1.5 Fuzzy inference system

Fuzzy inference is the system to map from a given input to an output using fuzzy logic. Mamdani Fuzzy Inference System and Takagi-Sugeno Fuzzy Model (TS Method) are two important methods of FIS. Mamdani Fuzzy Inference System Mamdani and Assilian [29] is widely used therefore the basic steps of this process are discussed here:

  1. 1.

    Set of fuzzy rules are established first

  2. 2.

    The crisp inputs are fuzzified by defining the membership functions.

  3. 3.

    To establish rule strength fuzzified inputs are combined according to the fuzzy rules.

  4. 4.

    The outcome of the rule are found by combining the rule strength and the output membership function.

  5. 5.

    The outcomes are combined to get an output distribution, and

  6. 6.

    The output distribution is defuzzified to get a crisp output.

Detailed description of this process is shown in Fig. 2.

Fig. 2
figure 2

Mamdani fuzzy inference system

3.3 Development of model to calculate the risk magnitude of the factors causing cost overrun using Fuzzy Logic Toolbox of the MATLAB Program Software

Based on the above concepts, the following steps are performed to calculate the risk magnitude of the factors causing cost overrun using Fuzzy Logic Toolbox of the MATLAB Program Software [17].

3.4 Define input and output

The risk magnitude usually can be assessed by two main risk parameters, risk probability and risk severity. Therefore the PI, and SI for risk factors causing cost overrun are used as input variables in this model. In order to determine the risk magnitude of the factors causing cost overrun, FIC is measured as the output of this model as shown in Fig. 3. FIC signifies the risk magnitude of a certain factor.

Fig. 3
figure 3

Input and output of the model

3.4.1 Membership function

Triangular membership function has been used in this study for input and output variable. The triangular fuzzy value of each linguistic variables such as PI, SI and FIC have been shown in Tables 3 and 4. The graphical presentation of membership function for these variables has been shown in Figs. 4, 5 and 6 respectively.

Table 3 Fuzzy value of linguistic variables for input
Table 4 Fuzzy value of linguistic variables for output
Fig. 4
figure 4

Graphical presentation of linguistic variables for probability index

Fig. 5
figure 5

Graphical presentation of linguistic variables for severity index

Fig. 6
figure 6

Graphical presentation of linguistic variables for fuzzy index for cost overrun

3.4.2 Formation of rules

In the present study probability index and severity index are considered for antecedent part and FIC is considered for consequent part as shown in Fig. 6 to assess the magnitude of the risk factors. The relationship among these parameters is needed to introduce logical rules for the two inputs (probability index and severity index for each factor) and output FIC. For this purpose fuzzy values given in risk matrix, as shown in Table 5, originated from PMBOK (version, 2004) [36] are used here.

Table 5 Risk matrix

There are two input variables and each input variables consists of five fuzzy sets. in general, if n is the number of fuzzy sets representing one input variable and m is the number of fuzzy sets representing second input variable, then the maximum number of propositions that can be written is m*n. Therefore, there are 25 propositions. Some of the valid propositions are as follow:

  1. 1.

    If probability index is very low and severity index is very low then fuzzy index for cost overrun is low.

  2. 2.

    If probability index is low and severity index is very low then fuzzy index for cost overrun is low.

  3. 3.

    If probability index is medium and severity index is very low then fuzzy index for cost overrun is medium.

  4. 4.

    If probability index is high and severity index is very low then fuzzy index for cost overrun is high.

  5. 5.

    If probability index is very high and severity index is very low then fuzzy index for cost overrun is high.

The above rules are generated using the rule editor of Fuzzy Logic Toolbox of the MATLAB Program Software as shown in Fig. 7. Based on the descriptions of the input and output variables defined with the FIS Editor, the Rule Editor allows us to construct the rule statements automatically.

Fig. 7
figure 7

fuzzy rules of the model

3.4.3 Defuzzification

Finally, the model performed defuzzification of the combined fuzzy output to generate crisp output value. The magnitude of the FIC is determined as an exact number in the interval of zero to one. The complete procedure is shown in the rule viewer window of Fuzzy Logic Toolbox of the MATLAB Program Software as shown in Fig. 8. The rule viewer displays a roadmap of the whole fuzzy inference process. It allows us to interpret the entire fuzzy inference process at once. The Rule Viewer also shows how the shape of certain membership functions influences the overall result.

Fig. 8
figure 8

Defuzzification

4 Application of developed model

The application of the developed fuzzy risk assessment model is illustrated by means of a practical case study from India. The developed methodology is applied for determining the risk magnitude of factor “Fluctuation in price material”. Following steps are performed for determining the FIC of factor “Fluctuation in price material”.

4.1 Data collection

The data was collected from the experts of construction industry in India to know the perceptions on the probability index and severity index of the factor “Fluctuation in price material” in Indian construction industry. For this purpose an interview was developed with a panel of ten experts. Experts had a vast experience in construction projects such as water and waste water, roads and railways and public buildings. Table 6 shows the experience of the experts.

Table 6 Experience of the experts

A five-point scale of 1–5 was adopted for getting the opinion of the experts for probability and severity index of the factor. These numerical values of the respondents were assigned linguistic values such as ‘1 = very low; 2 = low; 3 = medium; 4 = high; 5 = very high’ for both probability index and severity index. The fuzzy values used in developed model were then assigned to these linguistic variables. Table 7 shows the responses of the experts and their respective fuzzy values.

Table 7 Responses of experts

4.2 Average of expert’s opinions

To obtain the average of experts opinions, the fuzzy average operation for aggregate method that is known as the ‘‘Triangular Average Formula’’ [6] is used.

Triangular average formula for n number of experts.

Consider n experts and fuzzy number \({{\text{A}}_{\text{i}}}=({{\text{a}}_{\text{1}}}^{{({\text{i}})}},{{\text{a}}_{\text{m}}}^{{({\text{i}})}},{{\text{a}}_{\text{2}}}^{{({\text{i}})}}),\quad {\text{i}}={\text{1}},{\text{2}},{\text{3}} \ldots {\text{n}}\).

The average of two fuzzy numbers A1 and A2 can be calculated as

$$\begin{aligned} ({{\text{A}}_{\text{1}}}+{{\text{A}}_{\text{2}}})/{\text{2 }} & ={\text{ ((}}{{\text{a}}_{\text{1}}}^{{({\text{1}})}}+{\text{ }}{{\text{a}}_{\text{m}}}^{{({\text{1}})}}+{{\text{a}}_{\text{2}}}^{{({\text{1}})}}{\text{) }}+{\text{ (}}{{\text{a}}_{\text{1}}}^{{({\text{2}})}}+{\text{ }}{{\text{a}}_{\text{m}}}^{{({\text{2}})}}+{{\text{a}}_{\text{2}}}^{{({\text{2}})}}{\text{))}}/{\text{2}} \\ & ={\text{ (}}({{\text{a}}_{\text{1}}}^{{({\text{1}})}}+{\text{ }}{{\text{a}}_{\text{1}}}^{{({\text{2}})}}),{\text{ (}}{{\text{a}}_{\text{m}}}^{{({\text{1}})}}+{\text{ }}{{\text{a}}_{\text{m}}}^{{({\text{2}})}}{\text{)}},{\text{ (}}{{\text{a}}_{\text{2}}}^{{({\text{1}})}}+{{\text{a}}_{\text{2}}}^{{({\text{2}})}}{\text{))}}/{\text{2}} \\ \end{aligned}$$

The average of n fuzzy number can be calculated as

$$\begin{aligned} {{\text{A}}_{{\text{avg}}}}~~ & ={\text{ }}{{\text{A}}_{\text{1}}}+ \cdots {{\text{A}}_{\text{n}}}/{\text{n}} \\ {{\text{A}}_{{\text{avg}}}}~~ & ={\text{ ((}}{{\text{a}}_{\text{1}}}^{{({\text{1}})}}+{\text{ }}{{\text{a}}_{\text{m}}}^{{({\text{1}})}}+{{\text{a}}_{\text{2}}}^{{({\text{1}})}}{\text{) }}+ \cdots {\text{ (}}{{\text{a}}_{\text{1}}}^{{({\text{n}})}}+{\text{ }}{{\text{a}}_{\text{m}}}^{{({\text{n}})}}+{{\text{a}}_{\text{2}}}^{{({\text{n}})}}{\text{))}}/{\text{n}} \\ \end{aligned}$$

Using the above formula the fuzzy values for probability and severity index are determined as follows:

Average fuzzy value for probability index (PI) = (0.325, 0.575, 0.8).

Average fuzzy number for severity index (SI) = (0.65, 0.9, 1).

To obtain a crisp value, the fuzzy value (a1, am, a2)  then got converted into best non fuzzy performance (BNP) value using the following formula:

$$\left( {\left( {{{\text{a}}_{\text{2}}} - {{\text{a}}_{\text{1}}}} \right){\text{ }}+{\text{ }}\left( {{{\text{a}}_{\text{m}}} - {{\text{a}}_{\text{1}}}} \right)} \right)/{\text{3 }}+{\text{ }}{{\text{a}}_{\text{1}}}$$

Best non fuzzy performance (BNP) value for probability index = 0.566.

Best non fuzzy performance (BNP) value for severity index = 0.85.

Using the probability and severity index FIC of risk factor “Fluctuation in price material” is determined as 0.694, which indicates the magnitude of the risk factor.

Similarly the risk magnitude of the other factors is calculated by collecting data regarding the perceptions on the probability index and severity index. The risk magnitude of the other factors causing cost overrun is shown in Table 8.

Table 8 Risk magnitude of the factors causing cost overrun

5 Important factors causing cost overrun in Indian construction industry

On the basis of the perceptions on the probability index and severity index of the factor of experts from Indian construction industry, model calculated the top ten most important causes of cost overruns in construction projects of India included fluctuation in price material, lowest bid procurement policy, inflation, inappropriate govt. policy, inaccurate time and cost estimate, mistakes and discrepancies in contract document, additional work, frequent design change, unrealistic contract duration and financial difficulty faced by the contractor.

6 Conclusion

Cost is one of the fundamental criteria for measuring the success of the project. Therefore the risk factors causing cost overrun in the construction industry should be identified and assessed, for the managers to deploy more resources on critical factors, to eliminate or mitigate the expected consequences due to these factors. It is considered a necessary feature in the decision-making process. The Indian construction industry is also suffering from the problem of cost overrun. Hence in this study 55 important risk factors causing cost overrun in Indian construction projects are identified through intensive literature review and experts opinion. This paper also proposes a new fuzzy based model to assess the risk magnitude of these risk factors, as the theory has the potential to deal with the vagueness, uncertainty and subjective nature of any problems. It is better equipped to handle the almost same analogous which is found in the complex construction projects. In order to assess the risk factors causing cost overrun, PI and SI are considered and cost overrun factor index namely FIC is calculated, which indicates the risk magnitude of a certain factor. An interview was developed with a panel of ten experts from Indian construction industry, to assess the perceptions on the probability index and severity index of the risk factors causing cost overrun and FIC of risk factors are determined. The factors are ranked according to their risk magnitude. The model is useful for project managers to take proper action against these risk factors causing cost overrun.

The top ten factors for causing cost overrun in Indian construction industry are recognised as fluctuation in price material, lowest bid procurement policy, inflation inappropriate govt. Policy, mistakes and discrepancies in contract document, inaccurate time and cost estimate, additional work, frequent design change, unrealistic contract duration and financial difficulty faced by contractor.

6.1 Limitation of approach and future scope

The model is based on fuzzy approach. It cannot be accepted as a universal model. It can be considered as an example of how magnitude of risk factors causing cost overrun may be determined using this model in construction projects. The types of membership function used for developing the fuzzy model can be different. Fuzzy rules are based on matrix. A standard matrix is not available in the literature. Defuzzification method can also be changed. Using the methods based on fuzzy logic subjectivity can be reduced to acceptable level by converting the linguistic values to quantitative values but it is not eliminated completely. Risk factors causing cost overrun in Indian construction industry are based on literature survey result and opinion from the experts of Indian construction industry. Number of respondents selected in the study might have been increased. Numbers of experts included for getting opinion were also very few in comparison to the size of industry. The lack of data and information was one of the main limitations.

The fuzzy inference system used for developing the model can be extended using other fuzzy membership functions and performance of models could have been compared. In addition, the results of fuzzy inference system can be compared with other predictive tools. The developed model may be extended to other industrial sectors for assessing the risks.