Introduction

The construction business is one of the most crucial areas for any country's development and also an important source of employment and investments. It makes a substantial contribution to both industrialized and developing countries' economic and social growth. The construction industry has always been unique and dynamic, dealing with a variety of uncertainties and ad hoc teams with multilingual and multicultural backgrounds [1, 2]. Cost overrun is a common occurrence in construction projects in all countries, industrialized and developing alike. This is a key issue that has an impact on project success since it affects all phases of the project life cycle, from start to finish. As a result, all participants in construction projects must pay close attention in order to ensure that the projects are safe and complete within budget, time and quality constraints [3, 4]. The amount by which actual costs exceed the baseline or allowed costs is known as cost overrun. It is the difference between a construction project’s final or actual cost at completion and the contract price agreed upon by the customer and contractor when the contract was signed. Explaining why cost overruns happen has long been a source of scientific curiosity. Several studies have been conducted to investigate the factors that contribute to cost overruns in construction projects. However, research that assembles, synthesizes and analyzes results using a soft computing technique known as fuzzy logic to produce a phenomenon overview that enables for the search for global remedies with significant effect is lacking [1, 5].

Construction project performance is negatively impacted by cost overruns. Cost overrun is one of the most serious issues that can stymie the progress of a construction project, as it affects profits, resulting in massive losses and putting the project in jeopardy. Construction cost is one of the most important determinants of a project's performance throughout its lifecycle, and it is of great concern to individuals in the construction business [4]. A cost overrun, also known as a cost rise or budget overrun, occurs when unanticipated costs exceed budgeted amounts due to underestimating actual costs during budgeting. Cost overrun is the difference between the actual cost incurred during the building phase and the originally predicted cost, and it is one of the most significant elements impacting the entire project's success. Cost overruns can disrupt construction activities and potentially lead to project abandonment in the worst-case scenario, which occurs when the owner lacks the financial ability to cover unanticipated costs [6]. One factor for determining whether a construction project is successful is whether it is completed on time and on budget. There are numerous projects, project activities and limitations in the construction sector that are completed with major cost and time overruns. Delays have a detrimental impact on project cost, time, quality and safety. The goal of every construction firm is to make money and profit at the end of each project. This is accomplished through executing projects within budgetary, time and quality constraints. Construction delays have an impact not only on the construction business, but also on the general economy of the country [7]. Cost and schedule overruns in the construction sector are caused by a number of variables. The rising complexity of construction projects places more pressure on construction managers to complete projects on schedule, on budget and to a high standard. Construction projects face challenges such as limited resources, budget, quality and time, as well as issues with construction processes and administration. Failure to finish projects on schedule and within budget is the most serious issues. This challenge, in turn, is producing issues with project funding, public use of the facility on time and the relationship between stakeholders participating in the development process. This research is being carried out as a result of this issue [8, 9].

Throughout the lifecycle of a project, cost is one of the most important factors to consider. Unfortunately, the majority of the projects did not finish on time or within budget. Cost overrun, in addition to time overrun, is a critical issue in the construction business. In Nigeria, the trend is more pronounced, with cost overruns exceeding 100% of the project's original budget. Many construction projects still experience cost overruns, despite the widespread availability and use of many project management methods and software packages. Cost overruns range from 50 to 100% of project costs in around 90% of projects worldwide [9, 10]. Jackson and Steven [11] used a questionnaire survey to investigate the causes of cost overruns in Ilorin building projects and discovered that the major causes of cost overruns were fluctuating material and labor prices, variation orders, delays in honoring certificates, lack of proper tender analysis, selection of incompetent contractors, lack of proper project appraisal and unrealistic representation of clients' needs. Price instabilities, bankrolling and disbursements for concluded works, poor contract management, schedule interruptions, alterations in site state of affairs, inexact guesstimates, scarcities of material, imported materials and plant items, additional works, design changes, subcontractors and designated suppliers, weather, non-adherence to contract conditions and mistakes abound, according to Omoregie and Radford [12].

Fuzzy logic is an analytical computation system that describes fuzziness and calibrate vagueness. It also makes it straightforward to get a firm judgment based on confusing, imprecise, ambiguous, noisy or missing input data. An element belonging to a fuzzy set with a given degree of membership is referred to as a fuzzy set. Not only does fuzzy logic provide a meaningful and strong representation of uncertainties, but it also allows for the expression of imprecise concepts in normal language. Human perception of approximation has never been more important as it is now for retrieving information and answering deduction questions. As a result, the realm of crisp relational database models has given way to the arena of fuzzy real-world database models [13, 14]. The easiest way to incorporate world knowledge which is grossly imprecise and inaccurate is to present a database model that stores and process crisp entries but allows for fuzzy queries with the usage of an interactive graphical user interface program. Classical query languages are designed to extract data which fully satisfies given constraint. The use of fuzzy logic to treat information has grown in popularity in recent years, with this mathematical theory proving to be a useful tool for solving complicated scientific and technical challenges. In the realm of advanced fuzzy logic applications, research has always delivered new achievements over the years [15]. Industry, commerce, finance, medical and a variety of scientific domains such as machine learning, big data technologies, fuzzy control, expert systems, dynamic fuzzy neural networks and others have all benefited from fuzzy logic. Fuzzy logic provides an innovative of dealing with calculus difficulties in mathematics. In fuzzy logic, traditional algorithms are replaced with a set of If (then) conditional language rules (conclusion). As a result, a heuristic algorithm is created, and human experience can be included into the calculation's subject matter [16].

The purpose of this research is to identify factors that influence project cost overruns and to determine the most severe factors in order to improve project cost management in the Nigerian construction industry. The gains derived from this work are firstly, to investigate cost overrun factors in construction projects through a literature review and identification of project stakeholders responsible for cost overrun occurrence. Second, the cost variable will be correlated with other parameters such as project team positions (respondents), academic qualifications, years of experience (in years), project kind, project funding and historical project obstacles. And lastly, to evaluate cost overrun factors effects on construction performance using fuzzy logic artificial intelligence model.

Causes of delays and cost overrun

Some causes for delays and cost overruns are prevalent in all places, but some may vary owing to changing culture and customs within the country, according to the factors evaluated by numerous researchers across different countries. Some of the factors and reasons for schedule and cost overruns in residential construction projects, according to Bhatia et al. [17], are delays in decision making, poor time estimation of project tasks and activities, unforeseen circumstances, internal conflicts within the project team and poor work organization and planning. As a result, the key causes of time and cost excess are discussed as follows:

Lack of experience

Low bidders are encouraged for public construction projects, according to Frimponga et al. [18], which can lead to a lack of management expertise and abilities, as well as a lack of sufficient resources, resulting in timetable failure. Client experience is also necessary for the right selection of qualified and experienced contractors in order to avoid numerous reworks and variations, which cause project delays [19, 20]. Poor contractor technical performance is frequently linked to a contractor's lack of competent predicting and management experience, which leads to errors and reworks throughout the project's construction phase, rising project costs [18, 21].

Poor scheduling, planning and management

It is critical to plan the work before beginning the project to ensure that it is completed successfully [19]. Contractors frequently fail to give practical building programs and work plans from the outset, making project progress monitoring a time-consuming chore [22]. Sunjka and Jacob [20] find that clients' inaccurate contractual duration projections have an impact on time performance because project managers are given insufficient time to plan and estimate. In addition, Sambasivan and Soon [22] find that the contractor's inept site management causes delays in the remediation of on-site concerns. According to Abdussalam et al. [23], the causes of cost overruns in construction projects are poor cost management, a lack of effective vital success factors such as appropriate planning in the early stages, contractor and architect abilities, and regular coordination between contractor and client.

Orders for design variations and changes

During the planning stage, the scope of work is usually not finalized, and it is frequently subject to change depending on the interests of the parties involved. Due to the uniqueness of the project, the limited timeline and the funds set aside during the planning stage, changes in scope and design are likely to occur [24]. Sunjka and Jacob [20] identify design variation and change orders as causing a temporary pause in the project, which can add to the project's delay. According to Memon et al. [25], project cost is primarily affected by inadequate and delayed design. These allegations frequently result in disputes, affecting the customers' costs and contractors' income.

Poor labor productivity and resource deficiency

Sunjka and Jacob [20] argue that low on-site worker productivity leads to errors and rework, which leads to delays and cost overruns. Poor productivity is caused by a lack of communication, protective labor policies and ineffective organizational management, and is mostly influenced by the level of appreciation and recognition rather than compensation issues [26]. According to Frimponga et al. [18], material procurement can take weeks to months, causing delays in the construction timeline. Contractors are primarily accountable for appropriate material estimation, according to Le-Hoai et al. [27], but clients and consultants also play an important role in ensuring that the estimation is carried out correctly. Poor material estimation, price variations, supplier inefficiencies, transit delays, organizational payment procedures and confirmation delays all contribute to resource shortages.

Inflation, market competition and unsteady economic and political conditions

Material price inflation occurs when demand for an item exceeds supply, influencing construction costs and resulting in project cost overruns. According to Le-Hoai et al. [27], increased demand for common construction resources such as steel, cement and other components has resulted in increased material costs. Unstable political conditions, such as strikes, military attacks and border closures, have an impact on the availability of resources at the appropriate moment, leading to an increase in the market price of materials. Due to the high security situations at the borders, importing materials can be difficult in such cases [28].

Financial limitations and payment delays

Payment delays for completed work are sometimes attributed to the presence of bureaucracy in the company. Disbursement postponements to contractors will stymie the project's development by delaying the delivery of materials and equipment on-site, as well as the payment of workers' salaries [28]. For a project to go smoothly, both contractors and clients must have sufficient financial resources [27].

Unfavorable site and weather conditions

On-site, unpredictable factors such as bad weather and other topographic conditions might cause delays. Furthermore, the dry season, combined with the humidity, can decrease worker productivity, resulting in less output and cooperation among workers, thereby affecting project time [18]. To avoid conflicts and delays, Le-Hoai et al. [27] advise that the risks associated with unforeseen site and weather characteristics should be evenly allocated and indicated in the contract between the participants.

Fuzzy logic (FL)

Zadeh created the fuzzy system in the mid-1960s to express uncertain and imperfect knowledge. After a real-world application by Mamdani in 1975, fuzzy notions and systems drew attention. It gives a rough but useful way of explaining the behavior of systems that are too multifarious, imprecise or numerically difficult to study [29, 30]. It provides a rudimentary but effective explanation for the behavior of systems that are too complex, ill-defined or quantitatively challenging to analyze. It also offers a simple and natural solution to problems where the source of ambiguity is the lack of well-specified criteria rather than random variables. The introduction of fuzzy sets was primarily motivated by the need to describe imprecise concepts. Fuzzy theory can be used to solve both linear and nonlinear problems. It does not necessitate the use of time-consuming mathematical models; instead, it just necessitates the use of a basic control mechanism based on engineering experience [31]. The main characteristic of FL is allowance of partial possessions of any item to distinct subsets, made feasible by specifying membership functions. [32].

Fuzzy set theory

Fuzzy set theory provides a systematic calculus for dealing with such information verbally, and it executes numerical computations with the use of linguistic labels driven by membership functions. The pertinence, first presented by Zadeh [33], is a key term in the fuzzy set theory. The concept of pertinence of the variable x in the set A in classical theory (Boolean) is defined by (Eq. 1)

$$\mu_{A} \left( x \right) = \left\{ \begin{gathered} 1 \Leftrightarrow x \in A \hfill \\ 0 \Leftrightarrow x \notin A \hfill \\ \end{gathered} \right\}$$
(1)

where μA(x) pertinence of the variable x in the set A. So, a variable x only "belongs" (μA(x) = 1) or "non-belongs" (μA(x) = 0) to the set A. However, for the fuzzy set theory, this concept of pertinence is presented as follows (Eq. 2) [34].

$$A = \left\{ {\left( {x,\mu_{A} \left( x \right)} \right)|x \in U} \right\}$$
(2)

where μA(x) is the degree of pertinence of the variable x in the set A; A is the fuzzy set formed by ordered pair (x, μA(x)); x is the variable of interest; and U is the universe of speech. As a result, a variable x gradually came to belong to a set A in the interval [0, 1]. This means that a variable can be part of multiple fuzzy sets, each with varying degrees of pertinence. The concept of a linguistic variable is another essential concept. A linguistic variable in the fuzzy set theory uses natural language values, which give the membership functions their name. The idea of linguistic variable is supplemented by the concept of membership function, which assigns values of fuzzy pertinence to discrete values limiting the linguistic variable's discourse universe. [35]. The fuzzy set operations are shown in Eqs. 35 and are illustrated in Fig. 1.

$${\text{Union}}:\;\mu_{{\left( {A \cap B} \right)}} \left( x \right) = \max \left( {\mu {}_{\left( A \right)}\left( x \right),\mu {}_{\left( B \right)}\left( x \right)} \right)$$
(3)
$${\text{Intersection}}:\;\mu_{{\left( {A \cap B} \right)}} \left( x \right) = \min \left( {\mu {}_{\left( A \right)}\left( x \right),\mu {}_{\left( B \right)}\left( x \right)} \right)$$
(4)
$${\text{Complement}}:\mu_{{\left( {\overline{A} } \right)}} \left( x \right) = 1 - \mu_{\left( x \right)} \left( x \right)$$
(5)
Fig. 1
figure 1

Fuzzy set operation

Fuzzy inference system (FIS)

As shown in Fig. 2, a fuzzy inference system (FIS) consists of four components: fuzzification, a fuzzy rule base, a fuzzy output engine and defuzzification. The associated input membership function fuzzifies the crisp input before passing it to the fuzzy inference block, which is a decision-making unit that generates fuzzy output through fuzzy reasoning. The defuzzification block generates crisp output from fuzzy data. The knowledge base, which is made up of a database and a rule base, defines the associated membership function in the fuzzification and defuzzification blocks, and it gives fuzzy rules to the fuzzy inference block. The inference operations on the fuzzy rules are performed by the decision-making unit. A fuzzy rule's fuzzy values are aggregated using connective operators such as intersection (AND), union (OR) and complement (NOT). The fuzzy inference engine considers all of the fuzzy rules in the fuzzy rule base and learns how to transform a set of inputs into related outputs. Defuzzification is the process of converting the fuzzy outputs of the fuzzy inference engine to a crisp numerical number. It also entails interpreting the logical implications of results translated from the fuzzy to the real domain [36].

Fig. 2
figure 2

Fuzzy inference system (FIS) structure (source: Alaneme and Mbadike [36])

Methodology and research design

The research methodology consists of quantitative and qualitative information on foundations of cost upsurge identification through an examination of the literature and field research with specialists in the field of building projects from a number of building firms based in Rivers State, Cross River State and Abia State in Nigeria. In this paper, we offer a hybrid modeling method based on factorial analysis and fuzzy logic systems to help decision makers assess building costs overrun of building constructions so as to enhance quality management and save cost. The qualitative analysis aids in acquiring a better understanding of the established research objectives and identifies essential methods that could improve poor time and cost performance. The quantitative strategy, which employs a broad questionnaire survey, statistically analyzes the research objectives, whereas the qualitative approach, which employs a semi-structured interview with highly experienced construction workers, does not. The quantitative data were gathered using a questionnaire distributed via personal email and professional networking sites to 120 randomly selected construction professionals from various project entities such as clients, design consultants, contractors and other project participants in the three states under study. The data collected from numerous project participants provide a better knowledge and equalized perception of the research subject [37].

Through a snowball or chain referral strategy, a personal interview session was held concurrently with well-experienced construction professionals from numerous well-known and established firms. In this study, a semi-structured interview was used, with a covering letter and questions of interest communicated with the people before to the interview, leaving the session open to discuss any questions pertaining to the research problem. The interview offered a descriptive critical assessment for the research goal to obtain expert details to the challenging factors responsible for cost overruns in building industry. Fuzzy logic which is a soft computing technique (SCT) was used in the analyses of the primary data derived in the survey program. Soft computing seeks to develop precise approximations that result in a robust, computationally efficient, speedy and cost-effective solution that saves computational time.

Study area

This research will be conducted in Calabar, Cross River State, Nigeria. The decision was made based on commercial viability, social standing, economic considerations and accessibility, all of which create chances for a variety of businesses including construction, consultancy, manufacturing and technology growth. The study area map is shown in Fig. 3.

Fig. 3
figure 3

Map of study area

Questionnaire design

In order to develop a questionnaire, demographical data relevant to the interviewees' profile, such as the position held, the number of years of experience, the number of works done, the categories of work, the types of contracts and the percentage of cost increase on those projects, had to be obtained. The second section of the questionnaire is concerned with determining the root causes of cost overruns. In this study, the frequency and severity were scaled on a 5-point Liker scale from 1 to 5. The frequency of the cause event was classified as follows: Does not occur was assigned a score of "1", low frequency was assigned a score of "2", medium frequency was assigned a score of "3", high frequency was assigned a score of "4", and extreme frequency was assigned a score of "5". The severity is divided into five categories: no severity for "1", low severity for "2", medium severity for "3", high severity for "4" and extreme severity for "5". The collected data were then compiled using the method outlined below. Table 1 shows the 32 primary qualities that were examined.

Table 1 Cost overrun factors

Test of hypothesis

The assumption regarding the statistical test population parameter is set depending on the nature of the data in line with study objectives. A hypothesis's credibility is assessed by the use of sample data and also, to test and examine evidences concerning the plausibility. Statistical analysts put a theory to the test by measuring a random sample of the population under consideration. All analysts utilize a random population sample to test two separate hypotheses: the null hypothesis and the alternative hypothesis which measured the equality between population parameters. They are mutually exclusive, and only one of them can be correct [39].

Research hypotheses

Null Hypothesis (Ho)

Cost overrun attributes are not significant in inhibition of construction project performance.

Alternative Hypothesis (Hi)

Cost overrun attributes are not significant in inhibition of construction project performance.

Data analysis techniques

The average ranking score was derived using the weighted scoring approach based on the responses received. The following statistical formulas were used to analyze the data.

(FI) Frequency index

As indicated in Eq. 6, this formula is used to identify the causes found based on their frequency in each interviewee's opinion.

$$FI = \frac{{\sum {Af} }}{A \times N}$$
(6)

where ∑Af = sum of each interviewee's weighted frequency reported from 1 to 5 for the stated cause; A = greatest weight, which in this case is equal to 5, highest category of the Likert scale used in this study; and N = total number of responses to that cause. This formula is also used for each of the causes [40].

(RII) index of relative importance or (SI) severity index

This method is used to rank the discovered causes in terms of severity or impact based on the opinion of each respondent, as shown in Eq. 7.

$$SI = \frac{\sum W }{{A \times N}}$$
(7)

where ∑W = sum of each interviewee's impact weight from 1 to 5 for the provided cause; A = highest weight, which in this case is equal to 5 on the Likert scale used in this study; and N = total number of replies to that reason. For each cause, this formula must be used [40].

(IMPI) importance index

This formula is used to rank the identified causes based on the relevance of each interviewee's opinion and is generated by multiplying the IF index by the SI as given in Eq. 8.

$${\text{IMPI}} = {\text{FI}} \times {\text{SI}}$$
(8)

Spearman correlation coefficient (r s)

It is a nonparametric test with a score ranging from − 1 to + 1, with + 1 signifying complete agreement and -1 indicating complete disagreement among the candidate groups. A score close to one implies that the investigated groups have a strong relationship. This coefficient was applied to the outcomes of the three different combinations: project managers, contractors and consultants, using the following mathematical relationship expressed in Eq. 9 [41].

$$r_{s} = 1 - \frac{{6\sum {d^{2} } }}{{n(n^{2} - 1)}}$$
(9)

Membership function

The fuzzy set membership function generalizes the classical set indicator function. It expresses the degree of truth as an extension and describes the extent to which a particular input variable belongs to a set. In the fuzzification and defuzzification processes of a fuzzy logic system, membership functions are employed to translate non-fuzzy input values to fuzzy linguistic terms and vice versa. Whether the elements in fuzzy sets are discrete or continuous, membership functions characterize fuzziness (i.e., all the information in the fuzzy set). Membership functions are a way for solving practical problems based on experience rather than knowledge [42]. The membership function which represents a fuzzy set \(\overline{A}\) is usually denoted by \(\mu_{A}\). For an element x ofX, the value \(\mu_{A} \left( x \right)\) is called the membership degree of x in the fuzzy set \(\overline{A}\) The membership degree \(\mu_{A} \left( x \right)\) quantifies the grade of membership of the elementx to the fuzzy set \(\overline{A}\).The value 0 means thatx is not a member of the fuzzy set; the value 1 means that xis fully a member of the fuzzy set. The values between 0 and 1 characterize fuzzy members, which belong to the fuzzy set only partially as shown in Fig. 4 [43].

Fig. 4
figure 4

Membership function of a fuzzy set

Fuzzy membership function basic concepts are illustrated in Fig. 5 and are made up of the support, core, \(\alpha\)-cut and the height, respectively.

Fig. 5
figure 5

Fuzzy Membership functions basic concepts

Support: Elements with a nonzero degree of membership are considered to be supportive. The portion of the universe characterized by a nonzero membership in the set is the support of a membership function. Core: set with elements having degree of 1. A membership function's core is the region of the universe characterized by full membership in the set.

α-Cut: set of elements with degree >  = α.

Height: maximum degree of membership.

Membership function types

The membership function (MF) types deployed for the generalization of the indicator classifier in this study with associated descriptions and mathematical relationships are presented in Table 2.

Table 2 Membership functions

Results discussion and analysis

Demographical characteristics of respondents

The field survey results obtained from the three study areas through purposive and random sampling techniques were utilized for the evaluation of the study objectives. The demographical characteristics of the respondents showing the frequency and percentage distribution are presented in Table 3. The result were sorted according to their roles and occupations, namely contractor, project manager and consultant, so as to obtain their expert views. A total of 150 questionnaires were administered for the study, and the result showed 84% male and 16% female; this is because a greater percentage of personnel in construction works but skilled and artisans are men. Respondents within the range of 0–10 years of experience constitute 24.67%, 11–20 years of experience had 28%, 21–30 years of experience occupy 32.67%, while the remaining 14.67% are for those with above 30 years of experience. It is imperative to get details from professionals with greater years of experience in the field for better assessment of the cost overrun factors. Finally, 50% of the response were on building construction projects and 68.67% of the projects size were in the range of 241,000–2,280,000 US dollars.

Table 3 Respondents’ demographical characteristics

Respondents’ evaluation of cost overrun attributes

Survey results which presented the respondents’ experiences, opinions and attributes through a quantitative structured analysis through a 5-point Likert scale are presented in Tables 46, which show the frequency and severity evaluation computation with respect to respondents’ positions, namely consultant, project manager and contractor. For the consultant responses, attributes 23, 19, 1 and 11 corresponding to contractor’s financial constraints, poor communication among stakeholders, design variation from client/consultant and claims and disputes among stakeholders were the highest ranked variables. These results show that contractors’ financial constraints ranked first with a value of 4.46 as the severity index (SI) and a value of 3.98 as the importance index (IMPI). Poor communication among stake holders came second having a value of 4.38 as SI and 3.84 as IMPI, respectively. Design variation from client/consultant ranked third by having 4.14 as SI and 3.43 as IMPI. Claims and disputes among stakeholders ranked fourth with values of SI and IMPI as 4.1 and 3.36, respectively. The attribute with the least rank was attribute 8 which is delay in client decision-making process. It came 32nd with 1.72 and 0.59 as SI and IMPI, respectively. The scores are the sum of the impact weight reported by the respondents for each cost overrun attribute [44, 45].

Table 4 Ranking of cost overrun factors by the respondents (consultant)
Table 5 Ranking of cost overrun factors by the respondents (project manager)
Table 6 Ranking of cost overrun factors by the respondents (contractor)

Table 5 present the responses from project managers. Attributes that are significant are 19, 11, 26 and 23, respectively, corresponding to poor communication among stakeholders, claims and disputes among stakeholders, deficiency of materials, equipment and tools and contractor’s financial constraints factors. Attribute 19 ranked first having 4.46 and 3.98 as SI and IMPI, followed by attribute 11 which ranked second with SI and IMPI of 4.34 and 3.77, respectively. Though attribute 26 was ranked third, it is pertinent to note that it has the same weight attribute 11 since they share common values of 4.34 and 3.77 as their SI and IMPI. Attribute 23 ranked fourth with SI and IMPI values of 4.22 and 3.56, respectively. The least ranked was attribute 8 which corresponds to delay in client decision-making process. It ranked 32nd and its SI and IMPI values are 1.82 and 0.66, respectively. The higher the severity and importance indices, the more the negative impact and hence the greater the contribution to cost overrun.

In Table 6, responses from the contractors are shown and analyzed. Attributes that mostly contributed to cost overrun are 11, 23, 5 and 26, respectively, and are denoted as claims and disputes among stakeholders, contractor’s financial constraints, failures in design and inadequate project planning and deficiency of materials, equipment and tools. Attribute 11 ranked first in terms of cause of cost overrun because it has 4.34 and 3.77 as SI and IMPI, followed by attribute 23 which ranked second because of negative impact on cost overrun with SI and IMPI of 4.18 and 3.49, respectively. Attribute 5 was ranked third in contribution to cost overrun. Its SI and IMPI are 4.12 and 3.39, and finally, attribute 26 which ranked fourth having SI and IMPI values of 4.08 and 3.33, respectively. The least ranked was attribute 7 which corresponds to financial constraints and inadequate fund allocation from client. This attribute ranked 32nd having SI and IMPI values of 1.76 and 0.62, respectively, so that its contribution to cost overrun is highly insignificant. This is because, higher severity and importance indices give rise to greater contribution to cost overrun and vice versa.

Spearman correlation computation results

The spearman correlation computation which is similar to the Pearson correlation results between the ranked values for the two compared variables, while Pearson's correlation assesses linear relationships, and Spearman's correlation assesses monotonic relationships (whether linear or not). If there are no repeated data values, a perfect Spearman correlation of + 1 or − 1 occurs when each of the variables is a perfect monotone function of the other. The ranked responses from the three respondents’ classification were compared in this process as shown in Tables 7, 8 and 9, and the computation result showed maximum spearman coefficient of correlation \({r}_{\mathrm{scp}}\), between consultant and project manager with a value of 0.9344 an indication of similar ranking for the parties involved. Correlation between consultant and contractor gave \({r}_{\mathrm{scc}}\) value of 0.8343, while correlation between contractor and project manager showed correlation \({r}_{\mathrm{scp}}\) value of 0.838, respectively. These correlation values can be classified as moderately similar rank. This is in line with grade correlation results obtained by Alda and Assed [2] and Senouci et al. [45].

Table 7 Spearman correlation for consultant versus project manager
Table 8 Spearman correlation for consultant vs. contractor
Table 9 Spearman correlation for contractor versus project manager (PM)

From Eq. (9), \(r_{{\text{s}}} = 1 - \frac{{6\sum d^{2} }}{{n\left( {n^{2} - 1} \right)}}\) and computing Spearman’s correlation coefficient between consultant and project manager \(r_{{{\text{scp}}}}\), we obtain

$$r_{{{\text{scp}}}} = 1 - \frac{6 \times 358}{{32\left( {32^{2} - 1} \right)}} = 0.934384$$

Similarly, the coefficient of correlation between consultant and contractor is computed below:

$$r_{{{\text{scc}}}} = 1 - \frac{6 \times 904}{{32\left( {32^{2} - 1} \right)}} = 0.834311$$

In the same manner, the coefficient of correlation between contractor and project manager is computed below:

$$r_{{{\text{scp}}}} = 1 - \frac{6 \times 886}{{32\left( {32^{2} - 1} \right)}} = 0.83761$$

Fuzzy logic model development

In Table 10, a holistic assessment of the respondents’ responses with respect to the cost overrun attributes in construction projects were summarized. The data presented were utilized as the system database for the fuzzy logic model development for the evaluation of the overrun factor ranking presented by the professionals: consultants, contractors and project managers. The conclusion deduced from the respondents observations which indicated poor communication among stakeholders and Contractor’s financial constraints as the most severe attributes with SI values of 4.29 each, while delay in client decision-making process was the least severe attribute with SI value of 1.79 [46]. Furthermore, the frequency histogram plots of the model variables are presented in Fig. 6, showing the mean and standard deviation and also to observe the shape of the data frequency distribution. Computed results showed skewed distribution for the model variables due to the imposition of natural limit to constrain the expected outcomes on one side. The plot is showing the datasets within the variables on the x-axis of the plot and the frequency values plotted on the y-axis. Skewness is a measure of asymmetry conditions of a real-valued random variable probability distribution about its mean. The range (upper and lower limits) of the input variables is clearly illustrated using the histogram plots [45].

Table 10 System Database

Figure 6 presents the histogram plots estimated variables. Histogram of SI show a mean value of 3.347 with a standard deviation of 0.7338, while the histogram of rank have 0.6695 and 0.1467 as mean and standard deviation, respectively. Histogram of score has 502.1 and 110.0 as its mean and standard deviation, and lastly, the histogram of rank has a mean and standard deviation of 16.5 and 9.381, respectively.

Fig. 6
figure 6

Frequency histogram plot for the model variables

Model processing parameters

The fuzzy logic model processing parameters, variables interdependencies and connections showing the input and output variables are presented in Fig. 7. which showed score, FI and SI as the input variables while ranking of the cost overrun attributes were taken as the model output parameter. Mamdani fuzzy inference system was deployed with minimum function for AND method and implication parameters and maximum function was selected for OR method and aggregation. Also, centroid of area method was taken for the defuzzification [47].

Fig. 7
figure 7

Fuzzy model processing parameters

The centroid of area method is the defuzzification method utilized to execute this operation with the mathematical expression presented in Eq. 15

$${\text{C.A}} = \frac{{\sum\nolimits_{i} {\mu \left( {M_{i} } \right)} \times O_{i} }}{{\sum\nolimits_{i} {\mu \left( {M_{i} } \right)} }}$$
(15)

where \({\text{C.A}}\) is the centroid of area defuzzified output result,\(\mu \left( {M_{i} } \right)\) is the membership value for the output results in the ith subset and \(O_{i}\) is the output results in the ith subset.

Membership function

The membership function (MF) used for the model variables are as follows: the triangular MF for score and frequency index (FI) variables, and trapezoidal MF for severity index (SI). The fuzzy variables are assigned to a unique degree of membership based on expert judgment and details from the system database. Five membership functions were used for each fuzzy variable, namely does not occur (mf1), low frequency (mf2), medium frequency (mf3), high frequency (mf4) and extreme frequency (mf5). The membership function computation plots for the model variables showing the data range at the x-axis and universe of discourse (0–1) at y-axis as presented in Fig. 8 [48, 49].

Fig. 8
figure 8

Membership function plots

Graphical expression of the fuzzy variables relationships

The if–then rules formulation is achieved through relevant literatures and expert judgment in line with the variables relationships. This dependencies are characterized using the membership function parameters to derive the degree of belongingness between the independent factors. The aggregation of the formulated fuzzy rules is future achieved using maximum function which enhances data generalization for better prediction accuracy. The relationships between the generated fuzzy model variables in respect with the target response parameter are presented in a surface plot in Fig. 9 which provides a 3-D plane to assess and observe the effects of the variations of the fuzzy variables for better generalization of data and optimization of the mixture ingredients. Defuzzification is finally carried out using centroid of area method which is the last stage of the fuzzy inference system operation where the fuzzy results obtained from the fuzzy inference engine are mapped into crisp numeric value corresponding to the appropriate output membership function [30, 42].

Fig. 9
figure 9

3-D surface plots

Evaluation of the fuzzy logic model performance

Following model development, the predicted or estimated simulated fuzzy logic model results are compared with the corresponding experimental or actual values using loss function parameters such as root mean squared error (RMSE), mean absolute percentage error (MAPE) and coefficient of determination (COD) to rate the model performance in terms of prediction accuracy. The result computation is carried out using Microsoft excel software and is presented in Table 11. The statistical evaluation results show an RMSE of 0.321%, a MAPE of 0.115% and a coefficient of determination of 0.995%. This study showed a strong link between the actual and expected outcomes [50, 51].

Table 11 Performance measure for fuzzy logic model

Conclusion

The evaluation of the cost overrun factors on construction projects was assessed in this research study using fuzzy logic modeling, and from the investigative results, the following conclusions can be drawn:

  • Research survey was adopted in the experimental methodology using structured questionnaire which was distributed to specialists in the construction industry to assess the overrun factors effects on building projects.

  • The survey result indicated that delay in client decision-making process attribute recorded the least severity index of 1.79 while Contractor’s financial constraints attribute recorded the most severity index of 4.29.

  • The data generated were expertly sorted and taken as system database for the fuzzy logic model development. The modeling involves three input variables FI, SI and score, and one output factor which is the ranking of the cost overrun factors.

  • Fuzzy logic toolbox in MATLAB computational software with Mamdani fuzzy inference system (FIS) type was utilized in the modeling process to achieve generalization of derived datasets.

  • The developed model performance was evaluated using statistical methods, namely coefficient of determination (COD), MAPE and RMSE. The computed results indicated robust model performance. The fuzzy logic model developed is able to predict the ranking of the overrun factors to provide project managers special guide on efficient decision-making process to achieve optimal result.

Recommendations

Based on this research work, the following recommendations were made in other to guide against cost overruns in construction projects.

  • Government should setup a legal body which would be responsible for monitoring and controlling construction cost from the project inception phase to the project completion stage to evaluate and control cost during the construction process.

  • Mitigation of cost overrun in construction projects should be undertaken from early stages. This is due to the fact that several causal factors with high influence values are observed among some major top factors with the greatest influence which are related to different processes that belong to the initial stage of the projects.

  • Proper communication between all project participants should be a major practice.

  • Further works could focus on the following

  1. (a)

    Identifying cost overrun factors in construction project from geographical areas where there are no studies.

  2. (b)

    Analyzing the relationship between reported factors and geographic locations.