Introduction

Time overruns in construction projects may have substantial consequences for the project's overall success. These consequences may include escalated costs, prolonged completion timelines, and diminished customer satisfaction (Kamaruddeen et al., 2020). Hence, the precise assessment and anticipation of schedule overruns are crucial for efficient project management. In recent times, scholars have investigated different methodologies and strategies to tackle this predicament, including utilizing optimization algorithms and deep learning methodologies (Gómez-Cabrera et al., 2023).

Arabiat et al. (2023), the issue of time overruns is a persistent and prevalent difficulty within the construction sector worldwide. The integration of Machine Learning (ML) into the construction industry has emerged as a new development, despite ML's widespread uses in numerous fields. The present work aimed to construct a prognostic model to anticipate the occurrence of these cost overruns. The literature analysis identified measurable risk indicators that are particular to building projects. Based on the dataset from completed projects, the K-nearest neighbor (KNN) model demonstrated an accuracy rate of 83.76%. In comparison, the Artificial Neural Network (ANN) model produced a much higher accuracy rate of 99.28%. This performance was supported by three essential data mining techniques, including clustering, feature selection, and prediction.

Sambasivan and Soon (2007) examined the factors contributing to delays in the building sector of Malaysia and their subsequent impacts. The study focused on the specific setting of Malaysia and offered significant insights into the determinants of schedule overruns in building projects. Ghazali and Wong (2014), an examination of risk rating for tunneling construction projects in Malaysia, revealed that a notable proportion of public projects in Saudi Arabia had delays in meeting their scheduled timelines. These studies underscore the need to understand the origins and hazards linked to schedule overruns in building endeavors.

In order to tackle the issue of assessing and forecasting time overruns, scholars have examined the use of sophisticated methodologies, such as deep learning and optimization algorithms. Eltoukhy and Nassar (2021) developed a support vector machine (SVM) model to analyze time and expense overruns in building projects within the context of Egypt. The model exhibited notable precision in forecasting time overruns, highlighting the promising prospects of using machine learning methodologies in this field.

In conjunction with machine learning, optimization methods have been employed to tackle the problem of time overruns effectively. Bin Seddeeq et al. (2019) conducted research that specifically examined the oil and gas construction business in Saudi Arabia. The findings of this study indicated that a notable proportion of construction projects within this industry had delays in their completion. This research offers significant contributions by shedding light on the frequency of time overruns within the sector. In addition, using optimization techniques, such as the Coral Reefs Optimization (CRO) algorithm, has shown potential in mitigating time delays encountered in building projects.

In summary, the assessment and anticipation of time overruns in construction projects is a multifaceted undertaking that requires a thorough comprehension of the underlying factors and accompanying hazards linked to these delays. Scholars have investigated many approaches, such as deep learning and optimization algorithms, to tackle this issue (Sharma & Gupta, 2021; Gómez-Cabrera et al., 2020). According to Gharaibeh et al. (2020), it is feasible to devise efficient solutions for reducing time overruns and enhancing project results using these methodologies and considering the unique circumstances of building projects.

Within the dynamic and ever-changing realm of the construction sector, the issue of time overruns has become a prevailing obstacle, often disrupting project schedules and influencing various parties involved. With an understanding of the predictive capabilities of Machine Learning (ML), this research aims to develop a robust model that can effectively anticipate time overruns. Based on a thorough examination of relevant literature, this study used measurable risk indicators to develop a model. To verify the model, real-world project datasets were employed, resulting in notable accuracies when using both K-nearest neighbor (KNN) and Artificial Neural Network (ANN) approaches. This undertaking not only addresses the current lack of knowledge regarding utilizing machine learning in construction timelines but also introduces a novel approach that combines clustering, feature selection, and prediction techniques. This sets a precedent for future research and implementing proactive project management in practical settings.

Methodology

Data collection

The analyzed dataset contains a comprehensive record of 190 projects carried out from 2008 to 2022. The primary source of this data is the Ministry of Public Works and Housing, with additional information obtained from Daabes Contract EST. Every record in the collection provides information on the projected and realized expenses, marked as PC and AC, respectively, and the expected and actual durations, represented as PD and AD. In addition, the dataset provides measurements of time and cost disparities through the 'Time Overrun (Days)' and 'Cost Overrun (JOD)' measures. The first metric measures the percentage deviation from the predicted timetable, where negative values indicate delays. The second metric quantifies the percentage difference between estimated and realized costs. The dataset provided is extensive and offers valuable insights into the dynamics of project management. It plays a crucial role in facilitating empirical investigations within the discipline.

Within the complex realm of building projects, many factors intertwine to exert an effect on time overruns. Our objective is to carefully choose characteristics that effectively capture the fundamental aspects of these undertakings to comprehend the intricacies of this phenomenon.

As depicted in Fig. 1, the Planned Cost (PC) and Actual Cost (AC) function as financial indicators, providing valuable information regarding the financial aspects of a project. Divergences between these two elements can serve as indicative of unanticipated obstacles, alterations in scope, or inaccuracies in financial projections, all of which have the potential to impact project schedules. The plots reveal considerable fluctuations in these expenses, which frequently align with notable delays, emphasizing their interdependence. Planned Duration (PD) and Actual Duration (AD) serve as temporal markers, delineating the projected and achieved durations of undertakings. The comparison of different periods provides insight into the precision of initial project estimations and the effectiveness of implementation. The significance of these elements is heightened by the narrative structures, in which deviations in timeframes frequently parallel instances of time exceeding expectations, illustrating the dynamic relationship between strategic planning and practical implementation (Shoar et al., 2022).

Fig. 1
figure 1

Interplay of financial and temporal metrics: analyzing the symbiotic relationship between costs, durations, and time overruns

Finally, the Cost Overrun (JOD) is a metric that measures the percentage difference between projected and actual costs, and it functions as a financial indicator. This characteristic aligns with the proverbial saying that time holds economic value; delays in financial matters, whether caused by obstacles in procurement or payment delays, have the potential to result in the prolongation of project timelines. The graphs effectively illustrate this association, as shown by the prominent occurrence of expense overruns closely aligning with substantial time overruns.

The selected characteristics, as evidenced by their theoretical foundations and visual associations, highlight the complex and diverse nature of time overruns. The authors collectively develop a narrative that connects the financial, temporal, and execution aspects of construction projects, providing a comprehensive perspective to unravel the mystery surrounding delays in project timelines.

Data preprocessing

Within the domain of data-driven research, the maxim "garbage in, garbage out" underscores the pivotal significance of data pretreatment. The dependability and robustness of the subsequent model heavily depend on the vital importance of ensuring the quality and consistency of the input data. The preprocessing phase was initiated by evaluating the dataset's integrity, explicitly focusing on detecting any missing or abnormal values. The dataset utilized in this investigation demonstrated notable completeness, hence obviating the necessity for data imputation or exclusion (Sohrabi & Noorzai, 2022).

After performing an integrity check, the normalization process was undertaken to mitigate the heterogeneity observed in the scales of various variables (Yin et al., 2022). This stage holds significant importance, particularly when utilizing algorithms that are very responsive to the size of features, such as deep learning models. According to Ashtari et al. (2022), the process of scaling each feature to have a mean of zero and a standard deviation of one serves the purpose of preventing any individual feature from exerting disproportionate effect on the model as a result of its scale. Normalization plays a dual role in improving model convergence and feature importance interpretability. After the process of normalization, the dataset was divided into separate sets for training and testing purposes. The division of data into training and evaluation sets is a fundamental aspect of machine learning, since it allows for the assessment of a model's ability to generalize to new, unknown data after being trained. In our study, a proportion of 80% of the data was designated for training purposes, while the remaining 20% was set aside only for testing. The standard partitioning method guarantees a training set of sufficient size for model learning while also maintaining a test set that is suitable for assessment purposes.

In summary, the pretreatment stages were carefully selected to produce a dataset that is both representative and suitable for deep learning, establishing a solid basis for the subsequent modeling phase.

Neural network architecture

In our endeavor to simulate the complex dynamics that govern time overruns in building projects, we utilized the capabilities and adaptability of neural networks. The architecture selected for our study was a multi-layer perceptron (MLP), a feed-forward artificial neural network. The Multi-Layer Perceptron (MLP) was designed with an input layer that aligns with the number of features in our dataset, many hidden layers, and an output layer that provides the projected time overrun (Natarajan, 2022).

The concealed layers, the central component of our network, consist of a sequence of densely coupled neurons. Every individual neuron undertakes the processing of the input it receives by means of a weighted sum, which is subsequently followed by the application of a non-linear activation function. In our study, the Rectified Linear Unit (ReLU) activation function was selected for the hidden layers of our model. This choice was based on its shown efficacy in introducing non-linearity while maintaining computational efficiency, as highlighted by Gondia et al. (2020). The linear activation function in the output layer was deemed appropriate for our regression-oriented objective. The architectural decisions were justified based on empirical facts and computational factors. Deep architectures, distinguished by several hidden layers, have exhibited exceptional proficiency in capturing intricate correlations among datasets. Nevertheless, the model's depth was adjusted to prevent overfitting and maintain computational feasibility. The adoption of the Rectified Linear Unit (ReLU) activation function was driven by its efficacy in addressing the vanishing gradient problem, a prevalent obstacle encountered in deep neural networks. Consequently, this facilitates more seamless training procedures (Sanni-Anibire et al., 2021).

The neural network architecture employed in our study was carefully crafted to strike a delicate balance between model complexity, interpretability, and computational efficiency. This design choice renders our approach highly suitable for comprehending the intricate aspects of time overruns in building projects in Jordan.

Coral reefs optimization

The Coral Reefs Optimization Algorithm (CROA) is an innovative optimization technique that draws inspiration from biological processes, namely the reproduction and colonization behaviors observed in coral reefs. The basis of this approach is to replicate the processes of coral development, competition, and regeneration, enabling it to navigate intricate optimization terrains (Kaveh, 2014) effectively.

In the present implementation, the corals, which serve as the principal entities in the Coral Reefs Optimization Algorithm (CROA), symbolize prospective resolutions to the optimization problem at hand. The coral reef ecosystem, which serves as the habitat for these corals where they flourish and engage in competitive interactions, has resemblance to the solution space. Corals engage in a range of reproductive mechanisms, including sexual reproduction through methods, such as broadcast spawning and brooding, as well as asexual reproduction. The process of broadcast spawning facilitates the generation of novel possible solutions by amalgamating characteristics from numerous parental corals. The process of brooding is characterized by a very limited scope of exploration, in which progeny are generated with minor alterations from their progenitors (Kaveh & Khavaninzadeh, 2023; Kaveh et al., 2023). In contrast, asexual reproduction emulates the biological phenomenon of budding, resulting in the generation of almost identical corals with minimal variations. The primary focus of the interaction between CROA and our neural network pertained to the optimization of hyper-parameters and weights. Each every coral inside the reef possesses an own arrangement of neural network parameters. The evaluation of the fitness of each coral, which defined its probability of survival and reproductive success, was assessed based on the performance of the relevant neural network configuration on the validation dataset. The objective of the Coral Reproduction and Optimization Algorithm (CROA) was to identify the most effective arrangement that would minimize prediction error. This was achieved by repeated cycles of coral reproduction and competition, as discussed by Kaveh et al. (2008).

The appeal of employing CROA in conjunction with deep learning stems from its capacity to effectively navigate the extensive and complex parameter space of neural networks (Kaveh et al., 2021, 2023; Rofooei et al., 2011). In contrast to conventional optimization methods, which may become trapped in local optima, the bio-inspired mechanisms employed by CROA (Cuckoo Search Algorithm) offer a well-balanced strategy that encompasses both exploration and exploitation. This approach guarantees a more comprehensive search process. The combination of the Construction Risk Ontology Analysis (CROA) and our neural network architecture provided a strong and flexible approach to effectively identify the optimal model for forecasting time overruns in construction projects.

Evaluation metrics

In the field of predictive modeling, the careful selection of suitable assessment metrics is of utmost importance in order to thoroughly evaluate the performance of a model. In order to address the issue of forecasting delays in construction projects, a set of metrics was utilized, with each statistic providing a distinct viewpoint on the accuracy of predictions and the distribution of errors.

The Mean Absolute Error (MAE) is a metric that quantifies the average magnitude of prediction mistakes, irrespective of their direction (Moon et al., 2020). The term is delineated as:

$$ {\text{MAE}} = \frac{1}{n}{\mathop \sum \limits_{i = 1}^n} \left| {y_i - \hat{y}_i } \right| $$

where \(y_i\) represents the actual value, \(\hat{y}_i\) is the predicted value, and \(n\) is the total number of observations.

In contrast, the Mean Squared Error (MSE) places greater emphasis on larger errors by taking their squared values (Akinosho et al., 2020). The concept is articulated as:

$$ {\text{MSE}} = \frac{1}{n}{\mathop \sum \limits_{i = 1}^n} \left( {y_i - \hat{y}_i } \right)^2 $$

The Root Mean Squared Error (RMSE) is a metric that enhances the interpretability of the Mean Squared Error (MSE) by calculating its square root. This transformation allows for the representation of error magnitude in the original unit of measurement (Hu et al., 2021):

$$ {\text{RMSE}} = \sqrt {{{\text{MSE}}}} $$

The coefficient of determination, commonly referred to as R-squared (R2), quantifies the fraction of variability in the dependent variable that can be accounted for by the independent variables (Uddin et al., 2022). The value of the variable varies inclusively from 0 to 1, and is determined by the following equation:

$$ R^2 = 1 - \frac{{{\text{SS}}_{{\text{res}}} }}{{{\text{SS}}_{{\text{tot}}} }} $$

where \({\text{SS}}_{{\text{res}}}\) is the sum of squares of residuals and \({\text{SS}}_{{\text{tot}}}\) is the total sum of squares.

The Mean Absolute Percentage Error (MAPE) is a metric that offers a relative assessment of error expressed in percentage terms (Moghayedi & Windapo, 2020).

$$ {\text{MAPE}} = \frac{1}{n}{\mathop \sum \limits_{i = 1}^n} \left| {\frac{{y_i - \hat{y}_i }}{y_i }} \right| \times 100 $$

The Mean Bias Deviation (MBD) metric, as described by Maltbie et al. (2021), quantifies the average bias present in the forecasts.

$$ {\text{MBD}} = \frac{1}{n}{\mathop \sum \limits_{i = 1}^n} \frac{{y_i - \hat{y}_i }}{y_i } \times 100 $$

The Relative Absolute Error (RAE) is defined as the quotient of the Mean Absolute Error (MAE) for the specific model under consideration and the MAE of a baseline naive model, as described by Gondia et al. (2020):

$$ {\text{RAE}} = \frac{{\text{ MAE of the model }}}{{\text{ MAE of a naive model }}} $$

The Relative Absolute Error (RAE) is defined as the quotient of the Mean Absolute Error (MAE) for the specific model under consideration and the MAE of a baseline naive model, as described by Gondia et al. (2020):

$$ {\text{RSE}} = \frac{{\text{MSE of the model }}}{{\text{MSE of a naive model }}} $$

Each of these indicators provides a nuanced perspective for evaluating the predictions made by the model, so assuring a comprehensive and diverse assessment of its performance.

Results and discussion

Visual diagnostics and algorithm behavior analysis

As depicted in Fig. 2, the Convergence Plot showcases the convergence of the data. The Convergence Plot offers a visual depiction of the iterative advancement of an optimization method, highlighting the evolution of the best fitness or objective value. The evaluation of an algorithm's effectiveness and convergence speed is a crucial diagnostic technique. A steeper slope in this evaluation often signifies a more rapid convergence toward an ideal or sub-optimal solution.

Fig. 2
figure 2

Evolution of best fitness value: a convergence analysis of the optimization algorithm over iterations

The Diversity Plot is a graphical representation that illustrates the level of variety or heterogeneity observed in the population of solutions during multiple iterations. A high level of diversity within a system suggests an extensive investigation of the search space, whereas a decrease in diversity may imply premature convergence. As seen in Fig. 3. The practice of monitoring variety can contribute to the comprehension of how the algorithm achieves a trade-off between exploration, which involves exploring new regions, and exploitation, which involves refining known good solutions.

Fig. 3
figure 3

Exploration versus exploitation: a diversity analysis across iterations of the optimization algorithm

The Fitness Distribution Plot, commonly depicted as a histogram, provides a visual representation of the distribution of fitness values within a population at a specific iteration. The presented representation facilitates the identification of the spectrum and concentration of solution attributes, whereby a more even distribution implies a wider exploration, while a skewed distribution indicates a concentration on particular solution attributes. As depicted in Fig. 4.

Fig. 4
figure 4

Distribution of solution qualities: a fitness landscape analysis at a specific iteration

As depicted in Fig. 5, the exploration of solution space is demonstrated. The plot of Solution Space Exploration is a scatter diagram that visually represents individual solutions within the goal space. This plot is particularly advantageous in the context of multi-objective optimization problems as it offers valuable insights regarding the dispersion and allocation of solutions. It effectively highlights regions of concentrated exploration as well as prospective areas of interest or disregard.

Fig. 5
figure 5

Landscape of solutions: visualizing the distribution and density in the objective space

The shown plot illustrates the comparison between the average and worst fitness values throughout a series of iterations. Through the examination of these metrics, scholars are able to acquire valuable knowledge regarding the general well-being of the population as well as the prevalence of inadequate or below-average solutions. Consequently, this allows an evaluation of the algorithm's resilience and reliability—as seen in Fig. 6.

Fig. 6
figure 6

A comparative analysis of average and worst fitness values across iterations

The Parameter Sensitivity Plot plays a crucial role in the process of algorithm tuning and parameter analysis. As seen in Fig. 7. By graphically representing the relationship between the performance outcome and different parameter values, this visualization provides a clear understanding of how changes in parameters affect the effectiveness of the algorithm. It assists in determining the most suitable parameter settings that yield ideal or nearly optimal results, and also reveals the algorithm's ability to withstand fluctuations in parameters.

Fig. 7
figure 7

Impact of parameter variations: visual analysis of algorithm performance across different parameter settings

Model performance

The practicality and deployment potential of predictive models heavily rely on their performance. In our pursuit of forecasting delays in building projects in Jordan, the Multi-Layer Perceptron (MLP) has emerged as a prominent candidate, showcasing impressive predictive capabilities across various evaluation metrics.

As depicted in Fig. 8, the Mean Absolute Error (MAE) for the Multilayer Perceptron (MLP) was recorded as 0.1215, indicating an average magnitude of prediction error of approximately 12.15%. The Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) are often used metrics in the field of statistics to assess the accuracy of a model. In this particular case, the MSE and RMSE were calculated and yielded values of 0.0395 and 0.1986, respectively. These metrics are particularly useful as they give more weight to higher errors, providing a comprehensive evaluation of the model's performance. The combination of these measures indicates that although the model's predictions are often precise, there may be sporadic instances where the predictions stray significantly from the actual values. The R-Squared (R2) metric, which is widely used in regression research, yielded a noteworthy value of 0.9540 for the Multilayer Perceptron (MLP) model, providing valuable insights. This finding indicates that a significant proportion, specifically 95.40%, of the variability observed in the dependent variable (time overruns) can be accounted for by the independent factors included in our dataset, thereby highlighting the robust predictive capability of the model. Upon further examination of relative metrics, it was observed that the Relative Absolute Error (RAE) and Relative Squared Error (RSE) yielded values of 0.6104 and 0.6857, respectively. By comparing the performance of the MLP with that of a naive model, these metrics provide a comparative perspective, indicating that the average errors of our model are approximately 61.04% and 68.57% of the mistakes observed in a simplistic model.

Fig. 8
figure 8

Quantitative and visual assessment of MLP’s predictive proficiency: a deep dive into key performance metrics

The provided bar chart effectively represents the performance measures of the MLP, offering a visual overview that emphasizes the effectiveness of the model. Each statistic, which is visually depicted as separate bars, provides a concise overview of the model's strengths and areas that could benefit from further improvement.

Within the domain of predictive analytics, the effectiveness, resilience, and applicability of a model are reflected in its performance. This work employed the Coral Reefs Optimization Algorithm (CROA) in conjunction with the Multi-Layer Perceptron (MLP) to offer a novel methodology for forecasting time overruns in building projects. The findings, as explicated by the performance indicators, are both encouraging and enlightening.

The Mean Absolute Error (MAE) of the MLP-CROA exhibited a value of 0.0707, suggesting that the model's average deviation from the true values is around 7.07%. In addition, the Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) yielded values of 0.0115 and 0.1072 correspondingly. The metrics, specifically the root mean square error (RMSE), offer valuable information on the size and dispersion of the prediction inaccuracies, indicating that the model's forecasts are generally close to the actual values. The R-Squared (R2) metric, which is widely regarded as highly informative in regression tasks, exhibited a remarkable value of 0.9914 for the MLP-CROA. This figure indicates that a remarkable 99.14% of the variability in time overruns can be explained by the predictors included in our model, which serves as evidence of the model's strong predictive ability.

Upon delving into the relative metrics, it was observed that the Relative Absolute Error (RAE) and Relative Squared Error (RSE) exhibited values of 0.3549 and 0.1997, correspondingly. The utilization of these metrics, which provide a comparative perspective in contrast to a simplistic benchmark, highlights the superior performance of the MLP-CROA model in terms of its predictive accuracy and precision. Figure 9 depicts the bar chart that accompanies this analysis, serving as visual evidence to support the aforementioned findings. This figure succinctly presents a comprehensive summary of the model's performance. Each statistic, represented by separate bars, highlights a specific aspect of the model's capacity to make predictions, collectively providing an overview of a model that is both strong and skilled.

Fig. 9
figure 9

Comprehensive performance assessment of MLP-CROA: an integration of quantitative metrics and visual insights

In summary, based on the evaluation of metrics and visual representations, the results indicate that the combination of deep learning and bio-inspired optimization, as demonstrated by the MLP-CROA model, exhibits considerable potential in the field of construction project forecasting. The findings derived from this analysis provide a foundation for additional enhancements and future implementation in practical situations.

Comparison between the two models

The field of predictive modeling relies heavily on the practice of comparing different models, as this allows for valuable insights about their respective strengths, areas for potential improvement, and suitability for deployment. Within this particular setting, a notable comparison arises between the independent Multi-Layer Perceptron (MLP) and its enhanced variant, the MLP augmented with the Coral Reefs Optimization Algorithm (MLP-CROA).

The provided bar chart depicts a comparative assessment of the two models, showcasing their performance across many parameters in a side-by-side manner. Upon initial examination, it is evident that the MLP-CROA routinely demonstrates superior performance compared to the standalone MLP. This finding is further supported by a comprehensive analysis of specific measures. The Mean Absolute Error (MAE) of the MLP-CROA model demonstrates a significant decrease, indicating that the improved model provides predictions that are, on average, in closer proximity to the true values. In a similar vein, the metrics, such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and the relative metrics Relative Absolute Error (RAE) and Relative Squared Error (RSE) serve to demonstrate the MLP-CROA's exceptional predictive capabilities. As seen in Fig. 10. The R-Squared (R2) metric emerges as a particularly informative measure in this comparative analysis. Although both models exhibit high R2 values, showing a robust predictive capacity, the MLP-CROA model's R2 value of 0.9914 implies that it can account for an impressive 99.14% of the variability in time overruns, surpassing the standalone MLP model. The present comparative research serves to emphasize the strengths of the MLP-CROA approach, while also shedding light on the potential for significant transformation that arises from the integration of bio-inspired optimization techniques with deep learning models. By optimizing the model parameters, the Coral Reefs Optimization Algorithm significantly improves the performance of the Multilayer Perceptron (MLP), resulting in more precise and resilient predictions.

Fig. 10
figure 10

Comparative performance analysis: MLP versus MLP-CROA—a deep dive into metrics and model efficacy

Nevertheless, it is imperative to recognize that although the MLP-CROA exhibits enhanced performance, it also involves a more intricate training procedure owing to the inclusion of an additional optimization phase. The intricate nature of this intricacy could potentially result in extended training durations and heightened computational requirements. It is crucial for both academics and practitioners to be aware of these factors when implementing these models.

The potential for predictive modeling in construction projects is attractive due to the synergistic relationship between deep learning and bio-inspired optimization, exemplified by the MLP-CROA. The undeniable improvement in model performance necessitates careful consideration of the trade-off between accuracy enhancements and computational expenses, hence ensuring the model's practicality for real-world implementations.

Discussion

The evaluation of our models, specifically the MLP-CROA, offers significant insight into forecasting time delays in building endeavors. By thoroughly examining the obtained outcomes and comparing them with the findings from the literature review, a comprehensive comprehension of the research field and the potential ramifications of our study may be achieved.

The results of our investigation indicate that the MLP-CROA model demonstrated a higher level of performance in accurately forecasting instances of time overruns. The improved performance can be ascribed to the collaborative effect of the deep learning model's intrinsic capabilities and the optimization proficiency of the Coral Reefs Optimization Algorithm. The impressive performance of our approach necessitates a comprehensive study of the underlying factors contributing to its efficacy. The optimization phase is a crucial step in refining the model parameters. It plays a significant role in navigating the solution space to locate either optimal or close to ideal solutions. This process ultimately leads to an improvement in the accuracy of predictions. The results obtained from our study align with the existing body of literature about time overruns in building projects. Sambasivan and Soon (2007) highlighted the noteworthy role of contractor-related and labor-related factors in contributing to schedule overruns. The predictive capabilities of our model make it a valuable tool for proactively addressing the reasons for time overruns, potentially reducing the negative consequences identified in their research. Mishra et al. (2018) highlighted the significance of time overruns in project performance. This viewpoint is supported by Siyabi and Khaleel (2021), who established a direct correlation between time overruns and project costs and quality. Utilizing our model, which offers precise forecasts, holds significant importance in facilitating proactive project management, thereby ensuring the adherence of projects to predetermined schedules and budgets.

In their study, Ghazali and Wong (2014) emphasized the widespread occurrence of time overruns in building projects, specifically focusing on the setting of Saudi Arabia. The significance of prediction tools, as exemplified by the ones produced in our work, is underscored by the widespread nature of these difficulties. The research conducted in our study utilizes sophisticated procedures that are in line with the prevailing academic trajectory. This is exemplified by Eltoukhy and Nassar's (2021) investigation, where they applied Support Vector Machines to forecast time overruns. Their findings reinforce the considerable promise of machine learning techniques within this field. Our study adds to the expanding corpus of literature that underscores the importance of resolving schedule overruns in building projects. Numerous factors, including contractor performance and labor shortages, have been recognized as influential in this widespread issue. However, incorporating machine learning and optimization techniques, exemplified by our MLP-CROA model, presents a promising pathway for applying predictive analytics in this domain. Our study emphasizes the transformative potential of advanced predictive models in achieving timely and cost-effective project completion, drawing upon the foundational works of researchers, such as Sambasivan and Soon (2007), Mishra et al. (2018), Siyabi and Khaleel (2021), Ghazali and Wong (2014), and Eltoukhy and Nassar (2021).

Conclusion

The construction sector, within the project management domain, finds itself at a pivotal juncture where innovation intersects with tradition. The main aim of this study was to utilize deep learning, in conjunction with the Coral Reefs Optimization Algorithm (CROA), to forecast delays in time completion for construction projects in Jordan. This study used the Multi-Layer Perceptron (MLP) as the foundational deep learning model to analyze a dataset including diverse construction projects, their respective timeframes, and financial parameters. The results of our study were both enlightening and encouraging. The independent Multilayer Perceptron (MLP) had notable prediction capabilities, further improved by incorporating the Cross-Residual Optimization Algorithm (CROA). The MLP-CROA model, after optimization, consistently demonstrated superior performance compared to its non-optimized counterpart across various evaluation measures, ranging from Mean Absolute Error to R-squared. The performance above not only highlights the precision of the model but also its viability for implementation in practical situations.

These findings have significant consequences for the construction industry in Jordan. The issue of time overruns, which frequently lead to cost overextensions, has consistently posed a significant difficulty. The research provides a comprehensive set of tools, including a predictive model, enabling the proactive identification of probable time overruns. This allows project managers to implement corrective measures promptly. Furthermore, the incorporation of bio-inspired optimization methods, shown by the demonstrated effectiveness of the CROA, implies the existence of a broader scope for further investigation. Methods such as CROA can play a crucial role in enhancing forecasting accuracy and optimizing multiple aspects of project management. Upon reflection, it is crucial to recognize the intricacies that arise from the optimization process, notwithstanding the hopeful outcomes. The intricate nature of these phenomena might result in significant computational requirements, a factor that stakeholders need to be aware of. Nevertheless, based on our research findings, the trade-off favors improved accuracy and predictive capacity. In Jordan's construction industry, integrating conventional methodologies with emerging technologies, such as deep learning and optimization algorithms, can serve as a guiding beacon, facilitating the timely and cost-efficient execution of projects.