Keywords

1 Introduction

Existing reinforced concrete (RC) structures are constructed following the old design guidelines where the design demand on the structure was evaluated based on mostly gravity loads. Such structures are highly vulnerable to deterioration due to accidental damage, earthquakes, poor maintenance, and corrosion. It is essential to adopt a retrofitting technique to avoid demolition and disruption to typical daily services. About 28% of the local road bridges and 25% of rural highway bridges in Canada were built more than 50 years ago, and approximately 15% of these bridges were found to be in the worst condition [19]. The fiber reinforced polymer (FRP) has gained popularity due to its favorable properties: lightweight, high strength, durable, non-corrosive, and easier installation [10]. FRP has made its way into the shear strengthening of structures and is being studied for further improvements in its serviceability.

Compared to rectangular beams, T-beams have higher resistance to shear cracks [16]. As the main objective of beam shear strengthening is to bridge the cracks, FRPs are one of the most effective strengthening solutions. The typical wrapping schemes observed in externally bonded FRP retrofitted beams are u-wrap, closed wrap, and side bonded wrap. Although termed as the best option, the closed wrap is mostly avoided as it is difficult to wire the FRP laminates up to the flange in T-beams. A reliable design method is necessary to increase the longevity of the retrofitted beams and utilize most of the properties of FRP laminates. Several experimental studies are available in the literature focusing on the shear strengthening of RC beams using externally bonded FRP layers [7, 8, 11]. Design codes and guidelines, namely ACI 440.2R-17 [1], CSA S6:19 [2] and CSA S806 [3], are widely followed in designing such retrofitting systems. However, the precision obtained from the design guidelines is inadequate and the design equations rely heavily on several parameters calculation of which is often tedious. Furthermore, a number of studies developed semi-empirical equations for the shear strength determination of externally bonded FRP retrofitted beams [2, 6, 15]. Various prediction models have also been developed in the past by researchers to identify the shear capacity of the beams [14, 20, 21]. The prediction models, however, lack accuracy when applied to factors outside the range of the data that were used to develop the models. A more comprehensive database is thus required to develop a high accuracy prediction model for the shear capacity of RC T-beams strengthened with FRP.

The application of artificial intelligence (AI) in structural engineering has allowed the community to attain reliable predictions models. The machine learning (ML) algorithms developed are able to estimate the shear capacity of structural components with satisfactory accuracy [5]. The ensemble learning models, namely random forest (RF), XGBoost (XGB), CatBoost (CB), and AdaBoost (AB) are found to be excellent tools to provide estimations with high precision. [17] studied the shear capacity estimation of steel fiber reinforced concrete beams using ML models and identified that the XGB gave the best results with the highest accuracy. This paper aims to develop an interpretable ML model using the ensemble learning models (RF, XGB, CB, and AB) to estimate the shear strength of externally bonded FRP retrofitted RC T-beams. To evaluate the accuracy, the results are compared with the equations provided by design guidelines as well as the empirical studies done in the past. The study is unique in the sense that it covers the largest data of T-beams and therefore can be used to increase the accuracy of the prediction model.

2 Database Collection

A total of 302 data are collected for RC T-beams from experimental studies conducted between 1997 and 2021. The data includes details of cross-sectional dimensions: width (b) and effective depth (d), shear span to effective depth ratio (a/d), transverse steel ratio (Asv), concrete compressive strength (\(f^{\prime}_c\)), types of fiber, the total thickness of FRP (n*tf), width of FRP strips (Wf), elastic modulus of FRP (Ef), ultimate strain of FRP (Ɛfrp,u), tensile strength of FRP (ffrp,u), and shear capacity contribution by FRP (Vf) along with the total experimental shear capacity (Vexp). The types of wrapping included in the database are U-wrap (UW), side bonded (SB) and closed wrap (CW). Correspondingly, the types of fibers in the database are carbon (CFRP), aramid (AFRP), basalt (BFRP), and glass fiber (GFRP). Figure 1 shows the schematic representation of an externally bonded FRP strengthened RC T-beam.

Fig. 1
2 illustrations. A, depicts the top view of a T-beam with reinforcements and sections A-A and section B-B. B, depicts the orientation of F R P laminates aligned at an angle beta with the horizontal.

a Typical T-beam under study and b orientation of FRP laminates

The statistical properties of the database are summarized in Table 1. The distribution of the database collected from literature in terms of the type of fiber and wrapping scheme is presented in Fig. 2. From Fig. 2, it can be seen that CFRP and UW are the most common type of fiber and wrapping scheme, respectively. The shear contribution by FRP is calculated by deducting the shear strength of RC T-beam without FRP (control specimen) from the total shear strength of the FRP retrofitted T-beams.

Table 1 Statistical measures of input parameters
Fig. 2
2 pie charts depict the distribution of data. C F R P, 85%. G F R P, 14%. A F R P, 0.66%. B F R P, 0.33%. B, U W, 89%. S B, 10.32%. C W, 0.66%.

Distribution of data with respect to a type of fiber and b wrapping schemes

3 Selection of Input Features

For a satisfactory model performance, the selection of proper input features is very important. The input parameters chosen for this study in ML model development are based on the guidelines CSA S6:19 [4] and ACI 440.2R [1] as well as from empirical equations developed in earlier studies [2, 6]. The parameters considered as input include width of beam (b; effective depth of the beam (d); shear span to effective depth ratio (a/d); height of FRP strips (hf); modulus of elasticity of FRP material (Ef); ultimate strain of FRP (Ɛfrp,u); ratio of transverse reinforcement (Asv); concrete compressive strength (\(f^{\prime}_c\)); total thickness of FRP layers (n*tf); ultimate strength of FRP (ffrp,u); type of wrapping scheme and type of fiber. The variation in total experimental shear capacity of T-beam specimens retrofitted with the externally bonded FRP laminates (Vexp) with respect to a/d ratio, \(f^{\prime}_c\), n*tf and ffrp,u is illustrated in Fig. 3. Figure 3 also includes the variation in experimental shear contribution by FRP (Vfexp) with respect to Asv.

Fig. 3
5 plots of variation in shear capacity with respect to selected material parameters. Most of the points are plotted between 0 and 1200 of the y-axis and 15 and 45 of the x-axis in plot b and 2 and 4 of the x-axis in graph a.

Variation in shear capacity with respect to selected material parameters

From Fig. 3, it is seen that the total shear strength increased as a/d ratio increased up to 2.3 and then began to decrease as the ratio increased. At a/d ratio less than 2.0 the angle between the principal direction of fiber and critical shear crack is typically observed to be larger, thereby reducing the tensile stress in the fibers and consequently lower shear strength of the specimen [13]. The bond between FRP and concrete is vital in shear strengthening mechanism. It is observed in Fig. 3b that as the \(f^{\prime}_c\) increased, the shear capacity of the beams increased as well. On the other hand, it is seen that as Asv increased, Vfexp decreased. According to [12], the reason for such behavior could be that both transverse steel stirrups and FRP share the shear force. When there is lower transverse steel, FRP contributes most to the shear resistance in the beams [9]. In Fig. 3d, it is observed that as the total thickness of FRP increased the shear capacity also increased up to a thickness of 1.4 mm. The shear strength, however, remained constant beyond 1.4 mm thickness. No particular trend is observed in variation of Vexp with respect to ffrp,u (Fig. 3e) although the beams are found to have the highest capacity when the ffrp,u is between 3000 and 4300 MPa.

4 Ensemble Machine Learning Model Overview and SHAP Feature

This section provides a brief introduction to models generated in this paper including a description of the SHapley Additive exPlanations (SHAP) for identifying the feature importance. The k-fold cross-validation technique is applied for all the models analyzed in order to improve the model performance, details of which is described in the next section.

The RF is a supervised ensemble learning algorithm that combines multiple decision trees in order to learn the mapping between input and output. The decision tree is a supervised learning algorithm that utilizes a chart-like tree to predict target variables from the training data. In RF, a random selection of training dataset and feature subset is made for each decision tree to avoid overfitting of data in the individual decision trees. Eq. 1 shows how the input variable (x) is mapped to the output where yn denotes the number of individual decision tree, \(X^{\prime}\) as random selection of features, and N as the total number of decision trees.

$$\widehat{y } = \frac{1}{N}\mathop \sum \limits_{i = 1}^M y_n \left( {X^{\prime}} \right)$$
(1)

The XGB is called a gradient boosting algorithm which works on the decision tree as well. The gradient boosting feature minimizes the model error by using a gradient descent algorithm. An optimized technique for obtaining superior performance with diverse datasets is done in the XGB model. On the other hand, the CB algorithm works by combining the “Category” and “Boosting” features where the gradient boosting grows oblivious trees. All the nodes are maintained at the same level, and the predictions are done within the same conditions. Without extensive hyper tuning and data training, the CB yields a state-of-the-art performance. Adaptive Boosting technique is used in the AB algorithms where the weak learning algorithms are combined to improve the overall model performance.

The SHAP is used to explain the prediction of an outcome by computing the importance of each feature for the target prediction. The concept of SHAP is based on the game theoretic approach of SHAP values, where each feature of the instance is considered as “player” and the output prediction as “payout.” The technique indicates how the payouts are distributed among the features. The “summary.plot” from SHAP has been used in this study to explain the feature importance of the best prediction model.

5 Results and Discussion

The dataset is divided into training and testing sets with 80% as the training set chosen randomly to obtain the initial model hyperparameters. The remaining set is used as the testing data for model performance evaluation. A tenfold cross-validation technique is applied where the dataset is divided equally into 10 subsets with 1 being used as the validation and the remaining 9 sets for model training. The cross-validation technique is repeated 10 times with each of the subsets being used as validation data and average considered as the final output.

5.1 Cross-Validation and Hyperparameter Tuning

The results of cross-validation accuracy are shown in Fig. 4 where the interquartile range is presented by the box and the median value with a straight line in the middle of the box. The whiskers represent data exceeding 1.5 times the difference between the first and third quartiles, respectively. The median cross-validation accuracy of the ML models developed in this study ranged from 75% (AB) to 86% (RF). The median accuracy by CB is also close to that of RF (85.5%). Moreover, the interquartile range in RF is seen to be the smallest out of the four models showing less variation among data. The interquartile ranges of AB and XGB are similar with the lowest accuracy shown by AB. Therefore, it is understood that the best performing models out of the ones developed in this study are RF and CB.

Fig. 4
A box plot of cross-validation results for four models. The highest cross-validation accuracy for the median is for the model R F equal to 86% approximately and the first minima is the highest for R F, X G B, and C B equal to 96% approximately.

Cross-validation results

Table 2 summarizes the optimized model hyperparameters used in this study. Table 3 presents the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) values obtained from the models developed. The ensemble models RF and CB outperformed XGB and AB as evident by the highest R2 (0.897 and 0.899, respectively) and relatively lower MAE values (0.128kN and 0.127kN, respectively). Table 4 presents the equations used to calculate the statistical measures used in monitoring the model performances.

Table 2 List of optimized hyperparameters
Table 3 Performance measures of the developed ML models
Table 4 Formula to calculate the model performance measures

5.2 Interpreting Model Results Using SHAP

The effect of input features on the CB model for the FRP wrapped T-beams is presented in Fig. 5. The importance of each feature is ranked from low to high, where the higher SHAP value indicates higher importance of the feature and vice versa. The y-axis presents the order of features in terms of lowest to highest importance and each point on the plot horizontally along the individual feature indicate the high impact (red) and low impact (blue) conditions. It can be seen that the height of the FRP layer (hf) plays the most important part in predicting the shear capacity for models developed for the T-beams where the associated SHAP value increased with hf values. The a/d ratio is at the mid-rank in importance which shows that at low a/d values the SHAP values are higher, thereby implying that for lower values of a/d, the impact of a/d ratio is high in shear strength prediction. The least important parameter is observed to be the type of fibers.

Fig. 5
A SHAP value plot. The different models are depicted on the left side and the featured value from high to low is depicted on the right. The model a/ d has a high feature value from negative 0.6 to 0 and h f has a high feature value from 0 to 0.5.

Feature importance explanation using SHAP

6 Comparison with Design Code and Empirical Equations

In order to identify the accuracy of the design guidelines and empirical equations proposed in literature, the results by these equations are compared with the experimental data collected in this study. One of the common approaches in calculating the total shear capacity (Vtotal) of FRP retrofitted RC beams is the summation of shear contribution by transverse steel (Vs), concrete (Vc), and FRP (Vf) as shown in Eq. 2.

$$V_c + V_s + V_f = V_{{\text{total}}}$$
(2)

Deducting the shear contributions by transverse stirrups and concrete to find out FRP contribution does not provide accurate information as identified by Rousakis et al. [18]. Following such observation, the total shear capacity of externally bonded FRP strengthened RC T-beams is considered for comparison purposes in this study. A total of three design guidelines and three empirical equations are used to calculate the shear contribution by FRP in the T-beams. The shear crack inclination is an important factor in shear calculation and determining the angle of inclination (\(\theta\)) is difficult. The ACI 440.2R [1] guideline considered the value of \(\theta { }\) to be 45°, whereas CSA S6:19 [4] suggested its value as 42°. This paper considers a 45° angle of shear crack inclination in the subsequent calculations.

In order to monitor the efficiency of the chosen guidelines for comparison study, the experimental shear capacity is plotted against predicted shear capacity by the equations with a 45° line to identify the conservativeness of the formulations (Fig. 6). For instance, the points below the line indicate that the prediction is safe/conservative to use whereas those above the line represent unsafe/unconservative prediction by the equations. In Fig. 6, it is seen that the CSA S6:19 showed the most conservative predictions with approximately 30% data above the line. Among the empirical equations, it is seen that D’Antino and Triantafillou [6] provided the least unsafe data. Most of the data points are above the 45° line in case of the Mofidi and Challaal (2014) equation implying its poor performance. It can also be noted that, except for CSA S6:19, the trend in all the results in Fig. 6 diverges significantly from the diagonal line. A comparison of the best models developed in this paper is also included where it is seen that the data points are close to the 1:1 diagonal line for the best and second-best performing models in terms of the tenfold cross-validation results (CB and RF, respectively).

Fig. 6
8 scatter plots of the comparison of the design code and empirical equations with the experimental results. The first six plots have most of the points plotted between 0 and 400 k N approximately.

Comparison of the design code and empirical equations with the experimental results

The distribution of predicted to experimental ratio with respect to a/d of all the equations chosen for this study is illustrated in Fig. 7. It is seen that the CSA S806 [3] shows results with fairly conservative estimation and the lowest standard deviation (SD). Among the empirical equations, the worst prediction is observed from Mofidi and Challaal [15] where the mean is at 1.372 and the SD is at 0.87, thereby showing the high dispersion in data. The high scattered nature of data points is prevalent in all the formulas adopted for comparison up to an a/d ratio of 3.40. It can be observed that the models CB and RF outperformed the rest chosen for comparison with a mean predicted to experimental results ratio of 1.00 and relatively low SD values of 0.13 and 0.12, respectively.

Fig. 7
8 plots of variation in predicted to the experimental ratio for 6 legends. In the first six plots, all the legends lie between 1 and 5 of the x-axis and 0 and 3 of the y-axis.

Variation in predicted to experimental ratio with respect to a/d

It is evident from the above speculations that the models developed in the current study are superior in prediction accuracy and can be applied to a wide range of data. It can also be noted that the equations chosen for comparison do not show a particular trend in terms of the performance measures analyzed in this study. For instance, from Fig. 6 it is seen that the code CSA S6:19 [4] shows the best performance with respect to the least data points in the unsafe zone. On the other hand, the results in Fig. 7 imply that the code CSA S806 [3] performs the best with low SD values and mean closest to 1.00. The models CB and RF, identified as the best and second-best models in this paper, are the only models that show consistent performance in all measures analyzed. Therefore, it is safe to say that the ensemble models CB and RF outperformed the XGB and AB in estimating the shear capacity of FRP strengthen T-beams.

7 Conclusions

This study is based on an extensive database regarding shear strengthening of reinforced concrete T-beams with externally bonded FRP layers. The database collected is then used to develop four ensemble learning models and identify the best models to generate an estimation method with higher accuracy and lower computational efforts that can be applied easily in the practical field. The effect of concrete and FRP properties and beam cross section details on the shear strength of the specimens is studied. SHapley Additive exPlanation is used to interpret the importance of input features of the models. Finally, a comparison of the data with those predicted with formulations used widely in research and practical designs is done to verify the accuracy of the guidelines. The following conclusions can be drawn from the study:

  1. 1.

    The shear capacity increased with increase in a/d ratio to a point beyond which the capacity decreased with increasing a/d values. The highest shear contribution by FRP is observed in specimens with no transverse reinforcement. Moreover, no particular trend is observed in shear capacity due to changes in FRP tensile strength.

  2. 2.

    From the tenfold cross-validation, the coefficient of determination obtained from RF and CB models were very close to 1.00 and the mean absolute error was found to be less than 0.25 kN.

  3. 3.

    The most important feature as explained by SHAP is the height of FRP layers. On the contrary, the least important feature is the type of fiber. It was also noted that lower a/d ratio has greater impact on the prediction of shear strength.

  4. 4.

    The design guidelines and empirical equations do not perform satisfactorily when applied to data outside the range considered in developing the corresponding equations.

  5. 5.

    The prediction data points obtained from CB and RF are seen to have low scatter and cluster at the 1:1 line when plotted against the experimental capacity.

  6. 6.

    The mean of predicted to experimental ratio results from CB and RF models is seen to be very close to 1.00 with standard deviations of only 0.12 and 0.13, respectively.

The results summarized above identify the fact that CB and RF models perform with satisfactory accuracy in shear strength estimation of externally bonded FRP retrofitted RC T-beams. The models developed can be implemented in design and strengthening solutions in practical field application.