1 Introduction

Water loss has received worldwide attention because of the continuing concern of the water industry regarding rapidly increasing population demands.

The IWA states that NRW was 48.6 billion cubic meters per year divided between residential and commercial losses [1] and that number grew to 126 billion cubic meters according to a 2019 study [2]. While there are many components of NRW, the biggest losses come from the pipelines, reaching 30% of the water pumped by the WDS [3]. About 10% of the supplied water is wasted in residential areas [4]. Therefore, it is important to detect and manage leaks as soon as possible to either repair or replace pipelines as a preventive measure [5]. The negative effects of leaks cover many aspects such as public health, in which water contamination is likely to happen through pathogen intrusion [6], the economy, in which utilities lose considerable revenue [7], dissatisfaction among the consumers from supply interruption [8], and serious environmental damage [9].

Avoiding water leaks in WDSs is challenging; most distribution systems in the USA are older than 50 years, which is over the typical design period. However, minimizing damages caused by leaks and improving the performance of the water supply system are doable [9] by detecting leaks as timely as possible and reducing NRW. Lambert categorizes water leaks into three classifications: reported, unreported, and background [10]. Reported leaks are easily repaired because consumers report water that rises to the surface of the ground. The situation differs for reported and background leaks because the utility, or a third-party company, needs to continually look for anomalies. There are situations where some small leaks are unidentified and may run unnoticed for months.

One of the most commonly used methods by water utilities is MNF [11], which compares flow rates with flow thresholds. Unusual jumps that signify leaks trigger an alarm as the flow goes beyond the threshold; flows should be minimal at night, which makes the process of identification smoother and valid. This cumulative pressure on water availability is a motive to consider the optimal management of clean water resources [12].

The IWA launched DMA to help with constant supervision of water consumption [13]. In collaboration with the AWWA, they initiated an effort to develop an effective water balance-based method under the name “The Water Audit Method.” The method helps utilities manage supply efficiently under the ongoing pressure of a growing population, urbanization, industrialization, and other global pressures [14] without having to be weighed down by financial liabilities produced by physical losses. Since The Water Audit Method launched in the latter years of the past century, many methods have been developed that consider how the technical infrastructure is constantly evolving and how utilities collect massive amounts of data from the SCADA or other telemetry systems.

Many hardware- and software-based methods have been developed to detect bursts and leaks [15]. Hardware methods depend on sensors and devices [16] and are split between acoustic detection methods (e.g., leak correlator, leak noise loggers, and listening rods) and non-acoustic detection methods (e.g., ground penetrating radar and gas injection). While hardware-based methods have proven efficient and accurate in locating leaks and bursts, they are time and labor-intensive, in addition, such manual methods lose effectiveness as the depth increases [11, 17]. On the other hand, software-based methods are faster and less expensive [18] and are divided into two categories: numerical or model-based (e.g., time-domain analysis) and non-numerical or data-driven modeling methods (e.g., artificial neural networks and support vector regression).

Much effort has been invested in constructing systems capable of detecting leaks and precisely locating them in their distribution networks through a combination of mathematical models and machine learning algorithms; these techniques and methods fall under either transient-based, model-based, or data-driven approaches. Transient-based approaches find leaks by measuring pressure signals with frequent tests [19] and can locate some leaks. Model-based approaches employ mathematical formulas and first principles modeling to analyze data and discover links between the metrics and process states [16, 20] and are successful in finding leaks. Data-driven approaches do not demand a full understanding of water networks or the deciphering of the micro-interactions that take place in normal WDSs. Data-driven modeling can deliver results based on signal processing or statistical analysis, sometimes without needing deterministic models [12, 18].

Leak detection in WDSs is challenging given the current technologies. This review presents some techniques and approaches for novelty detection so that the utility acts as quickly as possible to alleviate the adverse impact of leaks and provide uninterrupted supply to consumers. The rest of the review is outlined as follows: Sect. 2 discusses the methodology used. Section 3 goes into depth into data-driven approaches, and Sect. 4 discusses model-based methods. Section 5 assesses the performance of many discussed methods. Section 6 summarizes problems with the approaches, and Sect. 7 concludes the review.

2 Research methodology

The research methodology follows PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [21]. This systematic review reflects and considers peer-reviewed research papers from 2002 through 2021. The identification phase was completed using Web of Science and Google Scholar databases from different journals and conferences. The keywords used in this literature research are “water,” “leak detection,” and “data-driven” combined with “deep learning” and “model-based” in different combinations. The initial identification from different search engines identified 319 possibilities. Twenty-six duplicates were removed, leaving 293. We further excluded 89. Then, we were unable to retrieve 120 studies, leaving 84 for additional screening. Finally, 18 studies were eliminated for some reason, leaving 66 to be reviewed in detail.

Figure 1 shows the exact procedure followed from the identification of relevant papers to the final step of inclusion. Figure 2 shows the distribution of the papers included in this section against the publication year, and Fig. 3 shows that 71.1% are focused on data-driven approaches. The exclusion criteria were as follows:

  1. I.

    Any article that is not related to the body of the review

  2. II.

    Papers in languages other than English

Fig. 1
figure 1

Research methodology diagram based on the PRISMA framework

Fig. 2
figure 2

Number of publications included in the research methodology section

Fig. 3
figure 3

Number of data-driven approaches and model-based methods included in the research methodology section

3 Data-driven methods and techniques used for burst and leak detection

Data-driven approaches are divided into four main categories. Figure 4 illustrates a generic framework for data-driven approaches.

Fig. 4
figure 4

Basic framework of data-driven approaches

One of the earliest researches this century related to burst and leak detection was by Mounce in [22]. Their study used a neural network to detect bursts in the WDS. Since then, there has been a dramatic growth in the number of studies relating to the topic. Mounce and all used ANNs to identify leaks by building an empirical model that classifies and predicts when such an event occurs from time-series data produced by sensors; this automated computing structure acknowledged the complexity and non-linearity of the water distribution system and solved the problem by decomposing the network and constructing more than one neural network to supervise zones in the DMA [22].

In a study published in 2006, Mounce et al. tested static and time delay architectures of neural networks to classify bursts that were simulated through hydrant flushing. The results showed that time delay NNs are more accurate than static neural networks. Time delay NNs feature a memory element and can learn patterns through time. That feature is lacking in static NNs. The outperformance was solidified by the results that showed that time delay NNs identified 75% of bursts, while the static architecture could detect only 4% of the simulated events [23].

The theoretical basis of applying FLS and engaging ANNs in the form of MDNs is to draw conclusions about abnormalities from a data stream. This implementation of FLS requests flow data, and it results in an error of 10%, which is smaller than the error resulting from MNF [24]. SOMs employed a leak function to help ease detection [25]. Mounce and others presented an online application of artificial intelligence to detect leaks and other abnormal events, and they verified the ability to detect small and medium leaks with an exceptionally small number of ghosts [26].

Mounce et al. [27] used SVM, a supervised statistical learning machine learning model, to model and predict from time-series data. When comparing the SVM model to ANNs, SVMs appear to have less sensitivity to noise, making it more practical according to [28].

Romano et al. [84, 85] used artificial neural networks, statistical process control, and a Bayesian network to forecast pressure and flow values, compare the actual and expected values, and finally classify the signals as normal or leak. Two years later, Romano et al. [29] improved their methodology to approximately locate bursts by using geostatistical techniques and a multivariate Gaussian graphical model and overcame the shortcoming of the probability that accompanies the increasing number of accessible data. In that paper, they used four geostatistical techniques, OC, OK, LP, and IDW, and of the four techniques, OC provided the smallest root-mean-square errors in all the five scenarios studied.

The LKF method developed by Ye and Fanner is based on regression and Kalman filtering of flow and pressure measurements, and the results obtained show that flow data are more sensitive to bursts [30]. Researchers argue that LKF has the advantage of efficient computations that make it an effective and practical method to expose leaks [14].

Palau et al. [31] performed PCA to detect severe outliers. Pattern matching methods and auto-associative neural networks for novelty detection showed promising results in exploring enormous time-series datasets very quickly despite the limitations in pattern libraries [12]. Fourier series and CUSUM are used for leak detection. The former learns the flow dynamics, and the latter is used for detection. Eliades et al. [32] presented their solution in addition to the night flow analysis for comparison, Fourier series and CUSUM outperformed the night flow analysis and scored a TPR of 94.5%, whereas the highest TPR the NF analysis achieved is 79.5%. The combined Fourier series and CUSUM method may miss small leaks unless a proper detection threshold is chosen to weed out background noises. The method, however, does not require sensors that sample pressures frequently, which most systems do not have. It also avoids misclassifying existing small leaks when the analysis starts.

Weighted least squares with the EM algorithm is an unbiased unsupervised procedure, and it detects a leak based on the difference between the measured flow and the expected flow [33]. Bekker et al. [34] developed an investigative approach where the setup starts with producing the expected values of demand and pressure, and then compares the generated values with the measured values. The deviations indicate a burst or leak, and to continuously supervise the network, a threshold value is established based on the previous year’s data. This method was applied on three networks in three different areas, and the performance varied based on area. As the area covered by a flow meter gets larger, the size of the leak must grow to be detected. For example, the Rhine area in the Netherlands was studied in the paper, and only leaks that exceeded 150 m3/h were detected. To move past such an obstacle, investment in more flow meters to divide into smaller areas is recommended.

The probabilistic demand forecasting model distinguishes abnormalities, e.g., bursts, based on previously documented data under the assumption of being normal (no bursts). The methodology identifies a burst based on the forecasted and actual values, and when the probability of the actual demand is larger than that of the forecasted one by a probability distribution function (PDF) then a burst exists [8, 35]. The methodology worked better at night with bursts greater than 5% of the mean flow.

NKF overcame the drawback of LKF by addressing the current system’s condition. Jung and Lansey [36] compared their methodology to two statistical process control methods, i.e., CUSUM and Hotelling T2. NKF, which uses SIP and nodal group demand estimates, had a detection probability close to SPC-Hotelling T2 in identifying large bursts, but the former proved to be better when it came to small leaks and instant detection. Anomalous event detection produced by [37] is a four-step statistical methodology, which relies on the desired TPR and FPR values and the window size. This modified SPC has proven its flexibility and is used by many water utilities. Karray et al. [38] proposed a hybrid method to detect leaks in long pipes, and it showed promising results: fast detection speed, acceptable reliability, full-time availability, good sensitivity, and medium complexity.

Evolutionary polynomial regression has been proven to be an effective leak detection method with no prerequisites, such as replacing the metering devices in the DMA [39]. BNL is an ANN approach that overcame the problem of assuming mutual relationships between the factors that cause leaks and the problem of prediction and uncertainty. It used the algorithm ExSEM and achieved an accuracy of almost 85% when applied to a water system. However, the aforementioned technique cannot manage continuous data, and discretization must take place for it to operate, such limitations must be overcome to ensure a better performance for future uses [40].

BPNN, one of the most used NNs, contains at least three layers (input, output, and one hidden layer) and is integrated with five hoop strain variations that can recognize patterns and predict the leak point when choosing the right hidden layers. Its RMSE, which can be as small as 1%, proves the success of the method [41]. As opposed to the model-based genetic algorithm used by [42], the dynamic time wrapping and supervised learning algorithm proposed by [43] performs better at night with great accuracy and sensitivity and a low number of false alarms.

MLPNN was investigated by [44]. Another data-driven approach that employs two different classifiers individually, namely, LDA and neural network, finds leaks based on pressure data. This AI-based technique achieves 80% accuracy; the approach suggests that increasing the number of sensors [45], Rayaroth Sivaradje [46]) and considering the optimal positioning in the WDS improve the performance of the classifiers [45].

A hybrid method that incorporates KPCA and Cas-SVDD detected small leaks with an accuracy of 90% by using historical or real-time data depending on if it is offline modeling or online detection, then denoising and signal reconstruction based on LMD take place before announcing a verdict [47].

Research conducted by [48], which located leaks by using Kriging spatial interpolation, shows that many parameters factor into the outcome, which is the same as concluded by [45]. Bayesian reasoning is then employed to better the accuracy [48]. Like many techniques that rely on pressure data to manage leaks effectively, disturbance extraction and isolation forest techniques by [49] showed great promise for application to large WDSs by detecting leaks from noisy pressure measurements without the introduction of extra flow sensors. The performance evaluation measures are summarized in Table 1.

Table 1 Data-driven approaches

3.1 Clustering strategies

In 2016, [50] came up with a new strategy to manage uncertainty: the clustering algorithm. That algorithm eliminates the uncertainty that results from the prediction and classification stages.

There are two steps before reaching a verdict; the first is a clustering algorithm for outlier detection, and the second identifies the leak. The method is promising given the low FPR compared to other studies. Differentiating from the huge amount of historical data needed for classification and prediction methods, [51] created a method that needs only “1 day’s worth of time-series data” but can consider the variations throughout the year. They employed the temporally fluctuating correlation among flow sensors to evaluate the discrepancies between flow data coming from cosine distances to lower the false-positive rate brought on by cyclical changes in water demand, weather, and season. However, this did not result in better TPR results when compared to the clustering algorithm.

Monitoring support by Geleen et al. [52] is a proactive leakage control method that employs feature-based clustering that looks for cluster patterns to identify a leak based on pressure anomalies. This application is expanded to include oil, electricity, or gas leaks.

Lu and Sela [53] proposed a method that offers a quick and effective solution to find hidden information in WDSs by analyzing high-frequency pressure signals from scattered sensors [53].

Quiñones-Grueiro et al. [54] provided a method to manage large WDSs. They used topological clustering for network partitioning to locate leaks. CtL-SSL is a machine learning-based framework for WDS leak detection and localization. This framework lessens the data requirement, shows very effective results, and is a tool for water management in any WDS [55].

3.2 Methods implementing deep learning

Deep learning offers a framework to solve detection problems and a wider range in overcoming/issues that basic traditional techniques find challenging and has the potential to separate the variabilities more effectively [56]. Quiñones-Grueiro et al. [57] used a model-based approach to link a deep neural network, a data-driven approach, and Gaussian process regression to detect and locate leaks, a flowchart of the proposed approach is shown in Fig. 5. Wang et al. [58] succeeded in employing an LSTM neural network, which is an advanced version of the RNN structure, in leak detection because the method showed high accuracy and sensitivity.

Fig. 5
figure 5

Flowchart of the methodology combining a data-driven approach with a model-based method

Kang et al. [59] combined CNNs with support vector machines. That architecture uses a one-dimensional CNN for local feature detection and the latter classifies leaks, they proposed an algorithm to approximately localize the leaks. A novel framework developed by [60] locates the leak by narrowing down the locations to specific pipes and has proven applicability in practice. BLIFF uses pressure data in a mixed data-driven and model-based approach, a method that has not received much attention in other studies, to locate leaks. The framework proved useful because the locations were not as generalized and narrowed down to pipes. To escape the disorder of pressure signals, linear connections were used instead of the usual convolutional layers in DenseNet [61], which also eased calculations. With all the advantages that the method presents, there is a constraint, which requires pressure meters instead of flow meters.

A study by [62] employed ISLMD and CNN to detect and locate leaks. ISLMD eliminates noise from nonlinear and nonstationary signals. Their sensitivity results varied depending on whether the AlexNet or LSSVM model was used. AlexNet detected 79.15% of the leaks, while LSSVM found that 92.5% CNN was integrated with a variational autoencoder in a paper by [63], and the combination achieved great accuracy and could detect leaks as small as 0.25 L/s, which proves its success.

Liao et al. [64] used DL to identify leaks in the water pipeline system. The accuracy of their model relies on calibrated parameters (e.g., pipe roughness), and it could reach 99.7%. However, their method does not cover the grounds where several leaks occur simultaneously. Lee et al. [5] introduced an LSTM-RNN, which accounts for numeral conditions in the system and provides a summary of the network’s performance. Table 2 summarizes the performance of some of the methods mentioned in the literature. It shows the techniques used, the type of data needed for the models, and some metrics found.

Table 2 Data-driven approaches that implement DL

4 Model-based methods and techniques used for burst and leak detection

Many researchers have preferred model-based methods over the years before they fully realized their full potential or limitations. These approaches are represented by mathematical formulas. Table 3 classifies the methods under three main categories.

Table 3 Model-based methods

Pudar and Liggett et al. [65] introduced a sensitivity matrix to guide measurement locations, but they suggested the method could not replace traditional leak surveys. That was the case until research conducted by (Perez et al. [66]) took place and adopted a mathematical model. Consequently, from 2011 to 2019, much research was invested into introducing leak detection and localization methods based on pressure sensitivity analysis [67,68,69,70,71].

Perez et al. [67] focused on determining a reasonable number of sensors to obtain more accurate locations based on an optimal sensor placement methodology that gave successful results under ideal conditions, but its accuracy was adversely affected by noise. Casillas et al. [70] located leaks using pressure measurements. Their LSS method assigns a key signature to every leak and then applies a linear model to approximate the location. In a case study that employed 15 pressure sensors and five flowmeters, the success rate reached 87%, better than the other six cases studied [71]. The change that resulted from replacing pressure data with flow measurements for leak detection in pipelines was covered by [72, 73].

A series of model-based methods such as genetic algorithms, least squares, and multi-objective ant colony optimization fall under the optimization-calibration category [42, 74,75,76]. Wu et al. [42] found leak hotspots by optimizing their model before calibration. They used genetic algorithms to detect hard-to-find leaks. However, Hu et al. [20] revealed Wu’s method congregates to the local optima due to the huge solution space.

Sanz et al. [76] established a methodology that requires calibration to solve the long-term issue of the leak location. Multi-objective ACO succeeded in detecting leaks by using two stages: dividing the WDS into pressure zones and then calibrating the nodal demands to quantify and locate the leaks [74]. Many methodologies build on error-domain model falsification where a model is falsified based on the thresholds, and the remaining models describe the leak region. The falsification method proposed by [77] paved the way for work by [78,79,80,81,82].

5 Performance evaluation

Most authors assessed their methodologies using different measures. A confusion matrix is a machine learning concept [83], which represents visually the performance of an algorithm. A confusion matrix shows important measures: TPR, TNR, FPR, and FNR. Such measures are mentioned in various papers [23, 51, 59, 84, 85]. A true positive refers to the occurrence of an actual leak in the WDS, a false positive is an alarm for a leak when there is none. Other common measures for evaluation are accuracy, precision, and F-score. Figure 6 shows a confusion matrix with measures and their respective mathematical equations.

Fig. 6
figure 6

Confusion matrix

6 Results and Discussion

Data-driven approaches can be exploited to detect and locate leaks without deep knowledge of every component or relationship in the WDS. However, the approaches come with the price of obtaining high FPR. Wu et al. [8] summarized the most frequent limitations of classification, prediction-classification, and statistical methods.

The requirement of data and the absence of hydraulic data labels are two limitations in classification methods. Mounce et al. [23] used the same simulated events in training and testing the method. MLPNN by [44] is immune to noise, shows a parallel structure to the real-life WDS, and has speedy processing and sorting abilities. However, the performance evaluation rates were not explicitly mentioned as was the case with many other papers [30, 35, 48, 84, 85].

Prediction-classification methods could provide misleading results at times because of the uncertainty that results from measurement inaccuracies, absent data, or variations over the year. Many methods may not be able to identify small leaks [33, 36,37,38, 50, 51, 84,85,86,87,88,89] or be based on impractical assumptions that may not reflect reality [36, 90, 91].

Background noises often cause an error that leads to inaccuracy in conclusions [32, 88].

Many methodologies require the optimal placement of sensors or the investment in more sensors [45, 46, 84,85,86,87,88]. The presence of large leaks overshadows smaller leaks in the EPR modeling paradigm [39].

Model-based methods are widely applied and studied for leak detection but similar to data-driven approaches, there are some limitations and weak spots that must be addressed. Generally, their effectiveness is questioned because of uncertainties generated by full-scale water distribution systems. Many papers address this in the literature. Hu et al. [20] summarized the weaknesses, the hydraulic models must be well-calibrated and reality simulating for model-based approaches to be applicable, which requires a huge amount of data and investment. Well-calibrated hydraulic models are not common in water utilities.

Perez et al. [68] stated that the pressure sensitivity matrix is very applicable and provides adequate results. However, that method delivers satisfactory outcomes when the sensors are optimally placed. This methodology by (Perez et al. [68]) assumes that leaks occur at nodes and neglects the leaks that happen at the pipeline level. Consequently, the model locates leaks at the nearest node.

Casillas et al. [70] continued and said that pressure sensitivity matrices’ accuracy is affected by noise inference in the system. The genetic algorithm used by [42] operates well when working on large WDSs in the mornings, but it finds the local optima. Multi-objective ACO can be used with either pressure or demand, but the performance relies on the number of pressure measurements [74]. Error-domain falsification models show potential and have been tested on a real full-scale WDS [82], but the method of [77] cannot detect small leaks.

7 Conclusions and future research

Water loss has received worldwide attention over the years. Loss from leaks could be controllable and should be minimized. In this review, many methodologies are discussed under two distinct categories: data-driven and model-based approaches. Classification, prediction-classification, statistical methods, and clustering algorithms fall under the former while sensitivity matrices, optimization-calibration, and error-domain falsification-based methods fall under the latter. There is an overlap between the two categories with methods that mix the two [16, 20, 46, 89,90,91]. The performance of the proposed methodologies is summarized as well as their weaknesses. Many studies provided effective or promising detection techniques, and many papers provided approaches to locate leaks. Future work should focus on minimizing false-positive rates and maximizing true-positive rates in methodologies and techniques that water utilities can easily access. Some techniques can be used in other fields such as oil or electricity. Also, some techniques with high detection rates should be tested in real-life WDSs and compared with the theoretical results.