1 Introduction

This paper is motivated by the need to better understand the temporal and spatial structure of anomalous traffic in a nationwide internet network. Characterizing the stochastic structure of the arrivals of Internet traffic anomalies, such as Distributed Denial of Service attacks or link failures, have various practical applications. One is to facilitate the design of network simulators, which are used to validate computer networks before deployment. Another application is to predict the arrival of future traffic anomalies and to plan the provisioning of resources. There has been extensive research on anomaly detection. A Google search produces hundreds of papers on anomaly detection in internet or local networks. Chandola et al. (2009) provide a comprehensive survey of anomaly detection methods in various applications. Tsai et al. (2009) review 55 studies on intrusion detection in internet networks. Bhuya et al. (2014) comprehensively survey general network anomaly detection methods, systems, and tools, in terms of the underlying computational techniques, while Liao et al. (2013) summarize the network intrusion detection with respect to different network scenarios, from the perspective of system deployments, timeliness requirements, data sources, and detection strategies. The anomaly detection techniques and systems in specific network scenarios, e.g., wireless sensor networks, Xie et al. (2011), and internet of things, Zarpelao et al. (2017), have been thoroughly reviewed with respect to the distinct characteristics of their network anomalies and detection requirements. Relevant to our research is the work of Paschalidis and Smaragdakis (2009) who consider a spatio-temporal framework for anomaly detection and of Kallitsis et al. (2016) who describe a hardware–software framework for attack detection that operates on live internet traffic.

In contrast to a great deal of attention devoted to anomaly detection, there is practically no work focusing on quantitative description of the propagation of anomalies through the network. In fact, the fundamental quantitative understanding of network anomaly via statistical and probabilistic analysis could significantly enhance the network intrusion performance and shed light on network intrusion detection system design. This low level of understanding of the stochastic structure of anomalous traffic must also be contrasted with a profound understanding of the structure of regular traffic over the internet and its subnetworks. The groundbreaking work of Leland et al. (1994) pointed out to the self-similar nature of such traffic, many elaborations are presented in Park and Willinger (2000). Most models for regular traffic over relatively short time interval postulate a fractal or multi–fractal structure with normal marginal distributions. More recent references and a comprehensive network-wide predictive model are given in e.g. Vaughan et al. (2013). We show that in contrast to the self-similar, hence strongly dependent, Gaussian time series models used to describe regular traffic, important aspects of anomalous traffic can be well described by practically independent, but highly non-Gaussian models. We use the framework of heavy-tailed random variables which have been used in many fields of science and economics, see e.g. Adler et al. (1998), Resnick (2007), and Peng and Qi (2017).

Our broad objective in this paper is to contribute to a better understanding of the stochastic structure of anomalous traffic by focusing on a perhaps its most important aspect: the distribution of the arrival times of anomalies. We deploy a set of statistical tools to understand the stochastic structure of the arrival time of Internet2 anomalies over a period of 50 weeks. These tool can be used to study similar problems in other networks. Another question we explore is the distribution of the waiting time for the arrival of the next anomaly. This waiting time is stochastically longer than the time between the arrivals. This is known as the “inspection” or “length-biased sampling” paradox. Intuitively, if t is an arbitrary time instant, it is more likely to fall into a long period between two arrivals than a short period between two arrivals. So the waiting time, starting from time t, until the next arrival can be expected to be longer than the expected time between any two arrivals. We provide a quantitative description of this phenomenon in the context of internet anomaly arrivals.

For the purpose of designing efficient networks, the knowledge of stochastic characteristics of the waiting time may be more important than statistical summaries related to interarrival times. To efficiently design devices or protocols that operate at link exit points, it is useful to know probabilities of an anomaly arriving at any specified future time interval. As explained above, such information is not directly available from physical measurements, as the interarrival times are stochastically shorter than the waiting times for the arrival of the next anomaly. To gain insight about the latter, probabilistic modeling is required.

The statistical methods presented in this paper are different from the usual approaches based on exponentially distributed, memoryless interarrival times, which correspond to Poisson process arrivals. We will demonstrate that the interarrival times are heavy-tailed, i.e. their tail probabilities decay like a power function rather than like an exponential function. This means that the tools based on classic renewal theory must be employed with care, as we attempt to do. We will estimate and examine the characteristics of waiting time distribution and perform hypothesis tests to check if the waiting time distributions are equal among network links and whether waiting time distributions are indeed different from the interarrival time distributions. We will see that they are different, and will explain that this means that the Poisson model cannot be used to describe the anomaly arrival times.

The remainder of the paper is organized as follows. In Sect. 2, we introduce the database of Internet2 anomalies we study. Section 3 is dedicated to exploration of statistical properties of the anomaly interarrival times. Building on the insights obtained in Sect. 3, we estimate in Sect. 4 the distribution of the waiting time for the arrival of the next anomaly. We do it using the tools of renewal theory, whose relevant results are also explained in Sect. 4. The main contributions and finding of the paper are summarized in Sect. 5. The paper contains two appendices which elaborate on the statistical analyses presented in Sects. 3 and 4.

2 Database of anomalies

Fig. 1
figure 1

A map showing 14 two-directional links of the Internet2 network. Source: www.Internet2.edu

We use the database of anomalies constructed by Bandara et al. (2014) who applied a simple Fourier transform filter to extract time periods of unusually high traffic. Bandara et al. (2014) used traffic measured at the links of the Internet2 network shown in Fig. 1 over the period of 50 weeks starting October 16, 2005. Their approach treats periodic and noise components of the measured traffic as normal. To extract the anomalies, the 20 largest Fourier components that capture about \(80\%\) of the energy and represent the periodic component, are removed from the time-series. Then a threshold, between 2 and 3 times the standard deviation of the detrended time-series, is applied. The deviations of the detrended data beyond this threshold are considered anomalous. A group of consecutive anomalous impulses is treated as a single anomaly. The application of their algorithm produces, among other characteristics, a database of anomalies in each link, each described by its start and end times (in 5 min. resolution). In this paper, we will work with interarrival times defined as the time difference between the starts of two consecutive anomalies. We will use 5 min. as the unit lag in the analysis of all time series we consider. In other words, the time separation between generic data points \(x_t\) and \(x_{t+1}\) is 5 min. An interarrival time is computed as the time difference between the starts of two consecutive anomalies. A typical interarrival time is about 100 lags, i.e. 500 min, but as we shall see in Sect. 3, their lengths vary considerably.

Following Bandara et al. (2014), we use the following four letter abbreviations: Atlanta (atla), Chicago (chin), Denver (dnvr), Houston (hstn), Indianapolis (ipls), Kansas City (kscy), Los Angeles (losa), New York (nycm), Sunnyvale (snva), Seattle (sttl) and Washington D.C. (wash).

3 Statistical properties of anomaly interarrival times

Denote by \(S_1, S_2, S_3, \ldots \) the anomaly arrival times in any of the 28 unidirectional links of the Internet2 network shown in Fig.  1. The interarrival times are defined as \(X_i = S_i - S_{i - 1}, \ i \ge 2\). Table 2 shows quantiles of the distribution of these interarrival times, the numbers in parentheses. Figure 2 shows the histograms for selected four links. The visual inspection of the histograms shows that the probability tails decay slower than might be suggested by an exponential distribution; after a peak around zero, the histograms remain fairly flat up to \(x=300\). Simple, rough calculations also show that an exponential decay is not a reasonable assumption, as we now explain. Consider the atla-hstn link in Table 2. The median interarrival time is \(m=53\) and the 95-th percentile is \(q=1056\). Under the exponential model, \(e^{-\lambda m} =0.5\), which gives \(\lambda =0.0131\). Then q must satisfy \(e^{-\lambda q}= 0.05\), which gives \(q=229\), a value about five time smaller than the observed value of 1056. Similar findings hold for other links.

Fig. 2
figure 2

Histograms of the interarrival times for two ranges with the density of the mixture model considered in Appendix A superimposed: left chin-ipls; right dnvr-snva

The considerations presented above suggest that the interarrival times might have power-law, or heavy, tails defined by the condition

$$\begin{aligned} P(X_i > x) \sim c x^{-\alpha }, \ \ \ \mathrm{as} \ x \rightarrow \infty , \end{aligned}$$
(3.1)

rather than the exponential tails defined by \(P(X_i > x) \sim e^{-\lambda x}\). The most common approach to detect heavy tails is to examine a Hill plot, see e.g. Chapter 4 of Resnick (2007). Examples of Hill plots for the interarrival times are shown in Fig. 3. The interpretation of these plots is as follows. If for small values of the “Order Statistics” index the plot levels off, this indicates that relation (3.1) holds. This is what we have observed in the plots for all links. The Hill plot can also be used to estimate the tail index \(\alpha \) in (3.1). The value of \(\alpha \) is found as the ordinate (alpha) corresponding to the level–off range. Examination of the Hill plots of the interarrival times reveals that it is reasonable to assume that relation (3.1) holds with \(\alpha \) generally in the range \(1.5< \alpha < 2\). It is easy to check, using the relation \(EY = \int _0^\infty P(Y> y ) dy\), that if relation (3.1) holds, then for arbitrarily small \(\delta >0\), \(E X_i^{\alpha + \delta } = \infty \) and \(E X_i^{\alpha - \delta } < \infty \). Thus, the Hill plots suggest that the expected interarrival times are generally finite, because \(\alpha > 1\), but their variance may be infinity because \(\alpha < 2\). The latter observation may impact the application of the standard statistical techniques when working with the anomaly interarrival times.

Fig. 3
figure 3

Hill plots for the interarrival times: left chin-ipls, right dnvr-snva

There are several ways of selecting the optimal (in various senses) value of the “Order Statistic” or “Threshold”, which should be used to estimate \(\alpha \). A method that is often used was introduced by Hall (1990). It employs a bootstrap procedure to choose a threshold that minimizes the asymptotic mean square error. This procedure is implemented by the function hall in the R package tea. An example of a Hill plot with the “Order Statistic” k selected by this method is shown in the left panel of Fig. 4. Table 1 shows the estimated \(\alpha \)s. In addition to the Hill estimator, we included three other estimators recommended by Resnick (1997). The values of Hill estimator and the QQ estimator are similar and indicate that the values of alpha are broadly close to \(\alpha = 1.5\). The smooth Hill estimator produces smaller values and the moment estimator larger values. All these estimates however support the conjecture that the expected value exists, but the variance may not exist.

Fig. 4
figure 4

Hill plot (left) and QQ plot (right) for link 5, chin-nycm

Table 1 Estimated tail indexes \(\alpha \) of the anomaly interarrival times estimated using the data-driven threshold of Hall (1990)

We have also explored the validity of model (3.1) by another set of plots, which complement the Hill plots. An example of such a plot is shown in the right panel of Fig. 4, for the link from Chicago to New York (link 5). This is the QQ plot of the log transformed \(X_i\) matched against exponential quantiles beyond the exceedance threshold corresponding to the optimal k. We should get approximately a line whose slope is \(1/\alpha \) if the data had a Pareto tail with index \(\alpha \), see Section 4.6.4 of Resnick (2007). The QQ plot looks linear with the fit of a straight line whose slope is 1 / 1.53, which tells us that it is reasonable to assume a Pareto tail (relation (3.1)) with index 1.53. The QQ plots for other links, with the estimates taken from Table 1, also look linear.

Having established that model (3.1) with \(1< \alpha < 2.5\) is reasonable, we turn to the examination of dependence between the \(X_i\). We will demonstrate the it is reasonable to assume that, for every link, the \(X_i\) are independent and identically distributed (iid) random variables. Such assertions are often checked by the examination of the sample autocorrelation function, the ACF. We will follow this approach, but first we emphasize two important points, see e.g. Chapter 1 of Shumway and Stoffer (2017):

  1. 1.

    If the autocorrelations are not significant, this indicates that the observations are uncorrelated rather than independent.

  2. 2.

    The significance bands in the ACF plots assume that the fourth moments exists, \(EX_i^4 < \infty \).

As explained above, for the anomaly interarrival times, it is even questionable that the second moment exists. To overcome these two difficulties, we will examine ACF plots for transformed \(X_i\), i.e. for \(g(X_i)\). The rationale is as follows. If the \(X_i\) are independent, then for any g, the \(g(X_i)\) are independent as well, and hence uncorrelated. So if we do not see significant autocorrelations for several functions g, it is reasonable to assume that the \(X_i\) are independent; otherwise significant autocorrelations would show up for some function g. Second, we can chose functions g such that \(E g^4(X_i) < \infty \), so the existence of the fourth moment is not a problem. We recall that if the \(X_i\) are independent, we expect \(95\%\) of the ACF values for the \(g(X_i)\) to be within the \(\pm 2n^{-1/2}\) bands.

Fig. 5
figure 5

ACF plots (zero lag excluded) for selected transformations of the interarrival times in the chin-ipls link: left \(\ln (X_i)\), right \(X_i^{-3}\)

We consider the following transformations:

$$\begin{aligned} g(x) = \ln (x), \ \ \ g(x) = x^{-p}, \ p=1, 2, 3. \end{aligned}$$

Since \(X_i \ge 1\), the above transformations are well defined. Direct verifications show that (3.1) implies that \(P(\ln X_i > t) \sim c e^{-\alpha t}\), so all moments of \(\ln X_i\) are finite. For \(p > 0\), the random variables \(X_i^{-p}\) values in the interval (0, 1], so to ensure the existence of the fourth moment, we must examine their behavior in neighborhoods of 0. By (3.1),

$$\begin{aligned} P(X_i^{-p} < x) = P(X_i > x^{-1/p}) \sim c x^{\alpha /p}, \ \ \ \mathrm{as} \ x\rightarrow 0. \end{aligned}$$

Therefore, as \(x\rightarrow 0\), the density is proportional to \(x^{\alpha /p -1}\). Hence all moments of \(X_i^{-p}\) are finite.

We found that for all 28 links, the ACF plots of \(\log (X_i)\), \(X_i^{-1}\), \(X_i^{-2}\) and \(X_i^{-3}\) show at least \(95\%\) correlations within the \(\pm 2n^{-1/2}\) bands. Figure 5 shows the ACF plots for the transformed interarrival times at the chin-ipls link. The plots for all links are very similar. The ACF at lag zero is always 1, so it is not plotted.

We have performed a similar analysis using the cross–correlation function (CCF), i.e. we examined the CCF values of \(g(X_i(\ell ))\) and \(g(X_i(\ell ^\prime ))\), where \(\ell \) and \(\ell ^\prime \) are any two links. We have found that at least 95% of these CCF’s were within the corresponding confidence bands.

The findings of this section can be summarized as follows:

  1. 1.

    The anomalies interarrival times have approximately Pareto tails; it is reasonable to assume that relation (3.1) holds with \(1< \alpha < 2.5\).

  2. 2.

    It is reasonable to assume the these interarrival times form a sequence of iid random variables.

  3. 3.

    The arrival times in different links form independent sequences.

We emphasize that these findings pertain to the database created by Bandara et al. (2014), but the statistical methodology we presented can be applied to other databases. In the next section, we use these findings to describe the distribution of the waiting time for the arrival of the next anomaly.

4 Distribution of the waiting time for the arrival of the next anomaly

As explained in the introduction, a typical waiting time for the arrival of the next anomaly can be expected to be longer than a typical time between the arrivals of the anomalies. This assertion can be quantified by comparing quantiles of the two distributions. As seen in Table 2, for our database, the difference is quite large. For all links, the quantiles of the waiting time distribution are several times larger than the corresponding quantiles of the distribution of the interarrival times. Intuitively, this is due to the heavy-tailed distribution of the interarrival times established in Sect. 3. Since there are many very long interarrival times, an arbitrary time instant t is likely to be in a long interarrival time, so the the time until the arrival of the next anomaly is more likely to be long than short. We emphasize that unlike the interarrival times, the waiting times are not observed directly. Their distribution must be computed using suitable tools of probability theory. Such tools are provided by the renewal theory, which is explained in many textbooks on stochastic processes, e.g. in Kulkarni (2017). Before presenting the computations leading to the estimates in Table 2, we introduce some results of the renewal theory, which are relevant to our task.

Table 2 Estimated 25th, 50th, 75th, 90th and 95th percentiles of the waiting time distribution (first columns) and the interarrival time distribution (in parentheses)

Consider a fixed link and recall that \(S_1, S_2, \ldots \) are the times of the anomaly arrivals. We have demonstrated in Sect. 3 that the interarrival times \(X_i = S_i - S_{i-1}\) are iid random variables with finite expectation (their distribution may very from link to link). These assumptions are enough to apply the results of the renewal theory stated below.

Denote by \(\{N(t), t \ge 0\}\) the renewal process with interarrival times \(X_i\), i.e. N(t) is the count of arrivals up to and including time t, i.e. \(N(t) = \max \{n: S_n \le t\}\). Notice that the random time \(S_{N(t)+1}\) is the time of the arrival of the next anomaly after time t. Therefore, the waiting time is

$$\begin{aligned} B(t) = S_{N(t)+1} - t. \end{aligned}$$

We will work with the complementary cdf of B(t) defined by

$$\begin{aligned} H_t(x) = P(B(t) > x), \ \ \ x \ge 0, \ t \ge 0. \end{aligned}$$

Using the key renewal theorem, one can show that as t increases, which corresponds, to an asymptotic equilibrium, the probabilities \(H_t(x)\) converge to a limit given by

$$\begin{aligned} H^{*}(x) := \lim _{t \rightarrow \infty } H_{t}(x) = \frac{1}{\tau } \int _{x}^{\infty } (1 - G(u))du, \end{aligned}$$
(4.1)

where

$$\begin{aligned} \tau = EX_i,\ \ \ G(u) = P(X_i \le u). \end{aligned}$$

We note that the parameters \(\tau \) and \(G(\cdot )\) do not depend on i because the \(X_i\) are assumed to have the same distribution. The equilibrium cdf of B(t), \(1 - H^{*}(x)\), is therefore given by

$$\begin{aligned} F_B(x) = 1 - H^{*}(x) = 1 - \frac{1}{\tau } \int _{x}^{\infty } (1 - G(u))du. \end{aligned}$$

This leads to the following formula for the density of the waiting time

$$\begin{aligned} f_B(x) = \frac{d(1 - \tau ^{-1} \int _{x}^{\infty }(1-G(u)du)}{dx} =\frac{1}{\tau }(1 - G(x)). \end{aligned}$$
(4.2)

Denoting suitable estimators by \(\hat{\tau }\) and \(\widehat{G}(\cdot )\), we can estimate the cdf and the density of the waiting time, respectively, by

$$\begin{aligned} \widehat{F}_B(x) = 1 - \frac{1}{\hat{\tau }} \int _{x}^{\infty } (1 - \widehat{G}(u))du \end{aligned}$$
(4.3)

and

$$\begin{aligned} \hat{f}_{B}(x) = \frac{1}{\hat{\tau }}(1 - \widehat{G}(x)). \end{aligned}$$
(4.4)

A central issue is therefore to determine which estimators to use. Essentially the only consistent estimator of the cdf \(G(\cdot )\) is the empirical cdf \(\widehat{G}(\cdot )\) defined by

$$\begin{aligned} \widehat{G}(x) = \frac{1}{N} \sum _{i=1}^N \mathbf{1}\left\{ X_i \le x\right\} , \end{aligned}$$

where N is the count of interarrival times in a given link, and \( 1\left\{ \cdot \right\} \) is the set indicator function. The Glivenko–Cantelli theorem asserts that, as \(N\rightarrow \infty \), \(\sup _x | \widehat{G}(x) - G(x)|\) tends to zero with probability 1. The problem of the estimation of the mean, \(\tau \) in our context, has been given a lot of attention. A textbook estimator is the sample average, but as we demonstrate in Appendix A, this is not a good choice for the \(X_i\) we study. Numerical comparisons of various commonly used estimators presented in Appendix A reveals that a very good choice for our data is the estimator which can be derived directly from the empirical cdf \(\widehat{G}(\cdot )\) via

$$\begin{aligned} \hat{\tau }= \int _0^\infty (1 - \widehat{G}(x)) dx. \end{aligned}$$

This estimator is based on the relation \(EX = \int _0^\infty P(X> x) dx\), valid for any nonnegative random variable X. It very desirable property is that it lead to an estimator of \(f_B(\cdot )\), which is a valid density:

$$\begin{aligned} \int _{0}^{\infty }\widehat{f}_B (x)dx = \frac{1 }{\int _{0}^{\infty }(1 - \widehat{G}(x))dx} \int _{0}^{\infty }(1 - \widehat{G}(x))dx = 1. \end{aligned}$$

Using estimator (4.3) we computed the quantiles shown in Table 2. Using (4.4), we computed estimates of the densities of the waiting time. All these densities have qualitatively very similar shapes, so we show just two of them in Fig. 6.

Fig. 6
figure 6

Estimated densities of the waiting time for the arrival of the next anomaly: left chin-ipls, right dnvr-snva

Probabilities of very long waiting times can be computed in an alternative way. Using L’hopital’s rule together with the assumptions that \(1 - G(X) \approx cx^{-\alpha }\), we obtain

$$\begin{aligned} \lim _{x \rightarrow \infty }\frac{1 - F_B(x)}{cx^{-(\alpha -1)}/(\tau (\alpha -1))} = \lim _{x \rightarrow \infty }\frac{f_B(x)}{cx^{-\alpha }/\tau } = \lim _{x \rightarrow \infty }\frac{(1-G(x))/\tau }{cx^{-\alpha }/\tau } =1, \end{aligned}$$

which yields

$$\begin{aligned} 1 - F_B(x) \approx (\tau (\alpha -1))^{-1} cx^{-(\alpha -1)}. \end{aligned}$$
(4.5)

For large x, the constant c can be estimated by \(x^\alpha (1 - \widehat{G}(x))\), so (4.5) leads to

$$\begin{aligned} P(B(t) > x) \approx \frac{1}{(\hat{\alpha }-1)\hat{\tau }x}(1 - \widehat{G}(x)). \end{aligned}$$

We conclude this section by presenting a simple argument showing that the waiting time distribution is equal to the interarrival distribution if and only if one of them is exponential. Under (3.1) the distribution of the waiting time thus cannot be equal to the distribution of the interarrival time. The postulated equality is equivalent to the equality of the densities, i.e., by (4.2), to

$$\begin{aligned} \frac{dG(x)}{dx} = \frac{1}{\tau }(1 - G(x)). \end{aligned}$$

The above linear differential equation has only one solution \(G(\cdot )\) which is a cdf; it is

$$\begin{aligned} G(x) = 1 - \tau ^{-1} e^{-x/\tau }, \ \ \ x > 0. \end{aligned}$$

The following conclusions can be drawn from the research presented in this section, including some details presented in Appendices A and B.

  1. 1.

    For every link, the quantiles of the distribution of the waiting time are generally several times larger then the corresponding quantiles of the distribution of the interarrival times. Very roughly, the interquartile range for the interarrivals is (10, 300), but for the waiting times it is (100, 700) (in units of 5 min). Looking at the medians, a typical time separation between two anomalies is about 8 h, but the typical waiting time for the arrival of the next anomaly is about 25 h. This paradox is a consequence of the heavy-tailed interarrival times.

  2. 2.

    The waiting time density functions are decreasing. They have large probability masses at both ends, close to zero and close to infinity. Unlike the exponential distribution, they do not touch the vertical line \(x=0\). They decay very slowly as \(x\rightarrow \infty \), much slower than the exponential distribution.

  3. 3.

    There is a noticeable variation in the distributions of waiting time among the links. Some distributions, e.g. links atla-ipls, sttl-dnvr, wash-nycm, show heavier tails than others, e.g. links dnvr-snva, ipls-kscy, snva-losa. Appendix B presents formal significance testing procedures that confirm that these distributions are not the same for links.

  4. 4.

    Mathematically, the distribution of the waiting time has a heavier tail than the distribution of the interarrival time. Under model (3.1), relation (4.1), implies that in equilibrium, i.e. for large t, \(P(B(t) > x) \sim c^\prime x^{-(\alpha -1)}\), as \(x\rightarrow \infty \). (By (4.5), \(c^\prime = (\tau (\alpha -1))^{-1} c\).) The bounds \(1< \alpha < 2.5\), imply \(0< \alpha -1 < 1.5\). This means that the random variable describing the waiting time is stochastically so large that it has infinite variance, and may even have infinite expected value (if \(\alpha < 2\)). In the latter case, for a device placed at a link, the expected waiting time until the arrival of the next anomaly is infinite, but an anomaly will arrive with probability 1.

5 Summary and main conclusions

We studied the arrival times of anomalies in 28 unidirectional links of the Internet2 network. Our objective was to provide information about the distribution of two random variables defined for each of these links: the interarrival time and the waiting time for the arrival of the next anomaly. The statistical characteristics of the interarrival time are accessible directly from physical measurements. The distribution of the waiting time is not directly accessible. It must be estimated using renewal theory.

We have shown that it is reasonable to assume that the interarrival times have heavy-tailed distributions, and they are independently and identically distributed random variables. We derived a parametric model for the distribution of the time between arrivals of two anomalies, see Appendix A, which can be used in various simulations. We have found that the density functions of the waiting time are decreasing with large probability masses for short and very long waiting times. Mathematically, the expected waiting time for the arrival of the next anomaly can be infinite. Practically, we expect the next anomaly to arrive within 6 to 13 days. We have also found that waiting time has heavier tail than interarrival and has infinite variance. There is a significant variation in the waiting time distributions among the 28 links. With formal hypothesis testing, we have confirmed that the interarrival and waiting time distributions among the 28 links are not identical.

We have performed a similar analysis for the gaps between the anomalies, which are defined as time separation between the end of the latest anomaly and the beginning of the next anomaly. The gaps are always shorter the the interarrival times, but the differences are very small as the duration of an anomaly is small compared to the large separation times. The modifications thus occur only in the left tail of the distributions, close to zero, and relation (3.1) is unaffected. We conclude that the distribution of these gaps is also heavy–tailed, in particular it is not exponential.

While the conclusions of our statistical and probabilistic analyses pertain to a specific network over a specific time period, it is hoped that the approaches presented in this paper will prove useful in other similar analyses.