1 Introduction

Description of ocean waves is affected not only by the wave data and models adopted but also by the uncertainties associated with them. These uncertainties can be classified into aleatory and epistemic, where the latter can be grouped into data, model, and statistical uncertainty (Bitner-Gregersen and Hagen 1990; Bitner-Gregersen et al. 2014a). The statistical uncertainty, also called sampling variability, is due to the limited number of observations and brings several challenges in an analysis of wave field data as well as in modelling and forecasting of ocean waves.

Measured wave data, either in situ or remotely sensed, remain important for the development, calibration, and validation of numerical and theoretical wave models, and specification of more detailed wave description such as wave spectra and individual wave characteristics, e.g., wave height, crest elevation, wave periods. These data are even more important in coastal areas where prediction of waves is further complicated by shallow-water and coastal boundary effects.

Traditionally, wave measurements have typically been recorded at single point locations by buoys, wave staffs, lasers, or radars, and restricted to record durations of 20 or 30 min. This duration has allowed assumption of stationarity of a sea state on which most wave models are based. A strong focus on spatial wave data was initiated by Krogstad et al. (2004), who introduced the Piterbarg (1996) theorem to oceanography demonstrating that single-point temporal measurements may greatly underestimate (especially in short-crested seas) the actual maximum wave displacements that can occur on sea surface areas even smaller than the typical size of a single wave. Recent installations of stereo video camera systems (e.g., Fedele et al. 2011, 2013; Benetazzo et al. 2015, 2017) for collecting space-time ensembles of sea surface elevation have allowed for gathering of spatial data, which were limited in the past. In situ temporal measurements are affected by the duration of the wave records, while spatial data are also affected by the size of an instrument’s footprint. Both duration of measurements and an instrument’s footprint bring limitations to the number of data which can be collected.

Investigations addressing sampling variability associated with in situ point measurements have a long history in oceanography, e.g., Longuet-Higgins (1952), Lipa et al. (1981), Donelan and Pierson (1983), Bitner-Gregersen and Hagen (1990), Tucker (1992), Forristall et al. (1996), and Bitner-Gregersen (2003). For a review, see Bitner-Gregersen and Magnusson (2014). Monaldo (1988) showed, for two buoys situated 100 m apart, a 7% RMS (root mean squared) error for significant wave height, but more limited studies have been dedicated to spatial field data and associated uncertainties, including also sampling variability, e.g., Forristall (2011), Benetazzo et al. (2015, 2017), and Gemmrich et al. (2016). However, the impact of sampling variability on parameters specific to nonlinear effects of wave fields has not been addressed in these investigations. The latter is important for understanding the role wave measurements play in confirmation of the existence of exceptionally large waves such as rogue waves.

Rogue waves, much steeper and larger than the surrounding waves in a wave record, were for a long time believed to be mostly anecdotal, although always part of maritime folklore, see, e.g., Kharif et al. (2009) and Olagnon and Kerr (2015). The investigations carried out in the last two decades have shown that rogue waves may occur in low, intermediate, and high sea states (e.g., Bitner-Gregersen and Hagen 2004) and can be generated by linear focusing, modulational instability, crossing seas, current, shallow water effects and wind; for review, see, e.g., Osborne (2010), Onorato et al. (2013), Adcock and Taylor (2014), and Bitner-Gregersen and Gramstad (2016). A question which arises is, how much information about these exceptionally large waves, and the sea states in which they occur, can single 20–30-min wave records provide, and how does sampling variability affect sea state characteristics derived from such records?

To answer this question, we investigate rogue-prone sea states recorded in the North Sea using wave data simulated by linear, second-, and third-order numerical wave models. The third-order data are simulated by the nonlinear wave model HOSM (higher order spectral method) which includes nonlinear free-wave modulation as well as higher order bound harmonics. Unidirectional and directional sea states are addressed. The Pierson-Moskowitz and the JONSWAP spectrum with different spectrum peakedness gamma parameters and different directional energy spreading functions are used in the sensitivity analysis. The study includes wave steepness, maximum wave crest, as well as the skewness and kurtosis of the surface elevation, often used as indicators of sea surface nonlinearity. Effects of sampling variability on these parameters, as well as on the correlation between them, are discussed in the context of 20–30-min temporal and spatial field measurements. Further, we demonstrate the supporting role numerical simulations can play in analysis of field data.

The aim of the present study is to demonstrate, through a few examples, different challenges sampling variability brings when single 20- or 30-min wave records are used in an analysis. To quantify systematically the effects of sampling variability for combinations of wave parameters describing a rogue-prone sea state is outside the scope of the study. These investigations, being an extension of the study of Bitner-Gregersen and Gramstad (2018), have been inspired by the paper of Donelan and Magnusson (2017) addressing the Andrea rogue wave, where the authors argue that this extreme wave event is a result of linear focusing, showing at the same time that the distribution of crest heights deviates significantly from the Rayleigh distribution (linear theory, Longuet-Higgins 1952) and the second-order Forristall distribution (2000).

The paper is organized as follows. The rogue-prone sea states used in the analysis are presented in Sect. 2 while the description of setup of the numerical simulations in Sect. 3. The nonlinear characteristics of surface elevation are given in Sect. 4. Section 5 is dedicated to unidirectional wave fields while Sect. 6 to directional waves. The paper closes with discussion and conclusions.

2 Analyzed Sea states

A common definition of a rogue wave is the criterion expressed by Haver (2000): Hmax/Hs > 2 and/or Cmax/Hs > 1.25, where Hmax denotes the maximum zero-crossing individual wave height, Cmax is the maximum crest height, and Hs is the significant wave height, defined as four times the standard deviation of the surface, typically calculated from a 20-min measurement of the surface elevation. Wave models, laboratory experiments, and field observations have shown that rogue waves have occurred in sea states with wave steepness kpHs/2 > 0.07 (kp denotes the wavenumber associated with the spectral peak period Tp) if a wave spectrum is sufficiently narrow in frequency and direction, see, e.g., Onorato et al. (2013) and Bitner-Gregersen and Gramstad (2016). For wave steepness of rogue-prone sea states observed in nature, see also Bitner-Gregersen and Magnusson (2004).

To demonstrate the effects of sampling variability on sea state characteristics of rogue-prone sea states, as examples, we have selected sea states with kpHs/2 ≥ 0.08 in which famous rogue waves were recorded in the North Sea: the Draupner wave, 1 January 1995 at 15:20 UTC (Haver 2000), the Andrea wave 9 November 2007 at 00:54 UTC (Magnusson and Donelan 2013), and the Justine Three Sisters, 30 November 2018, 18:20 UTC (Magnusson et al. 2019). The selected sea states are listed in Table 1. The Draupner (case 1) and Andrea (case 2) waves were single rogue waves while the Justine Three Sisters (case 3) was a triple rogue wave group. These rogue waves were recorded at single-point locations in 20-min wave time series which were available for the present study.

Table 1 Analyzed sea states

Lacking directional information for case 1 and case 2, unidirectional waves are assumed herein. Note that Donelan and Magnusson (2017) pointed out that the WDM (wavelet directional method) analysis showed that, during the passage of the Andrea wave, large wave groups were propagating at nearly the same direction. In the case of the Justine Three Sisters, we have used the frequency-directional wave spectrum from the operational wave forecast model WAM of the Norwegian Meteorological Institute (MET Norway), which has 4 km resolution (MWW4) and provides output every hour, but not at the exact time of the Justine triple wave group’s occurrence. Therefore, the spectrum at 18 UTC has been adopted in the analysis. The significant wave height provided by the forecast model at 18 UTC is lower, Hs = 3.4 m, than the one obtained from the WaveRadar REX radar time series, reducing wave steepness from kpHs/2 = 0.12 to kpHs/2 = 0.10.

In addition, a rogue-prone sea state (case 4 in Table 1) with Hs = 5.66 m, spectral peak period Tp = 10 s (corresponding wave steepness kpHs/2 = 0.11) and different directional wave energy spreading is used in the sensitivity analysis.

The sea states listed in Table 1 were selected to demonstrate that independent of the value of wave steepness and wave energy spreading in frequency and direction, 20- or 30-min wave records will be affected by sampling variability. Both temporal and space-time numerical data are studied.

3 Description of setup of numerical simulations

The investigations carried out herein are based on the numerical simulations performed using a numerical solver based on the HOSM, independently proposed by Dommermuth and Yue (1987) and West et al. (1987). Unidirectional and directional wave fields have been simulated in a spatial domain with periodic boundary conditions. The nonlinear order in the HOSM simulations in this study was set to M = 3, which includes the leading order nonlinear dynamical effects, including the effect of modulational instability.

For the unidirectional simulations, the spatial domain was discretized by nx = 1024 grid points, while in the short-crested simulations, the horizontal plane was discretized using nx = ny 512 × 512 grid points. Note that these values are for the fully de-aliased grid, the corresponding values before de-aliasing is nx = 2048 in the unidirectional case and nx x ny = 1024 × 1024 in the directional case. For the directional case, this corresponds to 32 × 32 λp and for the unidirectional to 64 λp in the domain, where λp denotes the peak wavelength. For example, for a directional wave field with Tp = 10 s in infinite water depth, the computational domain is approximately 5 × 5 km. A weak dissipation of high wavenumbers is included to model the energy dissipation due to wave breaking, using the wavenumber filter suggested in Xiao et al. (2013). Random phases and amplitudes were assigned to the initial spectrum in all cases.

In the simulations, the initial condition was chosen as a wave system with the Pierson-Moskowitz (PM) or the JONSWAP spectrum and with cosN(φ − φp) directional spreading function. Thus, the wave spectrum was defined as E(k) = F(k)D(φ) where k = (kx, ky)=k(sinφ, cosφ) and

$$ F(k)=\frac{\alpha }{2{k}^3}\exp \left[-\frac{5}{4}{\left(k/{k}_p\right)}^{-2}\right]{\gamma}^{\exp \left[-\frac{{\left(\sqrt{k/{k}_p}-1\right)}^2}{2{\sigma}^2}\right]} $$
(1)

and

$$ {\displaystyle \begin{array}{l}D\left(\varphi \right)=\frac{1}{k\sqrt{\pi }}\frac{\varGamma \left(N/2+1\right)}{\varGamma \left(N/2+1/2\right)}{\cos}^N\left(\varphi -{\varphi}_p\right),\\ {}D\left(\varphi \right)=0\end{array}}\kern0.5em {\displaystyle \begin{array}{l} if\mid \varphi -{\varphi}_p\mid \le \frac{\pi }{2}\\ {}\mathrm{otherwise}\end{array}} $$
(2)

where Γ is the gamma function, and the parameter σ has the standard values 0.07 for k ≤ kp and 0.09 for k > kp. The other spectral parameters α, γ, kp, φp, and N were chosen to give the desired spectral shape, significant wave height Hs, and peak period Tp. N denotes the directional spreading coefficient and φp the peak direction. The following spectrum peakedness parameters γ = 1, 2, 3.3, 6, and directional spreading coefficients N = 4, 16, 100, and unidirectional waves, are used in the analysis. Note that for γ = 1, the JONSWAP spectrum reduces to the Pierson-Moskowitz spectrum.

In the case of the Justine Three Sisters (case 3 in Table 1), the input to the numerical simulations is the wave frequency-directional spectrum from the operational wave forecast model WAM of the Norwegian Meteorological Institute (MWW4) at 18:00 UTC.

The wave fields are simulated in time for a total duration of tmax = 1800–3600 s for unidirectional waves and tmax = 1800 s for directional seas. The number of runs of the unidirectional simulations varies from 1 to 1000 to demonstrate different effects of sampling variability, while 20 runs are carried out for the directional sea state Case 4 requiring more CPU time. For the Justine Three Sisters (case 3), 500 runs have been carried out to be able to capture this very rare event (for details, see Bitner-Gregersen et al. 2020b) but only a few selected 30-min wave records are utilized in the present study.

The DNV GL HOSM solver includes linear and second-order wave model solvers, which are also applied in the investigations.

4 Nonlinear sea state characteristics

In the random linear wave model surface elevation ɳ(x, y, t) is Gaussian distributed, and under an assumption that a wave field is narrow-banded wave heights and crests follow the Rayleigh distribution (Longuet-Higgins 1952). Taking nonlinear effects into account, deviations from the Gaussian distribution should be expected (Longuet-Higgins 1963). This was confirmed by analysis of deep water field data, e.g., by Longuet-Higgins (1963) and shallow water data by Bitner (1980). Consequently, wave crests and heights will not be Rayleigh distributed, although wave heights are less sensitive to nonlinearities.

Wave steepness kpHs, the skewness coefficient κ3, the kurtosis coefficient κ4, and the ratios Hmax/Hs and Cmax/Hs can be used to characterize wave field deviations from Gaussianity. Further, for a Gaussian wave field, the skewness coefficient is equal to 0 and the kurtosis coefficient is equal to 3, while κ3 is typically larger than 0 and κ4 is beyond three in a non-Gaussian wave field. It should be noted that within wave theory, skewness is primarily a second-order effect while kurtosis is a third-order effect (Longuet-Higgins 1963). Therefore, kurtosis has been used as an indicator of occurrence of rogue waves (Mori et al. 2011).

In the present study, we demonstrate, using numerical simulations, effects of sampling variability on the maximum wave crest, skewness, and kurtosis when single 20–30-min temporal or spatial wave records are used in an analysis. Note, that the maximum surface elevation ɳmax, both in time and space, represents the maximum wave crest height Cmax.

Wave data collected in nature, laboratory experiments or numerically simulated include a limited number of observations, and therefore allow only sample estimators of maximum surface elevation, skewness, and kurtosis to be provided. For a surface snapshot from the numerical simulations ηi, j = η(xi, xj), where i = 1, …, nx and j = 1, …, ny, the sample skewness κ3 and the kurtosis κ4 coefficient can be defined as follows:

$$ {\kappa}_3=\frac{\frac{1}{n_x{n}_y}\sum \limits_{i=1}^{n_x}\sum \limits_{j=1}^{n_y}{\left({\eta}_{i,j}-\overline{\eta_{i,j}}\right)}^3}{{\left(\frac{1}{n_x{n}_y}\sum \limits_{i=1}^{n_x}\sum \limits_{j=1}^{n_y}{\left({\eta}_{i,j}-\overline{\eta_{i,j}}\right)}^2\right)}^{3/2}} $$
(3)
$$ {\kappa}_4=\frac{\frac{1}{n_x{n}_y}\sum \limits_{i=1}^{n_x}\sum \limits_{j=1}^{n_y}{\left({\eta}_{i,j}-\overline{\eta_{i,j}}\right)}^4}{{\left(\frac{1}{n_x{n}_y}\sum \limits_{i=1}^{n_x}\sum \limits_{j=1}^{n_y}{\left({\eta}_{i,j}-\overline{\eta_{i,j}}\right)}^2\right)}^2} $$
(4)

The mean value of sea surface \( \overline{\eta_{i,j}} \) is always 0.

Accuracy of the estimators of the maximum wave crest, and skewness and kurtosis given by Eqs. (3) and (4) will depend on the number of observations used in an analysis (low values of nx and ny may result in poor estimates), but more importantly, the accuracy of the estimates depends on the size of the domain over which the average is taken, with respect to the number of waves in the domain. Herein, we compare the values of κ3 and κ4, and ɳmax (Cmax) derived from a single 30-min temporal and spatial wave record with the coefficients κ3 and κ4 and ɳmax (Cmax) calculated as averages over all random realizations of the same sea state when a large number of runs is performed.

It should be mentioned that the sample skewness and kurtosis coefficients described by Eqs. (3) and (4), as well as the alternative commonly used “unbiased” estimators for identically distributed independent samples found in the literature, represent biased estimators of the real populations of κ3 and κ4 in cases where the samples are not independent (Joanes and Gill 1998; Bai and Ng 2005), which was pointed out for wave surface applications by Gramstad et al. (2018). Therefore, for a numerically simulated linear Gaussian surface, the skewness is not equal exactly to zero, nor is the kurtosis equal exactly to three. This effect is more pronounced for small simulation domains. Due to the relatively large computation area/duration considered in this study, the presented numerical results are little affected by this bias, but we also observe it.

5 Unidirectional wave field

5.1 General

It has been shown that the occurrence of modulational instability in deep and intermediate water is characterized by high wave steepness and a narrow wave spectrum, both in frequency and direction, and is more pronounced in a unidirectional wave field, Onorato et al. (2006), Toffoli et al. (2010), Toffoli and Bitner-Gregersen (2011). It can be parameterized by the Benjamin-Feir Index (BFI), see Onorato et al. (2006, 2013). The BFI is a measure of the relative importance of nonlinearity and dispersion and is defined as BFI = (kpHs/2)/(Δω/ωp), where Δω/ωp is the frequency spectral bandwidth (Δω is often measured as the half-width at half-maximum of the spectrum and ωp is the spectral peak frequency). Provided the wave field is sufficiently steep and narrow banded in frequency and direction, rogue waves are expected to be generated when BFI = O(1).

Consequently, we would expect that a JONSWAP spectrum with peakedness parameter γ = 6 will generate higher maximum wave crest, skewness, and kurtosis than the PM spectrum having γ = 1, given that the two spectra have the same wave steepness. Furthermore, for the same wave steepness and spectrum peakedness parameter γ, we would expect to see higher surface elevation, skewness, and kurtosis in third-order HOSM simulations than in the linear and second-order ones.

For a single 20–30-min realization of a sea state, this may not always be the case due to sampling variability. We demonstrate this below using as examples the Draupner and Andrea sea states (cases 1 and 2 in Table 1) and assuming unidirectional seas. The Draupner sea state was characterized by Hs = 11.2 m and Tp = 16.7 s (adopted from Bitner-Gregersen and Magnusson 2004), the Andrea sea state by Hs = 9.2 m and Tp = 13.2 s with the corresponding wave steepness kpHs/2 = 0.08 and kpHs/2 = 0.11, respectively.

5.2 Surface elevation

Surface oscillations in time ηi,j (t) at one selected grid point (xi, yj) for the Andrea sea state for HOSM and linear simulations are plotted for seed = 1 and seed = 100 in Fig. 1 a and b, respectively. The 30-min wave records in Fig. 1 show consistent with wave theory that the HOSM simulations provide the highest maximum wave crest both for seed = 1 and seed = 100. In contrast, the linear maximum wave trough is slightly deeper than the HOSM one (see Fig. 1a). Further, by selecting a 20-min wave record (vertical blue dashed lines in Fig. 1a; 600–1700 s) within the 30 min wave time series, we can see that the HOSM maximum wave trough is deeper than the linear one, while the maximum wave crests are equally high, both for the HOSM and linear simulations. These effects are due to sampling variability resulting from inherent randomness of surface elevation. Note that the earlier findings of Toffoli and Bitner-Gregersen (2011), where the 20-min HOSM simulations were repeated 150 times, gave higher wave crests and deeper troughs compared to the linear wave model.

Fig. 1
figure 1

Surface elevation ηi(t) as a function of simulation time. HOSM and linear simulations; Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11. a Seed = 1. b Seed = 100

Although due to sampling variability, wave crests and troughs may have similar values for the HOSM and linear wave model in a single 20–30-min wave record, the evolution of the linear and nonlinear wave train ηi,j(t) is different, as illustrated in Fig. 2. This will affect response calculations of marine structures.

Fig. 2
figure 2

An extract of HOSM and linear time surface elevation time series from 600 to 1200 s; Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11; seed = 1

Temporal 20- or 30-min wave records will always be more affected by sampling variability than space-time ones as space-time records will include more observations, if the sampling rate of the data is the same (see Bitner-Gregersen et al. 2020a). Further, the spatial observations will provide a higher maximum surface elevation (maximum wave crest), an important parameter for design.

5.3 Kurtosis and skewness

Estimators of skewness and kurtosis derived from 30-min wave records discussed below are calculated over the computational domain at every sampling time step.

The spatial kurtosis (Eq. 4) as a function of 1800 s HOSM simulations with the same seed for the PM (γ = 1) and JONSWAP spectrum with γ = 2, 3.3, and 6 for the Andrea and Draupner unidirectional sea states are shown in Fig. 3 a and b, respectively. Figure 3 demonstrates that a single spatial 30-min realization of a rogue-prone sea state may produce higher kurtosis for the PM spectrum (γ = 1) than for the JONSWAP spectrum with γ = 6, 3.3, and 2 due to the inherent randomness of waves, in contrast to what we would expect from averaging over very large amounts of data. However, in accordance with the wave theory, the kurtosis reaches higher values for the steeper Andrea sea state than for the less steep Draupner sea state.

Fig. 3
figure 3

Estimated spatial kurtosis as a function of simulation time. HOSM simulations for the selected seed. a Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11. b Draupner sea state Hs = 11.2 m and Tp = 16.7 s, kpHs/2 = 0.081; unidirectional waves

As mentioned above, for a Gaussian distributed wave population, the skewness coefficient is equal to 0 and the kurtosis coefficient is equal to 3. In Fig. 4, spatial skewness and kurtosis as a function of time for a single 30-min HOSM and linear simulation is plotted, when the Andrea sea state with the JONSWAP spectrum with γ = 6 and seed 100 is adopted in the analysis.

Fig. 4
figure 4

Spatial skewness (a) and kurtosis (b) as a function of simulation time. HOSM and linear simulations; Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11; JONSWAP spectrum with γ = 6, seed = 100

As seen in Fig. 4, the skewness derived from the linear simulations is not equal to zero and the kurtosis is not equal to three. Generally, the skewness derived from the linear simulations is much lower than the one obtained from the HOSM simulations. However, at the time step 484–488 s, the linear skewness κ3 = 0.20–0.24 is exceeding the HOSM skewness κ3 = 0.17–0.20 by 26%. The HOSM skewness reaches the value κ3 = 0.43, more than twice as high as expected from second order wave theory for this particulars wave steepness (Longuet-Higgins (1963); Srokosz and Longuet-Higgins 1986), but both the HOSM and linear skewness coefficients also take negative values, as observed also in field data (e.g., Bitner 1980).

The corresponding spatial kurtosis derived from the HOSM and linear simulations is plotted in Fig. 4b. The linear kurtosis at the time step 1431 s is reaching κ4 = 3.79, which is similar to the maximum HOSM kurtosis κ4 = 3.86. Thus, inherent randomness of wave field is also manifested in the spatial 30-min wave records. As expected, when the number of 30-min HOSM runs is increased to 4 and the mean kurtosis over them is calculated, the variations of kurtosis in time are reduced, as illustrated in Fig. 5.

Fig. 5
figure 5

Spatial kurtosis as a function of simulation time for 4 different seeds and the mean of 4 runs. HOSM simulations; Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11; JONSWAP spectrum with γ = 6

In order to improve significantly the accuracy of the skewness and kurtosis estimators, the duration of HOSM and linear simulations has been increased to 60 min and the number of the 60-min simulations of the surface domain was repeated 100 times, using different seeds for each run. The calculated skewness (Eq. 3) and kurtosis (Eq. 4) for the Andrea sea state are shown in Fig. 6 a and b, respectively. The outcomes from the linear and HOSM simulations are shown for the spectral peakedness parameter γ = 1.0, 2.0, 3.3, and 6.0.

Fig. 6
figure 6

Spatial a skewness and b kurtosis as a function of simulation time calculated as an average over 100 repetitions of the 60-min simulation. HOSM and linear simulations; Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11. M denotes the order of simulations, M = 1 referred to the linear simulations, M = 3 to the nonlinear third-order HOSM simulations

The estimated skewness and kurtosis calculated from the linear simulations have values close to the theoretical ones; skewness is approximately equal to 0, while the kurtosis is slightly lower than 3. This is consistent with the bias of the skewness and kurtosis estimators for serially correlated data pointed out by Bai and Ng (2005); see Gramstad et al. (2018) for the discussion of it in the context of ocean waves. Further, the HOSM skewness is approximately equal to 0.17, which is consistent with second-order wave theory and should be expected, as skewness is primarily a second-order effect. Note that the HOSM skewness is affected also by the estimator bias mentioned above. It is interesting to see that skewness has little sensitivity to the spectrum peakedness parameter γ. The largest skewness is obtained for γ = 1 (PM spectrum) and the lowest for γ = 6. This slight difference is difficult to explain at present and needs further investigations.

Figure 6 b shows that the kurtosis derived from the nonlinear HOSM simulations deviates significantly from three, consistent with dynamical nonlinear effects (see, e.g., Toffoli et al. 2010). The spectrum peakedness parameter γ = 6 gives the highest value of kurtosis, up to 3.4, while the PM spectrum results in the lowest kurtosis, consistent with the well-known result that a more narrow spectrum (larger BFI) will give higher kurtosis. Also consistent with previous works, an initial rapid increase of the kurtosis is observed, which can be attributed to the effect of modulational instability. As seen in Fig. 6b, there are also slow variations in the kurtosis after this initial increase. In order to identify if these variations are just due to sampling variability, or due to some physical effect, the number of runs is increased to 1000 and the results plotted in Fig. 7.

Fig. 7
figure 7

Spatial a skewness and b kurtosis as a function of simulation time calculated as an average over 1000 repetitions of the 60-min simulation. HOSM and linear simulations; Andrea sea state Hs = 9.2 m and Tp = 13.2 s, kpHs/2 = 0.11. M denotes the order of simulations, M = 1 referred to the linear simulations, M = 3 to the nonlinear third-order HOSM simulations

As demonstrated in Fig. 7, small variations of skewness and kurtosis for the linear and HOSM simulations are reduced significantly, although showing the same effects as in Fig. 6. We can say that, after about 500 s, small variations of kurtosis are due to sampling variability (i.e., that kurtosis is practically constant after the initial increase from 3).

The average skewness and kurtosis estimators plotted in Figs. 6 and 7 are characterized by spreading which is dependent on the spectrum peakedness parameter γ and can be described in terms of standard deviation, a measure of sampling variability. The mean and standard deviation (std) of the linear (M = 1) and HOSM (M = 3) kurtosis for γ = 1, 2, 3.3, and 6 are listed in Table 2. The results for 100 runs and for 1000 runs are mostly identical. As seen, the HOSM standard deviation is the highest for the JONSWAP spectrum with γ = 6, and the lowest for the PM spectrum (γ = 1). While the mean kurtosis is only 4% higher, the standard deviation is approximately twice as high for the JONSWAP spectrum compared to the PM spectrum. Further, the mean kurtosis predicted by the linear model is approximately equal to 3 (slightly lower due to the estimator’s bias mentioned earlier) while the average standard deviation over the spectra considered is equal to 0.22, but similarly to the HOSM model, the highest standard deviation is obtained for the JONSWAP spectrum with γ = 6.

Table 2 Mean and standard deviation of kurtosis, the Andrea sea state kpHs/2 = 0.11, unidirectional wave field

The distributions of kurtosis for the unidirectional linear and nonlinear simulations are shown in Fig. 8. Although the probability of occurrence of high values of kurtosis is significantly higher for the HOSM simulations with γ = 6 than for the linear simulations, the kurtosis coefficient derived from individual linear realizations reaches values far beyond 4.

Fig. 8
figure 8

Distribution of kurtosis derived from unidirectional linear M = 1 (left) and nonlinear HOSM M = 3, (right) simulations, kpHs/2 = 0.11, 1000 runs

It should be noted that the standard deviation of kurtosis will depend not only on the spectrum peakedness parameter gamma and degree of sea surface nonlinearity but also on wave steepness, whether temporal or space-time records are used in an analysis, and their duration, as systematically demonstrated by Bitner-Gregersen et al. (2020a). We illustrate this below using case 1 in Table 1 as an example.

Figure 9 shows histograms (frequency of occurrence) of temporal (subscript “t”) and spatial (subscript “x,y,t”) skewness, kurtosis, and ɳmax/Hs for the unidirectional, linear, second-order, and HOSM models for the Draupner sea state with kpHs/2 = 0.08. The 30-min numerical simulations for the PM spectrum (γ = 1) and the JONSWAP spectrum with γ = 6 are repeated 1000 times to provide accurate average estimators of the considered wave parameters. Similarly, as for the runs of the Andrea case discussed above, large variability of κ3 and κ4, and ɳmax is observed. The spreading of these parameters is highest for the HOSM simulations, followed by the second-order and linear ones. As already pointed out by Bitner-Gregersen et al. (2020a), Fig. 9 illustrates that the nonlinear wave field including dynamical effects is more sensitive to sampling variability than the second-order and linear ones, particularly affecting ɳmax/Hs. Thus, we should expect that wave fields in which rogue waves are a result of linear focusing will be less affected by sampling variability.

Fig. 9
figure 9

Temporal (blue) and space–time (brown) histograms of skewness, kurtosis, and ɳmax/Hs for unidirectional linear, second-order, and HOSM simulations for the Draupner sea state kpHs/2 = 0.08 and a γ = 1.0 and b γ = 6.0

6 Directional wave fields

The same effects of sampling variability as shown above for the unidirectional wave fields are observed also in directionally spread waves, although wave directionality suppresses modulational instability, generally reducing the maximum wave crest, skewness, and kurtosis. Single realizations of rogue-prone sea states show large variability and extreme events do not occur in all of them. How many rogue waves will be observed in single realizations of a rogue-prone sea state will naturally depend on the duration of the wave records, the size of the spatial domain the measurements cover, and whether temporal or spatial data are considered.

Figure 10 shows an example of two single 30-min temporal HOSM realizations of surface elevation for the sea state case 3 in Table 1 (Justine Three Sisters), when the directional wave spectrum from the operational forecast model of the Norwegian Meteorological Institute MWW4 is used as input to the HOSM code and two different seeds are applied; for details see Bitner-Gregersen et al. (2020b). The Justine Three Sisters (a triple rogue wave group) were recorded at 18:20 November 30, 2018 in the central North Sea at a single point by the WaveRadar REX in the intermediate water depth of 70 m (Magnusson et al. 2019). The Justine sea state had Hs = 4.04 m, Tp = 8.4 s, corresponding to wave steepness kpHs/2 = 0.12, and maximum wave crest Cmax = 5.2 m, while the wave model MWW4 produced values of Hs = 3.4 m, Tp = 8.4 s, and kpHs/2 = 0.10. The fitted JONSWAP spectrum gave γ = 1.84 (see Bitner-Gregersen et al. 2020b) while the directional spreading of the MWW4 spectrum was 35.8° (Magnusson et al. 2019).

Fig. 10
figure 10

a Time series of surface elevation of the Justine Three Sisters sea state, Hs = 3.4 m, Tp = 8.4 s, kpHs/2 = 0.10. HOSM simulations using the directional wave spectrum from the operational forecast model of the Norwegian Meteorological Institute as input, seeds = 5, 7. b The corresponding maximum spatial surface elevation as a function of simulation time, seeds = 5, 7

The time series of surface elevation from HOSM simulations presented in Fig. 10a do not reach the Justine crest height of 5.2 m and deviate significantly from each other, clearly manifesting the effect of sampling variability. The corresponding maximum spatial elevation (Fig. 10b) exceeds the Justine crest of 5.2 m several times in the 30-min wave records.

The impact of sampling variability on directional wave fields, and in particular the effect of the directional spreading parameter, is further investigated using case 4 in Table 1, a sea state with Hs = 5.66 m and Tp = 10 s, with corresponding wave steepness kpHs/2 = 0.11, and deep water. Case 4 sea state has the same steepness as the Andrea sea state but lower Hs and Tp. Three cases listed in Table 3 are considered in a sensitivity analysis, with different directional spreading: N = 4, 16, 100 (see Eq. 2), and JONSWAP spectrum peakedness parameter: γ = 1, 3.3, 6.0, but the same wave steepness. Note that N = 100 represents a wave field close to a unidirectional one, but still being directionally spread. The space-time directional 30-min simulations were carried out by the HOSM code with M = 3.

Table 3 Directional sea states considered in the analysis

Figure 11 shows single realizations of spatial skewness and kurtosis as a function of time during of the 30-min simulations for a given seed. Large variations of skewness and kurtosis are seen in the figure. As fully expected for single runs of limited domain size, at some time steps of the simulations, skewness, and kurtosis reach larger values for the most directionally spread sea state with N = 4 and γ = 1.0 than for the wave field with N = 100 and γ = 6.0, although the opposite is expected on average, e.g. when averaging over many independent runs.

Fig. 11
figure 11

Spatial skewness a and kurtosis b as a function of 30-min simulation time, HOSM simulations; Hs = 5.66 m and Tp = 10 s, kpHs/2 = 0.11

Indeed, when the 30-min HOSM simulations are repeated 20 times for every sea state, the average spatial skewness and kurtosis over the 20 runs show the highest values for the sea state with N = 100 and γ = 6.0, followed by the sea state N = 16 and γ = 3.3 and N = 4 and γ = 1.0. For all analyzed sea states, the average skewness over all runs is slightly below 0.2, while the average spatial kurtosis is beyond 3, in the range between 3.05 and 3.15, depending upon directional spreading, see Fig. 12. However, the kurtosis is lower than for the unidirectional waves presented in the previous sections, as directionality is suppressing the effect of modulational instability (e.g., Toffoli and Bitner-Gregersen 2011).

Fig. 12
figure 12

Estimated kurtosis as a function of simulation time, 20 times repetitions of the 30-min HOSM simulations; Hs = 5.7 m and Tp = 10 s, kpHs/2 = 0.11; γ = 1.0, 3.3., 6.0, N = 4, 16, 100

The mean value and standard deviation of kurtosis predicted by the HOSM model with M = 3, shown in Fig. 12, are listed in Table 4. The sampling variability of kurtosis, expressed in terms of the standard deviation, is the highest for the JONSWAP spectrum with γ = 6 and N = 100, representing nearly unidirectional waves, followed by the sea states with γ = 3.3, N = 16, and γ = 1, N = 4. Due to the fact that the kurtosis is calculated over a large 2D domain, the standard deviations for the directional wave field in Table 4 are significantly lower than the ones for the unidirectional waves listed in Table 2.

Table 4 Mean and standard deviation of kurtosis, directional wave field

7 Discussion

Reduction of sampling variability in field data is challenging. In situ 20- or 30-min temporal and spatial wave records include a limited number of observations. Today, several wave measurement campaigns are recording sea surface elevation continuously, theoretically allowing the duration of observations to be extended, but the stationarity of sea states, on which most of wave models are based, is an issue. Further, stereo video camera systems have limited footprints, which limits the number of waves recorded. Increasing the sampling rate alone will increase the number of data-points, but not necessarily the accuracy of the estimators of wave characteristics derived from them.

Figure 13 shows the variability of significant wave height, mean (zero-crossing) wave period, skewness, and kurtosis during 1 day of the Andrea storm (8–9 November 2007) at the point location in the central North Sea where the Andrea rogue wave was recorded by a sensor consisting of 4 lasers in an array (LASAR) on 9 November at 00:54 UTC (Magnusson and Donelan 2013). The wave parameters plotted in Fig. 12 are derived from one of the lasers’ wave profile measurements, using record lengths of 20 min, 60 min, and 3 h, given in 20-min intervals. As discussed in the previous sections, the variability of estimators of rogue-prone sea state characteristics will be largest for the shortest wave records, as can be seen in Fig. 13 for Hs, the mean wave period, skewness (SK), and kurtosis (KU).

Fig. 13
figure 13

Variability of a significant wave height, b mean (zero-crossing) wave period, c skewness, and d kurtosis in the Andrea storm in the Ekofisk point location of the North Sea in the period 8 November 2007, 15:00 UTC–9 November 2007 15:00 UTC. The wave parameters were calculated from the 20-min (blue color), 60-min (red color) and 3-h (cyan color) temporal time series of surface elevation recorded by the laser

Extension of the duration of temporal measurements to 3 h or beyond may not always be possible, as stationarity of the weather conditions needs to be satisfied. Figure 13 clearly demonstrates that Hs, one of the main characteristics of sea state stationarity, may vary significantly within 1 day.

Data recorded by stereo video camera systems also provide, in addition to temporal variations of surface elevation at a point, the spatial variability of a wave field (see Fedele et al. 2011; Fedele 2012; Benetazzo et al. 2015, 2017; Watanabe et al. 2019), but the stationarity of investigated sea states remains an issue. The spatial evolution of wave profiles obtained from these systems, however, may be of help in deciding the degree of wave field nonlinearity (see Babanin 2019).

Stereo video camera systems may cover relatively large observation domains, up to O(100–1000) m2. The size of the effective domain within which the wave field is reliably reconstructed depends on hardware and installation properties of the stereo camera system such as lens focal length, image resolution, image acquisition frame rate, height of the stereo cameras above the sea surface, their viewing angle, and baseline distance. Environmental factors such as ambient lighting, precipitation, cloud cover, and sea-surface roughness also affect the quality and effective spatial domain size of wave field reconstructions from visual stereo video systems (Jähne et al. 1994; Benetazzo 2006; Benetazzo et al. 2017; Guimaräes et al. 2019). Depending on the duration of recording, derived sea state characteristics of rogue-prone sea states may be affected by a bias. Gramstad et al. (2018) showed by numerical simulations that, when the computation domain was reduced from 5 × 5 km to 1.25 × 1.25 km, the estimated average spatial kurtosis of the linear wave field with Hs = 5 m and Tp = 10 s was underestimated by 4% if the number of runs was equal 50 and by 2% if the number of runs was increased to 500. Further, the distribution of kurtosis became significantly narrower for 500 runs.

Sampling variability brings also challenges in establishing functional relations between nonlinear wave characteristics and in forecasting of rogue waves. Wave parameters identifying occurrence of rogue waves derived from 20 to 30 min field measurements, e.g., kurtosis and wave steepness, show a large scatter which does not allow the calculation of any regression lines between them, see, e.g., Olagnon and Magnusson (2004). Bitner-Gregersen et al. (2020a) demonstrated that, by use of numerical simulations, such relations can be provided if the duration of simulations is sufficiently long, the considered space domain sufficiently large, and the average estimators of wave parameters over all runs are considered. Alternatively, a coupling of a spectral wave model with the HOSM model can be a solution for prediction of rogue waves (Bitner-Gregersen et al. 2014b). Application of surrogate models using machine learning systems trained with output from a spectral wave model and HOSM simulations has shown also promising results (Gramstad and Bitner-Gregersen 2019).

Further, numerical simulations can be used also to validate theoretical and semi-theoretical wave models where, due to sampling variability, field data fail to do it. We illustrate this using numerical HOSM data from Bitner-Gregersen et al. (2020a), and validating the expressions for kurtosis suggested by Janssen (2009) and Mori et al. (2011). Mori and Janssen (2006) demonstrated theoretically that, for a nonlinear wave field, the kurtosis of sea surface elevation is affected by bound waves (κ4(bound)) and by dynamical effects due to the free waves (κ4(dyn)). Janssen (2009) showed later that, for a narrow band wave field in deep water κ4(bound) = 4.5(kpHs/2)2. Further, by assuming unidirectional narrow band waves and the Gaussian wave spectrum, based on numerical simulations of a modified nonlinear Schrӧdinger equation (MNLS), Mori et al. (2011) proposed to approximate the dynamical kurtosis κ4(dyn) = (π/√3)BFI2, where BFI is the Benjamin Feir Index. Closed form expressions for BFI were suggested for long-crested and short-crested sea states (see also Janssen and Janssen 2019). Use of these formulas for prediction of rogue waves was not always successful.

κ4(bound) and the total κ4(total) = 3 + κ4(bound) + κ4(dyn) for unidirectional wave field following the proposed expressions are plotted in Fig. 14, together with the average kurtosis over all runs derived from space-time unidirectional HOSM simulations for the JONSWAP spectrum with γ = 1.0, 3.3, 6.0. The theoretical expressions derived for the narrow-band waves and the Gaussian spectrum deviate significantly from the HOSM kurtosis when the more realistic JONSWAP spectra are used. However, both curves show similar trends, even though the theoretical expressions predict much larger kurtosis with increase of wave steepness. Thus, the main idea of Janssen (2009), Mori et al. (2011), and Janssen and Janssen (2019) behind the proposed expressions is valid—wave steepness and BFI are important parameters for the description of kurtosis. Note that Fedele et al. (2016), when simulating with the HOSM method the sea states in which the Draupner, Andrea, and Killard rogue waves occurred, showed the limitation of the expression for κ4(bound) resulting from the narrow band assumption.

Fig. 14
figure 14

Space-time kurtosis as a function of wave steepness kpHs/2, unidirectional HOSM simulations for the JONSWAP spectrum with γ = 1.0, 3.3, 6.0 and fitted the second-order polynomials, and theoretical expressions from Janssen and Janssen (2019)

Figure 15 shows that the second-order polynomials described by Eq. (5) approximate the unidirectional HOSM data well.

$$ {\kappa}_4=A{\left( kpHs/ 2\right)}^2+B\left( kpHs/ 2\right)+C $$
(5)

where the parameters A, B and C are functions of the spectrum peakedness parameter γ, as illustrated in Fig. 15.

Fig. 15
figure 15

The parameters A, B, and C from Eq. (5) as a function of the spectrum peakedness parameter γ and fitted linear polynomials

8 Conclusions

This study demonstrates, using numerical simulations, the different effects that sampling variability can have on estimators of nonlinear characteristics of the wave field, such as surface elevation, skewness, and kurtosis, when single 20- or 30-min wave field records are used in an analysis. It shows, using selected rogue-prone sea states as examples, that independent of the value of sea state wave steepness, wave frequency and directional spreading, and water depth, 20- or 30-min wave temporal and spatial measurements will be affected by sampling variability.

Although single 20–30-min wave field records, temporal or spatial, provide description of individual characteristics of rogue waves such as, e.g., their wave heights, crest heights, wave periods, and steepness, these measurements give limited information about the wave fields in which such rogue events occur. Having 20–30-min in situ temporal or spatial wave records, it may be challenging to conclude on the importance of the nonlinearity of surface elevation, because the sampling variability may dominate over the nonlinear effects. This brings challenges for description of rogue-prone sea states using field measurements and for engineering applications. We demonstrate that nonlinearity of a wave field can be easily shown by applying numerical linear, second- and HOSM third-order simulations.

To provide accurate estimators of nonlinear sea surface characteristics, a sufficient amount of data is needed. However, increasing the duration of temporal and spatial measurements may not always be possible, as stationarity of sea states is an issue and should be considered with care.

Single 20- or 30-min realizations of rogue-prone sea states show large variability of surface elevation and its characteristics, larger for the temporal data than for the spatial ones. It is interesting to note that, due to sampling variability, wave crests and troughs in a single 20–30-min wave record may have similar values for the HOSM and the linear wave model, but the evolution of the linear and nonlinear wave trains will be different. This will affect response calculations of marine structures.

To get more complete information about sea states in which rogue waves occur, wave parameters characterizing them should be presented as average values with corresponding standard deviations. The average values of nonlinear wave parameters and associated standard deviations can be derived directly from field data using a simplified approach suggested by Bitner-Gregersen and Magnusson (2014), when the duration of 20-min wave records is extended to 1–6 h, satisfying at the same time the assumption of stationarity. Alternatively, numerical simulations, together with laboratory experiments, represent good supporting tools for field measurements, as demonstrated in the present study. The average nonlinear estimators of a rogue-prone sea state’s characteristics with plus/minus three standard deviations should be considered in engineering applications, to capture their inherent variability. The inherent variability of sea surface oscillations also brings challenges in establishing functional relations between nonlinear wave characteristics, forecasting of rogue waves and validation of theoretical and semi-theoretical wave models. Numerical simulations are also of great support here.

Further, rogue events have typically been recorded at single point locations by in situ measurements which lack information about frequency-directional wave spectra. Often wave spectral models may be the only source of 2-dimensional wave spectra, but their accuracy has been questioned within the wave community, e.g., Ardag and Resio (2019) and Cavaleri et al. (2020). Some concern exists that the wave spectral models may provide spectra which are too wide compared to those derived from wave measurements. Improving the availability of directional measurements is essential for the description of rogue waves in the future and enhancing safety at sea.