6.1 Introduction

The diagnosis of atrial fibrillation (AF) is based on the finding of an irregular ventricular rhythm, further strengthened when f waves are discernible. Since no information beyond the presence of f waves is considered when making the diagnosis, f wave characterization has yet to find its way into clinical practice. At the same time, f wave characterization is receiving considerable attention in the scientific community, driven by the need for noninvasive information on electropathological alterations in the atria, which may facilitate patient-tailored treatment of AF.

Invasive measurements, acquired during electrophysiological examination or open thorax surgery, can be used to characterize the atrial activity. While invasive measurements obviously provide a much more local characterization of the atrial activity than the surface ECG, the acquisition of invasive signals must take place inside the hospital, the required equipment is expensive, and the procedure is associated with increased risk of patient complication. Moreover, the acquisition duration is limited by the procedural duration, which may last for a few minutes only, while the surface ECG may be acquired over weeks or even months.

There are several aims of f wave characterization, many of them related to the prediction of treatment outcome [1,2,3]. For example, a low f wave amplitude predicts AF recurrence in patients with persistent AF undergoing catheter ablation [4], and, conversely, a large amplitude predicts termination of persistent AF during catheter ablation [5]. For patients with persistent AF undergoing cardioversion, a low atrial fibrillatory rate (AFR) predicts successful outcome [6], and, conversely, a fast rate predicts AF recurrence [7]. Monitoring of the effect of antiarrhythmic drug therapy is another application where f wave characterization provides valuable information, particularly in the developmental phase of the drug when the complications of invasive electrophysiological testing to some extent can be avoided [8]. For example, different f wave characteristics, including the AFR, have been studied in patients receiving either a drug under development or placebo, with the aim of determining what characterize patients converting to normal sinus rhythm, as well as patients not converting [9]. In all these applications, ECG-derived information may be considered for optimizing AF management and supporting therapeutic decisions at substantial cost savings.

Yet another, more general aim of f wave characterization is to investigate the structural changes and the electrophysiological remodeling that take place in the atria as AF progresses from self-terminating paroxysms to a more sustained or permanent state [10]. The outcome of such investigations may turn out to be instrumental in preventing the progression of AF.

From an engineering perspective, the problems of detecting AF and extracting f waves, treated in Chaps. 4 and 5, respectively, are considerably more clear-cut than the problem of characterizing f waves. The main reason is that methods for detection and extraction lend themselves to performance evaluation which can be expressed in technical terms, e.g., evaluation based on annotated or simulated ECG signals, whereas methods for f wave characterization, at least so far, rest on phenomenological observations which may link a certain f wave characteristic to the clinical issue at hand, be it related to prediction or evaluation of treatment outcome. As a result, research on f wave characterization implies more groping in the dark than does research on AF detection and f wave extraction. On the other hand, more room is available for investigating different techniques for signal characterization, with implications on clinical management.

The characterization of f waves has for many years revolved around f wave amplitude and AFR—the two fundamental signal characteristics which are relatively straightforward to determine [11]. However, as signal processing techniques have grown more sophisticated and diversified, research on f wave characterization has become increasingly more multifaceted. Different techniques have been investigated for analyzing nonstationary f wave signals with respect to spatiotemporal organization and nonlinear dynamics [2, 12, 13], as well as for analyzing the spatial distribution of different f wave characteristics on the body surface [14].

The majority of parameters proposed for f wave characterization are well-known in the realm of signal processing. Indeed, few parameters have been developed with reference to a statistical signal model accounting for certain specific electrophysiological phenomena. The lack of tailored, model-based parameters is most likely due to the difficulty to associate a particular f wave characteristic to a certain local electrophysiological property of the atria. This lack may be remedied using computational modeling and simulation to obtain a better understanding of the genesis of f waves [15,16,17].

Given the extensive work on f wave extraction, one would expect most studies on f wave characterization to be based on the extracted f wave signal—an expectation which remains to be fulfilled. With easy-to-implement methods such as average beat subtraction (ABS), the presence of QRS-related residuals will, to various extents, worsen the reliability of f wave characterization. For example, measurements of f wave amplitude are likely to be more vulnerable to such residuals than measurements on AFR. To evade this problem, several authors have confined characterization to f waves contained in TQ intervals [18,19,20,21]. However, as already pointed out in Chap. 5, the availability of fewer samples implies less accurate results, and, therefore, it is hoped that well-performing extraction methods will find their way into studies on f wave characterization.

This chapter reviews different approaches to f wave characterization, together forming a smorgasbord of “dishes” rather than a coherent body of methods. First, the two fundamental characteristics f wave amplitude and AFR are considered in Sects. 6.2 and 6.3, respectively, followed by a description of linear and nonlinear techniques for characterizing f wave morphology and regularity (Sect. 6.4). Techniques for quantifying f wave signal quality in individual leads are described in Sect. 6.5, needed to ensure that f wave characterization is performed on signals with sufficient quality. The analysis of spatial ECG information, manifested as a vectorcardiographic loop or a body surface potential map, is reviewed in Sect. 6.6. The chapter concludes with a brief overview of popular clinical applications where the herein described approaches to f wave characterization are explored.

6.2 f Wave Amplitude

In clinical studies, f wave amplitude has been manually analyzed after quantization into either fine or coarse, defined as less than or greater than 50 \(\upmu \)V [22,23,24,25,26]. As caliper measurements of f wave amplitude now belong to history, such quantization has once and for all been shelved in favor of continuous-valued measurements. Based on the extracted f wave signal x(n),Footnote 1 but with the QRS intervals excluded to avoid the influence of QRS-related residuals, a straightforward definition of f wave amplitude is the average of the four largest peak-to-peak amplitudes of individual f waves in a 10-s recording [27, 28]. Given that the f wave amplitude often varies over time, it may be necessary to average all peak-to-peak amplitudes contained in the recording to produce a representative measurement. Determination of the peak-to-peak amplitude requires that a search interval is first delineated so that the extrema of the f wave can be located. The length of the search interval depends on the AFR, and thus the AFR needs to be estimated.

The f wave amplitude does not necessarily have to rely on the amplitude of local extrema, but can just as well be computed as a root mean square (RMS) amplitude of the extracted f wave signal, or, as in [29], without the square root to instead measure signal power. Another approach would be to employ classical envelope detection based on the Hilbert transform [30], where the f wave amplitude can be determined as an average of the envelope in the time interval of interest.

Envelope detection based on local extrema has also been proposed for the measurement of f wave amplitude [31], see also [32]. Once x(n) has been centered, i.e., its mean \(m_x\) has been removed, the lower envelope \(e_{l}(n)\) is obtained by connecting successive local minima of x(n) using polynomial interpolation, and the upper envelope \(e_{u}(n)\) by connecting successive local maxima of x(n); a piecewise cubic Hermite interpolating polynomial was used in [31].Footnote 2 The sample-to-sample difference between \(e_{u}(n)\) and \(e_{l}(n)\) is taken as a measure of the local amplitude, which, when averaged over the entire N-sample signal,

$$\begin{aligned} A_\text {f} = \frac{1}{N} \sum _{n=0}^{N-1} |e_{u}(n) - e_l(n)|, \end{aligned}$$
(6.1)

is a measure of global f wave amplitude. The different signals involved with the computation of \(A_\text {f}\) are illustrated in Fig. 6.1.

Fig. 6.1
figure 1

(Reprinted from [31] with permission)

Envelope-based measurement of f wave amplitude using polynomial interpolation: the f wave signal after 0.5–30 Hz bandpass filtering (solid line), the lower and upper envelopes \(e_{l}(n)\) and \(e_{u}(n)\) (dotted lines) obtained by connecting successive local minima and maxima (marked by “+” and “o,” respectively), and the difference \(e_{u}(n)-e_{l}(n)\) (dashed–dotted line) used to compute the amplitude in (6.1).

The methods in [27, 31] require that the extrema of the f wave signal are determined before the amplitude can be measured. The method in [27] produces measurements which are more intuitive since the samples between peaks are not taken into account. However, as the noise level increases, peak-to-peak measurements become increasingly more unreliable than those obtained from the envelope [31]. To reduce the influence of baseline wander and muscular noise, the original ECG signal is bandpass filtered before the amplitude is measured, using a passband of either 1–50 Hz [27] or 0.5–30 Hz [31].

None of the two methods in [27, 31] have been applied to f waves extracted in the QRS interval. In fact, the envelope-based method analyzes an f wave signal resulting from the concatenation of consecutive TQ intervals, thus making f wave extraction superfluous [31]. Since concatenation sometimes leads to jumps at the interval boundaries, peaks located near the boundaries are excluded from polynomial interpolation. Moreover, some TQ intervals are so short that only a partial f wave is available for amplitude measurement.

The repeatability of f wave amplitude was investigated on a data set of 20 clinically stable patients with AF, using the average of the four largest peak-to-peak amplitudes [27]. For each patient, 10 ECGs of 10-s length were recorded at regular intervals over the course of 24 h. The results showed that the interpatient differences were substantial, with f wave amplitudes ranging from 60 to 350 \(\upmu \)V (mean±standard deviation equal to 131±54 \(\upmu \)V). On the other hand, the intrapatient differences were significantly smaller during the 24 h, ranging from 4 to 53 \(\upmu \)V when determined over the 10 intrapatient ECGs, with an average standard deviation of 21 \(\upmu \)V.

Assuming that f waves can be approximated by a sinusoid, the peak-to-peak amplitude can be compared to the RMS amplitude, since the former amplitude is approximately 2.8 times the latter amplitude. Using this approximation, a qualitative comparison can be made between the results reported in [27] and the histogram of f wave RMS amplitude displayed in Fig. 3.4b. The results are in fairly good agreement with each other, since, following multiplication of 2.8, the f wave amplitudes in Fig. 3.4b range from 35 to 340 \(\upmu \)V (117±48 \(\upmu \)V), to be compared with 60 to 350 \(\upmu \)V (131±54 \(\upmu \)V).

6.3 Atrial Fibrillatory Rate and Beyond

Atrial fibrillatory rate, being the other fundamental f wave characteristic, has received considerable clinical attention during the last two decades [3]. A spectral approach is commonly used to estimate the AFR, since estimation based on the occurrence times of the f waves is compounded by the difficulty to define a consistent fiducial point. Another reason is that the signal-to-noise ratio (SNR) may be poor. In contrast, when invasively recorded signals are subject to analysis, the AFR is often determined from the occurrence times of the local activations, with complex wavefront morphologies and low SNR as the factors which have the most influence on the accuracy of AFR estimation. Considerable research effort has been spent on developing techniques for estimation of local activation times [34,35,36,37,38,39,40] as an alternative to using spectral analysis [41,42,43].

Spectral analysis of the extracted f wave signal plays an important role not only in AFR estimation, but also in the characterization of f wave morphology. When changes in the spectral content of the f wave signal are of interest to investigate, whether spontaneous or due to intervention, time–frequency analysis is better suited for quantifying such changes.

In the engineering oriented literature, the term dominant atrial frequency (DAF) is usually substituted for AFR, where “dominant” refers to the largest spectral peak. In the clinical literature, the term dominant atrial cycle length (DACL) is sometimes substituted for AFR. Atrial fibrillatory rate, DAF, and DACL convey the same information, though they are expressed in units of fibrillations per minute (fpm), Hertz, and milliseconds, respectively. Since the DAF estimate is used to determine both AFR and DACL, DAF is the preferred terminology in the following.

6.3.1 Dominant Atrial Frequency

The position of the largest peak in the power spectrum of the extracted f wave signal defines the DAF, denoted \(\omega _0\). Nonparametric spectral estimation is typically employed, which, in most cases, is synonymous to Welch’s method, where the signal is divided into shorter, overlapping segments, followed by windowing of each segment [44].Footnote 3 The power spectrum is obtained by averaging the power spectra (periodograms) of the segments. Each segment is padded with zeros so that the position of the spectral peak can be determined more accurately; however, zero padding does not improve spectral resolution in the sense that two closely spaced spectral peaks are better resolved when the original signal is padded with zeros. A signal length of a few seconds is needed to produce an acceptable variance of the power spectrum. If better spectral resolution is needed, longer segments need to be analyzed. For example, a 10-s segment yields, at best, a frequency resolution of 0.1 Hz depending on the window chosen.

Figure 6.2 displays the power spectra computed from extracted f wave signals in leads V\(_1\), V\(_2\), and V\(_3\). The largest spectral peak occurs at approximately the same position in all three leads, where the f waves of V\(_1\) have the largest amplitude. In this example, the position of the next largest peak in V\(_1\) and V\(_2\) is not harmonically related to the position of the largest peak; the next largest peak is likely the expression of a time-varying DAF, discussed below.

Fig. 6.2
figure 2

Power spectra of extracted f wave signals in leads V\(_1\), V\(_2\), and V\(_3\). The dominant peak is marked with “*”

Since f waves are mostly characterized by frequencies up to 25 Hz, a sampling rate much lower than that required for f wave extraction can be used. Thus, the original ECG sampling rate can be decimated to 50 Hz without loss of clinical information. Although sampling rate decimation is not a critical operation when performing nonparametric spectral analysis, it is critical when performing parametric spectral analysis based on autoregressive modeling due to the risk of producing spectra with spurious peaks for too high a sampling rate [45].

Instead of performing spectral analysis of the extracted f wave signal, the analysis may be confined to the samples of successive TQ intervals [46]. In such cases, a technique must be employed which can handle unevenly sampled signals. Using iterative singular spectrum analysis (SSA), cf. Sect. 5.7, an atrial subspace is first determined from several, consecutive TQ intervals, after which the f wave signal of the QRST intervals is estimated by projecting the QRST samples on the atrial subspace. The resulting signal, composed of interpolated samples in the QRST intervals and observed samples in the TQ intervals, is then subject to spectral analysis using, for example, Welch’s method.

Using simulated f wave signals, all with 1-min duration and a 7-Hz DAF, iterative SSA was used to estimate the DAF [46]. The results showed that the estimation error rarely exceeded 1.0 Hz at heart rates up to 130–140 beats per minute (bpm) and relatively low SNRs. Recalling general results on the variance of frequency estimators [47, Chap. 3], the spectral estimation error is lower at higher frequencies, but higher at lower frequencies. Thus, not surprisingly, the best performing scenario for the iterative SSA is one with a slower heart rate, i.e., the TQ intervals are longer, and a higher DAF. The SSA-based technique was developed for estimating the DAF, whereas information on other harmonics, needed to compute some of the spectral parameters described below, is not captured.

Lomb’s periodogram is another technique for estimating the power spectrum of an unevenly sampled signal [30, 48]. This periodogram is determined by minimizing the squared error between the observed samples and a sinusoidal model signal composed of different frequencies. The accuracy of DAF estimates obtained from Lomb’s periodogram is similar to that of estimates obtained from iterative SSA, although the latter method tended to produce lower errors at lower heart rates [46].

As a rule, spectral analysis of multi-lead ECGs is performed on a lead-by-lead basis, resulting in a set of parameters characterizing the spatial distribution of spectral information. Another, less common approach is provided by the spectral envelope method [49] which combines spectral information of the different leads into a single power spectrum, where periodic components are emphasized and noise is suppressed [50, 51].

6.3.2 Spectral Parameters

The parameter spectral organization (SO) describes the harmonic structure of the f wave signal [52, 53]. A more organized signal, manifested by a harmonic spectrum with a dominant spectral peak, is hypothesized to reflect fewer wavelets circulating within the atria. Conversely, a less organized signal, manifested by “more frequency components added to the atrial signal,” is hypothesized to reflect more wavelets. Spectral organization is defined by

$$\begin{aligned} P_{\text {SO}} = \frac{\displaystyle \sum _{k=1}^{K} \int _{-\varDelta \omega }^{\varDelta \omega } S_{x}(\hat{\omega }_{k-1}+\omega ) d\omega }{\displaystyle \int _{\omega _{\text {min}}}^{\omega _{\text {max}}} S_{x}(\omega ) d\omega }, \end{aligned}$$
(6.2)

where \(S_x(\omega )\) is the power spectrum of x(n), and \(\omega _0,\dots ,\omega _{K-1}\) denote the positions of the K harmonics, i.e., the k-th harmonic is associated with \(\omega _{k-1}\). Four harmonics were analyzed in [52, 53], whereas two harmonics were analyzed in [51, 54]. The integration limits \(\varDelta \omega \) and \(\omega _{\text {min}}\) were set to 0.5 Hz and 2.5 Hz, respectively, and \(\omega _{\text {max}}\) was set to \(((K+1)\hat{\omega }_0-\varDelta \omega )\). Since the actual positions of the second and higher harmonics often differ slightly from the expected positions at \(k\hat{\omega }_0,\) \(k=2,\ldots ,K\), \(\hat{\omega }_k\) is determined by a grid search restricted to an interval centered around \(k\hat{\omega }_0\). A time-varying version of \(P_{\text {SO}}\) has been proposed in [54], involving an adaptive algorithm for tracking of the harmonics, see Sect. 6.4.1.

Another approach to characterizing the harmonic structure is based on the spectral line model, where the decay of the amplitude of the harmonics constitutes the crucial parameter [55]. The model is defined by the magnitude \(a_0\) of the dominant spectral peak at \(\omega _0\), the exponential decay \(\gamma \), referred to as the harmonic decay (HD), and the harmonic frequencies \(\omega _0,\ldots ,\omega _{K-1}\),

$$\begin{aligned} S_{\text {HD}}(\omega ) = a_0 e^{-\gamma k} \delta (\omega -\omega _k), \quad k=0,\ldots ,K-1, \end{aligned}$$
(6.3)

where \(a_0\) and \(\gamma \) are unknown parameters, whereas \(\omega _0,\ldots ,\omega _{K-1}\) may be determined as described above. By taking the logarithm of \(S_{\text {HD}}(\omega )\), the estimation of \(a_0\) and \(\gamma \) is transformed into a problem of fitting a line to \(\ln S_{x}(\omega )\). Using the least squares (LS) method, joint minimization of the cost function

$$\begin{aligned} J(\ln a_0,\gamma ) = \sum _{k=0}^{K-1} \left( \ln S_{x}(\hat{\omega }_k) - (\ln a_0 - \gamma k) \right) ^2 \end{aligned}$$
(6.4)

with respect to \(a_0\) and \(\gamma \) yields the following two estimators:

$$\begin{aligned} \hat{a}_0&= \exp \left[ \frac{2(2K-1)}{K(K+1)} \sum _{k=0}^{K-1} \ln S_{x}(\hat{\omega }_k) - \frac{6}{K(K+1)} \sum _{k=0}^{K-1} k \ln S_{x}(\hat{\omega }_k) \right] , \end{aligned}$$
(6.5)
$$\begin{aligned} \hat{\gamma }&= - \frac{6}{K(K+1)} \sum _{k=0}^{K-1} \ln S_{x}(\hat{\omega }_k) + \frac{12}{K(K^2-1)} \sum _{k=0}^{K-1} k \ln S_{x}(\hat{\omega }_k), \end{aligned}$$
(6.6)

where exponentiation is used to transform back to the original model parameters in (6.3). A wide range of f wave morphologies can be represented by the spectral line model, spanning from sawtooth-like waves, observed at an early stage of AF, to sinusoidal-like waves, observed in permanent AF, illustrated in Fig. 6.3. Since a slower AFR is usually associated with sawtooth-like waves, i.e., characterized by several harmonics, and a faster AFR with more sinusoidal-like waves, i.e., characterized by the fundamental frequency, it is plausible to assume that \(\omega _0\) and \(\gamma \) are positively correlated as AF progresses [55].

Fig. 6.3
figure 3

Simulated f waves with different morphologies, obtained by varying the parameters \(f_0\) and \(\gamma \) of the spectral line model in (6.3), assuming that \(\omega _k=k2\pi f_0\)

The logarithm of the spectral power ratio (SPR), defined by the harmonics positioned at \(\hat{\omega }_0\) and \(\hat{\omega }_1\), is yet another parameter for harmonic characterization [56],

$$\begin{aligned} P_{\text {SPR}} = \ln \left( \frac{S_x(\hat{\omega }_0)}{S_x(\hat{\omega }_1)}\right) . \end{aligned}$$
(6.7)

A large value of \(P_{\text {SPR}}\) reflects a spectrum with less pronounced harmonic structure, and vice versa.

The spectral parameters \(P_{\text {SO}}, \gamma \), and \(P_{\text {SPR}}\) require that at least two harmonics are present. Using spectral entropy (SE), less emphasis is put on the harmonic structure of \(S_x(\omega )\) and more on the complexity of the f wave signal [50, 51]. The spectral entropy of a narrowband signal is lower than that of a broadband signal. Since the entropy definition involves a probability mass function with unit area, the spectrum needs to be converted into such a function by normalizing each frequency component \(S_{x}(\omega _l)\) with the sum of all L components,

$$\begin{aligned} \overline{S}_{x}(\omega _l) = \frac{S_{x}(\omega _l)}{\displaystyle \sum _{i=1}^{L} S_{x}(\omega _i) }, \quad l=1,\ldots ,L, \end{aligned}$$
(6.8)

where \(\omega _1\) and \(\omega _L\) denote the lower and upper frequency limits, respectively, and \(\omega _2,\ldots ,\omega _{L-1}\) are equidistantly spaced frequencies between \(\omega _1\) and \(\omega _L\); thus, \(\omega _l\) does not denote a harmonic frequency in (6.8). The SE is defined by [57]

$$\begin{aligned} I_{\text {SE}} = -\sum _{l=1}^{L} \overline{S}_{x}(\omega _l) \log _2 \overline{S}_{x}(\omega _l). \end{aligned}$$
(6.9)

The spectral width of the largest peak is yet another parameter which has been investigated in a few studies [56, 58, 59]. However, this measurement is influenced by the spectral leakage effect, manifested by the power of a sinusoid leaking into adjacent frequencies within a bandwidth of approximately \(4\pi /N\), where N is the length of x(n) [44]. Moreover, the temporal variation often observed in DAF has profound influence on the spectral width. Together, these two factors explain why the spectral width has had very limited significance in clinical studies.

6.3.3 Time–Frequency Analysis

Power spectral analysis reflects the average signal behavior of the analyzed interval, and the position of the largest spectral peak represents the main carrier of clinically significant information. In case of bi- or multimodal spectral peaks, the presence of joint frequencies is not necessarily reflected, but just as well that the DAF varies within the analyzed interval. Using time–frequency analysis in patients with permanent AF [60], the variation in the DAF was found to be as large as 2.5 Hz during just a few seconds, suggesting that temporal variation in the DAF is a characteristic of the underlying, complex electrical activation patterns in the atria. Another reason to pursue time–frequency analysis is the wish to characterize changes in the DAF due to intervention, e.g., drug administration and tilt table testing.

A plethora of techniques have been developed for time–frequency analysis, of which the simplest, and the most common, is the short-term Fourier transform (STFT), being a linear, nonparametric method. The STFT results from modifying the one-dimensional discrete-time Fourier transform to include a sliding time window w(n) which extracts a segment from x(n) for analysis, resulting in a two-dimensional function \(X(n,\omega )\) defined by

$$\begin{aligned} X(n,\omega ) = \sum _{l=-\infty }^{\infty } x(l)w(l-n)e^{-j\omega l}. \end{aligned}$$
(6.10)

The length of w(n) determines the resolution in time and frequency: a short window yields good time resolution but poor frequency resolution, and vice versa. By analogy with the computation of the periodogram, the spectrogram is obtained by computing the squared magnitude of the STFT,

$$\begin{aligned} S_x(n,\omega ) = |X(n,\omega )|^2. \end{aligned}$$
(6.11)

In certain clinical applications, it may be desirable to track changes in the DAF as small as 0.1 Hz, thus calling for a segment length of at least 10 s. On the other hand, the DAF may change so rapidly over time that a time resolution of 10 s is insufficient. These conflicting demands have proven difficult to achieve with the STFT, and, therefore, depending on the AF application at hand [59,60,61], other techniques for time–frequency analysis with better resolution in both time and frequency have been investigated.

The Wigner–Ville distribution (WVD) is a well-known quadratic, nonparametric transform offering better resolution than the STFT [30, 62, 63]. Unfortunately, the quadratic structure also means the introduction of cross-terms in the time–frequency domain, arising between different signal components as well as between signal and noise components. Although the influence of cross-terms can be reduced by including a kernel function, the practical use of the WVD is still limited when multicomponent signals are encountered. Since the tracking of changes in the DAF is an important aspect of time–frequency analysis, the cross Wigner–Ville distribution (XWVD) is an attractive choice as it integrates the estimation of a varying frequency with the computation of the WVD [64]. The XWVD is initiated by the frequency series \(\hat{\omega }_{0,0}(n)\), determined from the STFT, where the two indices denote harmonic number and iteration number. The XWVD is computed between x(n) and a sinusoid defined by \(\hat{\omega }_{0,0}(n)\), from which an improved \(\hat{\omega }_{0,1}(n)\) can be estimated using peak detection of the XWVD. Based on \(\hat{\omega }_{0,1}(n)\), a new XWVD is computed, and so on, until the frequency series no longer changes from iteration to iteration.

Using the XWVD, spontaneous temporal variation can be uncovered in the DAF, illustrated in Fig. 6.4 where the XWVD of a 1-min extracted f wave signal is analyzed, obtained from a patient with permanent AF. The presence of such variation most likely explains why the dominant peak of the power spectrum is broad or bimodal as is the case in Fig. 6.2 [60].

Fig. 6.4
figure 4

The cross Wigner–Ville distribution of a 1-min f wave signal obtained from a patient with permanent AF, using a 2.5-s Hanning window. The distribution is displayed for leads a V\(_1\), b V\(_2\), and c V\(_3\). The DAF is centered around 6 Hz in all three leads, with considerable variation ranging from about 5–7 Hz

The spectral profile method [55] was developed to address the limitation that the DAF is the focus of the XWVD, while other harmonics are ignored. The spectral profile results from averaging of frequency-aligned spectra of successive signal segments. By using a logarithmic frequency scale, rather than the conventional linear scale, spectra with different harmonic frequencies can be properly aligned and averaged. The resulting spectral profile exhibits a more distinct harmonic pattern than the spectra of separate segments, and, therefore, lends itself better to f wave characterization. In this method, the time–frequency distribution is similar to that produced by the STFT, except that a nonuniform, discrete-time Fourier transform is employed. The spectrum of each segment is aligned to the spectral profile by finding the frequency shift that minimizes the weighted LS error, after which the spectral profile is updated with the aligned spectrum.

In this method, each spectrum \(\mathbf {q}_p\) of the time–frequency distribution is obtained by computing the nonuniform, discrete-time Fourier transform of \(\mathbf {x}_p\),

$$\begin{aligned} \mathbf {q}_p = \mathbf {FWx}_p, \end{aligned}$$
(6.12)

where the column vector \(\mathbf {x}_p\) contains the N samples of the p-th signal segment; the computation is either made in overlapping or nonoverlapping segments. The resulting column vector \(\mathbf {q}_p\) contains L different frequencies, the \(N \times N\) diagonal matrix \(\mathbf W\) defines the window function w(n) applied to \(\mathbf {x}_p\), and the \(L \times N\) transform matrix \(\mathbf {F}\) is defined by L nonuniformly sampled frequencies,

$$\begin{aligned} \mathbf {F}=\left[ \begin{array}{ccccc} e^{-j0\pmb {\omega }} &{} e^{-j1\pmb {\omega }} &{} e^{-j2 \pmb {\omega }} &{} \cdots &{} e^{-j(N-1)\pmb {\omega }} \\ \end{array} \right] , \end{aligned}$$
(6.13)

where \(\pmb {\omega }=\begin{bmatrix} \nu _0&\cdots&\nu _{L-1}\end{bmatrix}^T\) is a column vector with logarithmically spaced frequencies \(\nu _l\), defined by

$$\begin{aligned} \nu _l= \nu _{\text {low}} \pi ^{\frac{\eta l}{L}}, \quad l=0,\ldots ,L-1. \end{aligned}$$
(6.14)

The two parameters \(\nu _{\text {low}}\) and \(\eta \) determine together the frequency interval relevant for f wave characterization, and L determines the sampling rate of the logarithmic frequency scale. Using \(\nu _{\text {low}}=0.31\) and \(\eta =2\), together with a 50-Hz sampling rate of the extracted f wave signal, the nonuniform Fourier transform is computed for frequencies ranging from 2.5 Hz to about 25 Hz [55].

Thanks to the logarithmic frequency sampling in (6.14), two spectra with different harmonic structures can be aligned. For example, a spectrum with harmonic frequencies at 5 and 10 Hz can be aligned to another spectrum with harmonic frequencies at 6 and 12 Hz, since the number of samples between the two harmonics is the same for logarithmically sampled spectra. Using linear frequency sampling, these two spectra cannot be aligned since the number of samples between the harmonics differ.

The magnitude of the spectrum, i.e., \(|\mathbf q_p|\), is assumed to be described by a frequency-shifted (\(\theta _p\)) and amplitude-scaled \((a_p)\) version of the \(L \times 1\) spectral profile vector \({\pmb {\phi }_p}\), given by \(a_p \, \mathbf J_{\theta _p} \, {\pmb \phi }_p\). The shift matrix \(\mathbf J_{\theta _p}\), defined in (5.25), takes care of the frequency shifting needed when updating \({\pmb {\phi }_p}\) with new information. The weighted LS error criterion

$$\begin{aligned} J(\theta _p,a_p)&= (|{\mathbf q}_p| - a_p \, \mathbf J_{\theta _p} \, {\pmb \phi }_p)^T \, {\mathbf D} \, (|{\mathbf q}_p| - a_p \, \mathbf J_{\theta _p} \, {\pmb \phi }_p) \end{aligned}$$
(6.15)

is employed to estimate the unknown parameters \(\theta _p\) and \(a_p\). The primary purpose of the diagonal weight matrix \(\mathbf D\) is to correct for the oversampling at lower frequencies due to the logarithmic sampling. However, the weight matrix \(\mathbf D\) also makes it possible to emphasize frequencies which may be of special interest. Minimization of \(J(\theta _p,a_p)\) with respect to \(\theta _p\) and \(a_p\) yields the following estimators:

$$\begin{aligned} \hat{\theta }_p&= \arg \max _{\theta _p} \left( |\mathbf {q}_p^T| \mathbf D^{\frac{1}{2}} \mathbf J_{\theta _p} \mathbf D^{\frac{1}{2}} {\pmb \phi }_p \right) , \end{aligned}$$
(6.16)
$$\begin{aligned} \hat{a}_p&= |\mathbf {q}_p^T| \mathbf D^\frac{1}{2} \mathbf J_{\hat{\theta }_p} \mathbf D^{\frac{1}{2}} {\pmb \phi }_p. \end{aligned}$$
(6.17)

Design considerations on \(\mathbf D\), as well as details on the minimization of \(J(\theta _p,a_p)\), are described in [55].

Since the spectral profile \({\pmb \phi }_p\) is not known a priori, it can be estimated using exponential averaging of \(|\mathbf {q}_p|\) once shifted to the position of the first harmonic in the spectral profile,

$$\begin{aligned} \hat{\pmb \phi }_{p+1} = (1-\alpha _p)\hat{\pmb \phi }_p + \alpha _p \frac{\mathbf J_{-\hat{\theta }_p} |\breve{\mathbf {q}}_p|}{\Vert \mathbf J _{-\hat{\theta }_p} |\breve{\mathbf {q}}_p| \Vert }, \quad p\ge 0, \end{aligned}$$
(6.18)

where \(\alpha _p\) is set to a positive value \((0< \alpha _p < 1)\), unless \(\mathbf {x}_p\) contains an ectopic beat, large QRS-related residuals, or judged to be unreliable for some other reason, when \(\alpha _p\) is set to zero. The spectral profile \(\hat{\pmb {\phi }}_{0}\) is initialized by setting one frequency equal to one at a position where the DAF is likely to occur, whereas all other frequencies are set to a value close to zero. The notation \(\breve{\mathbf {q}}_p\) signifies that \(\mathbf {q}_p\) has been pre- and appended with a sufficient number of samples to allow for frequency shifting; these additional samples are also set to a value close to zero. Normalization by \(\Vert \mathbf J _{-\hat{\theta }_p} |\breve{\mathbf {q}}_p| \Vert \) in (6.18) is necessary to ensure that the spectral profile allows for meaningful estimation of \(a_p\) in (6.17).

In the spectral profile, the first harmonic has a fixed position throughout the analysis of \(\mathbf {x}_p\), and, therefore, the spectral profile needs to be properly shifted before it can be interpreted as a spectrum. In particular, the first harmonic of the p-th segment, denoted \(\hat{\omega }_{0,p}\), is obtained as

$$\begin{aligned} \hat{\omega }_{0,p} = \hat{\omega }_{0,0} - \hat{\theta }_p. \end{aligned}$$
(6.19)

It should be noted that \(\hat{a}_p\) is a measure of f wave amplitude, thus providing yet another definition to those earlier described in Sect. 6.2. The amplitude estimate \(\hat{a}_p\) may also be used as a normalization factor when evaluating the model error \(J(\hat{\theta }_p,\hat{a}_p)\) in successive signal segments [55].

Fig. 6.5
figure 5

Time–frequency analysis using the spectral profile method applied to a 60-s extracted f wave signal which either a contains a large second harmonic or b lacks a second harmonic. The time–frequency distribution, the DAF series, and the spectral profile (solid line) are displayed from left to right. The spectral profile obtained at the end of the 60-s interval is the one which is displayed. For comparison, the conventional amplitude spectrum (dotted line) is shown in the rightmost diagrams. In both a and b, the variation in the DAF is considerable

Figure 6.5 shows that the harmonics of the spectral profile are considerably less smeared than are those of the amplitude spectrum obtained by Welch’s method. This property can be ascribed to the frequency shifting which is part of the update of the spectral profile in (6.18).

The STFT, the XWVD, and the spectral profile method provide various degrees of insight into the time-varying nature of the DAF, as well as the harmonic composition of the f wave signal. Although time–frequency analysis provides more information than power spectral analysis, its impact on clinical studies has been rather limited. One reason may be the lack of an hypothesis connecting a certain property of the time–frequency distribution to an electrophysiological mechanism. Another reason may be that parameters are largely lacking for characterizing properties which are intrinsic to the time–frequency distribution, one of the few exceptions being the parameter tailored to investigate whether controlled respiration, mediated through the autonomic nervous system, influences the DAF in patients with permanent AF [65]. In that study, the frequency components in the interval 0.15–0.40 Hz of the power spectrum of the DAF series were quantified, since these components are known to reflect modulation of vagal tone, primarily through respiration, and therefore related to parasympathetic activation [66].

6.3.4 Frequency Tracking

When the time-varying characteristics of the harmonic components represent the main focus of investigation, time–frequency analysis may be replaced by single frequency tracking or harmonic frequency tracking, depending on whether one or more harmonic frequencies are of interest to analyze. Of the numerous techniques developed for single frequency tracking, the adaptive line enhancer is probably the most well-known [67, 68], composed of a time-varying bandpass filter H(zn) to enhance the harmonic component of the input signal x(n), and an adaptive algorithm to estimate the instantaneous frequency \(\omega _0(n)\) of the output signal y(n). The resulting estimate \(\hat{\omega }_0(n)\) is used to update the center frequency of the bandpass filter. Single frequency tracking can be extended to harmonic frequency tracking by assigning a time-varying bandpass filter and an adaptive algorithm to each of the harmonic components, resulting in a tracker with filter bank structure.

In the context of f wave characterization, single frequency tracking is part of a method developed for the purpose of selecting suitable patient candidates for restoration of sinus rhythm using catheter ablation [54]. The single frequency frequency tracker, belonging to the class of adaptive line enhancers, assumes that the input signal is modeled by [54, 69]

$$\begin{aligned} x(n) = A_0 e^{j\omega _0 n} + v(n), \end{aligned}$$
(6.20)

where \(A_0\) and \(\omega _0\) denote amplitude and fundamental frequency, respectively; the noise v(n) is assumed to be white. Although the quantities in (6.20) are complex-valued, the model is still relevant to a real-valued signal since its complex-valued analytic representation can be used, defined by the observed, real-valued signal and its Hilbert transform, see Sect. 6.4.1.

A time-varying, first-order bandpass filter with complex-valued coefficients enhances the sinusoidal component in x(n), defined by

$$\begin{aligned} H(z;n) = \frac{1-\beta }{1-\beta e^{j\omega (n)} z^{-1} }, \end{aligned}$$
(6.21)

where \(\omega (n)\) is the time-varying center frequency, and \(\beta \) \((0\ll \beta <1)\) defines the bandwidth. The filter H(zn) has unit gain and zero phase delay at \(\omega (n)\), ensuring that the harmonic component is undistorted.

The center frequency \(\omega (n)\) is estimated by an adaptive algorithm which, at each time instant n, tries to minimize the mean square error (MSE)

$$\begin{aligned} J(n) = E\left[ | y(n) - e^{j\omega (n+1)} y(n-1)) |^2 \right] , \end{aligned}$$
(6.22)

where y(n) denotes the output of H(zn). When \(y(n)=e^{j\omega _0n}\), J(n) is minimized for \(\omega (n)=\omega _0\), thus motivating the definition of J(n) in (6.22). The MSE estimator of \(\omega _0(n)\) is determined by differentiating J(n) with respect to \(\omega (n)\) and setting the result equal to zero, yielding

$$\begin{aligned} \hat{\omega }_0(n+1) = \arg (E\left[ y(n) y^{*}(n-1) \right] ). \end{aligned}$$
(6.23)

Similar to the derivation of the well-known least mean square (LMS) algorithm [68], the expected value may be replaced by its instantaneous estimate at time n,

$$\begin{aligned} \hat{\omega }_0(n+1) \approx \arg ( y(n) y^{*}(n-1)). \end{aligned}$$
(6.24)

Since this estimator is sensitive to noise, exponential averaging is performed so that a smoothed estimate Q(n) of the expected value in (6.23) is produced, while, at the same time, making sure that slow changes in \(\omega (n)\) can be tracked. Hence, together with (6.21), the single frequency tracker is defined by the following two equations:

$$\begin{aligned} Q(n)&= Q(n-1) + \alpha (y(n) y^{*}(n-1) - Q(n-1) ), \end{aligned}$$
(6.25)
$$\begin{aligned} \hat{\omega }_0(n+1)&= \arg (Q(n)), \end{aligned}$$
(6.26)

where \(\hat{\omega }_0(n)\) is an estimate of the DAF and \(\alpha \) \((0<\alpha <1)\) is a weight factor determining the speed of tracking. The estimate \(\hat{\omega }_0(n)\) is inserted in H(zn) so that the next filtered sample can be computed, and so on. Single frequency tracking is illustrated in Fig. 6.6 for an extracted, bandpass filtered f wave signal, where the changes in the DAF are relatively small, oscillating at around 7 Hz, except for a marked increase to 9 Hz after 21 s due to an artifact; after a few seconds, however, \(\hat{\omega }_0(n)\) returns to the earlier estimate.

Fig. 6.6
figure 6

a Extracted, bandpass filtered (4–12 Hz) f wave signal x(n), and b related dominant atrial frequency (DAF), estimated using single frequency tracking (\(\alpha =0.05\), \(\beta =0.95\))

The interest in harmonic analysis, which spurred the development of the spectral profile method, was also part of the motivation to extend the single frequency tracker to the handling of several harmonic frequencies. The starting point is the signal model with K harmonics [69],

$$\begin{aligned} x(n) = \sum _{k=1}^{K} A_k e^{jk\omega _0 n} + v(n). \end{aligned}$$
(6.27)

This model implies that the tracker should have a filter bank structure, consisting of K bandpass filters \(H_k(z;n)\), where each filter has its center frequency at an integer multiple of \(\omega (n)\),

$$\begin{aligned} H_k(z;n) = \frac{1-\beta }{1-\beta e^{jk\omega (n)} z^{-1} }, \quad k=1,\ldots ,K, \end{aligned}$$
(6.28)

see Fig. 6.7. The adaptive algorithm used in single frequency tracking is also employed in harmonic frequency tracking, except that an estimate of the fundamental frequency is computed for each of the K harmonic components \(y_k(n)\),

$$\begin{aligned} Q_k(n)&= Q_k(n-1) + \alpha (y_k(n) y^{*}_k(n-1) - Q_k(n-1) ), \end{aligned}$$
(6.29)
$$\begin{aligned} \hat{\omega }_{0,k}(n+1)&= \frac{\arg (Q_k(n))}{k}. \end{aligned}$$
(6.30)
Fig. 6.7
figure 7

Block diagram of the harmonic frequency tracker, composed of a filter bank with K bandpass filters \(H_k(z;n)\) with harmonically coupled center frequencies and an adaptive algorithm for updating the center frequencies of the filters

A global estimate of \(\omega _0(n+1)\) is obtained as a linear combination of the different estimates \(\hat{\omega }_{0,k}(n)\),

$$\begin{aligned} \hat{\omega }_0(n+1) = \sum _{k=1}^{K} w_k(n) \hat{\omega }_{0,k}(n+1). \end{aligned}$$
(6.31)

The choice of the weights \(w_k(n)\) is based on the same principle as that of weighted averaging, namely that \(w_k(n)\) are inversely proportional to the noise variance, cf. (5.12). Since the noise variance is not defined for the harmonic model in (6.27), the minimum MSE error \(J_{k,\text {min}}(n)\) has been proposed as a surrogate measure of the noise variance [69]. Thus, before \(\hat{\omega }_0(n+1)\) can be computed, \(w_k(n)\) is computed using the following equations:

$$\begin{aligned} \hat{J}_{k,\text {min}}(n)&= \hat{J}_{k,\text {min}}(n-1) \nonumber \\&\quad + \alpha (| y_k(n) - e^{jk\hat{\omega }_0(n)} y_k(n-1)) |^2 - \hat{J}_{k,\text {min}}(n-1) ), \end{aligned}$$
(6.32)
$$\begin{aligned} \hat{E}_{k}(n)&= \hat{E}_{k}(n-1) + \alpha (| y_k(n)|^2 - \hat{E}_{k}(n-1) ), \end{aligned}$$
(6.33)
$$\begin{aligned} \hat{\sigma }_{\omega ,k}^2(n)&= \frac{\hat{J}_{k,\text {min}}(n)}{\hat{E}_{k}(n)}, \end{aligned}$$
(6.34)
$$\begin{aligned} w_k(n)&= \frac{\displaystyle \frac{1}{\hat{\sigma }_{\omega ,k}^2(n)}}{\displaystyle \sum _{i=1}^{K}\frac{1}{\hat{\sigma }_{\omega ,i}^2(n)}}, \end{aligned}$$
(6.35)

where \(\hat{E}_{k}(n)\) is a smoothed estimate of the energy of \(y_k(n)\) which is used to normalize \(\hat{J}_{k,\text {min}}(n)\) so that \(w_k(n)\) reflects the local SNR. It should be noted that the second and higher harmonics are defined as integers of the fundamental frequency, although these frequencies are actually estimated by \(\arg (Q_k(n))\) in (6.30). In contrast to time–frequency analysis, the harmonic frequency tracker produces harmonic signal components as a by-product, useful for various purposes such as the analysis of phase differences, which is the topic of the Sect. 6.4.

A precursor to single and harmonic frequency tracking is the DAF-controlled bandpass filter, designed to produce the first harmonic component, sometimes referred to as the main atrial wave [70, 71]. The DAF-controlled approach to bandpass filtering was introduced to reduce the effect of noise, being of critical importance to the computation of the sample entropy [70], but also as part of a method for characterizing f wave morphology [71]. While single and harmonic frequency tracking update the center frequency of the bandpass filter(s) on a sample-by-sample basis, the center frequency of the DAF-controlled filter is updated on a segment-by-segment basis, estimated in each segment from the power spectrum of the f wave signal.

6.4 f Wave Morphology and Regularity

Certain information on f wave morphology is provided by power spectral analysis and time–frequency analysis, for example, conveyed by the harmonic decay \(\gamma \) which reflects whether f waves have a sinusoid- or a sawtooth-looking morphology, cf. (6.3). However, the phase information is discarded in both these types of analysis, and, consequently, much of the morphologic information is discarded. By decomposing the extracted f wave signal into its harmonic components and comparing the respective phases, information on morphology can be retrieved (Sect. 6.4.1). Since phase analysis requires a relatively high SNR and relatively well-organized f waves, there is a need for robust approaches to morphologic characterization. One such approach considers the few largest eigenvalues of the correlation matrix of the f wave signal as a measure of regularity (Sect. 6.4.2). Another approach considers pairwise similarity of individual f waves, using a robust similarity measure (Sect. 6.4.3). These approaches have in common that they produce a parameter which characterizes the morphology of several, consecutive f waves, rather than the morphology of individual f waves. Hence, “f wave regularity” may be a more appropriate notion than “f wave morphology.” Nonlinear techniques have also been considered for characterizing f wave regularity, including different measures of entropy (Sect. 6.4.4).

6.4.1 Phase Analysis

The classical approach to phase analysis of a lowpass signal x(n) is based on its analytic representation, defined by

$$\begin{aligned} x_A(n) = x(n) + j\tilde{x}(n), \quad n = 0,\ldots ,N-1, \end{aligned}$$
(6.36)

where \(\tilde{x}(n)\) denotes the Hilbert transform of x(n). This transform shifts the phase of the positive frequency components by \(-90^{\circ }\) and the negative ones by \(90^{\circ }\) [72, 73]. Since the analytic signal \(x_A(n)\) is complex-valued, it can alternatively be represented by its magnitude and phase,

$$\begin{aligned} x_A(n) = a(n) e^{j\psi (n)}, \end{aligned}$$
(6.37)

where

$$\begin{aligned} a(n) = \sqrt{x^2(n) + \tilde{x}^2(n)}, \end{aligned}$$
(6.38)
$$\begin{aligned} \psi (n) = \arctan \left( \frac{\tilde{x}(n)}{x(n)}\right) . \end{aligned}$$
(6.39)

Here, the function \(\psi (n)\) defines the notion “phase” in a broad sense, without referring to sinusoidal phase. To interpret \(\psi (n)\) as sinusoidal phase, the polar representation in (6.37) of a narrowband signal y(n), obtained by bandpass filtering of x(n) with center frequency \(\omega _0\), is considered:

$$\begin{aligned} y_A(n)&= y(n) + j \tilde{y}(n) = a(n) e^{j\psi (n)} \nonumber \\&= a(n) e^{j \phi (n)} e^{j \omega _0 n}. \end{aligned}$$
(6.40)

It is easily shown that the real-valued part of y(n), i.e., the part with practical interest, can be expressed as

$$\begin{aligned} y(n) = a(n) \cos (\omega _0 n + \phi (n)). \end{aligned}$$
(6.41)

Before computing the sinusoidal phase \(\phi (n)\), x(n) needs to be bandpass filtered to ensure that it is not a multi-component or broadband signal [74]. Even with the inclusion of bandpass filtering, \(\phi (n)\) is still an instantaneous measurement which is vulnerable to noise.

Another approach to phase analysis is based on statistical modeling of the harmonic signal components [71]. As a first step, the observed signal x(n) is decomposed into K different harmonic components \(y_k(n), k=1,\ldots ,K\), using a filter bank of linear, time-invariant bandpass filters. The center frequency of the filter \(H_1(z)\), producing \(y_1(n)\), is determined by the position of the largest spectral peak of \(S_x(\omega )\), i.e., \(\hat{\omega }_0\). Since the second and higher harmonic frequencies often differ slightly from their expected positions at \(k\hat{\omega }_0\) due to changes in f wave morphology, the center frequencies of \(H_2(z), H_3(z), \ldots \) are determined by searching for the respective peaks in intervals centered around \(k\hat{\omega }_0\), cf. the computation of \(P_{\text {SO}}\) in (6.2). Figure 6.8 illustrates a harmonic power spectrum and the passbands of the bandpass filter bank determined from the spectrum.

Fig. 6.8
figure 8

Harmonic power spectrum and related passbands of the bandpass filters (dotted lines) for producing three harmonic components. The passbands are centered around the spectral peaks (marked with circles) and are increasingly wider at higher harmonic frequencies

In a second step, the bandpass filtered signals \(y_k(n)\) are subject to analysis in nonoverlapping segments with L samples,

$$\begin{aligned} y_{k,p}(n)=y_k(n-pL), \quad n=0,\ldots ,L-1, \end{aligned}$$
(6.42)

where p is the segment number. In [71], the lengths N and L were set to 10 and 0.5 s, respectively, where the latter setting implies that the analysis of f wave morphology is performed almost on a wave-to-wave basis (though the boundaries between f waves are not taken into consideration). In each segment, \(y_{k,p}(n)\) is modeled by a sinusoid in Gaussian, white noise \(v_{k,p}(n)\),

$$\begin{aligned} y_{k,p}(n) = a_{k,p} \sin (\omega _{k,p} n+ \phi _{k,p}) + v_{k,p}(n), \end{aligned}$$
(6.43)

where \(a_{k,p}\), \(\omega _{k,p}\), and \(\phi _{k,p}\) are unknown parameters. The maximum likelihood (ML) estimators of these three parameters are given by [47]

$$\begin{aligned} \hat{\omega }_{k,p}&= \arg \max _{\omega } \left| \frac{1}{L}\sum _{n=0}^{L-1}y_{k,p}(n)e^{-j\omega n}\right| ^2, \end{aligned}$$
(6.44)
$$\begin{aligned} \hat{a}_{k,p}&= \frac{2}{L} \left| \sum _{n=0}^{L-1}y_{k,p}(n)e^{-j\hat{\omega }_{k,p}n}\right| , \end{aligned}$$
(6.45)
$$\begin{aligned} \hat{\phi }_{k,p}&= \arctan \left( \frac{\displaystyle \sum _{n=0}^{L-1}y_{k,p}(n)\cos (\hat{\omega }_{k,p}n)}{\displaystyle \sum _{n=0}^{L-1}y_{k,p}(n)\sin (\hat{\omega }_{k,p}n)} \right) . \end{aligned}$$
(6.46)

Thus, \(\hat{\omega }_{k,p}\) is determined by the position of the largest peak of the periodogram of \(y_{k,p}(n)\), required before estimation of \(a_{k,p}\) and \(\phi _{k,p}\). The accuracy of sinusoidal modeling can be quantified by the MSE \(\varepsilon _p\) between \(x_p(n)\) and its reconstructed, noise-free counterpart \(\hat{x}_{p}(n)\),

$$\begin{aligned} \varepsilon _p = \frac{1}{L} \sum _{n=0}^{L-1} (x_p(n) - \hat{x}_{p}(n))^2, \end{aligned}$$
(6.47)

where

$$\begin{aligned} \hat{x}_{p}(n) = \sum _{k=1}^{K} \hat{a}_{k,p} \sin (k\hat{\omega }_{0,p} n+ \hat{\phi }_{k,p}). \end{aligned}$$
(6.48)

Alternatively, \(\hat{x}_{p}(n)\) may be obtained by replacing \(k\hat{\omega }_{0,p}\) in (6.48) with \(\hat{\omega }_{k,p}\) as suggested by the model in (6.43).

In a third step, the phase parameters characterizing f wave morphology are computed, defined by the differences between \(\hat{\phi }_{2,p}\) and \(\hat{\phi }_{1,p}\), \(\hat{\phi }_{3,p}\) and \(\hat{\phi }_{1,p}\), and so on. A straightforward comparison of two phase estimates is, however, not meaningful since the estimates relate to different frequencies, i.e., \(\hat{\omega }_{k,p}\) and \(\hat{\omega }_{1,p}\), and therefore not comparable. To solve this problem, \(\hat{\phi }_{k,p}\) is converted to the same scale as \(\hat{\phi }_{1,p}\) by division with k. Moreover, since the k-th harmonic completes about k periods when the first harmonic completes one period, the k-th harmonic is periodic by \(2\pi \) in its own scale, and approximately periodic by \(2\pi /k\) in the scale of \(\hat{\phi }_{1,p}\). Therefore, the phase difference \(\hat{\theta }_{k,p}\) is computed using the following expression:

$$\begin{aligned} \hat{\theta }_{k,p} = \frac{\hat{\phi }_{k,p}}{k} - \hat{\phi }_{1,p} \pm \frac{l \cdot 2\pi }{k}, \quad k=2,\ldots ,K, \end{aligned}$$
(6.49)

where \(\hat{\phi }_{1,p}\) is adjusted with an integer multiple l of \(2\pi /k\) to become unique within the interval \([-\pi /k, \pi /k]\).

Characterization of f wave morphology using phase information is illustrated in Fig. 6.9, where f waves are positioned according to \(\hat{\theta }_{2,p}\). The phase difference \(\hat{\theta }_{3,p}\) usually plays a much more subordinate role, since \(\hat{a}_{3,p}\) is usually much smaller than \(\hat{a}_{2,p}\), and thus \(\hat{\theta }_{3,p}\) has much less influence on f wave morphology. It is noted that a change of \(\hat{\theta }_{2,p}\) by \(\frac{\pi }{4}\) results in reversed wave polarity. Moreover, Fig. 6.9 shows that f waves positioned at about \(-\frac{\pi }{8}\) have a steeper upslope than downslope, whereas f waves positioned at the opposite position, i.e., about \(\frac{3\pi }{8}\), have a downslope steeper than the upslope.

Fig. 6.9
figure 9

Morphologic f wave characterization based on the phase difference \(\hat{\theta }_{2,p}\), defined in (6.49) and confined to the interval \([-\frac{\pi }{2},\frac{\pi }{2})\). The diagram is a variant of the well-known phasor diagram whose range is here adjusted to suit \(\hat{\theta }_{2,p}\). The f waves are generated using the sawtooth model in (3.1)

Clustering of f wave segments is an application where the phase differences \(\theta _{k,p}\) have been explored, with the aim of determining a representative, reconstructed f wave signal better suited for morphologic characterization than the observed f wave signal itself, see Fig. 6.10 [71].

Fig. 6.10
figure 10

a Extracted 10-s f wave signals obtained from six different patients with persistent AF, and b reconstructed f waves judged to be representative of the corresponding signals in (a). Nonoverlapping 0.5-s segments of x(n) are clustered based on \(\theta _{1,p}\) and \(\theta _{2,p}\), after which the f waves belonging to the largest cluster are reconstructed; for details, see [71]

Considering that the spectral characteristics of the f waves can change over time, there is a risk that the harmonic frequencies wander outside the passbands of the time-invariant bandpass filters \(H_1(z), H_2(z),\ldots ,H_{K}(z)\)—a risk that increases with increasing length of the signal segment used for designing the filter bank. When such a situation arises, the harmonic components \(y_{k,p}(n)\) become less reliable, with repercussions on the reliability of \(\hat{\theta }_{k,p}\). This problem can be addressed by adaptively tracking the harmonic frequencies, using, for example, the algorithm described in Sect. 6.3.4 [54]. With such tracking, the filter passbands are updated on a sample-to-sample basis, implying that the phase differences can be estimated on a sample-to-sample basis from the harmonic components \(y_k(n)\). Thus, the segment-based estimate \(\hat{\theta }_{k,p}\) is replaced by \(\hat{\theta }_{k}(n)\).

Once \(y_k(n)\) is available, the instantaneous phase \(\phi _k(n)\) is computed by

$$\begin{aligned} \hat{\phi }_k(n) = \arctan \left( \frac{\tilde{y}_k(n)}{y_k(n)}\right) , \quad k=1,\ldots ,K, \end{aligned}$$
(6.50)

followed by computation of the instantaneous phase difference \(\hat{\theta }_{k}(n)\). Since \(\hat{\phi }_k(n)\) is vulnerable to noise, lowpass filtering of the phase difference \(\hat{\theta }_{k}(n)\) has been suggested [54]. The filtering was accompanied by the hypothesis that a change in f wave morphology is reflected by a change in the slope of a straight line which, in a sliding window, is fitted to \(\hat{\theta }_{2}(n)\); higher-order phase differences were not analyzed. Morphologic regularity was quantified by the variance of the resulting slopes: a variance close to zero indicated a strong coupling between the first and the second harmonics, and vice versa.

6.4.2 PCA-Based Characterization of Regularity

Since phase analysis is only suitable for f wave signals with a relatively high SNR, PCA-based approaches have been investigated which, to some extent, trade morphologic detail for robustness. In particular, the mapping of estimated parameters to wave morphology offered by phase analysis, cf. the signal model in (6.43), is traded for a more robust, data-driven characterization of f wave regularity where the link to a signal model is lost.

The starting point of PCA is the data matrix \(\mathbf {X}\), formed by dividing the extracted f wave signal into M nonoverlapping segments containing N samples each,

$$\begin{aligned} \mathbf {X}= \begin{bmatrix} \mathbf {x}_{1}&\mathbf {x}_{2}&\cdots&\mathbf {x}_{M} \end{bmatrix}, \end{aligned}$$
(6.51)

where each column \(\mathbf {x}_p\) has been centered. The signal segments, forming the columns in \(\mathbf {X}\), have not been aligned relative to any fiducial point. Thus, the definition of \(\mathbf {X}\) in (6.51) differs from the one in (5.100), where the columns have been aligned relative to the occurrence times of the QRS complexes. Time alignment was also involved in the study which first pursued PCA-based characterization of AF signals [75]; in that study, the occurrence times of the atrial activations in the intracardiac electrogram were used for alignment.

The principal components are associated with the variances given by the eigenvalues \(\lambda _1,\ldots ,\lambda _N\) of the \(N \times N\) sample correlation matrix \(\hat{\mathbf {R}}_x \!=\! \frac{1}{M} \mathbf {X} \mathbf {X}^T\), cf. Sect. (5.6.1). When f wave morphology is regular across the analyzed signal segments, only a few eigenvectors are required to represent the f waves. A measure of how well the K most significant eigenvectors represent, on average, the M signals in \(\mathbf {X}\) is provided by the normalized, cumulative sum of the K largest eigenvalues [30, 75,76,77]:

$$\begin{aligned} R_K = \frac{\displaystyle \sum _{i=1}^K \lambda _i}{\displaystyle \sum _{i=1}^{N} \lambda _i}, \quad 0<R_K\le 1, \end{aligned}$$
(6.52)

where \(\lambda _i\) are sorted in decreasing order \(\lambda _1> \lambda _2> \cdots > \lambda _N\) and \(K\ll N\).Footnote 4 Interestingly, \(R_5\) has been used to quantify the overall quality of ECG signals in various types of arrhythmia [79], though not on extracted f wave signals.

Fig. 6.11
figure 11

a Regular and b irregular f wave signals characterized by \(R_3= 0.40\) and 0.25, respectively [77]

Figure 6.11 illustrates \(R_3\) for two different signals: one with regular f wave morphology, and another with more irregular morphology and higher noise level. The difference in signal characteristic is well-reflected by \(R_3\). Since the f wave signals \(\mathbf {x}_p\) are not aligned, the ensemble \(\mathbf {X}\) is heterogenous, leading to much lower values of \(R_3\) than what is often reported in studies on ECG analysis.

A minor variation on \(R_K\) as a measure of regularity is to determine the number of eigenvalues K needed to make \(R_K\) exceed a certain preset level, and then use that particular value of K as a measure of regularity [80]. Obviously, a smaller K indicates a more regular signal since fewer eigenvectors are, on average, required to reconstruct the analyzed signal.

As a complement to \(R_K\) which characterizes the overall regularity of all signal segments in \(\mathbf {X}\), the reconstruction error associated with \(\mathbf {x}_p\), using the K most significant eigenvectors, may serve as a measure of regularity in individual segments. The reconstruction error of the p-th segment is defined by

$$\begin{aligned} \varepsilon _{p,K} = \frac{1}{N} (\mathbf {x}_p - \hat{\mathbf {s}}_{p,K})^T (\mathbf {x}_p - \hat{\mathbf {s}}_{p,K}), \end{aligned}$$
(6.53)

where the reconstructed signal \(\hat{\mathbf {s}}_{p,K}\) results from projecting \(\mathbf {x}_p\) on the K most significant eigenvectors of \(\hat{\mathbf {R}}_x\),

$$\begin{aligned} \hat{\mathbf {s}}_{p,K} = \pmb {\varPhi }_K \pmb {\varPhi }_K^T \mathbf {x}_p, \end{aligned}$$
(6.54)

with

$$\begin{aligned} \pmb {\varPhi }_K = \begin{bmatrix} \pmb {\varphi }_1&\pmb {\varphi }_2&\cdots&\pmb {\varphi }_{K} \\ \end{bmatrix}. \end{aligned}$$
(6.55)

It is noted that the expected value of \(\varepsilon _{p,K}\) is related to \(R_K\) through the following expression [30]:

$$\begin{aligned} E\left[ \varepsilon _{p,K}\right]&= \frac{1}{N} \sum _{i=K+1}^{N} \lambda _i = \frac{1}{N} (1 - R_K ) \sum _{i=1}^{N} \lambda _i. \end{aligned}$$
(6.56)

Early on in the history of automated ECG analysis, \(\varepsilon _{p,K}\) was used to exclude noisy QRS complexes and artifacts from classifying QRS complexes in single-lead ECGs. However, a set of Gaussian functions were then used instead of the eigenvectors in (6.55) [81]. More recently, related to f wave characterization, the definition in (6.53) has been generalized so that it applies to multi-lead ECGs, with the aim of characterizing stationarity of atrial wavefront patterns during AF [80, 82], see Sect. 6.6.

6.4.3 Similarity-Based Characterization of Regularity

Morphologic similarity is a crucial feature when clustering QRS complexes [83,84,85], which may be quantified by the correlation between two QRS complexes once they have been properly aligned in time, cf. (5.159). The correlation-based approach requires that QRS detection and QRS delineation have been performed. This approach can be applied to f waves as well, but then requiring that detection and delineation of individual f waves have been performed [86]. Compared to QRS detection and QRS delineation, the conditions under which the corresponding f wave algorithms should operate are much more challenging since f waves wax and wane and sometimes completely disappear. Moreover, since there is no clinical consensus on what defines f wave onset and end, delineation performance cannot be evaluated on annotated databases. For the algorithms proposed in [86], the occurrence time and onset of each f wave are determined using mathematical morphology operators [87,88,89]. It should be noted that only f wave onset needs to be determined since f wave end is identical to the onset of the subsequent f wave.

The main idea behind the correlation-based approach is to first assess morphologic similarity for all pairwise combinations of the M different f waves \(x_i(n),i=1,\ldots ,M,\) contained in the analyzed segment. The resulting correlation coefficients are then merged into one single parameter describing morphologic regularity. Since Pearson’s correlation coefficient suffers from the disadvantages of being invariant to changes in amplitude and vulnerable to impulsive noise, the signed correlation coefficient (SCC) has been proposed, avoiding these disadvantages by coarse quantization of the observed signal \(x_i(n)\) into three parts (“trichotomization”) [90]:

$$\begin{aligned} x_{t,i}(n) = \left\{ \begin{array}{ll} 1, &{} \quad x_i(n) \in S_p,\\ 0, &{} \quad x_i(n) \in S_z, \\ -1, &{} \quad x_i(n) \in S_n. \end{array} \right. \end{aligned}$$
(6.57)

The signal space is spanned by the positive subspace \(S_p\), the zero subspace \(S_z\), and the negative subspace \(S_n\), which are mutually disjunct. Each subspace is defined by a set of signal-dependent thresholds which can be fixed or variable over time. Before trichotomization, \(x_i(n)\) is normalized by its maximum amplitude or some other suitable signal feature.

The products computed in Pearson’s correlation coefficient are replaced by signed products of the two trichotomized signals \(x_{t,i}(n)\) and \(x_{t,j}(n)\), denoted \(\otimes \) and defined by

$$\begin{aligned} x_{t,i}(n) \otimes x_{t,j}(n) = \left\{ \begin{array}{ll} 1, &{} \quad x_{t,i}(n) = x_{t,j}(n),\\ -1, &{} \quad x_{t,i}(n) = -x_{t,j}(n) \text{ and } x_{t,i}(n) \ne 0, \\ 0, &{} \quad \text {otherwise}, \end{array} \right. \end{aligned}$$
(6.58)

where \(i,j=1,\ldots ,M\). Hence, the SCC is given by

$$\begin{aligned} P_{\text {SCC},i,j}&= \frac{\displaystyle \sum _{n=0}^{N-1} x_{t,i}(n) \otimes x_{t,j}(n)}{\displaystyle \sqrt{\sum _{n=0}^{N-1} x_{t,i}(n) \otimes x_{t,i}(n)}\sqrt{\sum _{n=0}^{N-1} x_{t,j}(n) \otimes x_{t,j}(n)}} \nonumber \\&= \frac{1}{N} \sum _{n=0}^{N-1} x_{t,i}(n) \otimes x_{t,j}(n). \end{aligned}$$
(6.59)

Similar to Pearson’s correlation coefficient, the signed correlation coefficient is limited to \(-1 \le P_{\text {SCC},i,j} \le 1\), where 1 and \(-1\) correspond to identical morphology but with equal or opposite polarity, respectively. Due to trichotomization, the product of the square root terms in the denominator of (6.59) equals N. Since the length of \(x_{t,i}(n)\) typically varies from f wave to f wave, the shortest signal of \(x_{t,i}(n)\) and \(x_{t,j}(n)\) determines N; the length is determined after alignment.

In a simplified version of the SCC, the trichotomization in (6.57) is omitted, i.e., \(x_{t,i}(n) \equiv x_i(n)\), and the signed product is redefined so that dichotomization is performed on the difference between \(x_{t,i}(n)\) and \(x_{t,j}(n)\) [86],

$$\begin{aligned} x_{t,i}(n) \otimes x_{t,j}(n) = \left\{ \begin{array}{ll} 1, &{} \quad |x_{t,i}(n) - x_{t,j}(n)| \le \eta ,\\ -1, &{} \quad |x_{t,i}(n) - x_{t,j}(n)| > \eta . \end{array} \right. \end{aligned}$$
(6.60)

The threshold \(\eta \) can be taken as a percentage of the combined peak-to-peak amplitudes of \(x_{t,i}(n)\) and \(x_{t,j}(n)\).

Based on \(P_{\text {SCC},i,j}, i,j=1,\ldots ,M\), whether determined using dicho- or trichotomization, morphologic regularity can be quantified by the following function [86]:

$$\begin{aligned} \kappa (r) = \frac{2}{M(M-1)} \sum _{i=1}^M \sum _{j=i+1}^M \exp \left[ -\frac{(P_{\text {SCC},i,j} -1)^2}{r^2} \right] , \end{aligned}$$
(6.61)

where \(0\le \kappa (r)\le 1\). The function \(\kappa (r)\) reaches its maximum when all f waves have identical morphology, i.e., \(P_{\text {SCC},i,j}=1\) for all combinations of i and j. The parameter r \((r>0)\) can be viewed as a threshold determining whether pairs of f waves are similar, i.e., fewer pairs are similar when r is set to a value close to zero. While exponents other than two in (6.61) have been investigated, this choice has been found to yield good overall performance [86, 91]. Thus, the three parameters \(M, \eta \), and r need to be set in the correlation-based approach to characterizing f wave regularity.

Figure 6.12 illustrates the use of \(\kappa (r)\) for an extracted f wave signal exhibiting substantial variation in both amplitude and morphology. It is noted that \(\kappa (r)\) approaches zero in intervals with waning f waves, but is close to one in intervals with waxing f waves.

Fig. 6.12
figure 12

a Extracted f wave signal and b related regularity function \(\kappa (r=1)\) computed in a sliding window, using \(M=5\) and \(\eta =0.15\)

6.4.4 Entropy-Based Characterization of Regularity

Entropy measures provide information on nonlinear characteristics of a signal which is complementary to the information provided by linear transformation methods such as spectral analysis and PCA. The signal characteristic quantified by entropy is usually referred to as complexity, with regularity, predictability, repeatability, and self-similarity as alternative descriptions. For f wave signals, entropy may also be viewed as a measure of “AF organization” [92]—a term originating from electrogram-based analysis where the aim is to quantify the organization of local activity as well as the spatial organization (coordination) between different regions of the atria [93]. However, a widely accepted definition of “AF organization” is unfortunately missing.

A large number of entropy measures have been proposed, most of them resulting from different approaches to estimation [94, 95]. Shannon entropy \(I_{\mathrm {ShEn}}\) [96], approximate entropy \(I_{\mathrm {ApEn}}\) [97], sample entropy \(I_{\mathrm {SampEn}}\) [98], spectral entropy \(I_{\text {SE}}\) [99], wavelet entropy [100], conditional entropy [101], and fuzzy entropy [91] have all been investigated in the realm of AF, either to characterize RR interval irregularity in AF detection and AF management (Chaps. 4 and 7, respectively) or f wave regularity, i.e., the topic of this section.

In an early study, \(I_{\mathrm {SampEn}}\) was used to predict the termination of AF episodes in ambulatory ECG recordings [78]. The results showed that \(I_{\mathrm {SampEn}}\) could not distinguish terminating from nonterminating AF, probably due to the often poor signal quality which precluded reliable computation of \(I_{\mathrm {SampEn}}\). In a later study, it was shown that both \(I_{\mathrm {SampEn}}\) and \(I_{\mathrm {ApEn}}\) are sensitive to the presence of spike artifacts [102], i.e., QRS-related residuals, which would lead to improper characterization of f wave regularity. Thus, the accuracy of \(I_{\mathrm {SampEn}}\) depends on the prevailing signal quality.

A means to reduce the influence of noise is to bandpass filter the extracted f wave signal, implemented either by reconstructing the signal from the wavelet coefficients of the scale containing the DAF [103, 104], or using the output of a bandpass filter whose center frequency is defined by the DAF [70], i.e., the approach employed in phase analysis, cf. Sect. 6.4.1. Interestingly, when computing \(I_{\mathrm {SampEn}}\) from a DAF-controlled bandpass filtered signal with a 3-Hz bandwidth [70], termination of paroxysmal AF could be predicted in the database previously analyzed in [78] without success. This result demonstrates that AF termination is associated with a change in f wave regularity which becomes increasingly more regular just before termination. It also demonstrates that entropy-based prediction calls for bandpass filtering of the f wave signal.

The idea to use a DAF-controlled bandpass filter was later expanded into a DAF-controlled filter bank, composed of harmonically-related bandpass filters, cf. Sect. 6.4.1, thus making it possible to compute \(I_{\mathrm {SampEn}}\) for each harmonic component [105]. Since \(I_{\mathrm {SampEn}}\) does not in itself convey any information on the strength of a harmonic component, a measure of strength is needed to judge the significance of the harmonics. In [105], strength was quantified by the relative energy of the second and the third harmonic components.

Before computation of \(I_{\mathrm {SampEn}}\), three parameters need to be set: the length m of the two subsequences to be compared, the similarity tolerance r, and the number of samples N, cf. the definition in (4.12). With respect to m and r, an early recommendation was to use \(m=1\) or 2 together with \(0.1\le r/\sigma _x \le 0.2\), where \(\sigma _x\) denotes the standard deviation of the analyzed signal [106,107,108]. This recommendation, which was based on biomedical signals with relatively slow dynamics, was later found to be less appropriate for signals with fast dynamics [109], thus motivating an investigation of how to choose optimal values of m and r in applications where f wave characterization is required. Using \(I_{\mathrm {SampEn}}\) to predict termination of paroxysmal AF and outcome of electrical cardioversion in persistent AF, the choice of m and r was found to have significant influence on prediction performance [110]. In particular, when optimizing the performance of a predictor or classifier, the results suggested that a wider range of values of m and r should be considered than suggested by the early recommendation.

The sampling rate of the f wave signal influences the computation of \(I_{\mathrm {SampEn}}\), since the probability that two subsequences are identical, i.e., the maximum norm in (4.11) is below r, becomes increasingly higher as the sampling rate becomes increasingly faster, i.e., the sample-to-sample changes become increasingly smaller. To mitigate the problem that oversampling can produce misleading values of \(I_{\mathrm {SampEn}}\), a lag of L samples may be introduced between successive samples in the two subsequences for comparison, where L is related to the degree of oversampling [111]. When counting the number of similar subsequences in (4.12), only those which are L samples apart are considered. The lag may be determined from the properties of the autocorrelation function of the analyzed signal, e.g., its first zero-crossing [111]. Using simulated signals, the lag-based definition of \(I_{\mathrm {SampEn}}\) was found to produce consistent results at different sampling rates, while the original definition did not.

A straightforward approach to choosing the sampling rate is to rely on knowledge from spectral analysis of the f wave signal, suggesting that frequencies up to about 25 Hz are relevant and thus a sampling rate of at least 50 Hz should be used. However, higher frequencies may still be relevant to the computation of \(I_{\mathrm {SampEn}}\), therefore motivating the use of a sampling rate higher than 50 Hz. Yet another approach is to choose the sampling rate which offers the best performance, for example, when the aim is to predict AF termination or to predict the outcome of electrical cardioversion [110]; for these two prediction problems, the best-performing sampling rate was found to be as high as 250 Hz.

The number of samples N should be chosen large enough so that the dynamics of several f waves is captured, where at least one second of the f wave signal is used to compute \(I_{\mathrm {SampEn}}\) [110]. While the choice of N is related to the sampling rate, there seems to be general consensus that N should not be less than 200–250 samples, irrespective of sampling rate, to provide reasonably accurate estimates of \(I_{\mathrm {SampEn}}\) [110, 112, 113].

6.5 Signal Quality Control

Several indices have been proposed for assessing the overall quality of ECG signals, e.g., the relative power of baseline variation, signal kurtosis, and the ratio of the number of beats detected by two different QRS detectors where one detector is tuned to be more sensitive to noise than the other [79, 114]. Unfortunately, these indices do not provide information on whether f wave characterization can be reliably performed. Therefore, a few methods have been developed for assessment of the signal quality, operating either in the time domain (Sect. 6.5.1) or the frequency domain (Sect. 6.5.2). Segments are discarded if the signal quality index (SQI) fulfills certain criteria. A completely different approach to dealing with poor signal quality is to postprocess the series of DAF estimates resulting from time–frequency analysis of the f wave signal [115].

6.5.1 Time Domain Analysis

Model-based assessment of signal quality explores basic information of the f wave signal, such as the variational patterns of amplitude and repetition rate. The harmonic model in (6.27), but with phase also included, is useful for such assessment [116]. Building on the observation that the variation in the DAF is restricted in short signal segments, a model signal can be reconstructed accounting for local variation in frequency and amplitude. The SQI is defined by the error between the observed signal and the reconstructed model signal.

The f-waves are modeled by a complex signal defined by the sum of K harmonically related, complex exponentials with fundamental frequency \(\omega _0\), corrupted by additive, white, complex Gaussian noise v(n),

$$\begin{aligned} x(n)&= \sum _{k=1}^{K} A_k e^{j(k\omega _0 n + \phi _k)} + v(n), \quad n=0,\ldots ,N-1, \end{aligned}$$
(6.62)

where \(A_k\) and \(\phi _k\) denote the amplitude and phase, respectively, of the k-th harmonic. The parameters \(A_1, \phi _1, \ldots , A_{K}, \phi _{K}\), contained in the \(2K \times 1\) vector

$$\begin{aligned} \pmb {\theta }&= \begin{bmatrix} A_1&\phi _1&\cdots&A_{K}&\phi _{K} \end{bmatrix}^T, \end{aligned}$$
(6.63)

and \(\omega _0\) are assumed to be deterministic, but unknown. In matrix format, the model in (6.62) is given by

$$\begin{aligned} \mathbf {x}&= \mathbf {Z}(\omega _0)\mathbf {a}(\pmb {\theta }) + \mathbf {v}, \end{aligned}$$
(6.64)

where \(\mathbf {a}(\pmb {\theta })\) is a \(K \times 1\) vector,

$$\begin{aligned} \mathbf {a}(\pmb {\theta }) = \begin{bmatrix} A_1e^{j\phi _1}&\cdots&A_{K} e^{j\phi _{K}} \end{bmatrix}^T. \end{aligned}$$
(6.65)

and \(\mathbf {Z}(\omega _0)\) is an \(N \times K\) Vandermonde matrix containing the frequency information,

$$\begin{aligned} \mathbf {Z}(\omega _0) = \begin{bmatrix} 1&1&\cdots&1\\ e^{j \omega _01}&e^{j2\omega _01}&\cdots&e^{jK\omega _01}\\ \vdots&\vdots&\ddots&\vdots \\ e^{j \omega _0(N-1)}&e^{j2\omega _0(N-1)}&\cdots&e^{jK\omega _0(N-1)}\\ \end{bmatrix}. \end{aligned}$$
(6.66)

Unfortunately, joint ML estimation of \(\mathbf {a}(\pmb {\theta })\) and \(\omega _0\), defined by [116],

$$\begin{aligned}{}[\hat{\omega }_0,\hat{\pmb {\theta }}] = \arg \min _{\omega _0,\pmb {\theta }} \Vert \mathbf {x}-\mathbf {Z}(\omega _0) \mathbf {a}(\pmb {\theta })\Vert ^2, \end{aligned}$$
(6.67)

does not result in closed-form expressions of the estimators \(\hat{\omega }_0\) and \(\hat{\pmb {\theta }}\). Therefore, a suboptimal, two-step approach is considered in which \(\mathbf {a}(\pmb {\theta })\) is first determined by LS estimation, followed by insertion of the resulting \(\hat{\mathbf {a}}(\pmb {\theta })\) into the ML estimator of \(\omega _0\). For a given \(\omega _0\), the LS estimator is given by [117], see also (5.53):

$$\begin{aligned} \hat{\mathbf {a}}(\pmb {\theta }) = (\mathbf {Z}(\omega _0)^H\mathbf {Z}(\omega _0))^{-1}\mathbf {Z}(\omega _0)^H\mathbf {x}. \end{aligned}$$
(6.68)

Inserting \(\hat{\mathbf {a}}(\pmb {\theta })\) in (6.67), the ML estimator of \(\omega _0\) is defined by

$$\begin{aligned} \hat{\omega }_0 = \arg \min _{\omega _{0,\mathrm {min}} \le \omega _0 \le \omega _{0,\mathrm {max}}} \Vert \mathbf {x}-\mathbf {Z}(\omega _0) (\mathbf {Z}(\omega _0)^H\mathbf {Z}(\omega _0))^{-1} \mathbf {Z}(\omega _0)^H\mathbf {x} \Vert ^2, \end{aligned}$$
(6.69)

where minimization is performed using a grid search over the frequency interval \([\omega _{0,\mathrm {min}},\omega _{0,\mathrm {max}}]\) in which the DAF is likely to be found. The estimate \(\hat{\omega }_0\) represents a global frequency estimate as it is based on the segment with N samples, having a length of several seconds.

Variation in the DAF is allowed by dividing \(\mathbf {x}\) into P overlapping subsegments \(\mathbf {x}_p\), \(p=1,\ldots ,P\). Each subsegment contains L samples, with L chosen so that the subsegment contains at least one f-wave. For each subsegment, a local frequency estimate \(\hat{\omega }_{0,p}\) is determined, using

$$\begin{aligned} \hat{\omega }_{0,p} = \arg \underset{ |\omega _{0,p}-\hat{\omega }_0| \le \varDelta \omega _0}{\min } \Vert \mathbf {x}_p-\mathbf {Z}_L(\omega _{0,p})(\mathbf {Z}_L(\omega _{0,p})^H\mathbf {Z}_L(\omega _{0,p}))^{-1}\mathbf {Z}_L(\omega _{0,p})^H\mathbf {x}_p \Vert ^2, \end{aligned}$$
(6.70)

where \(\mathbf {Z}_L(\omega _{0,p})\) consists of the first L rows of \(\mathbf {Z}(\omega _{0,p})\) and \(\varDelta \omega _0\) is the maximum deviation from \(\hat{\omega }_0\) in any of the P subsegments. This implies that \(\hat{\omega }_{0,p}\) accounts for short-time variation as long as it does not deviate more than \(\varDelta \omega _0\) from \(\hat{\omega }_0\).

Reconstruction in terms of the signal part in (6.62) has the disadvantage of yielding a fixed amplitude and a fixed phase within the analyzed N-sample segment, thus motivating the use of a basis vector approach which can produce a signal with time-varying amplitude. The local DAF estimates \(\hat{\omega }_{0,p}\) are used to create constant-amplitude basis vectors \(\mathbf {b}_k\)\(k=1, \ldots ,K\), describing the phase variation of the signal. The vector \(\hat{\mathbf {a}}_p(\pmb {\theta }_p)\), containing local amplitude and phase information, is obtained using the LS estimator in (6.68), but with \(\mathbf {Z}_L(\hat{\omega }_{0,p})\) replacing \(\mathbf {Z}(\omega _{0})\) and \(\mathbf {x}_p\) replacing \(\mathbf {x}\). The vector \(\mathbf {y}_{k,p}\) is then computed from \(\phi _{k,p}\), i.e., the phase of the k-th element of \(\hat{\mathbf {a}}_p({\pmb {\theta }_p})\), and \(\hat{\omega }_{0,p}\),

$$\begin{aligned} \mathbf {y}_{k,p} = \begin{bmatrix} k\hat{\omega }_{0,p}0+\phi _{k,p} \\ \vdots \\ k\hat{\omega }_{0,p}(L-1)+\phi _{k,p} \end{bmatrix}, \quad p=1,\ldots ,P. \end{aligned}$$
(6.71)

Since the related phase vectors \(\mathbf {y}_{k,p}\) are overlapping, the overlapping parts are averaged to produce a global \(N \times 1\) phase vector \(\mathbf {y}_k\) which then is used to construct the constant-amplitude basis vector \(\mathbf {b}_k\),

$$\begin{aligned} \mathbf {b}_k = \cos (\mathbf {y}_k), \end{aligned}$$
(6.72)

capturing the phase variation in \(\mathbf {x}\).

The time-varying amplitude of the reconstructed signal is described by the \(N \times 1\) vectors \(\pmb {\alpha }_k\),

$$\begin{aligned} \pmb {\alpha }_k = \begin{bmatrix} \alpha _k(0)&\alpha _k(1)&\cdots&\alpha _k(N-1) \end{bmatrix}^T, \end{aligned}$$
(6.73)

whose maximum sample-to-sample variation in \(\alpha _k(n)\) is limited by \(\varDelta \alpha _k\),

$$\begin{aligned} |\alpha _k(n)-\alpha _k(n-1)| \le \varDelta \alpha _k. \end{aligned}$$
(6.74)

The tolerance \(\varDelta \alpha _k\) should be chosen so that the variation in f-wave amplitude is captured, but not the variation due to noise. The model signal \(\hat{\mathbf {s}}\) is obtained by summing the elementwise product of the basis vectors and the amplitude estimates of the harmonic components,

$$\begin{aligned} \hat{s}(n) = \sum _{k=1}^{K} \hat{\alpha }_k(n) b_k(n), \end{aligned}$$
(6.75)

where \(b_k(n)\) denotes the n-th element of \(\mathbf {b}_k\). The amplitude estimator \(\hat{\pmb \alpha }_k\) is obtained by minimizing the following expression:

$$\begin{aligned} \hat{\pmb \alpha }_k = \arg \underset{\pmb \alpha _k}{\min } \sum _{n=0}^{N-1} \Vert \alpha _k(n) b_k(n)- \mathrm {Re}\left[ x(n)\right] \Vert ^2, \end{aligned}$$
(6.76)

which, along with the \(N-1\) constraints in (6.74), defines a convex optimization problem which is solved numerically; the notion “Re” denotes the real part.

The SQI is defined by the normalized RMS of the model error \(\hat{\mathbf {e}}=\mathbf {x}-\hat{\mathbf {s}}\),

$$\begin{aligned} S = 1-\frac{\displaystyle \sigma _{\hat{e}}}{\displaystyle \sigma _x}, \end{aligned}$$
(6.77)

where \(\sigma _{\hat{e}}\) and \(\sigma _x\) denote the RMS of \(\hat{\mathbf {e}}\) and \(\mathbf {x}\), respectively. For any reasonable estimate of \(\hat{\mathbf {s}}\), S is restricted to the interval [0, 1], where 0 indicates poor signal quality and 1 indicates perfect modeling of \(\mathbf {x}\). A fixed threshold \(\eta _S\) can be used to indicate whether the f-waves in the analyzed segment have sufficient quality for characterization, see Fig. 6.13.

Fig. 6.13
figure 13

Illustration of signal quality assessment. a ECG signal obtained from a patient with AF, b extracted f wave signal containing a noisy episode, c signal quality index S (solid line) and threshold \(\eta _S\) defining acceptable signal quality (dashed line), and d extracted f wave signal where the low-quality segment has been removed based on the information in c. The segment lengths N and L were set to 5 and 0.5 s, respectively

6.5.2 Frequency Domain Analysis

A disadvantage with the spectral profile method is its lack of control of what goes into the update of the spectral profile: the spectrum of a segment with large QRS-related residuals is just as influential as is the spectrum of a segment with noise-free f waves. Although the spectral profile can have a slow adaptation rate which limits the sensitivity to occasional noisy segments, several consecutive noisy segments will cause the spectral profile to lose its structure and, accordingly, the DAF estimates can no longer be trusted. Once the spectral profile has lost its structure, the recovery time may become unacceptably long, even if subsequent segments are associated with a harmonic structure. This limitation can be remedied by adopting a spectral modeling approach in which the spectrum of each segment is checked before entering the update of the spectral profile [118]. A harmonic spectrum is modeled as a sum of Gaussian functions (cf. (5.68)),

$$\begin{aligned} S_x(\omega ,\pmb {\theta }_p) = \sum _{k=1}^{K} A_{k,p} \exp \left[ -\frac{(\omega -k\omega _{0,p} - \varDelta _{k,p})^2}{2\sigma _{k,p}^2} \right] , \end{aligned}$$
(6.78)

where K is the number of Gaussians, \(A_{k,p}\) is the spectral magnitude, \(\sigma _{k,p}\) is the width, and \(\varDelta _{k,p}\) is the frequency jitter associated with the second and higher frequencies \(k\omega _{0,p}, \ k=2,\ldots ,K\); thus, \(\varDelta _{1,p}=0\). The model parameter vector \(\pmb {\theta }_p\), containing

$$\begin{aligned} \pmb {\theta }_p = \begin{bmatrix} A_{1,p}&\cdots&A_{K,p}&\sigma _{1,p}&\cdots&\sigma _{K,p}&\varDelta _{2,p}&\cdots&\varDelta _{K,p}&\omega _{0,p} \end{bmatrix}^T, \end{aligned}$$
(6.79)

is estimated by minimizing the following weighted LS error criterion with respect to \(\pmb {\theta }_p\) :

$$\begin{aligned} J(\pmb {\theta }_p) = (|\mathbf {q}_p| - \mathbf {s}(\pmb {\theta }_p))^T \mathbf {D} \mathbf {E}_p (|\mathbf {q}_p| - \mathbf {s}(\pmb {\theta }_p)), \end{aligned}$$
(6.80)

where \(\mathbf {q}_p\) is the nonuniform, windowed Fourier transform of the analyzed signal segment, defined in (6.12). The vector \(\mathbf {s}(\pmb {\theta }_p)\) is obtained by sampling the Gaussian model in (6.78) at the logarithmic frequencies \(\nu _l\) defined in (6.14), yielding

$$\begin{aligned} \mathbf {s}(\pmb {\theta }_p) = \begin{bmatrix}S_x(\nu _0,\pmb {\theta }_p)&\cdots&S_x(\nu _{L-1},\pmb {\theta }_p) \end{bmatrix}^T. \end{aligned}$$
(6.81)

The matrices \(\mathbf {D}\) and \(\mathbf {E}_p\) are both diagonal, but handle different aspects of spectral weighting. Identical to the spectral profile method, \(\mathbf {D}\) corrects for the oversampling at lower frequencies due to the logarithmic sampling. The matrix \(\mathbf {E}_p\), on the other hand, is designed so that the frequency intervals in \(|\mathbf {q}_p|\) with harmonic components are weighted with one, whereas the remaining intervals are weighted with a value close to zero; thus, this matrix is segment-dependent, while \(\mathbf {D}\) is not. Details on the design of the matrices \(\mathbf {D}\) and \(\mathbf {E}_p\), as well as the multidimensional optimization procedure associated with \(J(\pmb {\theta }_p)\), can be found in [118].

A set of parameters characterizing the harmonic pattern is introduced to decide whether \(\mathbf {q}_p\) should be excluded from the spectral profile update, i.e., whether or not \(\alpha _p\) in (6.18) should be set to zero. The following three parameters, of which the first two relate to the model in (6.78), are used to exclude spectra which do not exhibit a harmonic structure [118]:

  1. 1.

    The minimized error \(J(\hat{\pmb {\theta }}_p)\), quantifying the similarity between \(\mathbf {q}_p\) and the model spectrum \(\mathbf {s}(\hat{\pmb {\theta }}_p)\).

  2. 2.

    The width \(\hat{\sigma }_{1,p}\), characterizing the spectral peak of the first harmonic.

  3. 3.

    The ratio of the maximum magnitude between the first and the second harmonics and the magnitude of the first harmonic, picking up the occurrence of spurious peaks between the first and the second harmonic.

For poor-quality signals, Fig. 6.14a, b present the spectral profile when computed without and with application of the exclusion criteria. It is obvious that the dominant peak becomes much more distinct when noisy segments are excluded from the update of the spectral profile. For good-quality signals, the spectral profile remains essentially unchanged after application of the exclusion criteria, see Fig. 6.14c, d.

Fig. 6.14
figure 14

Spectral profiles a before (dashed line) and b after application of exclusion criteria (solid line), obtained from extracted f wave signals containing large-amplitude QRS residuals. Spectral profiles c before and d after application of exclusion criteria, obtained from f wave signals with good quality

6.6 Spatial Characterization

Most parameters proposed for f wave characterization are defined with reference to single-lead analysis, and extended to multi-lead ECG analysis by simply computing the parameters on a lead-by-lead basis. This approach has the disadvantage of ignoring intrinsic spatial information resulting from joint analysis of available leads. The vectorcardiographic f wave loops, defined by the orthogonal leads X, Y, and Z, provide basic spatial information (Sect. 6.6.1), whereas body surface potential mapping (BSPM) can provide much more comprehensive spatial information on AF activation patterns (Sect. 6.6.2). For example, the regions which are responsible for AF maintenance may be localized from such maps, with potential implications on AF treatment since regional information may contribute to improve the planning of an ablation procedure [119]. From an engineering viewpoint, spatial characterization of body surface maps is still in its infancy, leaving much room for the development of robust, tailored signal processing algorithms. So far, most types of spatial analysis are extended versions of single-lead analysis, e.g., estimation of the DAF and phase analysis.

Body surface potential mapping is also the starting point for reconstruction of the potentials on the epicardial surface of the heart—a technique known as ECG imaging (ECGI). From the time sequence of epicardial potentials, electrograms can be constructed at different locations on the epicardium. Since ECG imaging involves several advanced aspects which are far outside the scope of this book, such as techniques for solving the inverse problem and imaging techniques to obtain subject-specific information on the geometries of the heart and the torso surfaces (based on computer tomography or magnetic resonance imaging), the interested reader is referred to the literature in this area [20, 120,121,122,123,124].

6.6.1 Vectorcardiogram Loop Analysis

The precursor to vectorcardiogram (VCG) loop analysis of f waves was a study which investigated the characteristics of loops in atrial flutter [125]. Since the reentry circuit of isthmus-dependent atrial flutter is known to contribute significantly to the VCG, it was hypothesized that flutter loops would be mostly contained in a two-dimensional plane whose orientation is approximately parallel to the reentry circuit. To corroborate this hypothesis, the planarity of each flutter loop was determined, as well as the orientation of the plane, described by the azimuth and elevation angles relative to the frontal plane. By analyzing the VCG synthesized from the 12-lead ECG,Footnote 5 recorded in patients before undergoing catheter ablation of atrial flutter, it was shown that flutter loops were mainly planar and had orientations concentrated to a narrow region of azimuth and elevation angles, likely corresponding anatomically with the expected flutter circuit. Atrial flutter waves in intervals without ventricular activity were analyzed on a wave-by-wave basis, i.e., each flutter wave was delineated manually.

This study laid the foundation for a number of studies investigating f wave loops [128,129,130], see also [131]. In contrast to flutter waves, f waves are less organized, and, therefore, spatial f wave analysis is more difficult to pursue. Spatial analysis can either be based on individual f waves in TQ intervals [128, 129] or an extracted signal containing multiple f waves. The latter case is preferable when low-amplitude f waves and noise, in combination with short TQ intervals, are to be analyzed [130]. Moreover, in the latter case, there is no need to delineate individual f waves, but a segment of the extracted signal can be analyzed. The data matrix is formed by the three orthogonal leads X, Y, and Z,

$$\begin{aligned} \mathbf {X}= \begin{bmatrix} \mathbf {x}_{\text {X}}&\mathbf {x}_{\text {Y}}&\mathbf {x}_{\text {Z}} \end{bmatrix}^T. \end{aligned}$$
(6.82)

where each column vector, i.e., lead, contains N samples. Segment lengths of 1-s and 60-s were analyzed in [130].

The orientation of the plane-of-best-fit is defined as the two-dimensional projection of the loop producing the minimum MSE with respect to the original loop. The plane is determined from eigenanalysis of the sample correlation matrix of the data in \(\mathbf {X}\), resulting in the three eigenvectors \(\pmb {\varphi }_1, \pmb {\varphi }_2\), and \(\pmb {\varphi }_3\) associated with the eigenvalues \(\lambda _1 \ge \lambda _2 \ge \lambda _3\). The eigenvector \(\pmb {\varphi }_1\) defines the principal axis, i.e., the axis with the largest correlation among the data, \(\pmb {\varphi }_2\) spans the plane-of-best-fit together with the principal axis, and \(\pmb {\varphi }_3 = \begin{bmatrix} \varphi _{3,\text {X}}, \varphi _{3,\text {Y}}, \varphi _{3,\text {Z}}\end{bmatrix}^T\) is the perpendicular axis which defines the azimuth and elevation angles of the plane-of-best-fit:

$$\begin{aligned} \phi _{\text {AZ}}&= \arctan \left( \frac{\varphi _{3,\text {Z}}^{}}{\varphi _{3,\text {X}}^{}} \right) , \end{aligned}$$
(6.83)
$$\begin{aligned} \phi _{\text {EL}}&= \left| \arctan \left( \frac{\varphi _{3,\text {Y}}^{}}{\sqrt{\varphi _{3,\text {X}}^2+\varphi _{3,\text {Z}}^2} } \right) \right| , \end{aligned}$$
(6.84)

where \(-90^{\circ }< \phi _{\text {AZ}} < 90^{\circ }\) and \(0< \phi _{\text {EL}} < 90^{\circ }\). Loop planarity is defined as [132]

$$\begin{aligned} \psi _{\text {PL}} = \frac{\lambda _3}{\lambda _1+\lambda _2+\lambda _3}, \end{aligned}$$
(6.85)

which is close to zero when when the loop is essentially planar. Thus, the characterization of a segment containing several f waves embraces the three parameters \(\phi _{\text {AZ}},\phi _{\text {EL}}\), and \(\psi _{\text {PL}}\) [130].

Although the results from VCG loop analysis have had few implications on AF treatment, they have still provided certain qualitative information. Notably, varying degrees of organization have been observed, where the more organized cases have their plane-of-best-fit near the sagittal plane [128]. Moreover, a relatively weak coupling between loop morphology and the DAF was observed, suggesting that both these parameters may have a place in AF classification [130]. Analysis of the pseudo-VCG, defined by the leads V\(_5\), aVF, and V\(_1\), suggests that changes in loop morphology may be used to predict conversion from AF to atrial tachycardia, information which in turn may be used to establish when the therapy is on an effective path [133].

6.6.2 Body Surface Potential Mapping

Noninvasive, spatiotemporal analysis of electrical activation patterns may be performed on a body surface map constructed from a large number of leads which are placed on the anterior and posterior thorax. In the context of AF, such analysis was first considered in [134], with the overall aim of establishing whether single wavefronts as well as multiple simultaneous wavefronts, previously observed in intracardiac maps [135,136,137], could also be observed in body surface maps. Of the 56 leads, recorded during four minutes, 40 were arranged in matrix format on the anterior thorax and 16 on the posterior thorax. The traditional approaches to cardiac mapping of invasive data, i.e., isopotential mapping and isochronal mapping, were adopted for visualizing and analyzing cardiac activation [134]. The isopotential map displays the voltage for different electrode positions on the body surface at a given time instant, with contour lines connecting points of equal voltage. The isochronal map displays contour lines which connect points of equal activation time, often accompanied by one or several arrows to indicate the major propagation path. While it is straightforward to construct a isopotential map from the samples of the multi-lead f wave signal, the isochronal map requires that the activation time is determined for each electrode position. Each isochronal contour line is identified from the isopotential map as the line for which the voltage is equal to zero; to have a single representation of each activation wavefront, instead of having both forward and backward movement of the wavefront (i.e., atrial de- and repolarization), only points with a positive slope should be used for identification of the contour line. To improve spatial resolution, interpolation can be applied to the isopotential map, which in turn implies improved resolution of the isochronal map.

The information conveyed by noninvasive isochronal maps has been assessed qualitatively by classifying maps into the following three types [134], originally developed for electrogram-based analysis [138]: Type I (single wavefront), Type II (single wavefront with wave breakages and splitting), or Type III (multiple simultaneous wavefronts or none at all). On a data set consisting of 14 patients with persistent AF, all three types were represented, leading the authors to conclude that isochronal mapping has the potential to characterize activation patterns in AF. However, no comparison was made to invasively recorded activation maps. Figure 6.15 illustrates isopotential and isochronal maps, in both cases determined from a subinterval of an f wave.

Fig. 6.15
figure 15

(Reprinted from [134] with permission)

a Isopotential maps obtained at three time instants of an f wave, using a 56-lead system for body surface potential mapping. Each isopotential map is composed of two submaps: one based on the anterior leads and another, smaller based on the posterior leads. The solid, black line in each map connects the points with zero voltage. b Isochronal map of the f wave in (a), where contour lines are drawn every 2 ms. Note that the three zero-voltage lines in (a) are also part of the isochronal map, indicated by the numbers 1, 2, and 3.

Accurate identification of isochronal contour lines calls for high-quality signals, which in BSPM analysis implies the use of bandpass filtering to reduce the influence of baseline wander (particularly critical when finding the time for zero voltage) and myoelectrical noise. So far, TQ-based f wave analysis has been performed instead of f wave extraction to avoid the risk of analyzing QRS-related residuals [14, 80, 119, 134, 139]. Even when these precautions are taken, f wave amplitude may be so low that accurate determination of the activation times is not possible, especially for leads positioned far away from the atria. Since an isochronal map displays only one activation, variation in f wave amplitude and morphology may call for multiple maps, rendering the interpretation more complex [140].

Noninvasive isofrequency mapping in AF means the construction of a map displaying the spatial distribution of the DAF (“DAF map”), where the DAF is estimated in each lead using any of the techniques described in Sect. 6.3.1. Since the DAF map does not require the determination of activation times, its computation is much more straightforward. An important application of the DAF map is the identification of high-frequency sources which play an important role in the maintenance of AF [119]. Knowledge on the location of such sources are expected to improve the planning and outcome of ablation—an expectation supported by results obtained from invasive DAF maps showing that ablation guided by the identification of high-frequency sources increases the likelihood for long-term maintenance of sinus rhythm [141]. A comparison of the locations of the highest frequency source in the surface and invasive DAF maps, where the latter map served as the reference, demonstrated statistically significant correlation [119]. The agreement between these two types of DAF map is illustrated in Fig. 6.16, where the highest frequency source has similar location in both types of map.

Fig. 6.16
figure 16

(Reprinted from [119] with permission)

a Electrograms recorded at different atrial sites and related power spectra, with the dominant atrial frequency (DAF) indicated, and b surface ECG leads and related power spectra. c Invasive DAF map obtained by electroanatomical mapping. The arrow points to the right atrial (RA) region with highest DAF. d Noninvasive DAF map with superimposed locations of the electrodes used in (b). The following acronyms are used: coronary sinus (CS), left atrial (LA), left inferior pulmonary vein (LIPV), left superior pulmonary vein (LSPV), right superior pulmonary vein (RSPV), surface left (SL), surface posterior (SP), surface right (SR), and superior vena cava (SVC).

Phase mapping is a tool particularly well-suited for characterizing temporal changes in spatial activation patterns in cardiac fibrillation, notably rotor activity [142]. The term “rotor” refers to an activation wavefront circulating in an organized fashion around a center of rotation (“phase singularity point”). The engine in phase mapping is the Hilbert-based instantaneous phase computation, defined in (6.39), performed at regular time intervals in all the available leads to produce a time sequence of phase maps (“phase movie”). From this movie, the presence of a phase singularity point is identified as the site where the curved activation wavefront and wavetail of the rotor meet each other, i.e., a point where the phase of the rotating waves progresses through a complete cycle from \(-\pi \) to \(\pi \) [142, 143], see also [144, 145].

Identification of phase singularities is important since they pinpoint where the tissue is capable of supporting rotors which drive AF. Hence, such points represent potential targets for ablation. The significance of rotor-guided ablation has been studied in patients with persistent AF, mostly with promising results [146,147,148], although poor efficacy has also been reported [149]. In these studies, the instantaneous phase map was computed from intracardiac electrograms.

As noted in Sect. 6.4.1, stable, one-dimensional phase analysis requires that the f wave signal is bandpass filtered before phase computation—an operation which is equally needed in phase mapping. It has been demonstrated that bandpass filtering, with center frequency defined by the highest DAF of all available ECG leads, provides more accurate identification of phase singularity points than when bandpass filtering is omitted [14], see also [150]. By performing bandpass filtering, rotors were found to be more long-lasting, thereby facilitating the study of rotor characteristics such as trajectory, stability, and life span, and promoting atrial sites as potential targets for ablation.

The isopotential, isochronal, isofrequency, and phase maps have in common that they provide a basis for identification of features with electrophysiological interpretation. An overall approach to noninvasive BSPM analysis, disregarding map-specific features, is based on PCA of the temporal sequence of isopotential maps, proposed for quantifying spatial complexity of atrial wavefronts [80], see also [51]. In this approach, spatial complexity is linked to dimensionality reduction: a map which can be approximated by a few eigenvectors is considered less complex (more organized) than a map which requires several eigenvectors. The starting point for analysis is the \(L \times N\) data matrix

$$\begin{aligned} \mathbf {X} = \begin{bmatrix} \mathbf {x}(0)&\mathbf {x}(1)&\cdots&\mathbf {x}(N-1) \end{bmatrix} \end{aligned}$$
(6.86)

whose columns \(\mathbf {x}(n)\) contain L leads at time n,

$$\begin{aligned} \mathbf {x}(n) = \begin{bmatrix} x_1(n) \\ x_2(n) \\ \vdots \\ x_L(n) \end{bmatrix}, \quad n=0,\ldots ,N-1, \end{aligned}$$
(6.87)

where N is the number of samples subject to analysis. Each column \(\mathbf {x}(n)\) contains a spatial map, and thus \(\mathbf {X}\) contains the entire temporal sequence of maps. Each row of \(\mathbf {X}\), i.e., \(x_l(0),\ldots ,x_l(N-1)\), contains the samples of successive, concatenated TQ intervals of the l-th lead.Footnote 6 The onset and end of each TQ interval is determined either by the intervals related to the occurrence times of the surrounding QRS complexes [80], or delineation of T wave end and QRS onset [139]. As already noted on page 155, the presence of f waves makes delineation challenging, especially when using a delineation algorithm not designed for, nor evaluated on, ECG signals in AF [139, 153].

The normalized cumulative sum \(R_K\) of the K largest eigenvalues \(\lambda _i\), defined in (6.52), obtained from the sample correlation matrix of \(\mathbf {X}\), cf. (5.106), provides a statistical measure of how well \(\mathbf {X}\) is approximated by \(\tilde{\mathbf {X}}\), obtained as a truncated series expansion of separable matrices resulting from SVD of \(\mathbf {X}\),

$$\begin{aligned} \tilde{\mathbf {X}} = \sum _{k=1}^K \sigma _k \mathbf {u}_k \mathbf {v}_k^T, \end{aligned}$$
(6.88)

where \(\sigma _k\) are the ordered singular values and \(\mathbf {u}_k\) and \(\mathbf {v}_k\) are the associated left and right singular vectors, respectively. Thus, for a fixed K, \(\mathbf {X}\) is considered less complex when \(R_K\) is close to one, and vice versa; K was set to 3 in [80, 139]. Alternatively, K can be set to that value which makes \(R_K\) exceed 0.95 [80], and thus \(K_{0.95}\) replaces \(R_K\) as the main information carrier; a larger \(K_{0.95}\) implies higher spatial complexity. To smooth out the influence of temporal variation, \(R_3\) and \(K_{0.95}\) were computed in six consecutive 10-s segments and averaged.

It should be emphasized that the approximation in (6.88) is identical to the one earlier encountered in (6.54). This is realized by forming a data matrix with the reconstructed signals \(\hat{\mathbf {s}}_{p,K}\), i.e., \(\tilde{\mathbf X}=\begin{bmatrix} \hat{\mathbf {s}}_{1,K}&\cdots&\hat{\mathbf {s}}_{P,K} \end{bmatrix}\), so that (6.54) can be expressed as \(\tilde{\mathbf X}= \pmb {\varPhi }_K \pmb {\varPhi }_K^T \mathbf X\). Since \(\pmb \varPhi = \mathbf U\) and \( \mathbf X = \mathbf U \pmb \Sigma \mathbf V^T \), cf. page 184, then

$$\begin{aligned} \tilde{\mathbf X}&= \mathbf U_K \mathbf U_K^T \mathbf X = \mathbf U_K \pmb \Sigma _K \mathbf V_K^T = \sum _{k=1}^K \sigma _k \mathbf u_k \mathbf v_k^T. \end{aligned}$$
(6.89)

For overall characterization of spatial complexity, the number of leads is not as critical as it is for the maps which offer an electrophysiological interpretation. Using PCA, this aspect was investigated by computing a complexity measure closely related to \(R_K\) for a 64-lead map, as well as for 32- and 10-lead maps, where the latter two maps were subsets of the 64-lead map. In particular, the 10-lead map was chosen such that it closely approximated the standard 12-lead ECG [139]. The results demonstrated that similar information can be derived from all three maps, suggesting that the standard 12-lead ECG is actually useful for determining spatial complexity.

6.7 f Wave Characterization in Clinical Applications

This section provides a brief overview of popular clinical applications, where f wave characteristics are explored with the goal of monitoring, detecting, or predicting changes in the atrial activity, either due to procedural intervention or spontaneous in origin. These applications, having emerged during the last decade, call for advances in methodological development as well as for further clinical studies to better establish the significance of f wave characteristics.

Whether monitoring, detection, or prediction is of interest, a single-parameter approach is usually pursued first, involving measurements from the lead with the most prominent f waves. The natural extension of this approach is to consider multi-lead measurements of a single parameter. In decision-oriented applications, for example, the prediction of catheter ablation outcome, a multi-parameter approach is likely to achieve better performance than a single-parameter approach. However, the more parameters involved in the decision-making, the larger needs the data set to be to adequately characterize performance.

6.7.1 Monitoring of Drug Response

The use of antiarrhythmic drugs is one of several approaches to long-term AF management which aims at restoring and maintaining sinus rhythm, an approach known as “rhythm-control therapy,” cf. Sect. 1.8.3. Since antiarrhythmic drugs are moderately effective and may have serious side effects including life-threatening ventricular arrhythmias, it is important to develop ECG-based tests for quantifying the feasibility and dosage of a selected drug by monitoring various f wave characteristics. Such tests may also prove useful for drug development as they avoid the complexity of invasive electrophysiological testing, and offer a valuable complement to pharmacokinetic studies.

Dominant atrial frequency has been extensively studied for a great number of antiarrhythmic drugs designed to increase refractoriness and/or delay conduction of the atrial myocardium [154]. Most studies report on a substantial decrease in the DAF in patients responding to the drug [155,156,157,158,159,160]. This is a desirable result since a lower DAF usually means a more favorable outcome of rhythm-control therapy as it may lead to conversion to sinus rhythm. A decrease in the DAF is illustrated in Fig. 6.17 for an antiarrhythmic drug administered at several occasions during a time span of almost three days; the largest decrease in the DAF took place during the first day.

Fig. 6.17
figure 17

Response of the dominant atrial frequency (DAF) to an antiarrhythmic drug (flecainide). The drug was administered at the onset of the recording and repeated after 16, 27, 42, 52, and 66 h (indicated by dashed lines)

For a drug under development, administered to patients with persistent AF, the short-term dynamics of the DAF was studied using the spectral profile method [161]. The results showed that the “baseline” DAF, i.e., the DAF determined just before the time of the first drug administration, was not predictive of conversion to sinus rhythm. On the other hand, the decrease in the DAF was significantly more rapid in patients converting to sinus rhythm than in those not converting. A similar rapid decrease was observed in the harmonic decay and the standard deviation of the DAF, computed in 1-min intervals, suggesting that drug treatment increases AF organization, as reflected by more pronounced harmonics, and stabilizes the DAF.

So far, entropy and other nonlinear measures have not been considered for non-invasive monitoring and evaluation of drug response.

6.7.2 Prediction of Catheter Ablation Outcome

Outcome prediction performed before catheter ablation can prevent unnecessary procedural risk in patients with low chance of successful AF termination [162]. Conversely, outcome prediction can be useful for selecting patients who require more aggressive ablation techniques than what is offered by catheter ablation. The significance of preoperative outcome prediction applies particularly to patients with persistent AF, since catheter ablation in patients with paroxysmal AF is associated with better success rate. The time span of prediction may differ from study to study: short-term prediction concerns successful AF termination in direct connection with catheter ablation, i.e., intraprocedural outcome [31, 163], while long-term prediction concerns maintenance of sinus rhythm a few months or longer following catheter ablation [19, 56, 164, 165]. Short-term prediction usually represents a simpler task than long-term prediction and is therefore associated with better performance—an observation which should be kept in mind when comparing the results of different studies on outcome prediction.

The significance of f wave amplitude in prediction of catheter ablation outcome has been investigated in patients with persistent AF [19, 31, 56, 163]. Clinical studies have shown that patients with lower f wave amplitude are less likely to benefit from catheter ablation [4, 5].Footnote 7 The lower amplitude may be related to a more disorganized (complex) form of AF, characterized by several activation wavefronts propagating in different directions which lead to wavefront collisions and a lower f wave amplitude.

Outcome prediction can be restricted to analyzing only the lead with the most prominent f waves, typically lead V\(_1\), [56], or all available leads so that lead-dependent measurements can be produced [19, 31, 163]. In [19, 56], both addressing long-term prediction and applying traditional amplitude measures, i.e., peak-to-peak amplitude and spectral power \(|\hat{S}_x(\hat{\omega }_0)|^2\), to the preoperative ECG, no statistically significant difference was found in f wave amplitude between terminating and nonterminating AF. Thus, these two studies, using automated amplitude measurements, do not support the results of the above-mentioned clinical studies [4, 5] which showed that a lower f wave amplitude is predictive of AF recurrence.

Alternatively, amplitude measurements can be derived from a PCA-based rank-one approximation of the data matrix containing the preoperative 12-lead ECG [31], cf. (6.88) with \(K=1\). The main reason for performing PCA-based dimensionality reduction is to retain the main f wave characteristics, while at the same time making amplitude measurements less sensitive to noise due to, for example, loosely attached electrodes. The envelope-based definition of f wave amplitude, illustrated in Fig. 6.1, is applied to the rank-one approximated data matrix. Using this approach in short-term prediction, f wave amplitude was found to differ significantly between terminating and nonterminating AF.

Invasive studies have shown that a low DAF is predictive of long-term catheter ablation outcome in patients with persistent AF [166,167,168]. Similar results have been reported in noninvasive studies, where the DAF also differed significantly between terminating and nonterminating AF, either in lead V\(_1\) [164] or in leads I, aVR, and V\(_5\) [163]; however, no such difference was reported in [56]. Out of several spectral parameters, including the DAF, the position of the second harmonic, the harmonic decay, the spectral concentration, and the spectral power, it was only the harmonic decay that differed significantly between the two groups [56]. The results suggested that patients with more organized AF, reflected by more harmonics, are less likely to relapse to AF following catheter ablation.

Sample entropy could not predict AF termination, irrespective of whether DAF-controlled bandpass filtering was performed [163] or not [31]. Neither could spectral entropy predict AF termination [163].

While the results reported from single-parameter prediction may not be particularly striking, it has been noted that the performance of ECG-derived parameters to predict AF termination and long-term success of catheter ablation in patients with persistent AF is at least as good as that achieved by clinical parameters [163].

6.7.3 Prediction of Cardioversion Outcome

Electrical cardioversion is a well-established, noninvasive procedure with which AF is converted to sinus rhythm by delivering a high energy electrical shock, usually by placing two electrodes on the chest [169], cf. page 14. The shock is synchronized with the QRS complex to avoid delivery during ventricular repolarization, i.e., the T wave, which can induce ventricular fibrillation. Electrical cardioversion is usually accompanied by administration of an antiarrhythmic drug to increase the likelihood of conversion.

Unfortunately, as many as 35% of patients with persistent AF who undergo cardioversion relapse to AF, most of them within two weeks [170]. Consequently, in the same way as prediction of catheter ablation outcome can provide better selection of patients who will maintain sinus rhythm after ablation, prediction of cardioversion outcome can provide better selection of patients. From an engineering perspective, however, there is little difference between the problems of predicting catheter ablation and cardioversion outcome.

Early studies on ECG-based predictors in patients with persistent AF suggest that a lower DAF may be used as a long-term predictor of maintenance of sinus rhythm [7, 171]. Subsequent studies demonstrated the significance of a lower DAF for maintenance, especially when prediction was performed in AF of short duration [172] or when prediction was based on the DAF computed after an unsuccessful shock [173]. However, one study found the harmonic decay, being faster in patients relapsing to AF than in patients maintaining sinus rhythm, to be a more powerful predictor than the DAF, although the DAF was also a statistically significant predictor [174]. In all these studies, the spectral parameters were determined from the extracted f wave signal.

Rather than focusing on the dominant spectral peak, some studies have proposed predictive parameters for quantifying the spectral content of certain scales of the wavelet transform. Using the original ECG, rather than an extracted f wave signal or TQ intervals, the wavelet entropy was proposed as a predictor, computed from the scales containing 20–30 Hz components [175]. Using instead the extracted f wave signal, the sample entropy [103] and the central tendency [176], i.e., a measure describing the degree of signal variability, were computed from the scale containing the DAF and used as independent predictors. The results of these three studies showed that wavelet-based parameters may be used to predict maintenance of sinus rhythm following cardioversion.

6.7.4 Prediction of Spontaneous AF Termination

The question whether it is possible to predict spontaneous termination of an AF episode was highlighted to the engineering community in the PhysioNet/Computing in Cardiology Challenge in 2004 [177, 178]. As a result, several subsequent papers addressed this question using the AF Termination Database (AFTDB) which was made available for this challenge. Prediction of spontaneous termination relates to the hypothesis that subtle changes in f wave characteristics precede AF termination. With successful prediction, the parameters employed to characterize the f wave signal may help to explain why AF is terminating in certain individuals, but not in others. Such information may not only lead to more effective therapy, but also to avoidance of ineffective therapeutic intervention and reduced patient risk.

Early experimental studies, analyzing intracardiac electrograms, showed that prolongation of the DACL is a significant determinant of spontaneously terminating AF episodes in many patients [179, 180]. This result has been shown to carry over to the analysis of the surface ECG, where spontaneous termination is also preceded by a decrease in the DAF [59, 78, 178, 181]. The time course of the decrease before termination differs from study to study, where periods of about 5 to 10 minutes have been reported. In one study, a decrease in the DAF was only observed in patients who converted to sinus rhythm during morning hours, but not in those who converted in the afternoon or evening—results suggesting that electrophysiological mechanisms of termination may be different depending on the time of day [181]. For studies using the AFTDB, the decrease occurred immediately before spontaneous termination [78, 178].

Using parameters derived from the spectral profile, spontaneous termination in AFTDB was best predicted by a low DAF, a slow harmonic decay, and a stable DAF, while f wave amplitude, defined by (6.17), sample entropy, and spectral entropy could not discriminate between terminating and nonterminating AF [78]. Using parameters derived from the STFT, the DAF was, together with the average heart rate, the best-performing predictors [59], while f wave amplitude, defined by \(|\hat{S}_x(\hat{\omega }_0)|^2\), and spectral width did not contribute to better prediction.

Introducing DAF-controlled bandpass filtering of the extracted f wave signal, a decrease in sample entropy was observed before termination [70]. Interestingly, the prediction performance achieved using sample entropy was identical to that achieved using the DAF [78], thus emphasizing the importance of prefiltering to reduce the sensitivity of sample entropy to noise. Similar prediction performance was achieved when the sample entropy was computed from a filtered f wave signal, obtained by reconstructing the signal from the wavelet coefficients of the scale containing the DAF [103, 104]. Wavelet decomposition was later considered for prediction of spontaneous AF termination [182], but then accompanied by computation of the wavelet entropy, defined by the Shannon entropy of the relative energies of the different scales, cf. page 207. However, other nonlinear parameters than entropy have been found to offer better prediction performance on AFTDB; for details, see [92].

6.7.5 Detection and Characterization of Circadian Variation

It is well-known that heart rate and blood pressure increase during daytime and decrease during night-time in healthy subjects. However, many other bodily functions also exhibit circadian variation. Information on circadian rhythms can help to establish proper timing of drug administration so that the effect of a drug can be maximized (chronotherapy)  [183, 184]. The attenuation or absence of circadian variation may be indicative of certain risk conditions.

Circadian variation is driven by various external factors, e.g., sleep–wake routine, meal consumption, emotional state, and intrinsic activity of the autonomic nervous system. The latter type of activity is well-studied in the literature, with results demonstrating that sympathetic tone dominates during daytime activity, while vagal tone dominates during night-time sleep.

Detection and characterization of circadian variation usually involve a sinusoidal model which is fitted to the observed data using LS techniques [184, 185]. In this approach, the offset, commonly referred to as the “midline estimating statistic of rhythm” (MESOR), the amplitude, and the phase of the sinusoid, whose period is 24 h, constitute the model parameters. Detection can be based on a comparison of the MSE associated with two different models, namely 1. the MSE between the observed data and the non-circadian model defined by the MESOR only, and 2. the MSE between the observed data and the sinusoidal model. The most relevant of these two models is determined using a statistical test, for example, a paired bootstrap hypothesis test [77].

With respect to f wave characteristics and circadianity, the DAF was the first parameter to be investigated, determined every sixth hour from 24-h ambulatory recordings in patients with persistent AF [186]. A significant decrease in the DAF was observed at night, and an increase during the morning hours, reaching its maximum during the afternoon hours. To a large extent, these results were reproduced in subsequent studies on patients with persistent or permanent AF, although circadian variation was not detected in all patients [187, 188]. It has been pointed out that the short-term variation often observed in the DAF, uncovered by time–frequency analysis, may exceed the circadian variation, with implications on the accuracy of detecting circadian variation [188].

These studies share the limitation of a short recording duration, ranging from 15 to 24 h  [186,187,188]. Hence, less than one sinusoidal period was available for parameter estimation, implying a large variance of the resulting estimates. To address this limitation, the DAF was studied on 7-day recordings in patients with persistent AF [77]. The results showed that the circadian variation detected in a 7-day recording was not always detected in all seven 24-h periods of the same recording, thus casting doubt on the validity of the conclusions made in [186,187,188]. In addition to the DAF, the eigenvalue-based parameter \(R_3\), defined in (6.52), and the sample entropy were also studied. These parameters exhibited circadian variation although not always in the same patient.