Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Public ECG Databases

The availability of public databases is essential as it enables researchers to establish whether a novel method performs better than the existing ones. Many of the public ECG databases relevant to engineering-oriented research on atrial fibrillation (AF) are available for download at PhysioNet (www.physionet.org), a free web resource with a huge collection of physiological signals and software [1]. The Physionet databases have played, and continue to play, a crucial role in the development of AF detectors and the evaluation of their performance (Chap. 4), whereas they hardly play any role in the development of methods for f wave extraction (Chap. 5) and f wave characterization (Chap. 6).

The PhysioNet databases include beat-based annotations such as occurrence time and type of beat, but often also arrhythmia-based annotations such as type and onset/end of arrhythmia. Annotations on beat occurrence time may be automated and provided by a well-performing QRS detector, whereas arrhythmia-based annotations are usually provided by one or several experts, implying a considerable work effort to annotate a database consisting of long-term continuous ECG recordings. Unfortunately, information on the annotation process is usually scarce, and details are almost invariably missing on the number of annotators involved, the level of expertise among the annotators, and how consensus was reached in cases of disagreement. Considering that some ECG databases have evolved into virtually becoming standards, information on the annotation process should, preferably, be transparent to the user.

In the following, the most popular public databases employed in engineering-oriented research are briefly described.

The MIT–BIH Atrial Fibrillation Database (AFDB) consists of 25 10-h, two-lead ambulatory ECG recordings of patients with AF, mostly paroxysmal [2]. The signals were acquired using an analog device with a bandwidth of approximately 0.1–40 Hz, sampled at a rate of 250 Hz, and quantized with 12-bit resolution over a range of ±10 mV. Two of the 25 recordings contain only the RR interval series, but no ECG signal, and can therefore only be used in RR-based analysis. Information on lead placement is missing.

The database was manually annotated with respect to type of beat, type and onset/end of arrhythmia, resulting in a total of 297 AF episodes with durations ranging from as few as 3 beats to tens of thousands of beats.

The distributions of AF episode duration and RR intervals provide interesting information on the properties of AFDB. Figure 3.1a presents the histogram of episode duration, with an exponential-like decay, except that 29 episodes have durations exceeding 2000 beats. Together, these 29 episodes account for as much as 82% of the total time the patients are in AF; when computed in individual patients, this percentage is commonly referred to as “AF burden.” The fact that a small number of episodes can dominate the total time a patient is in AF highlights an important limitation of the commonly used detection performance measures, to be further discussed in Sect. 4.5.

Fig. 3.1
figure 1

Histograms of a AF episode duration and b RR intervals in AF, determined from the MIT–BIH AF Database

Figure 3.1b presents the histogram of all RR intervals in AFDB, with most RR intervals ranging from 0.3 to 1.5 s. As many as 25% of all RR intervals are shorter than 0.5 s, thus imposing an important constraint on methods exploring f waves in the TQ interval; this constraint applies especially to methods for f wave extraction, see Chap. 5. For an RR interval of 500 ms and a QT interval with a typical length of 350 ms, the TQ interval is only 150 ms, which for a dominant atrial frequency (DAF) of 5 Hz implies that less than one f wave is contained in the TQ interval.

The Long-Term AF Database (LTAFDB) consists of 84 two-lead ambulatory ECG recordings obtained in patients with paroxysmal or persistent AF, lasting from 24 to 25 h [3]. The signals were sampled at a rate of 128 Hz and quantized with 12-bit resolution over a range of ±10 mV. Information on bandwidth and lead placement is missing.

The beat-based annotations were automated, whereas the arrhythmia-based annotations resulted from manual review of the output of a commercial system for ECG analysis. More than 7000 AF episodes are contained in LTAFDB, and therefore it is the public database with the largest number of episodes.

The temporal occurrence pattern of AF episodes is presented in Fig. 3.2 for four different patients; the onset and end of an episode are given by manual annotations. These four examples illustrate that the temporal occurrence pattern can differ dramatically between patients.

Fig. 3.2
figure 2

Examples of temporal occurrence patterns of episodes in paroxysmal AF, obtained from four patients monitored over a 24-h period, being part of the Long-Term AF Database. a A few long episodes which together extend virtually the entire monitoring period, b numerous, often short episodes which together extend virtually the entire monitoring period, c many short episodes aggregated in a 5-h period, and d a short episode followed by a much longer 3-h episode

The AF Termination Database (AFTDB) is a subset of LTAFDB composed of 80 1-min excerpts from patients with spontaneously terminating or persistent AF [4]. The database was compiled for the purpose of predicting spontaneous termination of AF. The 80 records are divided into a training set with 30 records and two test sets with 30 and 20 records, respectively.

The Short Single-Lead AF Database (SSAFDB) consists of 12,186 single-lead ECG recordings obtained from a smartphone-based device, lasting from 9 to 60 s [5]. The signals were sampled at a rate of 300 Hz, quantized with 16-bit resolution over a range of ±5 mV, with a bandwidth from 0.5 to 40 Hz. Although the lead is not specified, the vast majority is lead I since it is the simplest to record with the device.

The database is divided into a training set with 8,528 recordings and a test set with 3,658 recordings. Each recording is manually annotated using the following four categories: 1. Normal sinus rhythm, 2. AF, 3. other rhythm, and 4. too noisy to classify, with 5076, 758, 2415, and 279 recordings in each of the categories of the training set. A category applies to the entire ECG recording, even if an arrhythmia is only partially present. No beat-based annotations are provided.

Since the smartphone-based device is used for home-based screening, and thus operated by the patient, the quality of the recording is generally much lower than, for example, in long-term continuous recordings. In addition, f wave amplitude is generally lower in lead I than in lead V\(_1\), which is the preferred lead for f wave analysis. Signal quality can be quantified using an index which determines the suitability of analyzing f waves in 5-s signal segments [6], see also Sect. 6.5 for a brief description. The signal quality index is normalized to the interval [0, 1], where 1 represents the highest quality; a suitable cut-off value for acceptable signal quality is 0.25. Figure 3.3a presents the histogram of the signal quality index, computed in nonoverlapping, 5-s segments of all recordings of SSAFDB annotated as AF. Using 0.25 as the cut-off value, 83% of all recordings in SSAFDB have a signal quality which is too low for f wave analysis.

Fig. 3.3
figure 3

Signal quality assessed on all AF recordings in a the Short Single-Lead AF Database and b the Lund AF Database (lead V\(_1\)), using an index (S) which determines the suitability of analyzing f waves [6]. The results are presented as relative histograms

The original purpose of compiling SSAFDB was to evaluate the performance of classifiers designed to handle short ECG segments, whereas long-term ambulatory ECG databases such as AFDB and LTAFDB have primarily been used to evaluate performance in terms of how accurately AF episodes can be detected. Thus, different types of algorithms are evaluated on SSAFDB and AFDB/LTAFDB.

The MIT–BIH Arrhythmia Database (MITDB) contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects [7]. The signals were sampled at a rate of 360 Hz and quantized with 11-bit resolution over a range of ±10 mV. Information on bandwidth and lead placement is missing.

Since only eight recordings contain AF, with a total of 105 episodes, the main value of this database is to investigate detection performance in the presence of non-AF arrhythmias such as atrial flutter, bigeminy, and trigeminy.

The MIT–BIH Normal Sinus Rhythm Database (NSRDB) includes 18 long-term ECG recordings of subjects without significant arrhythmias. Hence, only the specificity of an AF detector can be investigated with this database, for example, in the presence of respiratory sinus arrhythmia.

3.2 Non-public ECG Databases

Although public databases have eliminated much of the time-consuming work involved with data collection, the need to collect databases which are well-matched to a particular research problem nevertheless remains. This will ensure that methods development and performance evaluation are carried out on relevant data. For example, the development of methods for f wave characterization calls for databases obtained with ECG leads which are more relevant than those of the above-mentioned public databases. In fact, the collection of matched databases promotes diversity in research in a way which public databases historically have not done. Although most matched databases are non-public at the outset, either proprietary or available at a cost, it can be hoped that they sooner or later become public to benefit a larger group of researchers.

Considering that many public databases were collected using old recording technology, where MITDB is one of the oldest, dating to 1982, another important motivation for collecting databases is to benefit from modern recording technology, offering higher sampling rate, larger bandwidth, lower noise level, more leads, and longer acquisition period.

The Lund AF Database exemplifies the numerous non-public databases collected over the years, with the purpose of developing and evaluating methods for f wave characterization [8]. The database contains 211 12-lead extended ECG recordings obtained at rest from patients with AF, mostly persistent (in some studies, a 1-min segment was extracted from each patient in this database to ensure AF presence throughout the segment). The signals were sampled at a rate of 1000 Hz, quantized with 16-bit resolution over a range of ±10 mV, with a bandwidth from 0.1 to 300 Hz. No annotations are provided.

Figure 3.4a presents the RR interval histogram of the Lund AF Database, resembling the RR interval histogram of AFDB shown in Fig. 3.1b. Since the histogram in Fig. 3.4a is obtained from signals recorded at rest, it would likely have been shifted leftwards towards shorter intervals had the database been recorded during physical activity, with implications on the length of the TQ interval and related analysis.

The histogram of f wave amplitude in lead V\(_1\) is presented in Fig. 3.4b. Here, amplitude is defined as the root mean square (RMS) value of the samples contained in the TQ interval, beginning 350 ms after a QRS complex and ending 50 ms before the preceding QRS complex; no amplitude measurement was made in TQ intervals shorter than 250 ms. Section 6.2 provides an overview of different approaches to measuring f wave amplitude.

Fig. 3.4
figure 4

Histograms of a RR intervals and b f wave amplitude in lead V\(_1\), determined from 1-min segments of the Lund AF Database

Figure 3.3b quantifies that the signal quality in lead V\(_1\) of the Lund AF database is superior to that of SSAFDB. This result is, of course, expected since the former database was recorded during rest, under the supervision of a technician who made sure that the electrodes were properly attached. Using a cut-off value of 0.25, 11% of all recordings have signal quality which is too poor for f wave analysis, to be contrasted with the above-mentioned 83% of SSAFDB.

3.3 Simulation of Atrial Fibrillation

Although databases with ECG signals are central to methodological development and evaluation, model-based simulation offers certain advantages such as the possibility to investigate conditions which are difficult to deal with experimentally and the possibility to control the properties of the simulated signal by a set of parameters. As a result, the agreement between simulated and estimated signals can be quantitatively assessed and expressed in terms of suitable performance measures. If desired, these measures can be computed for simulated signals with different signal-to-noise ratios (SNRs). The simulation advantages were first exploited in the context of f wave extraction, since none of the public ECG databases lend themselves well to performance evaluation, and later in the context of detection of brief AF episodes, since annotated ECG database with such episodes are largely missing.

Three f wave simulation models with widely different complexity are briefly described below. Since none of these models produce a signal with ventricular activity, the simulated f wave signal is usually added to ECG signals obtained from subjects in normal sinus rhythm, provided that the P waves have been first cancelled. In doing so, the inherent variation in QRS morphology, e.g., due to respiration, is transferred from the recorded to the simulated ECG signal—a transfer which is important in f wave extraction since morphologic variation can have substantial influence on performance. The RR intervals of normal sinus rhythm are also transferred to the simulated ECG signal—a transfer which may be acceptable when the simulated ECG signal is investigated for f wave extraction, but clearly unacceptable for AF detection.

The f wave sawtooth model is widely used in algorithmic development, first introduced in [9] and later employed in, e.g., [10,11,12,13,14]. This signal model is defined by a sum of K amplitude- and frequency-modulated sinusoids with harmonically related frequencies,

$$\begin{aligned} d(n)=\sum _{k=1}^{K}a_{k}(n) \sin \left( k \omega _0 n + \frac{\varDelta f}{f_{f}} \sin (\omega _f n) \right) , \quad n=0,\ldots ,N-1, \end{aligned}$$
(3.1)

where \(\omega _0=2\pi f_{0}\) is the fundamental frequency, i.e., the model counterpart to the DAF. The fundamental frequency \(\omega _0\) is modulated by \(\omega _f=2\pi f_{f}\) with a maximum deviation of \(\varDelta f\). The time-varying amplitude \(a_k(n)\) is defined so that d(n) exhibits a sawtooth characteristic,

$$\begin{aligned} a_k(n) = \frac{2}{k \pi }\left( a + \varDelta a \sin (\omega _a n) \right) , \end{aligned}$$
(3.2)

where a is the sawtooth amplitude, \(\varDelta a\) is the maximum modulation amplitude, and \(\omega _a=2\pi f_{a}\) is the modulation frequency of the amplitude. The model in (3.1) offers certain flexibility since both f wave amplitude and frequency are modulated.

An important limitation of the sawtooth model was brought to light when the problem of f wave extraction was addressed using an artificial neural network [15]: the network could learn the predictable changes in amplitude and frequency of the simulated f wave signal, leading to exaggerated performance figures.

The f wave replication model produces a signal based on the observed samples of the TQ intervals [16]; no mathematical modeling is involved. Interpolation between two successive TQ intervals fills in the intermediate QT interval with f wave samples, using the approach originally described in [9]. The f waves of the first TQ interval are replicated in the QT interval and subjected to linear weighting, and the f waves in the second, subsequent TQ interval are replicated in the same way, but time-reversed. The interpolated samples of the intervening QT interval result from summation of the two replicated and weighted signals. Other techniques for TQ-based interpolation are described in Sect. 5.3.

While the f wave replication model can produce realistic signals, neither the repetition rate nor the amplitude of f waves can be controlled. Another major limitation is that the length of the TQ intervals decreases as the heart rate increases, implying that the risk of producing unrealistic f wave signals becomes increasingly higher at higher heart rates.

A much more sophisticated approach to simulating f wave signals is based on a biophysical model of the atria [17], see also [18, 19]. The model is based on anatomical information derived from magnetic resonance imaging, accounting for the entries and exits of the vessels, the locations of the valves connecting the atria to the ventricles, as well as several other aspects. The electrical activity of the atria is modeled in terms of membrane kinetics, where the presence of heterogeneities in action potential duration creates the substrate for sustained AF. Volume conduction theory is employed to describe the propagation of currents from the electrical sources of the atria through the passive body tissues to the body surface, influencing the amplitude and morphology of the simulated multi-lead f wave signals.

Since none of the three above-mentioned simulation models account for switching between non-AF rhythms and AF, they cannot be used when addressing the problem of detecting AF. To fill this void, a model of paroxysmal AF has been proposed [20], including not only rhythm switching but also the possibility to chose whether the simulated signal should be composed of synthetic or real components, described in Sects. 3.4 and 3.5, respectively.

Fig. 3.5
figure 5

Simulation of ECG signals using synthetic components. The same model of QRST complexes is employed in sinus rhythm (SR) and AF

3.4 Simulation of Paroxysmal AF Using Synthetic Components

The simulation of multi-lead ECGs in paroxysmal AF is based on phenomenological, mathematical modeling of ventricular rhythm, ventricular morphology, atrial morphology, and rhythm switching, whereas the noise added to the simulated signal derives from a public database, see Fig. 3.5. Thus, the resulting signal is composed of synthetic components whose properties are controlled by a set of parameters defining, e.g., episode duration, variability of the RR interval series in sinus rhythm and AF, f and P wave morphology, QRST complex morphology, and percentage of atrial premature beats (APBs). For each new realization of the simulated signal, the model parameters are generated randomly from uniform distributions in predefined ranges so that realistic ECG signals with unique intersubject morphologies can be produced.

The simulation model assumes a vectorcardiogram (VCG) lead system initially, consisting of the orthogonal leads X, Y, and Z. Once suitably processed, these leads are transformed to the standard 12-lead ECG system. A detailed description of the simulation model is found in [20], together with a list of the default model parameter values.

3.4.1 Atrial Fibrillation

Ventricular rhythm. A statistical model of the atrioventricular (AV) node with dual pathways is used to generate RR intervals in AF [21]. In this model, the ventricles are assumed to be activated by atrial impulses arriving to the AV node according to a Poisson process with mean arrival rate \(\lambda _a\), which is closely related to the DAF. The joint probability density function (PDF) of the consecutive RR intervals \(x_{0}, x_{1},...,x_{N-1}\) is given by

$$\begin{aligned} p_x(x_{0}, x_{1},...,x_{N-1}) = \prod _{n=0}^{N-1}(\epsilon p_{x,s}(x_{n})+(1-\epsilon )p_{x,f}(x_{n})), \end{aligned}$$
(3.3)

where \(\epsilon \) is the probability of an atrial impulse conducted through the slow pathway, whose refractory period is defined by a deterministic part \(\tau _{s}\) and a stochastic part \(\tau _{s,p}\). Hence, the probability of an atrial impulse to take the fast pathway, whose refractory period is defined by \(\tau _{f}\) and \(\tau _{f,p}\), is \((1-\epsilon \)). For an atrial impulse taking the slow pathway, the interval x between two successive ventricular activations, i.e., the RR interval, is described by the following PDF [21]:

$$\begin{aligned} p_{x,s}(x)=\left\{ \begin{array}{ll} {\displaystyle 0}, &{} \ 0<x<\tau _s,\\ {\displaystyle \frac{\lambda _a(x -\tau _s)}{\tau _{s,p}} \exp \left[ -\frac{\lambda _a(x-\tau _s)^2}{2 \tau _{s,p}} \right] }, &{} \ \tau _s\le x<\tau _s+\tau _{s,p}, \\ {\displaystyle \lambda _a \exp \left[ -\frac{\lambda _a \tau _{s,p}}{2} -\lambda _a (x-\tau _s-\tau _{s,p})\right] }, &{} \ x\ge \tau _s+\tau _{s,p}. \end{array} \right. \end{aligned}$$
(3.4)

The PDF of the fast pathway is described by \(p_{x,f}(x)\), being identical to (3.4) except that \(\tau _{s}\) is replaced with \(\tau _{f}\) and \(\tau _{s,p}\) with \(\tau _{f,p}\). Chapter 7 provides a comprehensive overview of AV node models for simulation of RR intervals in AF, including the statistical AV node model in [21].

f waves. The f wave sawtooth model in (3.1) is supplemented with a stochastic component so that more complex, less predictable f waves can be produced [15]. Using, for convenience, a continuous-time framework, the f wave model signal \(f_l(t)\) of the l-th vectorcardiographic lead is composed of two components,

$$\begin{aligned} f_{l}(t) = d_{l}(t) + s_{l}(t),\quad l \in \{\text {X,Y,Z}\}, \end{aligned}$$
(3.5)

where \(d_l(t)\) is defined similarly to (3.1),

$$\begin{aligned} d_{l}(t) = \sum _{k=1}^{K} a_{l,k}(t) \sin \left( k \varOmega _{l,0} t + \frac{\varDelta F}{F_k} \sin (2\pi F_k t) \right) , \end{aligned}$$
(3.6)

but with the difference that lead dependence is introduced, i.e., \(\varOmega _{l,0} = 2\pi F_{l,0}\) and

$$\begin{aligned} a_{l,k}(n) = \frac{2}{k \pi }\left( a_l + \varDelta a_l \sin (\varOmega _{a,l} n) \right) , \quad k=1,\ldots ,K. \end{aligned}$$
(3.7)

In paroxysmal AF, the DAF (corresponding to \(F_{l,0}\)) is typically contained in the interval 3–7 Hz [3], while, in persistent and permanent AF, it is typically higher and contained in the interval 5–12 Hz. Moreover, it is well-known that the DAF depends on anatomical location [22], which in the model is accounted for by setting \(F_{\text {X},0}\) to a value 5% larger than \(F_{\text {Y},0}\), and \(F_{\text {Z},0}\) to a value 5% smaller than \(F_{\text {Y},0}\). The mean arrival rate \(\lambda _a\) of atrial impulses arriving to the AV node is taken as the average of the frequencies \(F_{\text {X},0},F_{\text {Y},0}\), and \(F_{\text {Z},0}\).

The stochastic f wave component \(s_{l}(t)\) results from multi-bandpass filtering of white noise, with two passbands symmetrically related to \(F_{l,0}\) by \([0.65F_{l,0}, 0.95F_{l,0}]\) and \([1.05F_{l,0},1.35F_{l,0}]\). The variance of the input white noise \(\sigma _{l,s}^2\) is taken as a fraction of the sawtooth amplitude \(a_{l}\) in (3.7).

The first minutes after AF onset and the last minute before AF termination are associated with more organized f waves and a lower DAF [23,24,25], which in the model is accounted for by using bandpass filters with narrower passbands for the first three minutes and the last minute of the episode. A set of bandpass filters is used with gradually wider passbands, starting at \([0.8F_{l,0}, 0.95F_{l,0}]\) and \([1.2F_{l,0},1.35F_{l,0}]\) and ending at \([0.65F_{l,0}, 0.95F_{l,0}]\) and \([1.05F_{l,0},1.35F_{l,0}]\), respectively. To account for the lower DAF, \(F_{l,0}\) is multiplied with a factor which increases linearly from 0.8 to 1 during the first three minutes of an AF episode. Conversely, \(F_{l,0}\) is multiplied with a factor which decreases linearly from 1 to 0.8 during the last minute of an AF episode. Figure 3.6 illustrates simulated f waves at the onset, the midpoint, and the end of an AF episode.

A further generalization of the sawtooth model, to make the f wave signal even less regular, is to employ an adaptive non-harmonic model in which amplitude and frequency modulation is described by a random walk whose steps are sampled from a zero-mean Gaussian distribution [26].

Fig. 3.6
figure 6

Simulated f waves at the onset, the midpoint, and the end of an AF episode, produced by the sawtooth-based model in (3.5)

QRST complexes. The three-dimensional, single-dipole ECG model proposed in [27] is used for simulating QRST complexes, building on the dynamical model based on three coupled, ordinary differential equations [28]. The three orthogonal leads are obtained by projecting the dipole vector onto the recorded leads. The dipole vector, defined by \(q_{\text {X}}(t), q_{\text {Y}}(t)\), and \(q_{\text {Z}}(t)\), is modeled as a summation of P different Gaussian functions,

$$\begin{aligned} q_l(t) = \sum _{p=1}^{P} \alpha _{l,p} \exp \left[ - \frac{(t - \mu _{l,p})^2}{2\sigma _{l,p}^2}\right] , \quad l \in \{\text {X,Y,Z}\}, \end{aligned}$$
(3.8)

where each Gaussian is appropriately scaled in amplitude and time with \(\alpha _{l,p}\) and \(\sigma _{l,p}\), respectively, and shifted in time with \(\mu _{l,p}\). To allow for a wide variety of QRST morphologies, \(\alpha _{l,p}, \sigma _{l,p}\), and \(\mu _{l,p}\) are assigned uniform distributions [20]. In contrast to the models in [27, 28], where the aim was to simulate a signal with recurrent heartbeats, the aim of the paroxysmal AF simulation model is to produce a single QRST complex, and, therefore, the VCG loop defined by the orthogonal leads \(q_{\text {X}}(t), q_{\text {Y}}(t)\), and \(q_{\text {Z}}(t)\) is traversed only once. Amplitude variation is introduced by letting \(\alpha _{l,p}\) vary according to a sinusoidal function whose frequency is randomly chosen in the interval [0.05, 0.15] Hz to mimic Mayer waves.

The resulting three-lead QRST complex \(q_{\text {X}}(t), q_{\text {Y}}(t)\), and \(q_{\text {Z}}(t)\) is placed at the occurrence time produced by the AV node model, accompanied by resampling of the T wave to ensure that the duration fits into the current RR interval. Since the QT interval is usually shorter in AF than in sinus rhythm, it is set to a fixed value (360 ms) based on observations reported in [29, 30].

3.4.2 Sinus Rhythm

Ventricular rhythm. The RR intervals in sinus rhythm are simulated according to the technique described in [28], where parasympathetic stimulation (respiratory sinus arrhythmia) and baroreflex regulation are modeled by a bimodal power spectrum of the RR interval series, defined by two Gaussian functions

$$\begin{aligned} S_{\text {RR}}(\varOmega ) = \frac{P_{1}}{\sqrt{2 \pi \sigma ^{2}_{\text {RR},1}}} \exp \left[ -\frac{(\varOmega -\varOmega _{1})^2}{2 \sigma ^{2}_{\text {RR},1}} \right] + \frac{P_{2}}{\sqrt{2 \pi \sigma ^{2}_{\text {RR},2}}} \exp \left[ -\frac{(\varOmega -\varOmega _{2})^2}{2 \sigma ^{2}_{\text {RR},2}} \right] , \end{aligned}$$
(3.9)

where \(\varOmega _{1}\) and \(\varOmega _{2}\) \((\varOmega _{1}<\varOmega _{2})\) are the mean frequencies with related “variance” \(\sigma ^{2}_{\text {RR},1}\) and \(\sigma ^{2}_{\text {RR},2}\) and spectral power \(P_{1}\) and \(P_{2}\), respectively. The low- to high-frequency power ratio is determined by \(P_{1}/P_{2}\). The higher frequency \(\varOmega _2\) is usually related to the respiratory rate.

The resulting RR interval series is obtained by computing the inverse Fourier transform of the spectrum \(S_{\text {RR}}(\varOmega )\). The desired heart rate and heart rate variability are set by scaling the RR interval series and adding an offset value. Very low frequency oscillations are modeled by a zero-mean component added to the output of the model in [28]. This component is produced by a third-order autoregressive model, identified from a lowpass filtered (cut-off frequency 0.001 Hz) RR interval series taken from NSRDB [20].

P waves. A linear combination of Hermite functions is used to model P waves in the orthogonal leads,

$$\begin{aligned} p_l(t) = \sum ^{3}_{i=1} w_{l,i}\phi _{i}(t),\quad l \in \{\text {X,Y,Z}\}, \end{aligned}$$
(3.10)

where \(w_{l,i}\) are lead-dependent weights. The first three Hermite functions are defined by

$$\begin{aligned} \phi _{1}(t)&= \frac{1}{\sqrt{\sigma _{\text {P},1} \sqrt{\pi }}} \cdot \exp \left[ -\frac{t^2}{2 \sigma _{\text {P},1}^2} \right] , \end{aligned}$$
(3.11)
$$\begin{aligned} \phi _{2}(t)&= -\frac{\sqrt{2}}{\sqrt{\sigma _{\text {P},2}\sqrt{\pi }}} \frac{t}{\sigma _{\text {P},2}} \cdot \exp \left[ -\frac{t^2}{2\sigma _{\text {P},2}^2} \right] , \end{aligned}$$
(3.12)
$$\begin{aligned} \phi _{3}(t)&= \frac{1}{\sqrt{2\sigma _{\text {P},3}\sqrt{\pi }}} \left( \frac{2t^2}{\sigma _{\text {P},3}^2}-1\right) \cdot \exp \left[ -\frac{t^2}{2\sigma _{\text {P},3}^2} \right] , \end{aligned}$$
(3.13)

with mono-, bi-, and triphasic morphology, respectively. The width of \(\phi _{i}(t)\) is determined by \(\sigma _{\text {P},i}\), which is treated as a lead-independent parameter. The Hermite functions were originally proposed in [31] for modeling of QRS complex morphology, and later explored for different purposes in ECG analysis, see, e.g., [32,33,34,35].

Depending on polarity and morphology, P waves may be classified into three different types [36], of which P waves of Type 2 are the ones which are considered for simulation, characterized by positive, monophasic morphology in leads X and Y, and biphasic morphology in lead Z with a transition from negative to positive polarity. This type of P wave is predominant in patients with paroxysmal AF [36, 37]. Since P waves are monophasic in leads X and Y, larger values are assigned to \(w_{\text {X},1}\) and \(w_{\text {Y},1}\), whereas a larger value is assigned to \(w_{\text {Z},2}\) to emphasize the biphasic morphology in lead Z. To account for the fact that P wave morphology varies over time, \(w_{l,i}\) and \(\sigma _{\text {P},i}\) vary according to a sinusoidal function whose frequency is randomly chosen in the interval [0.05, 0.15] Hz.

QRST complexes. The technique used for simulating QRST complexes in AF is also used in sinus rhythm. Resampling of the T wave is based on the well-known Bazett’s formula, setting the corrected QT interval to 420 ms [38]. Immediately after AF termination, T wave duration increases linearly over the next seven beats to produce a smooth QT interval transition from AF to sinus rhythm. The choice of a seven-beat transition is ad hoc, since the QT interval transition in AF has not been much investigated in the literature.

3.4.3 Atrial Premature Beats

Since APBs are frequent in AF patients [39,40,41,42], it is important to account for their presence in the simulation model. Using a simple two-state Markov chain, a certain percentage of APBs is introduced, chosen from the following four types of unifocal APBs [43]:

  1. 1.

    APBs with reset of the sinus node. The sum of the length of the preceding and the subsequent RR intervals is less than twice the normal RR interval, simulated by 20% shortening of the preceding RR interval and by leaving the subsequent RR interval unchanged.

  2. 2.

    Interpolated APBs occur in between two adjacent sinus beats, simulated by splitting an RR interval into two intervals with 60/40 proportions.

  3. 3.

    APBs with delayed reset of the sinus node, simulated by 20% shortening of the preceding RR interval and 20% prolongation of the subsequent RR interval.

  4. 4.

    APBs with full compensatory pause, simulated by 20% shortening of the preceding RR interval, and subtracting the shortened RR interval from twice the normal RR interval to obtain the subsequent RR interval.

The likelihood of generating consecutive APBs, i.e., couplets, triplets, and short runs, is increased by setting the percentage of APBs to a large value. To account for the fact that P waves associated with APBs often deviate in amplitude and morphology from normal P waves in sinus rhythm, a new set of parameter values is generated and used to simulate P waves preceding APBs. The QRST complexes are generated in the same way as is done in sinus rhythm. Figure 3.7 illustrates simulated ECGs with different types of APBs.

Fig. 3.7
figure 7

Simulated ECGs containing a atrial premature beats (APBs) with reset of the sinus node (type 1), b interpolated APBs (type 2), c APBs with delayed reset of the sinus node (type 3), and d APBs with full compensatory pause (type 4)

3.4.4 Respiration

To account for the fact that respiration influences QRST morphology through changes in the electrical axis of the heart, the simulated VCG signal is transformed by a rotation matrix \(\mathbf {Q}(t)\), composed of three successive rotations around each of the axes [44],

$$\begin{aligned} \mathbf {Q}(t) = \mathbf {Q}_{\text {X}}(t) \mathbf {Q}_{\text {Y}}(t) \mathbf {Q}_{\text {Z}}(t). \end{aligned}$$
(3.14)

The three rotation matrices are defined by the time-varying angles \(\varphi _{\text {X}}(t), \varphi _{\text {Y}}(t)\), and \(\varphi _{\text {Z}}(t)\),

$$\begin{aligned} \mathbf {Q}_{\text {X}}(t)&= \left[ \begin{array}{ccc} 1 &{} 0 &{} 0 \\ 0 &{} \cos {\varphi _{{\text {X}}}(t)} &{} \sin {\varphi _{{\text {X}}}(t)}\\ 0 &{} -\sin {\varphi _{{\text {X}}}(t)} &{} \cos {\varphi _{{\text {X}}}(t)} \end{array}\right] , \end{aligned}$$
(3.15)
$$\begin{aligned} \mathbf {Q}_{\text {Y}}(t)&= \left[ \begin{array}{ccc} \cos {\varphi _{\text {Y}}(t)} &{} 0 &{} \sin {\varphi _{\text {Y}}(t)} \\ 0 &{} 1 &{} 0\\ -\sin {\varphi _{\text {Y}}(t)} &{} 0 &{} \cos {\varphi _{\text {Y}}(t)} \end{array}\right] , \end{aligned}$$
(3.16)
$$\begin{aligned} \mathbf {Q}_{\text {Z}}(t)&= \left[ \begin{array}{ccc} \cos {\varphi _{\text {Z}}(t)} &{} \sin {\varphi _{Z}(t)} &{} 0 \\ -\sin {\varphi _{\text {Z}}(t)} &{} \cos {\varphi _{\text {Z}}(t)} &{} 0\\ 0 &{} 0 &{} 1 \end{array}\right] . \end{aligned}$$
(3.17)

It is assumed that angular variation is proportional to the amount of air in the lungs during a respiratory cycle, a property modeled as the product of two sigmoidal functions reflecting inspiration and expiration,

$$\begin{aligned} \psi (t) = \frac{1}{1+e^{-\gamma _{\text {in}}t}} \frac{1}{1+e^{\gamma _{\text {ex}}(t-\delta )}}, \end{aligned}$$
(3.18)

where \(\gamma _{\text {in}}\) and \(\gamma _{\text {ex}}\) define the duration of inspiration and expiration, respectively, and \(\delta \) defines the delay between inspiration and expiration. In lead X, the angular variation across successive respiratory cycles is defined by

$$\begin{aligned} \varphi _{\text {X}}(t) = \sum _{i=0}^{\infty } \xi _{\text {X}} \psi (t-iT_r), \end{aligned}$$
(3.19)

where \(T_r\) is the duration of a respiratory cycle (inversely related to the fixed respiratory frequency, i.e., \(T_r=2\pi /\varOmega _r\)), and \(\xi _{\text {X}}\) is the maximum angular variation. The angular variation in leads Y and Z is determined in a similar way, defined by \(\xi _{\text {Y}}\) and \(\xi _{\text {Z}}\), respectively. The choice of realistic model parameter values is discussed in [45], as well as an extension of the model in (3.19) so that a time-varying respiratory frequency can be accounted for.

In sinus rhythm, the respiratory frequency \(\varOmega _2\) in (3.9), influencing the ventricular rhythm through the autonomic system, should, preferably, be set to \(\varOmega _r\). In AF, the autonomic influence of respiration on ventricular rhythm is not modeled since the cardiorespiratory interaction is negligible [46].

3.4.5 Additive Noise

Three types of noise frequently encountered in ambulatory recordings—baseline wander, muscle noise, and electrode motion artifacts—can be added to the simulated ECG. These types of noise are extracted from the MIT–BIH Noise Stress Test Database, composed of a number of 30-min recordings which predominantly contain baseline wander, electromyographic noise, and electrode motion artifacts [47]. The two leads of the recordings in this database are labeled leads X and Y, whereas the noise in lead Z is constructed by computing the square root of the sum of squares of leads X and Y (an offset value is added before squaring, and the mean is subtracted after taking the square root).

3.4.6 Transformation from VCG to 12-Lead ECG

Different transformation matrices are applied to f waves, P waves, and QRST complexes when computing the standard 12-lead ECG from the VCG. The f wave transformation is based on the inverse of the P wave optimized transformation matrix [48], multiplied with a diagonal scaling matrix determining the tendency of f wave amplitude in the 12-lead ECG [20]. The diagonal matrix accounts for the fact that f wave amplitude is typically largest in V\(_1\) and then gradually decreases as the leads move away from the atria. The decrease in amplitude can be explained by a much more scattered electrical vector in AF than in sinus rhythm, combined with increased distance to the electrode site. The resulting simulated 12-lead ECG with f waves, but not QRST complexes, is illustrated in Fig. 3.8a, and a real 12-lead ECG, whose f waves resemble the simulated ones, is illustrated in Fig. 3.8b.

Fig. 3.8
figure 8

a Simulated f waves produced by the model in (3.5), and b f waves extracted from a real ECG using an echo state network [15]

The inverse of the P wave optimized transformation matrix in [48] is used to reconstruct P waves in the 12-lead ECG, see Fig. 3.9.

Fig. 3.9
figure 9

Ten superimposed realizations of P waves in the standard 12-lead ECG, modeled as a linear combination of the first three Hermite functions using randomly generated weights

The Dower matrix [49, 50] is used to compute the QRST complexes, as well as the noise, in the 12-lead ECG. However, the transformation of the QRST complexes and the noise is done separately so that the noise can be scaled in each lead to the desired RMS value before being added to the 12-lead signal composed of both atrial and ventricular activity.

3.4.7 Switching Between Atrial Fibrillation and Sinus Rhythm

The switching between sinus rhythm and AF is modeled by a two-state continuous-time Markov chain, where the time d spent in a state, also referred to as episode duration, is determined by the exponential PDF

$$\begin{aligned} p(d) = \left\{ \begin{array}{ll} \beta _d e^{-\beta _d d}, &{} \ d\ge 0,\\ 0, &{} \ d < 0. \end{array} \right. \end{aligned}$$
(3.20)

The parameter \(\beta _d\) defines the rate of episodes. The median duration of an AF episode is given by

$$\begin{aligned} \bar{d}_{\text {AF}} = \frac{\ln 2}{\beta _{\text {AF}}}, \end{aligned}$$
(3.21)

where \(\beta _{\text {AF}}\) denotes the rate of AF episodes, cf. (3.4). The median duration of an episode with sinus rhythm is assumed to be given by

$$\begin{aligned} \bar{d}_{\mathrm {SR}} = \frac{B}{(1-B)} \cdot \bar{d}_{\mathrm {AF}}, \end{aligned}$$
(3.22)

where B \((0<B<1)\) determines the total time AF is present, and thus B can be viewed as a descriptor of mean AF burden. The sole parameter controlling episode duration is \(\bar{d}_{\mathrm {AF}}\), and no minimum episode duration is specified.

A more advanced, non-Markovian switching model has been proposed which account for aspects of AF progression related to genetic disposition, age-, and AF history-related remodeling [51]. The model can simulate individual AF episodes as well as the natural progression of AF in patients over a period of decades.

The possibility to generate episodes with varying duration is valuable when simulating arrhythmia progression. Evidence shows that brief episodes progress to longer episodes [52, 53], implying that it is of interest to evaluate detection performance as a function of episode duration. Moreover, brief but rare episodes have been observed in patients after cryptogenic stroke and transient ischemic attack [54,55,56,57]. Such signals can be simulated with the model described in this section, using, for example, a median episode duration of 30 beats and a low AF burden of 0.001.

Fig. 3.10
figure 10

Simulation of ECG signals using real components, taken from the Long-Term AF Database (LTAFDB), the MIT–BIH Normal Sinus Rhythm Database (NSRDB), and the PTB Diagnostic ECG Database (PTBDB)

3.5 Simulation of Paroxysmal AF Using Real Components

Alternatively, the simulator can produce signals based on real ECG components, randomly selected from the three databases which are used to characterize ventricular rhythm, atrial activity (f or P waves), and QRST complexes, see Fig. 3.10. These components, together with the above-described noise types, are added to produce the standard 12-lead ECG.

Ventricular rhythm. The Long Term Atrial Fibrillation Database was used for creating a set of AF rhythms. A total of 69 different RR interval series were extracted from the 84 long-term ECG recordings; the 15 remaining recordings were excluded due to their relatively short duration with AF (<5000 beats). Similarly, the entire NSRDB, consisting of 18 long-term ECG recordings, was used to create a set of sinus rhythms. Switching between paroxysmal AF and sinus rhythm is modeled in the same way as for synthetic components, cf. Sect. 3.4.7.

For each simulated signal, the RR interval series of the prevailing rhythm is randomly selected from the proper rhythm set, and repeated by concatenation until the desired length is attained. While heart rate is often higher in AF than in sinus rhythm, this may not be the case when concatenating randomly selected RR intervals in sinus rhythm and AF. Therefore, whenever the mean RR interval is shorter in sinus rhythm than in AF, the mean RR interval in sinus rhythm is adjusted to become identical to the mean RR interval in AF.

It should be noted that when simulating ECGs using real components, the atrial and ventricular rates are unrelated since the f waves and the RR interval series are extracted from different databases.

f and P waves. A set of 20 segments with real, multi-lead f waves is extracted from the Lund AF database with 12-lead ECGs, acquired from patients with persistent AF [8]. An echo state network was applied for f wave extraction [15], see also Sect. 5.5.3. Lead V\(_{6}\) was used as reference lead when extracting f waves in the remaining 11 ECG leads, whereas lead V\(_{5}\) was used when extracting f waves in lead V\(_{6}\), see Fig. 3.8b.

In sinus rhythm, the original, real P wave, along with the subsequent QRST complex, is retained, while, in AF, only the QRST complex is retained and a continuous f wave signal added.

QRST complexes. A set of 100 15-lead ECGs (12 standard leads plus Frank leads) with sinus rhythm, selected from the Physikalisch–Technische Bundesanstalt Database, serves as the basis for modeling QRST complexes. Following baseline removal and QRST delineation [58], the original T wave is resampled to have a fixed width and then adjusted to the prevailing heart rate according to the procedure described in Sect. 3.4.1. Since the ECGs of this database last for only about two minutes, the QRST complexes are repeated by concatenation until the desired duration is achieved. The TQ interval is interpolated using cubic spline interpolation. All other steps required to generate QRST complexes are similar to those described in Sect. 3.4.1.

Simulated signals composed of either synthetic or real ECG components are illustrated in Fig. 3.11.

Fig. 3.11
figure 11

Simulated 12-lead ECGs containing a brief AF episode, composed of a synthetic components and b real components. Using synthetic components, the 12-lead ECG is obtained from the simulated signals in leads X, Y, and Z, following linear transformations. Using real components, the original 12-lead ECG is taken from the Lund AF database, followed by removal of P waves and addition of extracted f waves

3.6 Relevance of Simulated Signals

The question whether a simulation model produces realistic signals is not easily answered since the term “realistic” is difficult to quantify. Historically, this question has not received much, if any, attention in papers describing simulation models of the ECG, see, e.g., [28, 31, 59, 60], although the models have turned out to be most valuable in the development of signal processing algorithms—an observation which applies particularly to the simulation model in [28]. To provide a quantitative answer, the idea to let expert cardiologists assess blindly the realism of simulated ECG signals was first materialized in [20], involving not only the simulated ECG signals produced by the model in Sect. 3.4, but also real ECG signals [20]. The results showed that the simulated signals were, for the most part, realistic, but they also showed that the approach to modeling of the QT interval in AF needed improvement. To make the outcome of expert assessment more powerful, it would have been desirable with more than two cardiologists so that more far-reaching conclusions could have been drawn.

In the context of AF detection, an indirect approach to evaluating signal realism is to analyze simulated signals using some suitable detector, and then compare the obtained results with those obtained using the same detector on an existing database containing real ECGs [20]. Neither this approach has been considered in the past, although it may provide valuable insight into whether the simulated signals are too “doctored” to be used for the development of AF detectors.

The degree of sophistication of a simulation model is another way to judge model relevance, hinted at in [17] where the f wave replication model was labeled as “primitive” and the above-mentioned model of normal sinus rhythm [28] as “simple,” whereas the biophysical model proposed by the authors themselves was labeled as “more sophisticated” in producing ECG signals. Considering that the biophysical model accounts for detailed electroanatomical information, whereas the other two models do not, such labeling seems reasonable. But does a higher degree of sophistication imply that the model is better suited for the development of signal processing algorithms and performance evaluation? The fact that biophysical models have hardly been considered at all for such purposes provides an answer to this question, with implementational and computational complexity, difficulty to control basic signal characteristics such as f wave amplitude and repetition rate, and the lack of rhythm switching models as probable reasons. From an algorithmic viewpoint, it is not obvious why biophysical models necessarily produce ECG signals which are more relevant than those of phenomenological models, such as the ones described in Sects. 3.4 and 3.5.