1 Introduction

The significance of brief episodes (<30 s) of paroxysmal atrial fibrillation (PAF), also referred to as occult PAF, is currently receiving considerable attention in clinical research [30]. Recent results from prolonged rhythm monitoring support independent association between brief episodes and future risk of stroke. It has been suggested that brief episodes may be coupled to the formation of atrial thrombus, and that brief episodes may be viewed as biomarkers of prolonged episodes occurring outside of the monitoring period [30].

The impact of brief PAF episodes on thrombus formation is a recent ongoing debate, see, e.g., [14, 27], which prompts the need for detection techniques that could help to establish the clinical value of such episodes. Long-term, continuous noninvasive monitoring is likely to improve the AF detection rate [6], but considering the often poor signal quality, it is important to develop robust detectors which minimize the time for manual review of the data.

Both noninvasive and invasive recording technology have been employed for prolonged rhythm monitoring, exemplified by the following two clinical studies. Using mobile cardiac outpatient telemetry, 56 patients with presumed cryptogenic ischemic stroke were monitored [34]. In 23 % of the patients with atrial fibrillation (AF), 85 % of all episodes were brief. Using implantable cardiac monitoring, 11 % of all patients with cryptogenic ischemic stroke had new onset PAF with 5-s episodes or longer [10]. The authors argued that subsequent strokes may be prevented whether patients are monitored during their first month after stroke.

The poor agreement between methods for AF detection has recently been pointed out as an important limitation of clinical studies [30]. Although monitoring devices have been on the market for some time, no information is available on their accuracy to detect occult PAF. Thus, there is not only a need for validation of commercial devices, but also to develop methods for detection of occult PAF.

The vast majority of AF detectors explore RR interval irregularity through parameters which reflect randomness, variability, and complexity, e.g., [9, 12, 20, 29, 33]. While the detectors offer satisfactory performance with respect to longer episodes, occult PAF is precluded since a window length of at least 30 s is usually needed. An interesting RR-based detector was recently proposed where the coefficient of sample entropy was employed to find episodes with as few as 12 beats [18]. When evaluating the performance of this coefficient on short duration ECGs, an area under the receiver operating characteristic (ROC) of 90.2 % was achieved when a 5-s window was used [19].

It is well known that AF detectors relying on information on RR irregularity are prone to produce false alarms in rhythms with atrial premature beats (APBs) [1, 18]. In order to reduce the number of false alarms, information on P-wave absence and f-wave presence appears natural to include in the decision process. However, very few detectors have been described in the scientific literature which explores atrial information: One of the few combining information on RR irregularity with PR interval variability and P-wave morphology [1]. The performance was only slightly better than that achieved by the same detector but without use of atrial information; all episodes shorter than 1 min were excluded.

The AF detector proposed by Carvalho et al. [4, 8] appears to be the first with an architecture that jointly processes information on RR irregularity, P-wave absence, and f-wave presence. An artificial neural network (ANN) was used as classifier, first trained on a huge dataset and then used with fixed values for detection. Similar to other detectors, this detector requires that ventricular premature beats (VPBs) are first located and excluded. Using the MIT–BIH AF database, the performance was not better than that achieved by the RR-based detector in [33]. A possible explanation to this result is that the decision process did not account for the prevailing noise level.

In the present study, a novel AF detector is proposed that embraces four parameters which characterize RR irregularity, P-wave absence, f-wave presence, and noise level. All parameters, except for RR irregularity, are determined from a signal produced by the echo state network (ESN) described in [26]. This type of network offers a unified solution to the problem of QRST cancelation in the presence of VPBs and large variation in normal beat morphology; thus, no dedicated algorithm is needed for the handling of VPBs. The four parameters constitute the total information fed to the classifier based on fuzzy logic. Detector performance is studied on a large set of ECG test signals whose properties are easily controlled, e.g., with respect to episode duration, percentage of APBs, and noise level.

The paper is organized as follows. The detector is described in Sect. 2, followed by a description of the ECG database and the performance measures in Sect. 3. The results characterizing performance are presented in Sect. 4 and compared to a detector which explores RR irregularity. The generation of test signals is described in the “Appendix”.

2 Methods

The main processing steps of the proposed detector are shown in Fig. 1. The detector requires two ECG leads as input of which one needs to be positioned away from the atria, e.g., \(\hbox {V}_6\). A sliding window approach is taken to PAF detection: The window length is defined by the number of beats \(M\), rather than by a time period, since a beat-based definition seems more natural when detecting brief episodes.

Fig. 1
figure 1

Block diagram of the proposed PAF detector. The echo state neural network is used for PQRST cancelation in the target lead \(x(n)\), here given by \(\hbox {V}_1\); the reference lead \(x_{r}(n)\) is \(\hbox {V}_6\). The output \(\hat{s}(n)\) of the block labeled “PQRST cancelation” contains f-waves during AF, and otherwise noise and PQRST residuals. The ESN inputs and output are normalized and denormalized, respectively, according to standard procedure. See the text for definitions of signals and parameters

2.1 Atrial activity characterization

Similar to other techniques for atrial activity extraction during AF, the ESN-based technique was developed under the assumption that AF is present and, accordingly, a signal with f-waves is fed to the ESN [26]. That assumption is not valid here since the input signal may just as well contain P-waves. However, preliminary tests showed that the ESN is not only suited for cancelation of QRST complexes but also for P-waves. Therefore, the ESN is briefly described below, followed by the parameters characterizing P-wave absence and f-wave presence, both computed from the ESN output.

In the present application, the ESN can be viewed as an adaptive filter which produces an output signal \(\hat{s}(n)\) with the f-waves from the target signal \(x(n)\) when AF is present, whereas \(\hat{s}(n)\) mostly contains the noise of \(x(n)\) and PQRST residuals when AF is absent. The reference signal \(x_r(n)\) is filtered by a time-variable transfer function, see Fig. 1. The output signal \(\hat{s}(n)\) is defined as the error \(e(n)\) between the target signal \(x(n)\) and the ESN output \(\hat{y}(n)\), being an estimate of the PQRST or the QRST complex, i.e.,

$$\begin{aligned} \hat{s}(n) \overset{\triangle }{=} e(n) = x(n) - \hat{y}(n). \end{aligned}$$
(1)

The estimate \(\hat{y}(n)\) is obtained by

$$\begin{aligned} \hat{y}(n) = g_o(\mathbf {w}^T_{\mathrm{out}}(n-1)\mathbf {z}(n)), \end{aligned}$$
(2)

where \(g_o(\cdot )\) denotes the output neuron activation function and \(\mathbf {w}_{\mathrm{out}}(n-1)\) the \(N\times 1\) time-varying output weight vector. The number of neurons in the reservoir is denoted \(N.\) The vector \(\mathbf {z}(n)\) is the concatenation of the \(N\times 1\) reservoir state vector \(\mathbf {r}(n)\) with the reference signal \(x_r(n)\), its first derivative \(x_r^{\prime }(n),\) and an impulse-like signal \(x_r^{s}(n)\),

$$\begin{aligned} \mathbf {z}(n) = \begin{bmatrix} \mathbf {r}(n)&x_r(n)&x_r^{\prime }(n)&x_r^{s}(n) \end{bmatrix}^T. \end{aligned}$$
(3)

The signal \(x_r^{s}(n)\) is identical to \(x(n)\) in a short interval of length \(2D\) centered around the fiducial point \(n_i\) of the \(i\)th beat; outside this interval \(x_r^{s}(n)\) is set to 0 (the fiducial point is here defined by the QRS center-of-mass). Thus, \(x_r^{s}(n)\) can be viewed as a variant of the impulse correlated reference input to the adaptive filter [16]. It should be noted that the definition in (3) differs from the one in [26] since the second derivative of \(x_r(n)\) is replaced with \(x_r^{s}(n)\) in order to achieve better noise immunity.

The output weights \(\mathbf {w}_{\mathrm{out}}(n)\) of the ESN are updated using the recursive least squares (RLS) algorithm in combination with least squares prewhitening. Prewhitening is defined by

$$\begin{aligned} \mathbf {v}(n)&= \mathbf {P}(n-1) \mathbf {z}(n), \end{aligned}$$
(4)
$$\begin{aligned} \mathbf {u}(n)&= \mathbf {P}^{T}(n-1) \mathbf {v}(n), \end{aligned}$$
(5)

where \(\mathbf {P}(n)\) denotes the inverse of the correlation matrix of \(\mathbf {z}(n)\). The update of \(\mathbf {P}(n)\) is given by the following two equations:

$$\begin{aligned} k(n)&= \frac{1}{\lambda + \Vert \mathbf {v}(n)\Vert ^2+\sqrt{\lambda (\lambda +\Vert \mathbf {v}(n)\Vert ^2})}, \end{aligned}$$
(6)
$$\begin{aligned} \mathbf {P}(n)&= \frac{\mathbf {P}(n-1)-k(n) \mathbf {v}(n) \mathbf {u}^{T}(n)}{\sqrt{\lambda }}, \end{aligned}$$
(7)

where \(\mathbf {P}(0) = \delta ^{-1} \mathbf {I}, \delta\) is a small positive constant, \(\mathbf {I}\) the identity matrix, and \(\lambda\) a forgetting factor. The RLS part of the algorithm produces an update of the output weights,

$$\begin{aligned} \mathbf {w}_{\mathrm{out}}(n) = \mathbf {w}_{\mathrm{out}}(n-1)+ \frac{e(n) \mathbf {u}(n)}{\lambda + \Vert \mathbf {v}(n)\Vert ^{2}}, \end{aligned}$$
(8)

where \(\mathbf {w}_{\mathrm{out}}(0) = \mathbf {0}\). The vector \(\mathbf {r}(n)\) is updated by

$$\begin{aligned} \mathbf {r}(n) = \alpha \mathbf {r}(n-1)+(1-\alpha )(g_r(\mathbf {W} \mathbf {r}(n-1) + \mathbf {W}_{\mathrm{in}} \mathbf {u}(n))), \end{aligned}$$
(9)

where \(\mathbf {W}_{\mathrm{in}}\) is a \(3\times N\) input weight matrix, \(\mathbf {W}\) an \(N\times N\) weight matrix of the internal network connections, \(g_r(\cdot )\) a reservoir neuron activation function, and \(\alpha\) a forgetting factor. The recursion in (9) is initialized with \(\mathbf {r}(0) = \mathbf {0}\).

P-wave absence \(({{\mathcal {P}}})\) is quantified by first computing the squared error between two different PR intervals,

$$\begin{aligned} e_{ij} = \sum _{n=n_P}^{n_R} \left( \hat{s}(n_i-n) - \hat{s}(n_{j}-n)\right) ^2, \end{aligned}$$
(10)

where \(n_P\) and \(n_R\) denote the onset and end of the PR interval, respectively, both located at fixed distances from the fiducial points \(n_i\) and \(n_j, i\ne j\). Then, the squared error is averaged for all pairwise combinations of the \(M\) beats in the detection window,

$$\begin{aligned} {{\mathcal {P}}} = \sum _{i=1}^{M-1}\frac{1}{M-i}\sum _{j=i+1}^{M} e_{ij}. \end{aligned}$$
(11)

The parameter \({{\mathcal {P}}}\) is close to 0 in rhythms with P-waves, but increases when f-waves are present. Since the F-waves of atrial flutter are largely canceled by the ESN, thanks to their much more stable pattern than the f-waves, the corresponding value of \({{\mathcal {P}}}\) is close to 0. In contrast to [4], this approach to characterizing P-wave absence requires no P-wave template, neither is it sensitive to variations in morphology since P-waves have already been canceled by the ESN.

f-wave presence \(({{\mathcal {F}}})\) is quantified by the parameter known as spectral concentration [5, 21],

$$\begin{aligned} {{\mathcal {F}}} = \frac{1}{E_{\hat{s}}} \int _{\varOmega _p} P_{\hat{s}}(\omega )\ \hbox {d}\omega , \end{aligned}$$
(12)

where \(P_{\hat{s}}(\omega )\) and \(E_{\hat{s}}\) denote the power spectrum and energy, respectively, of \(\hat{s}(n)\) in the \(M\) beat long detection window. The integration interval \(\varOmega _p\) is centered around the dominant spectral peak located within the interval \([\omega _{a,0},\omega _{a,1}]\) [5]. When f-waves are present, the dominant peak reflects AF frequency and \({{\mathcal {F}}}\) becomes closer to 1, whereas it is closer to 0 for sinus rhythm (SR). The power spectrum \(P_{\hat{s}}(\omega )\) is obtained using Welch’s method (1-s cosine window with 50 % segment overlap).

2.2 Ventricular activity characterization

RR interval irregularity \(({{\mathcal {R}}})\) is quantified by the coefficient of sample entropy, defined by

$$\begin{aligned} {{\mathcal {R}}} = -\ln \left( \frac{A}{B} \right) + \ln (2r) - \ln (\bar{m}_{r}), \end{aligned}$$
(13)

where \(A\) and \(B\) denote the total number of RR interval patterns of length \(m+1\) and \(m\), respectively, that match within a certain tolerance \(r\); for details, see the PAF detector described in [18]. The mean length of the RR intervals in the detection window is denoted \(\bar{m}_{r}\).

2.3 Noise level estimation

The noise level is estimated by the root mean square (RMS) value \(R_{\hat{s}}\) of \(\hat{s}(n)\), weighted by a ratio of spectral entropies. The numerator and denominator are computed in spectral bands dominated by noise and f-waves, respectively, defined by the respective frequencies \(\omega _{n}\) and \(\omega _{a}\). The noise parameter \({{\mathcal {N}}}\), defined by

$$\begin{aligned} {{\mathcal {N}}} = R_{\hat{s}} \cdot \frac{\displaystyle \int _{\omega _{n,0}}^{\omega _{n,1}} P_{\hat{s}}(\omega ) \cdot \log _{2}P_{\hat{s}}(\omega )\ \hbox {d}\omega }{\displaystyle \int _{\omega _{a,0}}^{\omega _{a,1}} P_{\hat{s}}(\omega ) \cdot \log _{2}P_{\hat{s}}(\omega )\ \hbox {d}\omega }, \end{aligned}$$
(14)

is small when \(P_{\hat{s}}(\omega )\) reflects AF, whereas it is large when motion artifacts and/or electromyographic (EMG) noise is present. The properties of \({{\mathcal {N}}}\) are further investigated in Sect. 4.

2.4 AF detection based on fuzzy logic

A Mamdani-type fuzzy inference method is employed for AF detection [22]. With fuzzy logic, numerical and linguistic knowledge are combined, which makes it particularly useful in applications where subjective knowledge is available about the process. The present design comes with four inputs, i.e., \({{\mathcal {P}}}, {{\mathcal {F}}}, {{\mathcal {R}}}, {{\mathcal {N}}}\), a set of “if–then” rules, and one output \({{\mathcal {O}}}\). By means of an input membership function, each input value is mapped (“fuzzified”) to a value that indicates the degree of belonging to a certain fuzzy set. For \({{\mathcal {P}}}, {{\mathcal {F}}}\), and \({{\mathcal {R}}}\), the fuzzy sets relate to SR and AF, and the following two input membership functions are employed:

$$\begin{aligned} \mu _{\mathrm{SR}}(x) = \left\{ \begin{array}{ll} 1, &{} \quad x \le a \\ 1-2\left( \frac{x-a}{b-a}\right) ^{2}, &{} \quad a \le x \le \frac{a + b}{2} \\ 2\left( \frac{x-b}{b-a}\right) ^{2}, &{} \quad \frac{a + b}{2} \le x \le b \\ 0, &{} \quad x \ge b, \end{array} \right. \end{aligned}$$
(15)

and

$$\begin{aligned} \mu _{\mathrm{AF}}(x) = \mu _{\mathrm{SR}}(a+b-x). \end{aligned}$$
(16)

The shape of \(\mu _{\mathrm{SR}}(x)\) and \(\mu _{\mathrm{AF}}(x)\) is defined by the parameters \(a\) and \(b\). For \({{\mathcal {N}}}\), the same type of input membership function is employed, but the fuzzy set relates instead to the noise level which is judged either to be low or high.

The set of if–then rules are then activated: In each rule, the antecedent is the fuzzified input value and the consequent is the linguistic output that reflects the degree of confidence of SR and AF. Each rule is composed of the four fuzzified parameters and combined with the AND operator. The output of each rule is defined by the Gaussian membership function,

$$\begin{aligned} \mu _{k}(y) = \exp \left[ -\frac{\displaystyle (y-c_k)^{2}}{\displaystyle 2\sigma ^{2}}\right] , \quad k=0,\ldots ,C, \end{aligned}$$
(17)

where \(c_k\) and \(\sigma ^2\) determine location (output specific) and width, respectively, and \(C\) is the number of linguistic outputs. For each rule, the degree of activated output is determined by the minimum value of each member. For simplicity, all rules are assigned a weight equal to 1.

The inference of a fuzzy block is based on all rules, and therefore the output of the individual rules \(\mu _k(y)\) is combined using the maximum method for accumulation to produce the overall fuzzy output \(\mu _o(y)\). The output value is obtained using the centroid defuzzification method, defined by

$$\begin{aligned} {{\mathcal {O}}} = \frac{\displaystyle \int _{y_{\mathrm{min}}}^{y_{\mathrm{max}}} y \mu _o(y) \hbox {d}y}{\displaystyle \int _{y_{\mathrm{min}}}^{y_{\mathrm{max}}} \mu _o(y)\hbox {d}y}, \end{aligned}$$
(18)

where \(y_{\mathrm{min}}\) and \(y_{\mathrm{max}}\) are the lower and upper limits, respectively, of the overall fuzzy output. The output \({{\mathcal {O}}}\) is a value between 0 and 1 which reflects the likelihood that the detection window contains AF.

2.5 Detection threshold

Since a short detection window is likely to cause more false alarms, median filtering (whose length is equal to that of the sliding window, i.e., \(M\)) is applied to the output \({{\mathcal {O}}}\) for the purpose of suppressing outlier values (it is recalled that \({{\mathcal {O}}}\) is a signal that results from the sliding window computation). Paroxysmal AF is detected whenever the output of the median filter exceeds a fixed threshold \(\eta\) \((0<\eta <1)\).

2.6 Parameter settings

All parameter values of the detector were determined through experimentation on ECG data which were not part of the performance evaluation. In some case, the parameter values were identical to those used in previous studies.

Since the goal of the present study is to detect occult PAF, the length of the sliding window was set to only \(M=5\) beats. The ESN was implemented using \(N=100, \lambda =0.999, \alpha =0.8\), and \(D=50\) ms. The PR interval was set to \((n_R, n_P)=(50,250)\) ms when computing \({{\mathcal {P}}}\). The parameters \({{\mathcal {F}}}\) and \({{\mathcal {R}}}\) were computed using the values given in [5] and [18], respectively. The parameter \({{\mathcal {N}}}\) was computed with the integration interval \([\omega _{a,0},\omega _{a,1}]\) set to \([3,12]\,\hbox {Hz}\), reflecting that the AF frequency is usually contained in this interval [28], whereas the noise interval \((\omega _{n,0},\omega _{n,1}]\) was disjunct and set to \((12,125]\,\hbox {Hz}\).

A total of 16 fuzzy rules were used. The input membership functions in (15) and (16) are defined by the parameters \(a\) and \(b\), determining the extreme values of the functions. The following values were used: \((a,b)=(-3,0.2)\) for \({{\mathcal {R}}}, (a,b) =(0,0.6)\) for \({{\mathcal {S}}}, (a,b)=(0,0.015)\) for \({{\mathcal {P}}}\), and \((a,b)=(0,2)\) for \({{\mathcal {N}}}\). Equidistant locations were assigned to the Gaussian output membership functions in (17): \(c_k=c_0+k\varDelta c, c_0=0, \varDelta c=0.143\), and \(C=8\); the motivation for choosing \(C\) is presented below. The set of linguistic outputs was defined by four values of SR and four values of AF, i.e., \(\{0,1,2, 3\}\) that reflect the likelihood of SR or AF. For example, the output is labeled SR0 when SR is present with low likelihood, and AF2 when AF is present with rather high likelihood. The width \(\sigma\) was set to 0.061. The integration interval in (18) was set to \((y_{\mathrm{min}}, y_{\mathrm{max}}) =(-0.2,1.2)\). The complete set of fuzzy rules is presented in Table 1. It should be noted that the guiding star when designing the fuzzy rules is simple: More weight is assigned to \({{\mathcal {R}}}\) and less weight to \({{\mathcal {P}}}\) and \({{\mathcal {F}}}\) when the noise level \({{\mathcal {N}}}\) is high, and vice versa when low.

The detection threshold \(\eta\) was fixed and set to 0.5, a choice based on the distributions of \({{\mathcal {O}}}\) for SR and AF, see the results below.

Table 1 The set of 16 fuzzy rules used for AF detection

3 Performance evaluation

3.1 Development and test datasets

The dataset used for developing the proposed detector was a database previously described in [32], with standard 12-lead ECGs from 211 patients clinically diagnosed with paroxysmal or persistent AF.

Due to the lack of annotated databases with occult PAF, test signals were generated for performance evaluation. The starting point was a set of 100 ECGs selected from the PTB Diagnostic ECG Database [3, 11], containing signals from 50 healthy subjects and 50 patients with myocardial infarction, all with SR, and lasting for about 2 min. The original sampling rate of 1,000 Hz was decimated to 250 Hz to alleviate the computational demands of the ESN [26]. Leads \(\hbox {V}_{1}\) and \(\hbox {V}_{6}\) were selected as target and reference signals, respectively. The original ECG was then subjected to repeated concatenation until at least 1,000 beats were included.

In order to generate signals with PAF episodes, the concatenated ECGs were altered with respect to rhythm and morphology. In PAF episodes, the signal was produced by adding the ventricular activity of the ECG and synthetic f-waves produced by a sawtooth model (once P-waves had been blanked). During SR, the original P-waves were modified to produce a more challenging test signal with larger morphologic beat-to-beat variability. The original RR interval series was replaced by a series produced by a model of either SR or AF. Finally, EMG noise was added at different RMS values to produce the test signal. The “Appendix” provides more information on signal generation.

The capability of \({{\mathcal {N}}}\) to characterize noise, but not f-waves, was investigated using 100 5-s segments each of f-waves extracted from the AF database in [32], and EMG noise extracted from the MIT–BIH Noise Stress Test Database [24]. All 5-s segments were normalized with respect to their RMS value.

3.2 Performance measures

In the present study, the principal performance measure is detection accuracy, denoted \(A\), defined as the number of correctly detected AF and SR episodes divided by the total number of episodes in a signal. Sensitivity is the number of correctly detected AF episodes divided by the total number of AF episodes, whereas specificity is the number of correctly detected SR “episodes” divided by the total number of SR episodes. An episode is considered to be correctly detected whether the overlap between annotation and detector output is at least 50 %. The statistical results are expressed as mean \(\pm\) two-sided confidence interval (95 %). All statistical results are based on 100 test signals.

4 Results

Figure 2 illustrates the performance of the proposed detector: The two AF episodes are correctly detected, including the second episode immediately preceded by APBs and corrupted with EMG noise that drown the f-waves. It can be noted that \({{\mathcal {N}}}\) is large when noise is present, while it is close to zero when PQRST residuals and f-waves are present (as is the case during the first 15 s of the example).

Fig. 2
figure 2

The performance of the proposed detector is illustrated on an ECG with two brief episodes of PAF. The first 15 s of the signal is noise-free, then followed by a 10-s burst of EMG noise. The second episode is preceded by two APBs. The output signal \({{\mathcal {O}}}\) is displayed with a thick line whenever the detection threshold is exceeded

To shed further light on how noise is characterized by the parameter \({{\mathcal {N}}}\), it was not only computed for EMG noise but also for f-waves to determine the extent by which f-waves influence \({{\mathcal {N}}}\). Figure 3 shows that \({{\mathcal {N}}}\) is proportional to the noise level, while it is essentially independent of f-wave amplitude.

Fig. 3
figure 3

a Example of EMG noise and extracted f-waves. b The parameter \({{\mathcal {N}}}\) computed for segments with EMG noise and f-waves

The range of each input membership function was determined by the distributions displayed in Fig. 4a–d, obtained from the AF database in [32]. While none of the parameters \({{\mathcal {R}}}, {{\mathcal {F}}}\), and \({{\mathcal {P}}}\) can individually discriminate AF from SR, Fig. 4f shows that their combination into \({{\mathcal {O}}}\), with \({{\mathcal {N}}}\) taken into account, offers excellent discrimination for \(\eta =0.5\). Figure 4e indicates that the detection accuracy \(A\) is only mildly dependent on the number of linguistic outputs. Eight outputs were used since no further improvement was obtained with additional outputs.

Fig. 4
figure 4

ad Distribution of the four fuzzy input parameters during SR and AF. e Beat-by-beat detection accuracy \(A\) as a function of the number of the linguistic outputs C. f The resulting distribution of the output \({{\mathcal {O}}}\) for the number of linguistic outputs set to \(C=8\)

Figure 5a displays \(A\) as a function of noise level when episodes with random length are analyzed. In order to show the added value of different features, the following combinations were compared: \({{\mathcal {R}}}, ({{\mathcal {R}}},{{\mathcal {P}}}), ({{\mathcal {R}}},{{\mathcal {P}}},{{\mathcal {F}}})\), and \(({{\mathcal {R}}},{{\mathcal {P}}},{{\mathcal {F}}}, {{\mathcal {N}}})\), i.e., \({{\mathcal {O}}}\).

The results show that the decrease in \(A\) for \({{\mathcal {O}}}\) is just 0.01 when the noise level increases from 20 to \(100\,\upmu \hbox {V}\), and \({{\mathcal {O}}}\) performs better than \({{\mathcal {R}}}\) for all noise levels. The accuracy of \({{\mathcal {R}}}\) is constant because the noise does not influence the RR interval pattern through falsely detected or missed heartbeats. While \({{\mathcal {P}}}\) improves detection performance only for low noise levels (<30 μV), the contribution of \({{\mathcal {F}}}\) remains significant up to a noise level of 90 \(\upmu \hbox {V}\). Figure 5b presents \(A\) as a function of noise level, but with 5 % of all beats being APBs. When comparing to the results in Fig. 5a, it is obvious that the performance of all detectors deteriorate when APBs are present; however, the deterioration is more pronounced for \({{\mathcal {R}}}\) as \(A\) drops from 0.97 to 0.88. The performance of \({{\mathcal {O}}}\) remains superior to \({{\mathcal {R}}}\), especially at low noise levels.

The requirement of a reference lead with negligible f-waves may seem as a major limitation of the proposed method. The results in Fig. 5c indicate though that increased f-wave amplitude in the reference lead \(\hbox {V}_{6}\) does not deteriorate \(A\) when the amplitude in the target lead \(\hbox {V}_{1}\) is \(30\,\upmu \hbox {V}\). When the amplitude in \(\hbox {V}_1\) is very small, i.e., \(10\,\upmu \hbox {V}, A\) drops from 0.99 to 0.94.

Fig. 5
figure 5

Detection accuracy \(A\) as a function of noise level when a no APBs are present, and b when 5 % of all beats are APBs. c Detection accuracy \(A\) as a function of f-wave amplitude in the reference lead \(V_{6}\), presented for two f-wave amplitudes in the target lead \(V_{1}\)

Table 2 displays the performance of the proposed detector for an increasing number of beats in the PAF episodes. The proposed detector was compared to the RR-based detector in [18], using the coefficient of sample entropy as decision parameter, here denoted \({{\mathcal {O}}}_{R}\); the detection threshold used in [18] was also used here. The results of Table 2 show that both \({{\mathcal {O}}}\) and \(\mathcal {O_{R}}\) are capable of detecting all AF episodes for the chosen threshold settings since the sensitivity is equal to 1. When no APBs are present, the accuracy of \({{\mathcal {O}}}\) remains high (0.88) also for episodes with as few beats as 5. When APBs are present, \(\mathcal {O_{R}}\) has much lower specificity than \({{\mathcal {O}}}\).

Table 2 The influence of episode length on detection accuracy (\(A\)), sensitivity (\(Se\)), and specificity (\(Sp\)) in the absence of APBs, and when 5 % of all beats are APBs

The above results, obtained from a large set of test signals, are complemented by a number of ECG examples. Figure 6a illustrates that \({{\mathcal {O}}}\) has a shorter delay than \({{\mathcal {O}}}_{R}\) when detecting an AF episode. Figure 6b, c illustrate that \({{\mathcal {O}}}\) is more robust to false alarms caused by sudden changes in the RR interval series, here associated with either APBs or respiratory sinus arrhythmia.

Fig. 6
figure 6

Detection performance on ECGs with a a brief PAF episode, b several APBs (marked with arrows), and c respiratory sinus arrhythmia are analyzed. Note that b and c do not contain PAF episodes. A thick line of the output indicates that AF is detected

5 Discussion

The goal of this work is to develop a reliable method for detection of occult PAF. With such a detector in long-term monitoring, information on episode pattern can be produced which may help to shed light on clinical challenges such as cryptogenic ischemic stroke. The synergy of the four parameters and the a priori knowledge built into the decision model (cf. Table 1) is the main reason to why the proposed detector performs well. Yet, the structure of the present detector is simple since RR irregularity, P-waves, and f-waves are characterized by just one parameter each.

Both the detector in [4] and the proposed detector make use of atrial information, though in quite different ways. Firstly, an f-wave signal can be extracted with the ESN when physiological disturbances such as VPBs are present, thereby precluding the need for ectopic beat detection. Secondly, the inclusion of noise level in the decision process allows the proposed detector to determine whether \({{\mathcal {P}}}\) and \({{\mathcal {F}}}\) can be relied on. The detection of brief episodes was not addressed in [4] since most episodes of the MIT–BIH AF database are much longer than 30 beats, nor was the performance evaluated at different noise levels.

The proposed detector assumes that P-wave absence, f-wave presence, and noise can be quantified from \(\hat{s}(n)\). The feasibility of this assumption is illustrated by the following two examples. Noise appearing in the target signal is not canceled by the ESN, but remains in \(\hat{s}(n)\), see Fig. 7a. On the other hand, noise present in the reference lead does not deteriorate f-wave extraction, see Fig. 7b. Other techniques than the ESN may be considered for PVCs, e.g., averaged beat subtraction or spatiotemporal QRST cancelation. These cancelation techniques suffer, however, from the disadvantage of requiring many beats for averaging, and therefore do not perform well when occasional PVCs occur. For this reason, we promote the ESN for PQRST cancelation since accurate f-wave extraction is required when the feature \({{\mathcal {F}}}\) is used.

Fig. 7
figure 7

Examples of f-wave extraction from an ECG when a the target lead or b the reference lead is noisy

The results show that the proposed detector is robust to noise (Fig. 5a), performs well in the presence of APBs (Fig. 5b), and can detect occult PAF reliably (Table 2). The example in Fig. 2 suggests that the delay in detection is about three beats, and that an episode length of at least five beats is needed for detection. This example also suggests that the detector is operational already after five beats after the onset of the recording, and thus a lengthy initialization period is not required.

In recent, interesting paper on ECG signal quality during arrhythmias, Behar et al. [2] explore skewness and kurtosis for noise quantification. These two parameters are not suitable though for signals with canceled ventricular activity, and therefore a novel noise parameter \({{\mathcal {N}}}\) was proposed and tested. Still, the main insight of [2] is valid also here, namely that signal quality parameters should be rhythm-specific.

The use of fuzzy logic is attractive since basic knowledge on AF can be easily translated to a set of linguistic rules. The Mamdani-type fuzzy logic does not require training, and its implementation is easily reproduced. On the other hand, the performance of an ANN-based detector depends on the training dataset and, as a consequence, its performance is likely dropping when noisy data is fed to the ANN. The main challenge with fuzzy logic is the selection of appropriate membership functions and rules. Although the present choice of membership functions and rules was heuristic, the performance of \({{\mathcal {O}}}\) was still superior to that of \({{\mathcal {O}}}_{R}\). The number of linguistic outputs \(C\) and the detection threshold \(\eta\) are crucial parameters and were given special attention, cf. Fig. 4e, f; the remaining parameters were determined heuristically from the development dataset.

Other decision techniques may be employed as well, e.g., linear discriminant analysis or artificial neural networks. However, a much larger dataset must then be used for training, especially when the noise level constitutes one of the input parameters, and therefore such techniques were not considered.

A limitation of the present study is that the proposed detector is not evaluated on an ECG database with occult PAF. Since the database must also have at least two ECG leads (one with negligible atrial activity, and the other containing atrial activity), and no such database is yet available with annotations, an approach with test signals has been pursued which still provides valuable insight on performance. For example, the influence of noise can be investigated in situations when the noise level exceeds the f- and P-wave amplitudes. Although noise immunity is a central aspect in long-term monitoring of AF, it has not received much attention in the literature. It should be noted that the present type of test signals preserve the morphologic QRST variability of the original ECG and the relationship between different leads. An alternative approach to performance evaluation may be to consider a database with PAF and manually “edit” all signals so that shorter (occult) episodes are created. However, the present approach offers better control of different signal properties and can produce signals with very challenging properties.

It is obviously desirable to involve more than two detectors in a performance comparison; however, detectors in the literature use window lengths of at least 30 s and are thus unsuitable for occult PAF. The detector by Dash et al. employed a window of 128 beats, implying that PAF episodes shorter than 64 beats could not be detected [9]. A similar observation applies to the detector developed by Huang et al. [12] which employed a window of 100 beats. Hence, a comparison of performance with these two detectors, not designed to detect brief PAF episodes, would be unfair and favor the present detector.

Furthermore, it should be noted that the proposed detector is developed exclusively for analysis of ECG signals. It is not applicable to PAF detection in intracardiac signals, e.g., studied in [25], since P- and f-wave information is explored.

6 Conclusions

This study shows that the combination of parameters characterizing atrial activity, ventricular activity, and prevailing noise level offers reliable detection of occult PAF. The results show that AF episodes as short as five beats can be detected, and the performance is essentially unchanged for noise levels up to \(100\,\upmu \hbox {V}\) RMS.

The detector is expected to have clinical relevance since brief AF episodes can be reliably detected in asymptomatic cases and trigger an event recorder. The detector should also be suitable for integration in eHealth services where analysis of long-term recordings is offered.