Keywords

1 Introduction

Characterizing cardiac response to increased activity or adverse situations is key in the analysis of Heart Failure (HF) etiologies. For this matter, the clinical value of stress-testing is well established: in clinical practice, standardized stress-invoking protocols (e.g. dobutamine challenge [15], cycling/running with controlled heart rate, exercise time or generated power [3]) are adopted. The strictly protocolized nature of these tests makes the collection of measurements of cardiac function at only a few well-defined time-points/stress-levels of the test (e.g. 4 in the case of dobutamine challenges) sufficient to analyze how the heart copes with the induced stress. These tests are, however, difficult to implement at a large scale (i.e. in all patients), due to the required time, equipment and staff (and thus cost). Handgrip [7] or cold pressure testing [14] are, on the other hand, cheap, fast, and easy ways to invoke stress. However, they are difficult to standardize, and thus not reproducible with regard to timings/intensity levels of the stress challenge. This makes the classical ’single time-point’ measurements at different stress levels infeasible, implying a continuous data acquisition throughout the test (e.g. 40–60 cycles). In this context, the assessment of cardiac response to stress needs to be performed based on the quantification of trends rather than on amplitude differences. Naturally, this kind of acquisition raises other challenges, such as the considerably increased amount of data involved, and well-known issues related to their processing, such as image artifacts caused by breathing motion in echocardiographic sequences. In this paper, we propose an approach based on multiview dimensionality reduction for the analysis of response to stress in such contexts. We start by defining features of interest to be collected at each consecutive cycle, for each patient, such as heart rate and deformation features. Then, the approach allows the projection of several patients onto a space where the response to stress of each patient is compactly represented as a low-dimensional trajectory, that encodes change patterns in the defined features. The main objective, once this space is obtained, is to discriminate healthy and pathological responses, and to reconstruct the patterns in the features of interest that characterize them.

As a first step, a synthetic dataset was generated, so as to obtain a sufficient number of patients to evaluate the performance of the approach in terms of the proposed objectives, i.e. its capability of clustering different types of response and reconstructing the main patterns that characterize them. Then, we used a real echocardiographic sequence, acquired on a healthy volunteer during a cold pressor test, to illustrate the applicability of the proposed approach in a real context.

2 Synthetic Data

The importance of heart rate (HR) and left-ventricular (LV) deformation patterns for the analysis of response to stress has been demonstrated in several clinical studies [6, 15]. In particular, the longitudinal deformation function of the LV is presumed to be one of the earliest to be reduced in several cardiovascular pathologies [8]. For these reasons, we selected HR and the global longitudinal strain (GLS) curve as features of interest to monitor over each patient’s stress test. GLS is here defined as the change in longitudinal size of the LV during the cardiac cycle, relative to the end-diastolic size. The features were collected for each consecutive cycle, namely one HR value and one vector holding the evolution of GLS throughout the cycle.

2.1 Model

The model for generating synthetic GLS curves is based on 4 control points adjusted in time and amplitude, and piecewise cubic spline interpolation. Let us first consider 6 keypoints \(\varvec{p_i} = (t_i, A_i), i = 0,..,5\), of the GLS curve, as illustrated in Fig. 1.

Fig. 1.
figure 1

Synthetic GLS curve model.

Here, \(\varvec{p_0} = (0,0)\) and \(\varvec{p_5} = (t_5,0)\) correspond, respectively, to the start and end points of the cardiac cycle, \(\varvec{p_0}\) coinciding with mitral valve closure. Furthermore, we define \(t_1\) as the instant when GLS slope effectively becomes negative, \(t_2\) as the instant of aortic valve closure (AVC), \(t_3\) as the instant when GLS slope turns positive (start of relaxation), and \(t_4\) as the start of atrial contraction (AC). Since we capture the variation of the HR as a distinct feature, all GLS curves were normalized in time and all \(t_i\) ranged between 0 and 1 for all cycles. Timings and amplitudes of \(\varvec{p_1}\) to \(\varvec{p_4}\) can be adjusted to reflect stress-induced changes as described in the literature [15]. A few additional points were defined to maintain key features of the GLS curves, whose coordinates were defined proportionally to \(\varvec{p_1}\), \(\varvec{p_2}\), \(\varvec{p_3}\) and \(\varvec{p_4}\), and thus passively followed these active control points. Each synthetic GLS curve results then from the piecewise cubic spline interpolation of the set of points defined by the extremes of the cycle \(\varvec{p_0}\) and \(\varvec{p_5}\), the active points \(\varvec{p_1}\) to \(\varvec{p_4}\), and the remaining passive points. Since the extremes are fixed and the passive points change passively with the active points, the GLS curve needs only 8 parameters to be defined, which are the timings and amplitudes of \(\varvec{p_1}\) to \(\varvec{p_4}\).

2.2 Dataset Characteristics

To generate physiologically consistent GLS curves, we took as reference previous clinical studies where physiological and pathological responses to stress are characterized [15]. Three types of response to stress were recreated: one normal (Fig. 2a), and two pathological with the following signatures: post-systolic shortening (PSS, i.e. continuation of shortening after AVC – Fig. 2b) and the combination of PSS and prolonged early relaxation/delayed AC (Fig. 2c).

Fig. 2.
figure 2

Three types of responses to stress were considered in the generated synthetic GLS curves: (a) normal, (b) pathological with PSS, (c) pathological with PSS and prolonged early relaxation/delayed AC.

For each type of response, 5 synthetic patients were generated. To include inter-patient and inter-acquisition variability, the parameters of the model, the total number of cycles, the point where stress was introduced, and the time it took to reach peak stress, were slightly varied among patients. In terms of HR, the response was considered normal for all patients (\(\approx \)60 beats per minute (bpm) baseline; \(\approx \)120bpm peak stress). The responses were considered to be approximately linear in time from baseline to peak stress. An example of a synthetic sequence of GLS curves and corresponding HR values is illustrated in Fig. 3.

Fig. 3.
figure 3

The 2-feature sample representation of a synthetic patient: sequence of GLS curves and respective HR values, from rest to peak stress (blue to green). (Color figure online)

In summary, we generated a synthetic dataset consisting of 15 patients with a 2-feature sample representing each of their (\(\approx \)15) consecutive cycles. The dimensionalities associated with those 2 features are 1 (HR) and 75 (GLS curve). For analyzing the main trends and modes of variation in the data, and how they are clustered, it is convenient to obtain a more compact representation of the data. For that reason, we performed dimensionality reduction.

3 Dimensionality Reduction Methodology

Within dimensionality reduction approaches, unsupervised methods are particularly suited for analyzing the main trends and modes of variation in the data, and discover how they are clustered. Furthermore, a non-linear method was preferred, for the sake of robustness to possible data distribution geometries where linear methods such as Principal Component Analysis [5] might deliver limited performances. Within non-linear methods, those categorized as graph embedding algorithms (e.g. Isomap [13], Laplacian Eigenmaps [1] (LEM), Locally Linear Embedding [10]) are particularly popular. However, all of the above-mentioned methods are prepared for a single multivariate input. Given that the GLS is a multivariate feature with a functional structure, concatenating our two features into a single multivariate input does not seem to be the most appropriate way to deal with our data. Instead, a multiview approach was considered more suited.Footnote 1 We thus selected the unsupervised formulation of the Multiple Kernel Learning (MKL) algorithm for dimensionality reduction  [9], which can be seen as a multiview generalization of the non-linear method LEM. In addition, MKL has been shown to perform well in several multiview dimensionality reduction problems [9], including cardiovascular applications [11].

3.1 Formalism of Unsupervised MKL

Let us consider N input samples, each one consisting of F uni/multivariate features. For each feature \(f = 1,..., F\), an affinity matrix \(W^f \in \mathbb {R}^{N \times N}\) is computed using the Gaussian kernel, which encodes the similarity among samples. Let us now express \(W^{f}\) as the set of its columns, \(W^{f} = [W_{1}^{f}, ..., W_{N}^{f}], \ W_i^f \in \mathbb {R}^{N }, \ i = 1, ..., N\), and let matrix \(\mathbb {K}^i \in \mathbb {R}^{N \times F}\) be defined as \(\mathbb {K}^{i} = [W_{i}^{1}, ..., W_{i}^{F}]\). The mapping of a sample i to the output space is expressed as

$$\begin{aligned} y_{i} = A^T \mathbb {K}^i \beta \; , \end{aligned}$$
(1)

where A is the projection matrix to the output space and \(\beta \in \mathbb {R}^F\) contains the normalized weights of each feature f in the mapping. Let \(\mathbb {W}\) be a linear or non-linear combination of all the feature-wise similarity matrices \(W^f\) (we used \(\mathbb {W} = \frac{1}{F}\sum _{f} W^{f}\)). The entry \(\mathbb {W}_{ij}\) corresponds then to a similarity coefficient between samples i and j in the input space based on their F features. The goal is to map the data onto a lower-dimensional space where samples that are close in the input space remain close in the output space. Extending the idea of Laplacian Eigenmaps [1], the optimal embedding can be obtained by finding A and \(\beta \) which minimize

$$\begin{aligned} \sum _{ij} \Vert A^T \mathbb {K}^i \beta -A^T \mathbb {K} ^j \beta \Vert ^2 \mathbb {W}_{ij}\; . \end{aligned}$$
(2)

Thus, close samples in the input space (high \(\mathbb {W}_{ij}\)) will be enforced to remain close in the output space, so as to minimize the product \(\Vert y_{i} -y_{j}\Vert ^2 \mathbb {W}_{ij}\). Matrix \(\mathbb {W}\) is often made sparse, so that pairs of samples that are very distant do not contribute to the final projection.

Lin et al. [9] proposed an iterative two-step approach that alternately solves the minimization for \(\beta \) and for A. To better control and understand the effects of weighting the features in the obtained projections, we withdrew \(\beta \) as minimization argument, tuned its value, and solved the minimization of  (2) for A through a generalized eigenvalue problem (first step of the minimization strategy proposed by Lin et al. [9]). The first dimensions of the obtained space correspond to the eigenvectors with lowest associated eigenvalues, and encode the main modes of variation of the data.

3.2 Multiscale Kernel Regression

After the optimal mapping is obtained, multiscale kernel regression (MKR) [2, 4] can be used to associate an output-space sample with its corresponding form in the input feature space. This is done based on the similarity of such output-space sample to all others in the output space, and their known representation in the input space. By studying the effects of moving a sample along a dimension of the output space in its input space representation, we can analyze the modes of variation of each input feature encoded in such output-space dimension, and relate output-space trajectories with specific patterns in the input features.

3.3 Generalizing for Multiple Views and Multiple Samples per Subject

MKL has been applied before in echocardiography by Sanchez-Martinez et al. [11] for the analysis of the main modes of variation in the myocardial velocity traces among healthy patients and patients suffering from HF with preserved ejection fraction, under rest and stress conditions. In [11], each patient was represented by one sample with 6 different views, which included 4 cycle-wise velocity curves (basal/septal regions of the LV at rest/submaximal exercise) and 2 vectors providing information on the timing of cardiac phases.

In this paper, in addition to multiple patients, we consider multiple temporal samples per patient, which draw a trajectory from rest to stress. In this context, we have interest in analyzing the modes of variation in the input features both among patients and over time, or, in other words, in analyzing and comparing patient trajectories. For that, we need to map all the patients onto the same space. Assuming that all patients lie in a common manifold, one possible approach is applying MKL having as inputs affinity matrices \(W^f\) that compare all samples (i.e. cardiac cycles) of all patients. More specifically, each input sample consists of two views (GLS and HR) from a cycle c of a patient p, and it is indexed according to \(i(p,c) = \sum _{q=1}^{p-1} N_{q} + c\), where \(N_q\) represents the total number of cycles of patient q. Conveniently, this approach does not require an equal number of samples from the different patients, nor identical sampling grids.

In this context, the proposed approach comprises the following steps:

  1. 1.

    Collecting the 2-view (HR value and GLS curve) sample corresponding to each of the consecutive cycles of each of the patients;

  2. 2.

    Building feature-wise affinity matrices \(W^f\), \(f = \{GLS, HR\}\) comparing all samples of all patients;

  3. 3.

    Tuning Gaussian kernel bandwidths (\(\sigma ^f\)) and sparsity of \(\mathbb {W}\) to adjust the sensitivity of the algorithm to the order of amplitudes of the sought modes of variation;

  4. 4.

    Normalizing the affinity matrices by variance before being fed to the MKL algorithm;

  5. 5.

    Tuning \(\beta \) and finding the projection matrix A of the data which minimizes the objective function in  (2);

  6. 6.

    Applying data projection.

  7. 7.

    Performing MKR to obtain the modes of variation encoded in each dimension of the new space.

Once all samples are projected to the output space, the temporal trajectory of each patient in the new space can be obtained by connecting his samples over time. These trajectories can be analyzed dimension-wise. Then, within the first dimensions (which encode the main variations in the data), we can perform a combined analysis of the trajectories and the corresponding modes to search for those more relevant for the characterization/discrimination of responses to stress.

4 Experiments with Synthetic Data

In the experiments with synthetic data, HR and GLS curves for each of the (\(\approx \)15) consecutive cycles of each of the 15 synthetic patients were collected as described in Sect. 2.2. We then applied our MKL extension described in Sect. 3.3, each HR value and GLS curve per cycle/patient being considered as an input sample. MKL projects these input samples to a low dimensional space, where trajectories over time can be reconstructed. Figure 4 shows, in the right column, these trajectories for some dimensions of the MKL output space (#1, #4, #5). In these plots, each curve represents one patient, and the color corresponds to the patient class (healthy/PSS/PSS+AC label, see Sect. 2.2). For interpreting what each dimension of the output space relates to in the input samples, we applied MKR as described in Sect. 3.2. Through MKR, we were able to reconstruct the modes of variation in GLS curves and HR associated to each dimension of the output space. In the left and middle columns of Fig. 4 we show the results regarding three GLS and HR modes that we considered important in the characterization/discrimination of different types of response to stress. Trajectories over time (right column) for each mode, combined with physiological interpretation of the mode can reveal important trends in the data. For example, in the first dimension, the trajectory plot shows that the 3 groups of 5 patients experiment a similar upwards trajectory over consecutive cycles. The HR and GLS modes associated to this dimension indicate that it encodes a mix of the different pathological responses included in the database (PSS, AC amplitude and timing). A mapping between the trajectory and the corresponding GLS and HR is plotted by the colorbar on the right. By looking at the colorbar, we can relate the shift in the output coordinate with the color shift of the GLS curves and HR values: it reveals an increase in GLS peak amplitude and in HR over time as the main factor of response to stress. As this mode is common to both healthy and pathological, further modes are needed to differentiate the populations. In the \(4^{th}\) and \(5^{th}\) dimensions of the output space, trajectories diverge over time between healthy and pathological populations. The \(4^{th}\) dimension represents increasing levels of PSS, whereas the \(5^{th}\) dimension corresponds to increasing AC delays in the pathological trajectories. These results show that, while considerably reducing the dimensionality of the data – we moved from a space where each sample consisted of 2 views, represented by a scalar and a 75-sized vector, to a space where each sample is represented by a single 3-coordinate vector – we can reconstruct important patterns of response to stress.

Fig. 4.
figure 4

Synthetic data: modes of variation of GLS curves and HR and patient trajectories over 3 dimensions of the output space. MKL parameters: \(\beta _{GLS}\) = 0.98; \(\beta _{HR}\) = 0.02; \(\sigma ^f\): average f-wise 4-NN distance; sparsity of \(\mathbb {W}\): for each sample i, the 10% highest \(\mathbb {W}_{ij}\) entries were preserved. MKR parameters: The GLS and HR modes were obtained at increments of the standard deviation \(\sigma _d\) (from \(-2\sigma _d\) to \(2\sigma _d\)) for each dimension d. The colorbars link output coordinates to GLS curves and HR values. (Color figure online)

Fig. 5.
figure 5

Mapping of synthetic patients based on their output-space k-NN distances to each patient group (\(k=3\)).

Furthermore, to investigate how patients were clustered in the output space, we computed the distance of each patient to each group in a leave-one out experiment. The trajectories of each patient were first averaged in the output space to obtain a single output point per patient. Then, for each patient, the averages of the distances to the k nearest neighbors (k-NN) within each group were used as patient-group distance estimates. A scatter plot of these distances is shown in Fig. 5, suggesting that the output space is able to discriminate the 3 groups defined in Sect. 2.2.

In conclusion, these results show that the proposed approach succeeds to meet the initially set objectives: it allows a compact representation of responses to stress in terms of multiple features as low-dimensional trajectories, the clustering of different types of response, and the reconstruction of the patterns in the input features that characterize them.

5 Application to Patient Data

5.1 Collection of the Features of Interest

The proposed approach was then tested on echocardiographic data acquired from one volunteer during a cold pressor stress test, provided by Centre Hospitalier Universitaire de Caen (CHUC). The immersion of a subjects’s arm in iced water is known to trigger responses in the cardiovascular system, including arteriolar constriction and increased HR [14]. Consequently, blood pressure increases, posing an afterload challenge to the LV. The echocardiographic recording consisted of over 4000 apical 4-chamber view frames corresponding to about 60 consecutive cycles, and respective ECG traces (Fig. 6a). The LV myocardium was segmented on the first frame and its deformation was tracked over consecutive frames using the Sparse Demons registration algorithm [12]. GLS was computed as the relative change in longitudinal size of the LV during the cardiac cycle. The start and end points of cardiac cycles were defined by the timings of the R-peaks of the ECG. An inter-cycle registration was first performed (i.e. among the initial frames of all cycles), followed by the intra-cycle registration (i.e. among consecutive frames within each cycle), so as to prevent high error accumulation. Motion artifacts were addressed through drift correction. It is worth referring that, given the considerable size of the frame sequence and the breathing motion artifacts that are strongly amplified with stress, performing a quality tracking over the whole sequence represents a big challenge. HR information was extracted from the ECG. We assume that stress was introduced around the \(30^{th}\) cycle, when HR shows a sudden sharp increase (Fig. 6b).

Fig. 6.
figure 6

Patient data. (a) Echo frame and ECG from a cold pressor test acquisition (an animated version is available at http://goo.gl/WGCJpt). (b) Extracted GLS curves and corresponding HR values, from rest to peak stress (blue to green). (Color figure online)

Fig. 7.
figure 7

Patient data: modes of variation of GLS curves and HR and patient trajectories over the first 2 dimensions of the output space. MKL parameters: \(\beta _{GLS}\) = 0.9; \(\beta _{HR}\) = 0.1; \(\sigma ^f\): average f-wise 6-NN distance; sparsity of \(\mathbb {W}\): for each sample i, the 26% highest \(\mathbb {W}_{ij}\) entries were preserved. MKR parameters: The GLS and HR modes were obtained at increments of the standard deviation \(\sigma _d\) (from \(-2\sigma _d\) to \(2\sigma _d\)) for each dimension d. The colorbars link output coordinates to GLS curves and HR values. (Color figure online)

5.2 Experiments

After the collection of the 2-view samples for each consecutive cycle of the acquisition, the methodological steps in Sect. 3.3 were applied. Given that we had data from one single patient, we sought modes, and trajectories over time in such modes, that correlated with the timing of stress and/or known physiological GLS patterns. In Fig. 7, we observe that the trajectory in the first dimension of the output space is very correlated with the timing of stress, as a clear upwards motion starts around the \(30^{th}\) cycle. Interestingly, the corresponding GLS mode of variation reveals two pathological signatures of stress-response that had been introduced in the synthetic dataset: PSS and late AC. Looking at the colorbar, an upwards trajectory corresponds to increasing HR and reinforcing these GLS signatures. Indeed, to cope with the acute afterload challenge, the patient’s heart developed inotropic mechanisms similar to some typically observed in hypertensive patients (chronic afterload challenge). Given that this is a quite demanding challenge for the heart, it is not uncommon to find traces of these mechanisms even in normal patients. In this context, it is rather how accentuated the pathological signatures are, or the combination with abnormal changes in other features, that distinguish physiological adaptations from pathological responses. With the second dimension, we illustrate how data artifacts/tracking errors can affect some of the modes: while the trajectory is clearly affected at the time of stress, it oscillates in end-systolic and AC peak amplitudes, preventing tendency analysis. Thus, although an acquisition from a single healthy patient was insufficient to recreate the type of analysis led with the synthetic data, with this experiment we (i) confirmed that we are able to extract from a true ultrasound acquisition the same features we used in the synthetic case, i.e. the simulated features can be realistically extracted; (ii) were able to recover patterns of response in the GLS curve that have a clear physiological interpretation.

6 Conclusions

Results suggest that multiview dimensionality reduction may be interesting for representing patient response to stress over time as a low-dimensional trajectory encoding fundamental modes of variation in features that we have interest in monitoring, such as global left-ventricular deformation and heart rate. Moreover, it can be used to characterize and discriminate different types of response, as illustrated with a synthetic population. Results of experiments with real data were consistent with typical patterns of response, although some modes of variation and trajectories are naturally disturbed by artifacts in the input data (e.g. breathing). Further work will target reducing their impact on the analysis.