Keywords

1 Introduction

Deep brain stimulation (DBS) is a surgical therapy used mainly for treatment of a variety of neurological disorders, in patients who do not respond well to medication. In particular, DBS consists of an electrode inserted inside neural tissue of the patient to modulate neural activity with applied electric pulses, which depend on amplitude, pulse-width, frequency, and the electrode characteristics [2]. Although the physiological mechanism of DBS still remains unclear, its clinical effectiveness is evident [5]. A measure of the effects of the DBS resides in the estimation of the volume of tissue activated (VTA), namely, the spatial spread of direct neural response to external electrical stimulation as a 3D high-dimensional activate/non-activate image. Nonetheless, fixing the DBS parameters is not a straightforward task since it depends on both the specialist expertise and the brain tissue properties of each patient [6].

The VTA, and its visualization, jointly with reconstructions of the brain structures surrounding the implanted electrode have been proposed as an alternative to accelerate the process of stimulation parameters adjustment, also minimizing the adverse side effects that can occur when they are not carefully adjusted [1]. This system allows the medical specialist to observe the brain structures that are responding directly to the electrical stimulation. Thus, the clinician determines the possible effects of a given stimulation configuration on the patient. Nonetheless, such an approach still involves a heuristic search method (trial and error), requiring high computational load and appropriate expertise by the specialist [6]. So, the problem of computing a set of specific neuromodulation parameters given an objective VTA has been less studied in the DBS literature. In contrast, there is an extensive literature that attempts to estimate the VTA from the stimulation parameters [1, 10]. Authors in [7] introduced a machine learning system to predict the VTA from DBS parameter space. It allows the user to select a target to find the correlation between the calculated and the desired VTA. Once the correlation between the VTAs is calculated, the algorithm provides a possible configuration of neurostimulation parameters. However, the approach only operates under isotropic conditions and the system can not represent high stimulation parameter values and/or more than two active contacts. In [4] our group presented an alternative strategy for DBS parameter estimation from a previously specified VTA. We employed a framework based on support vector machines for multi-output regression and classification. However, our strategy is developed only under isotropic conditions.

In this work, we developed a kernel-based approach for DBS parameters estimation from VTA data. In this sense, our data-driven kernel-based scheme comprises mainly two stages (i) kernel-based principal components extraction from VTA samples, and (ii) DBS parameter estimation using kernel-based multi-output regression and classification. Moreover, our technique is developed under both ideal (isotropic tissue conductivities) and realistic (anisotropic tissue1 conductivities) assumptions. Our aim is to estimate neuromodulation parameters from the planned VTA to support DBS-based treatments. Results obtained show a remarkable reduction of the input VTA dimensionality after applying our feature extraction scheme, ensuring suitable DBS parameter estimation accuracies and avoiding over-fitting. The remainder of this paper is organized as follows: Sect. 2 describes the materials and methods of the approach introduced. Sections 3 and 4 describe the experimental set-up and the results obtained, respectively. Finally, the concluding remarks are outlined in Sect. 5.

2 Materials and Methods

Let \({\varvec{X}}\in \{0,1\}^{N\times P}\) and \({\varvec{Y}}\in \mathbb {R}^{N\times Q}\) be a given pair of matrices coding the VTA and the DBS parameter spaces, respectively, holding P axons, Q stimulation parameters, and N samples. The i-th VTA \({\varvec{x}}_i\in \mathbb {R}^P\) (\(i\in \{1,2,\dots ,N\}\)) is computed from the DBS configuration in row vector \({\varvec{y}}_i\in \mathbb {R}^Q\). In particular, a six-dimensional DBS parameter vector \({\varvec{y}} = [D_A\, D_W\, c_0\, c_1\, c_2\, c_3]\) is considered (\(Q = 6\)), where \(D_A\in \,\mathbb {R}^+\) refers to the amplitude, \(D_W\in \mathbb {R}^+\) to the pulse width, and \(c_r\) to the r-th contact condition, with \(r = \{0, 1, 2, 3\}\). Regarding the VTA simulation, given a DBS parameter vector \({\varvec{y}}_i\), a finite element method (FEM) is employed to compute the spatial distribution of the extracellular potential [10]. Then, a model of multicompartment myelinated axons is implemented to determine the axonal response to electric stimulation and the VTA is computed as the volume generated by the active axons [1]. The i-th VTA \({\varvec{x}}_i\) is stored towards axon concatenation, where the element \({x}_{ip}\ = 1\) (\(p\in \{1,\dots ,P\}\)) if the p-th axon is activated by the DBS, otherwise, \({x}_{ip} = 0\). Figure 1 shows the VTA estimation sketch.

Fig. 1.
figure 1

VTA estimation main sketch. The VTA is composed by elements labeled as active ‘1’ (green dots) and non-active ‘0’ (red dots). (Color figure online)

With the aim to estimate the neuromodulation parameters from a planned VTA in DBS-based treatments, we introduce a data-driven kernel-based scheme to highlight the relevant relations between the VTA and the DBS parameter spaces. Our approach comprises two main stages (i) VTA feature extraction through kernel-based eigendecomposition, (ii) DBS parameter estimation using kernel-based multi-output regression and classification.

Kernel-Based VTA Feature Extraction. Let \(\phi \negthickspace :\negthickspace \{0,1\}^P\negthickspace \rightarrow \negthickspace \mathcal {H}\) be a nonlinear mapping function that embeds any \({\varvec{x}}\in {\varvec{X}}\) into the element \(\phi ({\varvec{x}})\in \mathcal {H}\) of the Reproducing Kernel Hilbert Space (RKHS) \(\mathcal {H}.\) By assuming that the elements in \(\mathcal {H}\) are centered (\(\sum \nolimits _{i = 1}^{N}{\phi }({\varvec{x}}_{i}) = 0\)), the covariance matrix in the RKHS can be computed as follows: \({\varvec{S}} = ({1}/{N})\sum \nolimits _{i = 1}^{N}{\phi }({\varvec{x}}_{i})\phi ({\varvec{x}}_{i})^\top \), where its eigenvalues \(\lambda _m\in \mathbb {R}^+\) and eigenvectors \({\varvec{v}}_m\in \mathcal {H}\) satisfy \({\varvec{S}}{\varvec{v}}_m = \lambda _m{\varvec{v}}_m\) (\(m = \{1,2,\dots ,M_P\},\) \({\varvec{v}}_m\ne 0\)). Since all solutions \({\varvec{v}}_m\) lie in the span of \(\{\phi ({\varvec{x}}_i)\}^N_{i = 1}\) we may consider the equivalent system: \(\lambda _m\phi ({\varvec{x}}_i)^\top {\varvec{v}}_m = \phi ({\varvec{x}}_i)^\top {\varvec{S}}{\varvec{v}}_m,\) and that there exists a coefficient vector set \(\{{\varvec{\alpha }}_m\in \mathbb {R}^N\}^{M_P}_{m = 1},\) such that \({\varvec{v}}_m = \sum \nolimits _{i = 1}^N{{\alpha }^m_i\phi ({\varvec{x}}_i)}\) for all \(\alpha ^m_i\in {\varvec{\alpha }}_m\). Then, an eigenvalue problem is solved to find \({\varvec{\alpha }}_m\) as [9]: \(N\lambda _m{\varvec{\alpha }}_m = {\varvec{K}}{\varvec{\alpha }}_m,\) where \({\varvec{K}}\in \mathbb {R}^{N\times N}\) is a kernel matrix holding elements \(k_{ij} = \kappa _x({\varvec{x}}_i,{\varvec{x}}_j) = \phi ({\varvec{x}}_i)^\top \phi ({\varvec{x}}_j),\) being \(\kappa _x\negthickspace :\negthickspace \{0,1\}^P\times \{0,1\}^P\negthickspace \rightarrow \negthickspace \mathbb {R}^+\) a positive definite kernel. Due to the binary structure of the VTA, a Gaussian kernel is computed from a Hamming-based distance:

$$\begin{aligned} \kappa _x({\varvec{x}}_i,{\varvec{x}}_j) = \exp {\left( \frac{-\mathrm{{d}}^2_h({\varvec{x}}_i,{\varvec{x}}_j)}{2\sigma ^2_x}\right) }, \end{aligned}$$
(1)

where \(\mathrm{{d}}^2_h({\varvec{x}}_i,{\varvec{x}}_j) = \sum \nolimits ^P_{p = 1}{(1\negthickspace -\negthickspace \delta (x_{ip}\negthickspace -\negthickspace x_{jp}))}\) is the Hamming distance operator, \(\delta (\cdot )\) stands for the delta function, and \(\sigma _x\in \mathbb {R}^+\) is kernel bandwidth. Afterwards, the kernel principal component extraction (KPCA) \(z^m_i\in \mathbb {R}\) of \(\phi ({\varvec{x}}_i)\) onto the m-th basis in \(\mathcal {H}\) is computed as:

$$\begin{aligned} {z^m_i} = {\varvec{v}}^\top _m\phi ({\varvec{x}}_i) = \sum \nolimits ^N_{j = 1}{\alpha _i^m\kappa _x({\varvec{x}}_i,{\varvec{x}}_j)} \end{aligned}$$
(2)

and the feature extraction matrix \({\varvec{Z}}\in \mathbb {R}^{N\times M_P},\) holding row vectors \({\varvec{z}}_i\in \mathbb {R}^{M_P},\) is built by concatenating the \(M_P\) principal components.

Kernel-Based DBS Parameter Estimation. Given \({\varvec{Z}}\) we build two kind of vector-valued functions. The former, \(f^R\negthickspace :\negthickspace \mathbb {R}^{M_P}\negthickspace \rightarrow \negthickspace \mathbb {R}^2\) estimates the amplitude and pulse-width DBS parameter vector \({\varvec{y}}^R_i\in \mathbb {R}^2\) in \({\varvec{Y}}^R\in \mathbb {R}^{N\times 2}\) as: \(\hat{{\varvec{y}}}^R_i = f^R({\varvec{z}}_i) = {\varphi ^R}({\varvec{z}}_i){\varvec{W}}\negthickspace +\negthickspace {\varvec{b}}\), where \({\varvec{W}}\in \mathbb {R}^{M_R\times 2},\) \({\varvec{b}}\in \mathbb {R}^2,\) and \(\varphi ^R\negthickspace :\negthickspace \mathbb {R}^{M_P}\negthickspace \rightarrow \negthickspace \mathbb {R}^{M_R}\). Then, a multi-output support vector regression (MSVR) optimization problem can be defined as follows [8]:

$$\begin{aligned} {\varvec{W}}^*,{\varvec{b}}^* = \arg \min _{{\varvec{W}},{\varvec{b}}}{\frac{1}{2}\sum \nolimits _{m = 1}^{2}\Vert {\varvec{w}}_m\Vert ^2 +\gamma _R\sum \nolimits _{i = 1}^N \varsigma (u_i)}, \end{aligned}$$
(3)

where \({\varvec{w}}_m\in \mathbb {R}^{M_R}\) is the m-th column vector of \({\varvec{W}},\) \(\gamma _R\in \mathbb {R}^+\) is a regularization parameter, \(u_i = ({{\varvec{e}}_i^{\top }{\varvec{e}}_i})^{1/2}\), \({\varvec{e}}_i = {\varvec{y}}^R_i\negthickspace -\negthickspace \hat{{\varvec{y}}}^R_i\), and \(\varsigma (u_i) = (u\negthickspace -\negthickspace \epsilon )^2\) if \(u_i\negthickspace \ge \negthickspace \epsilon \), otherwise, \(\varsigma (u_i) = 0\) (\(\epsilon \in \mathbb {R}^+\)). Writing Eq. (3) in terms of \({\varvec{\xi }}_m\mathbb {R}^{N}\), with \(\pmb {\varphi } = [{\varphi ^R}({\varvec{z}}_1), \ldots ,{\varphi ^R}({\varvec{z}}_N)]^{\top }\in \mathbb {R}^{N\times M_R}\) and \({\varvec{w}}_m = \pmb {\varphi }^{\top }{\varvec{\xi }}_m\), a dual problem can be solved as:

$$\begin{aligned} \hat{{\varvec{y}}}^R_i = {\varvec{k}}^R_i{\varvec{\Xi }}\negthickspace +\negthickspace {\varvec{b}} \end{aligned}$$
(4)

where \({\varvec{\Xi }}\in \mathbb {R}^{N\times 2}\) is a weighting matrix with column vectors \({\varvec{\xi }}_m\) and \({\varvec{k}}^R_i\in \mathbb {R}^N\) is a row vector holding elements: \(k^R_{ij} = \kappa _z({\varvec{z}}_i,{\varvec{z}}_j),\) \((i,j\in \{1,2,\dots ,N\}),\) being \(\kappa _z\negthickspace :\negthickspace \mathbb {R}^{M_P}\negthickspace \times \negthickspace \mathbb {R}^{M_P}\negthickspace \rightarrow \negthickspace \mathbb {R}^+\) a Gaussian kernel function:

$$\begin{aligned} \kappa _z({\varvec{z}}_i,{\varvec{z}}_j) = \exp {\left( \frac{-\mathrm{{d}}^2_e({\varvec{z}}_i,{\varvec{z}}_j)}{2\sigma ^2_z}\right) }; \end{aligned}$$
(5)

notation \(\mathrm{{d}}_e(\cdot ,\cdot )\) stands for the Euclidean distance and \(\sigma _z\in \mathbb {R}^+\). Then, an iteratively reweighted least squares procedure is used to find \({\varvec{\Xi }},{\varvec{b}}\) [8].

Regarding the latter function, \(f^C\negthickspace :\negthickspace \mathbb {R}^{M_P}\negthickspace \rightarrow \negthickspace \{-1,0,1\},\) which allows computing the DBS contact configuration vector \({\varvec{y}}^C_i\in \{-1,0,1\}^4\) in \({\varvec{Y}}^C\in \{-1,0,1\}^{N\times 4}\), we built a soft margin support vector classifier (SVC) over \({\varvec{Z}}\) to compute the r-th contact value as:

$$\begin{aligned} \hat{y}^C_{ir} = f^C_r({\varvec{z}}_i) = \sum \nolimits _{j = 1}^{N}\varrho ^r_{j}{y}^C_{jr}\kappa _z({\varvec{z}}_i,{\varvec{z}}_j)+{a}_r, \end{aligned}$$
(6)

where \(\varrho ^r_{j}\in \mathbb {R}\) is the weight of training sample j for the r-th classifier and \({a}_r\in \mathbb {R}\) is a bias term. So, each classifier is solved as a quadratic optimization from the well-known SVC dual formulation (for details see [9]).

3 Experimental Set-Up

We built two VTA databases generated by 1000 randomly selected combinations of realistic stimulation parameters, thereby, \(c_r\{-1,0,1\}\), \(D_A\in [0.5,\) 5.5] [V], and \(D_W\in [60,450]\) \([\mu s].\) Such parameter value ranges are relevant in the context of VTA estimation for the Medtronic ACTIVA-RC stimulator.Footnote 1 The first database is built for both monopolar and bipolar conditions (one or two active contacts), under the assumption of an isotropic tissue medium (ITM), which is the most commonly used in clinical practice settings. The second database comprises isotropic and anisotropic tissue medium conditions (IATM), namely, 500 VTAs are computed for each of them. An extracellular potential model is executed for both databases using the COMSOL Multiphysics 4.2 FEM toolbox. Therefore, a model where the electric conductivity of the brain tissue is assumed to be homogeneous and isotropic is employed for ITM, meanwhile an anisotropic conductivity one is used for IATM. Such anisotropic conductivities are obtained from magnetic resonance imaging by means of diffusion tensors. For concrete testing, a DTI30 dataset is consideredFootnote 2 with the RESTORE (Robust Estimation of Tensor by Outlier Rejection) algorithm, and then, linearly transformed to conductivity tensors. After that, a model of multicompartment myelinated axons is implemented by using NEURON 7.3 as a Python module, to determine axonal response to the electric stimulation. Nevertheless, solving the gold standard approach for VTA estimation is computationally expensive. So, we use a Gaussian Process classifier (GPC) to emulate the multicompartment myelinated axonal model [3]. In this sense, the multicompartment axon model is executed by a random sample set estimated from the whole axonal population, aiming to simulate the axonal response to the electric stimulation by training the GPC.

For all the experiments, we performed a training-testing validation scheme with 30 repetitions, where \(80\%\) of the samples are used as training set and the remaining \(20\%\) as testing set. Two kind of systems are tested. The former, high-dimensional kernel learning (HDKL), does not include the KPCA stage for learning relevant components from the VTAs. Instead, the MSVR and the SVC are applied directly from the input set \({\varvec{X}}\) by applying a Gaussian kernel from a Hamming-based distance. The latter, low-dimensional kernel learning (LDKL), includes the KPCA stage as feature extraction. For both systems the MSVR algorithm is implemented according to an open source codeFootnote 3. Furthermore, the kernel bandwidth values are fixed as: \(\sigma _x = \mathrm{{med}}(\mathrm{{d}}_h({\varvec{x}}_i,{\varvec{x}}_j))\) and \(\sigma _z\ = \mathrm{{med}}(\mathrm{{d}}_e({\varvec{z}}_i,{\varvec{z}}_j)),\) respectively, where \(\mathrm{{med}}(\cdot )\) stands for the median operator. Moreover, the number of projected features in KPCA is computed from the set \(M_R = \{3,4,\dots ,30\}.\) Additionally, the MSVR free parameters are fixed within the following ranges based on the system performance: \(\sigma _z = [0.5\mathrm{{med}}(\mathrm{{d}}_e({\varvec{z}}_i,{\varvec{z}}_j)),1.5\mathrm{{med}}(\mathrm{{d}}_e({\varvec{z}}_i,{\varvec{z}}_j))],\) \(\epsilon = [0.5,2],\) and \(\gamma _R = [0.5,2].\) Alike, the SVC regularization parameter \(\gamma _C\) is tuned from: \(\gamma _C = [1,10,50,100,1000]\). For all the datasets provided, we uniformly subsample each VTA vector to obtain the input dimensionalities \(P = \{17.405,8.703,3481\}\) and \(P = \{17.924,8.897,3559\}\) in ITM and IATM, respectively.

4 Results and Discussion

Figure 2(a) and (b) show the kernel-based eigendecomposition projections for both VTA datasets studied. Each dot represents a different neurostimulation configuration, where its size is given by the contact condition, and the color represents the amplitude value. After visual inspection of the ITM results, we note that the kernel-based feature extraction allows differentiating between active and non-active points. Since ITM is computed for isotropic conditions a smooth data structure is revealed by the KPCA projection. In fact, the projected space allows coding the DBS amplitude information, which probes the capability to encode high-dimensional sample relations. Nevertheless, some overlaps are found due to different combinations of the amplitude and pulse width values, which lead to similar VTAs. Regarding the IATM visual inspection results, the achieved projection highlights more overlaps in terms of contact state and amplitude value in comparison to the ITM results. The fact that both isotropic and anisotropic conditions are studied in the same dataset leads to complex data relations between VTAs, however, a bottom to top amplitude increases is presented.

Fig. 2.
figure 2

Kernel-based eigendecomposition results. The dots color represents the amplitude value and the size the contact settings: active (big dot), inactive (small dot). (a) ITM dataset-Contact 0. (b) IATM dataset-Contact 0. (Color figure online)

Table 1. DBS parameter estimation results for the ITM dataset. Percentage of training-set selected as support vector is shown in parentheses.
Table 2. DBS parameter estimation results for the IATM dataset. Percentage of training-set selected as support vector is shown in parentheses.

Now, Tables 1 and 2 summarize the DBS parameter estimation accuracies obtained for both considered kernel-based approaches (HDKL and LDKL). The embedding dimensionality in LDKL is varied in the range \(M_P = \{5, 10, 15, 20, 25, 30,\) \(35, 40, 45, 50, 55, 60, 65, 70\}\) and the best result is presented for each provided input dimensionality value. As seen, the HDKL approach obtains slightly better results in terms of system accuracy in comparison to the LDKL methodology. However, the LDKL extracts a representation space of \(M_P = 40\) for both ITM and IATM, without affecting significantly the DBS parameter estimation results. So, our kernel-based eigendecomposition is able to code a high dimensional VTA space in a few number of relevant feature. Hence, LDKL reduces the number of required support vectors by avoiding over-fitting. Overall, LDKL performance is over 94\(\%\) and 82\(\%\) in ITM and IATM, respectively, where the highest results are related to the \(c_0\) and \(c_3\) contacts and the lowest ones to \(c_1\) and \(c_2\). The latter can be explained by their central position along the DBS device. So, the activation of the volume around the \(c_1\) and \(c_2\) positions can be affected by the activity of \(c_0\) and \(c_3\). In this sense, similar VTAs can provide different DBS configurations, especially, for axons positions around the center of the stimulation device.

5 Conclusions

In this study, we proposed a novel kernel-based approach to estimate the DBS parameters from VTA data. The data-driven estimation introduced comprises two main stages: (i) kernel-based feature extraction from VTA samples, and (ii) DBS parameter estimation using kernel-based multi-output regression and classification. In this sense, we carried out a KPCA algorithm from pair-wise Hamming distances between VTAs to extract relevant features from high-dimensional and activated/non-activated-valued data. Then, a MSVR and SVC are trained in the projected space to learn the DBS configuration. The problem we describe in the paper has not been studied in deep in the literature. As we mention in the introduction, there is an extensive literature that attempts to estimate the VTA from stimulation parameters. However, attempting to solve the problem of computing a set of specific neuromodulation parameters given a desired VTA has been much less studied. The proposed approach is tested under both isotropic and anisotropic conditions to validate its performance under realistic clinical environments. According to the results achieved, a significant reduction of the VTA space is obtained based on the kernel-based eigendecomposition analysis, which avoids system over-fitting and ensures stable estimations. As future work, authors plans to develop different kernel-based eigendecomposition, besides the KPCA over the Hamming distances, aiming to enhance the system performance in challenging VTA configurations. Moreover, a pre-image extension of the introduced kernel-based extraction will be carried out for VTA reconstruction tasks.