1 Introduction

Resting-state functional magnetic resonance imaging (rs-fMRI) has been widely applied as the non-invasive imaging technique for studying the human brain functional organization architecture. It was originally designed to detect the variations and covariations of the blood-oxygenation-level-dependent (BOLD) signals mostly related to the spontaneous neural activities [1]. The majority of rs-fMRI studies focus on the gray matter (GM), while the rs-fMRI signals in white matter (WM) pathways are treated as noise and artifacts. However, recent studies indicate that WM may also contain meaningful BOLD signals, which carry potentially valuable information complementary to GM-based rs-fMRI studies. Nevertheless, utilizing WM BOLD signals for basic and clinical neuroscience studies is challenging, as WM has blood vasculature that is much less denser, and also the BOLD signal in WM is significantly weaker than in GM [2].

Despite the challenges, attempts have been made to investigate WM fMRI. Early task-based fMRI studies have revealed consistent, reliable task activations in several corpus callosal WM areas linking activated GM structures [3, 4]. Recently, Ding et al. [5] found WM functional anisotropic patterns using local functional connectivity (FC) using rs-fMRI, which grossly resemble the anisotropic diffusivity reflected by diffusion tensor imaging (DTI) in several major WM structures. They employed functional correlation tensor (FCT) to capture such anisotropy, allowing functional WM tractography based on rs-fMRI data of a small group of healthy subjects. However, it is challenging when applied to other large cohorts, owing to the limited signal-to-noise ratio (SNR) of the WM BOLD signals. Moreover, the FCT estimation method proposed in [5] does not leverage any prior knowledge of DT data that can help overcome the SNR issue. Thus, a robust and reliable FCT estimation technique is important for greater utility of WM anisotropy in neuroscience studies and also as biomarkers for disease diagnosis.

In this paper, we propose a robust FCT estimation technique to address the aforementioned issues. First, we develop a novel patch-based correlation measurement strategy to suppress noise. Second, we propose to leverage the underlying WM fiber orientation information as prior knowledge when calculating the FCT. This is based on the finding that the dominant direction of the local WM FC anisotropic pattern, extracted from rs-fMRI, is roughly consistent with that of the diffusion tensors (DTs) from DTI [5] in major WM fiber structures. Thus, we can improve FCT estimation by increasing weighting along the dominant directions of DTs. Ideally, the DTs can be obtained from DTI [6]. In the case where DTI is not available, we employ a learning based method to predict the DTs from the rs-fMRI data. This is achieved by using random forest regression with cascaded learning strategy [7] to learn the FC-to-DT mapping [8, 9] with a training dataset containing both rs-fMRI and DTI. Thus, for a testing rs-fMRI, the learned mapping can be applied to predict DTs. Also note that to consider between-tissue difference, the tissue probability features of GM/WM/cerebrospinal fluid (CSF) from T1-weighted MRI are also used to guide the FC-to-DT mapping process.

2 Materials and Methods

Two datasets are employed in this paper: (1) The Human Connectome Project (HCP) [10] dataset and (2) the Alzheimer’s Disease Neuroimaging Initiative Phase-II (ADNI2) dataset [11]. The HCP dataset contains high spatial and temporal resolution rs-fMRI, multi-shell diffusion MRI data, and T1-weighted MRI for each subject. It is hence suitable for training the regression model. The ADNI2 dataset focuses on capturing the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD) with both rs-fMRI and T1 MRI. It contains data for early MCI patients, which are used for validation of the improved FCTs in enhancing AD diagnosis.

2.1 Data Preprocessing

HCP Dataset:

We randomly select 96 subjects from the dataset, which are all scanned with a customized Siemens Skyra 3T scanner with the same imaging parameters (rs-fMRI: voxel size = 2 × 2 × 2 mm3, 1200 volumes; DTI: voxel size = 1.25 × 1.25 × 1.25 mm3; T1: 0.7 × 0.7 × 0.7 mm3). Note that the first 30 frames in the rs-fMRI images are discarded for magnetization equilibrium. The first 600 frames (7 min and 12 s) of the remaining data are used to estimate FCTs. The preprocessing of the rs-fMRI and DTI data is based on the HCP pipeline (https://github.com/Washington-University/Pipelines), but modified for our requirements as below:

  1. (1)

    The DTs are computed using dtifit in FSL [12]. An average \( b0 \) image is used for inter-modality registration to rs-fMRI using flirt in FSL.

  2. (2)

    The tissue probability maps for GM/WM/CSF segmentation are obtained from the T1 MRI by using fast in FSL, and are linearly warped to each subject’s own rs-fMRI space using flirt.

  3. (3)

    FCT computation is performed in the native space of the rs-fMRI per subject. The DTs are warped to each subject’s own rs-fMRI space.

  4. (4)

    The minimally preprocessed rs-fMRI (in native space) are further band-pass filtered (\( 0.01 \le f \le 0.08 \) Hz). No spatial smoothing is applied. All subjects’ head motion profiles are checked to ensure that they are within an acceptable range.

ADNI2 Dataset:

39 early-stage MCI (eMCI) and 42 age- and gender-matched normal controls (NC) are included. The rs-fMRI (TR = 3000 ms, 140 frames, voxel size = 3.3 × 3.3 × 3.3 mm3, eyes open) and T1 MRI (voxel size = 1 × 1 × 1 mm3) are obtained using 3T Phillips Achieva scanners. Data preprocessing is conducted based on SPM8 (https://www.fil.ion.ucl.ac.uk/spm/soft-ware/spm8/), REST (http://www.restfmri.net/forum/REST_V1.8), and DPARSFA (http://rfmri.org/DPARSF) toolboxes with similar procedures as those used for HCP data. T1 MRI is also segmented and coregistered to each subject’s native rs-fMRI space. No subject’s head motion exceeds 2 mm or 2°.

2.2 Regression Forest for FC-to-DT Mapping

We describe here how the DT-like tensors can be estimated from the HCP rs-fMRI data, and how the learned DT-like tensors can be used to guide FCT estimation using the ADNI2 rs-fMRI data. In the training stage, we extract features from randomly selected 3D patches. Using the obtained patch feature vectors, the regression forest method is trained to predict the corresponding DT at a center voxel of each patch. In the testing stage, the trained regression model is applied patch-wise to the input image to estimate DT-like tensors.

The feature vector is composed of two types of features: (1) local FC from rs-fMRI and (2) tissue probability maps of WM/GM/CSF from T1 MRI. For rs-fMRI, we follow [5] for computing the local FC as the correlation features. Specifically, we compute the Pearson’s correlation coefficients between the center voxel and its neighboring voxels. Note that, unlike [5], we also include voxels beyond the neighboring 26 voxels. For each of the three probability maps obtained from T1 MRI, we use the 3D Haar-like operators [9] to compute tissue-probability features. These two types of the features extracted from the two modalities are then concatenated as a single feature vector.

The process of training the regression forest generally follows the steps in [8, 9]. The major difference here is in the splitting function that is used to split the patch samples in the current node into the left and right child nodes. The criterion of the splitting function is based on one feature selected by exhaustive search within the feature subset, which can maximize the information gain of the splitted groups of training patches based on their corresponding target values. Specifically, the target DT information can be formatted as a 3 × 3 symmetric matrix, which includes six effective components and can be reshaped as a DT vector; therefore, the splitting function produces six estimates of the information gain corresponding to the six elements of the DT target, which are then averaged as an overall information gain to guide the splitting. In this way, the forest method can gauge all the information in the target vector for training the regressor. It is worth noting that, by combining tissue probability-based features, DT-like tensors can be estimated with more accuracy, because the local FC patterns in the GM and WM could be different, and accordingly the “FC-to-DT” mapping for GM voxels could also be different from the “FC-to-DT” mapping for WM voxels. Our experiment has shown that, by adding tissue-specific features, the testing rs-fMRI data can generate much better DT-like tensor maps.

It is worth noting that we also incorporate the auto-context model [7, 9] as cascade learning strategy for helping improving the mapping performance. Specifically, we refine the mapping by cascading multiple stages of regressions. The first-stage regressor uses only the correlation and tissue-probability features, while, in the subsequent stages, the context features obtained from the DTs predicted in the previous stage are also considered. Since each DT consist of six elements, the context features are computed using Haar-like operators for each DT element and then concatenated together.

2.3 FCT Estimation

We use the ADNI2 data to calculate the FCT with the guidance from the DTs predicted from the rs-fMRI data with the learned mapping model (using the HCP rs-fMRI data). For each voxel \( V_{i} \) from the input rs-fMRI data, the FCT \( \varvec{T}_{i} \) is represented using a 3 × 3 symmetric matrix, which is in the same mathematical form as the DT:

$$ \varvec{T}_{i} = \left[ {\begin{array}{*{20}c} {T_{xx} } & {T_{xy} } & {T_{xz} } \\ {T_{xy} } & {T_{yy} } & {T_{yz} } \\ {T_{xz} } & {T_{yz} } & {T_{zz} } \\ \end{array} } \right]. $$
(1)

To estimate it, the first step is to compute the Pearson’s correlation coefficient \( C_{ij} \) between the center voxel \( V_{i} \) and each of its 26 neighboring voxels \( V_{j} \). To increase the robustness of such a process to the noise and artifacts in rs-fMRI, we follow a patch-based strategy to implement the correlation measurement. Denote \( Q_{i} \) and \( Q_{j} \) as the two \( k \times k \times k \) patches centered at voxel \( V_{i} \) and \( V_{j} \), respectively. Here, we set \( k = 3 \) which suits the spatial resolution of the mostly-adopted rs-fMRI data such as in the ADNI dataset. The correlation coefficient \( C_{ij} \) is therefore given as

$$ C_{ij} = \frac{{\mathop \sum \nolimits_{x = 1}^{k} \mathop \sum \nolimits_{y = 1}^{k} \mathop \sum \nolimits_{z = 1}^{k} b\left( {x,y,z} \right) f_{\text{corr}} \left( {Q_{i} \left( {x,y,z} \right),Q_{j} \left( {x,y,z} \right)} \right)}}{{\mathop \sum \nolimits_{x = 1}^{k} \mathop \sum \nolimits_{y = 1}^{k} \mathop \sum \nolimits_{z = 1}^{k} b(x,y,z)}}, $$
(2)

where \( Q\left( {x,y,z} \right) \) is the voxel at location \( \left( {x,y,z} \right) \) of the patch \( Q \), \( f_{\text{corr}} (V_{i} ,V_{j} ) \) is the Pearson’s correlation comparing the time courses of \( V_{i} \) and \( V_{j} \), \( b\left( {x,y,z} \right) = { \exp }\left( { - \frac{{\left( {x - \mu } \right)^{2} + \left( {y - \mu } \right)^{2} + \left( {z - \mu } \right)^{2} }}{{2\rho^{2} }}} \right) \) is the Gaussian kernel used for weighting the correlations, with \( \mu = (k + 1)/2 \) and \( \rho \) as a scaling coefficient. In our study, \( \rho^{2} = 1.25 \) gives the optimal results.

Next, we compute a unit vector \( {\mathbf{n}}_{ij} = \{ n_{ij, 1} ,n_{ij, 2} ,n_{ij, 3} \} \) describing the direction from the center voxel \( V_{i} \) to each of its neighbors \( V_{j} \), the dyadic tensor \( \varvec{D}_{ij} \) is given as

$$ \varvec{D}_{ij} = \left( {\begin{array}{*{20}c} {n_{ij,1} \cdot n_{ij,1} } & {n_{ij,1} \cdot n_{ij,2} } & {n_{ij,1} \cdot n_{ij,3} } \\ {n_{ij,2} \cdot n_{ij,1} } & {n_{ij,2} \cdot n_{ij,2} } & {n_{ij,2} \cdot n_{ij,3} } \\ {n_{ij,3} \cdot n_{ij,1} } & {n_{ij,3} \cdot n_{ij,2} } & {n_{ij,3} \cdot n_{ij,3} } \\ \end{array} } \right). $$
(3)

Third, the orientation information derived from the DT-like tensors is calculated by applying an orientation distribution function (ODF) [13] to obtain the weighting function \( \beta \left( {{\mathbf{n}}_{ij} } \right) = 1/\left( {4\pi Z\left| \varvec{B} \right|^{{\frac{1}{2}}} \left( {{\mathbf{n}}_{ij}^{\text{T}} \varvec{B}^{ - 1} {\mathbf{n}}_{ij} } \right)^{{\frac{1}{2}}} } \right) \), where \( Z \) is a normalization constant and \( \varvec{B} \) is the learned DT represented using a 3 × 3 symmetric matrix.

Finally, we compute the robust FCT \( \varvec{T}_{i} \) by summing up all the dyadic tensors \( \varvec{D}_{ij} \) with their respective correlation coefficients \( C_{ij} \) and corresponding weighting coefficients \( \beta \left( {{\mathbf{n}}_{ij} } \right) \):

$$ \varvec{T}_{i} = \sum\nolimits_{j} {C_{ij} \varvec{D}_{ij} \,\beta ({\mathbf{n}}_{ij} )} . $$
(4)

In this way, the dyadic tensors along with the main directions of DT-like tensor have higher weights in \( \beta \left( {{\mathbf{n}}_{ij} } \right) \) than those at other directions. The overall framework of FCT computation is summarized in Fig. 1.

Fig. 1.
figure 1

The overall pipeline for robust FCT computation.

3 Experimental Results

We demonstrate the validity of our proposed framework by evaluating both the learned DT-like tensors and the final FCTs. For the HCP dataset that is used to learn the regression model, we first show the accuracy of the learned DT-like tensors by comparing them with the actual DTs derived from DTI. This is done using 4-fold cross-validation on the HCP dataset. The parameters for training the regression model are identical in all folds. From each rs-fMRI data, we extract 20000 patches with the size of 11 × 11 × 11 in voxels. The number of correlation features for each patch is set to be 1000, and the number of tissue-probability features for each segmented ROI is also set to be 1000. The trained regression forest has 20 trees, and the minimum sample number for the leaf node is set as 8. Note that when implementing the cascaded learning strategy, we connect three regression models. The maximum of the tree depth is 30 in the first regression model as it is trained without context features, and 33 for each of the later stages.

We evaluate the similarity between the predicted DTs in different stages of the cascade and the actual DTs by measuring Pearson’s correlations of their fractional anisotropy (FA) maps. The overall correlation coefficients without the cascade is 0.877± 0.015, which is improved to 0.894 ± 0.015 with the cascade. This shows the validity of the mapping and the effectiveness of the cascade. Furthermore, Fig. 2 shows the FA maps computed from the predicted DTs using the two different configurations, as well as the actual FA map from DTI for reference.

Fig. 2.
figure 2

The FA maps from the DTI-like tensors and actual DTI (used as reference).

In the second experiment, we show the generalizability of the trained regression model (based on the HCP dataset), by directly applying it to the ADNI2 dataset for robust FCT estimation. Figure 3 shows the FA maps using the original FCT calculation method proposed in [5] and using our proposed FCT estimation method. It can be observed that noise is significantly reduced with our method, and the estimated FA map is more reasonable, i.e., with high FA values in the major WM structures (such as the genu and splenium parts of corpus callosum) compared with the FA in the GM regions.

Fig. 3.
figure 3

The FA maps of the obtained FCTs using the method of Ding et al. (left) and our proposed method (right).

In the third experiment, we further evaluate the validity of our method by applying the resultant FCTs from both eMCI and NC subjects in ADNI2 as features for early AD diagnosis. Specifically, given the FA maps computed in the native space from the FCTs based on rs-fMRI of ADNI2, SPM8 is used to non-rigidly register them to the standard MNI-152 space. Next, an in-house WM fiber bundle probability template, consisting of 359 major WM segments linking 359 pairs of Automated Anatomical Labeling (AAL) brain regions and generated based on the DTI data of 500 subjects in HCP, is applied to each subject’s registered FA map. The fiber-probability-weighted average FA and the weighted variance of FA values in each of the 359 WM segments are computed as features for subsequent classification. In this way, each subject has two 359-by-1 feature vectors (corresponding to the weighted mean FA and the weighted FA variance obtained from FCTs). LASSO-based feature selection [14] is conducted to the two feature vectors separately. Two support vector machine (SVM) classifiers [15] are then trained, respectively. The prediction scores from the two classifiers are fused to give a final classification result. Leave-one-out cross-validation is used to evaluate classification performance.

Experiments show that using FCTs from rs-fMRI, even extracted from only several major WM structures and fed into a simple classifier, the accuracy (ACC) and the area-under-curve (AUC) for eMCI classification still reach the satisfactory level (i.e., 72.84% and 73.63%, respectively). On the other hand, if using the original FCT calculation method [5], the performance is relatively low (i.e., ACC = 67.90% and AUC = 64.53%). The improvements by our proposed FCT calculation method are also visualized using ROC curves in Fig. 4.

Fig. 4.
figure 4

The ROC curve for the eMCI-NC classification using Ding et al.’s method and our proposed method, respectively.

4 Conclusion

In this work, we have presented a novel framework for robust FCT estimation. First, based on high-resolution rs-fMRI and DTI data, we employ regression forest for predicting DTs by using both local temporal correlation features from rs-fMRI and tissue-probability features from T1 MRI. Then, the predicted DTs are further used as a prior to improve FCT estimation. In the experiments, we have also demonstrated that the resulting FCTs can be used as features for diagnosis of eMCI.