Multi-modal Variational Autoencoders for Normative Modelling Across Multiple Imaging Modalities

Lawry Aguila, Ana; Chapman, James; Altmann, Andre

doi:10.1007/978-3-031-43907-0_41

Ana Lawry Aguila¹⁴,
James Chapman¹⁴ &
Andre Altmann¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14220))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

6819 Accesses
2 Citations

Abstract

One of the challenges of studying common neurological disorders is disease heterogeneity including differences in causes, neuroimaging characteristics, comorbidities, or genetic variation. Normative modelling has become a popular method for studying such cohorts where the ‘normal’ behaviour of a physiological system is modelled and can be used at subject level to detect deviations relating to disease pathology. For many heterogeneous diseases, we expect to observe abnormalities across a range of neuroimaging and biological variables. However, thus far, normative models have largely been developed for studying a single imaging modality. We aim to develop a multi-modal normative modelling framework where abnormality is aggregated across variables of multiple modalities and is better able to detect deviations than uni-modal baselines. We propose two multi-modal VAE normative models to detect subject level deviations across T1 and DTI data. Our proposed models were better able to detect diseased individuals, capture disease severity, and correlate with patient cognition than baseline approaches. We also propose a multivariate latent deviation metric, measuring deviations from the joint latent space, which outperformed feature-based metrics.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Conditional VAEs for Confound Removal and Normative Modelling of Neurodegenerative Diseases

Progression Models for Imaging Data with Longitudinal Variational Auto Encoders

Using normative modelling to detect disease progression in mild cognitive impairment and Alzheimer’s disease in a cross-sectional multi-cohort study

Article Open access 03 August 2021

Keywords

1 Introduction

Normative modelling is a popular method to study heterogeneous brain disorders. Normative models assume disease cohorts sit at the tails of a healthy population distribution and quantify individual deviations from healthy brain patterns. Typically, a normative analysis constructs a normative model per variable, e.g., using Gaussian Process Regression (GPR) [9]. Recently, to model complex non-linear interactions between features, deep-learning approaches using adversarial (AAE) and variational autoencoder (VAE) models have been proposed [8, 11]. These models have a uni-modal structure with a single encoder and decoder network. So far, almost all deep-learning normative models have modelled only one modality. However, many brain disorders show deviations from the norm in features of multiple imaging modalities to a varying degree. Often it is unknown which modality will be the most sensitive. Thus, it is advantageous to develop normative models suitable for multiple modalities.

Most previous deep-learning normative and anomaly detection models measure deviations in the feature space [4, 7, 11]. However, for multi-modal models built from modalities containing highly different, but complementary information (e.g., T1 and DTI features as used here), we may not expect to see significantly greater deviations in the feature space compared to uni-modal methods. Indeed previous work has shown that, when using VAEs, even for one modality, measuring deviation in the latent space outperforms metrics in the feature space [8] and provides a single measure of abnormality. As such, we develop a latent deviation metric suitable to measuring deviations in multi-modal data.

There are many approaches to extending VAEs to integrate information from multiple modalities and learn informative joint latent representations. Most multi-modal VAE frameworks learn separate encoder and decoder networks for each modality and aggregate the encoding distributions to learn a joint latent representation. Wu and Goodman [16] introduced a multi-modal VAE (mVAE) where each encoding distribution is treated as an ‘expert’ and the Product-of-Experts (PoE), which takes a product of the experts’ densities, is used to approximate a joint encoding distribution. The PoE approach treats all experts as equally credible taking a uniform contribution from every modality. In practice, however, different levels of noise, complexity and information are present in different modalities. Furthermore, if we have an overconfident miscalibrated expert, i.e. a sharp, shifted probability distribution, the joint distribution will have low density in the region observed by the other experts and a biased mean prediction. This can result in a suboptimal latent space and data reconstruction. Shi et al. [13] address this problem by combining latent representations across modalities using a Mixture-of-Experts (MoE) approach. For MoE, the joint distribution is given by a mixture of the experts’ densities so that the density is spread over all regions covered by the experts and overconfident experts do not monopolize the resulting prediction. However, MoE is less sensitive to consensus across modalities and will give lower probability to regions where experts are in agreement than PoE. Alternatively, we propose a mVAE modelling the joint encoding distribution as a generalised Product-of-Experts (gPoE) [2]. We optimise modality specific weightings to account for different information content between experts and enable the model to down-weight experts which cause erroneous predictions. Depending on the application, either MoE or gPoE will be most appropriate and so we consider both methods for normative modelling.

As far as we are aware, only one other multi-modal VAE normative modelling framework has been proposed in the literature which uses the PoE (PoE-normVAE) [7]. However, Kumar et al. [7] rely on measuring deviations in the feature space, which we argue does not leverage the benefits of multi-modal models. Here, we present an improved factorisation of the joint representation by modelling it as a weighted product or sum of each encoding distribution.

Our contributions are two-fold. Firstly, we present two novel multi-modal normative modelling frameworks, MoE-normVAE and gPoE-normVAE, which capture the joint distribution between different imaging modalities. Our proposed models outperform baseline methods on two neuroimaging datasets. Secondly, we present a deviation metric, based on the latent space, suitable for detecting deviations in multi-modal normative distributions. We show that our metric better leverages the benefits of multi-modal normative models compared to feature space-based metrics.

2 Methods

Multi-modal Variational Autoencoder (mVAE). Let ${\textbf {X}}=\{{\textbf {x}}_m\}^M_{m=1}$ be the observations of M modalities. We use a mVAE to learn a multi-modal generative model (Fig. 1c), where modalities are conditionally independent given a common latent variable, of the form $p_{\theta }\left( {\textbf {X}}, {\textbf {z}}\right) = p (\textbf{z}) \prod _{m=1}^{M} p_{\theta _{m}}\left( {\textbf {x}}_{m} \mid {\textbf {z}}\right) $. The likelihood distributions $p_{\theta _{m}}\left( {\textbf {x}}_{m} \mid {\textbf {z}}\right) $ are parameterised by decoder networks with parameters $\theta = \{ \theta _{1}, \ldots , \theta _{M} \}$. The goal of VAE training is to maximise the marginal likelihood of the data. However, as this is intractable, we instead optimise an evidence lower bound (ELBO):

$$\begin{aligned} \mathcal {L} = \mathbb {E}_{q_{\phi }({\textbf {z}} \mid {\textbf {X}})}\left[ \sum _{m=1}^{M} \log p_{\theta }\left( {\textbf {x}}_{m} \mid {\textbf {z}}\right) \right] -D_{K L}\left( q_{\phi }({\textbf {z}} \mid {\textbf {X}})| p({\textbf {z}})\right) \end{aligned}$$

(1)

where the second term is the KL divergence between the approximate joint posterior $q_{\phi }\left( {\textbf {z}} \mid {\textbf {X}}\right) $ and the prior $p({\textbf {z}})$. We model the posterior, likelihood, and prior distributions as isotropic gaussians.

Approximate Joint Posterior. To train the mVAE, we must specify the form of the joint approximate posterior $q_{\phi }\left( {\textbf {z}} \mid {\textbf {X}}\right) $. Wu and Goodman [16] choose to factorise the joint posterior as a Product-of-Experts (PoE); $q_{\phi }\left( {\textbf {z}} \mid {\textbf {X}}\right) = \frac{1}{K} \prod _{m=1}^{M} q_{\phi _{m}}\left( {\textbf {z}} \mid {\textbf {x}}_{m}\right) $, where the experts, i.e., individual posterior distributions $q_{\phi _{m}}\left( {\textbf {z}} \mid {\textbf {x}}_{m}\right) $, are parameterised by encoder networks with parameters $\phi = \{ \phi _{1}, \ldots , \phi _{M} \}$. K is a normalisation term. Assuming each encoder network follows a Gaussian distribution $q\left( {\textbf {z}} \mid {\textbf {x}}_{m}\right) =\mathcal {N}(\boldsymbol{\mu }_m, \boldsymbol{\sigma }_{m}^{2} {\textbf {I}})$, the parameters of joint posterior distribution can be computed [5]; $ \boldsymbol{\mu } = \frac{\sum _{m=1}^{M} \boldsymbol{\mu }_{m} / \boldsymbol{\sigma }_{m}^{2}}{\sum _{m=1}^{M} 1 / \boldsymbol{\sigma }_{m}^{2}} \quad \text{ and } \quad \boldsymbol{\sigma }^{2} =\frac{1}{\sum _{m=1}^{M} 1 / \boldsymbol{\sigma }_{m}^{2}} $ (see Supp. for proofs).

However, overconfident but miscalibrated experts may bias the joint posterior distribution (see Fig. 1b) which is undesirable for learning informative latent representations between modalities [13].

Shi et al. [13] instead factorise the approximate joint posterior as a Mixture-of-Experts (MoE); $q_{\varPhi }\left( {\textbf {z}} \mid {\textbf {X}}\right) = \frac{1}{K} \sum _{m=1}^{M} \frac{1}{M} q_{\phi _{m}}\left( {\textbf {z}} \mid {\textbf {x}}_{m}\right) .$

In the MoE setting, each uni-modal posterior $q_{\phi }({\textbf {z}} \mid {\textbf {x}}_m)$ is evaluated with the generative model $p_{\theta }\left( {\textbf {X}}, {\textbf {z}}\right) $ such that the ELBO becomes:

$$\begin{aligned} \mathcal {L} = \sum _{m=1}^{M}\left[ \mathbb {E}_{q_{\phi }({\textbf {z}} \mid {\textbf {x}}_{m})}\left[ \sum _{m=1}^{M} \log p_{\theta }\left( {\textbf {x}}_{m} \mid {\textbf {z}}\right) \right] -D_{K L}\left( q_{\phi }({\textbf {z}} \mid {\textbf {x}}_m)| p({\textbf {z}})\right) \right] . \end{aligned}$$

(2)

However, this approach only takes each uni-modal encoding distribution separately into account during training. Thus, there is no explicit aggregation of information from multiple modalities in the latent representation for reconstruction by the decoder networks. For modalities with a high degree of modality-specific variation, this enforces an undesirable upperbound on the ELBO potentially leading to a sub-optimal approximation of the joint distribution [3].

Generalised Product-of-Experts Joint Posterior. We propose an alternative approach to mitigate the problem of overconfident experts by factorising the joint posterior as a generalised Product-of-Experts (gPoE) [2]; $q_{\phi }\left( {\textbf {z}} \mid {\textbf {X}}\right) = \frac{1}{K} \prod _{m=1}^{M} q_{\phi _{m}}^{\alpha _{m}}\left( {\textbf {z}} \mid {\textbf {x}}_{m}\right) $ where $\alpha _{m}$ is a weighting for modality m such that $\sum _{m=1}^{M}\alpha _{m} = 1$ for each latent dimension and $0<\alpha _{m}<1$. We optimise $\alpha $ during training allowing the model to weight experts in such a way as to learn an approximate joint posterior $q_{\phi }\left( {\textbf {z}} \mid {\textbf {X}}\right) $ where the likelihood distribution $p_{\theta }\left( {\textbf {X}} \mid {\textbf {z}}\right) $ is maximised. This provides a means to down-weigh overconfident experts. Furthermore, as $\alpha $ is learnt per latent dimension, different modality weightings can be learnt for different vectors, thus explicitly incorporating modality specific variation in addition to shared information in different dimensions of the joint latent space. Similarly to the PoE approach, we can compute the parameters of the joint posterior distribution; $ \boldsymbol{\mu } = \frac{\sum _{m=1}^{M} \boldsymbol{\mu }_{m}\boldsymbol{\alpha }_{m} / \boldsymbol{\sigma }_{m}^{2}}{\sum _{m=1}^{M} \boldsymbol{\alpha }_{m} / \boldsymbol{\sigma }_{m}^{2}} \quad \text{ and } \quad \boldsymbol{\sigma }^{2} = \sum _{m=1}^{M} \frac{1}{ \boldsymbol{\alpha }_{m} / \boldsymbol{\sigma }_{m}^{2}} $.

Recently, a gPoE mVAE was proposed for learning joint representations of hand-poses and surgical videos [6]. However, we emphasize that our approach differs in application and offers a more lightweight implementation (Joshi et al. [6] require training of auxiliary networks to learn $\alpha $ per sample).

Multi-modal Normative Modelling. We propose two mVAE normative modelling frameworks shown in Fig. 1a. MoE-normVAE, which uses a MoE joint posterior distribution, and gPoE-normVAE, which uses a gPoE joint posterior distribution. For both models, the encoder $\phi $ and decoder $\theta $ parameters are trained to characterise a healthy population cohort. normVAE models assume abnormality due to disease effects can be quantified by measuring deviations in the latent space [8] or the feature space [11]. At test time, the clinical cohort is passed through the encoder and decoder networks. Deviations of test subjects from the multi-modal latent space of the healthy controls and data reconstruction errors are measured. We compare our methods to the previously proposed PoE-normVAE [7] and three uni-modal models; two single modality and one multi-modality with a concatenated input.

To compare our normVAE models to a classical normative approach, we trained one GPR (using the PCNToolkit) per feature on a sub-set of 2000 healthy UK Biobank individuals and used extreme value statistics to calculate subject-level abnormality index [9]. We used a top 5% abnormality threshold (set using the healthy training cohort) to calculate a significance ratio (see Eq. 6).

Multi-modal Latent Deviation Metric. Previous works using autoencoders as normative models mostly relied on feature-space based deviation methods [7, 11]. That is, they compare the input value for subject j for the i-th brain region $x_{ij}$ to the value reconstructed by the autoencoder $\widehat{x}_{ij}$: $d_{ij}=\left( x_{ij}-\widehat{x}_{ij}\right) ^{2}$. Kumar et al. [7] propose the following normalised z-score metric on the data reconstruction (a univariate feature space metric):

$$\begin{aligned} D_{\text {uf}}=\frac{d_{i j}-\mu _{\text{ norm }}\left( d_{i j}^{\text{ norm }}\right) }{\sigma _{\text{ norm }}\left( d_{i j}^{\text{ norm }}\right) } \end{aligned}$$

(3)

where $\mu _{\text{ norm }}\left( d_{i j}^{\text{ norm }}\right) $ is the mean and $\sigma _{\text{ norm }}\left( d_{i j}^{\text{ norm }}\right) $ the standard deviation of the deviations $d_{i j}^{\text{ norm }}$ of a holdout healthy control cohort.

However, in the multi-modal setting, feature space-based deviation metrics may not highlight the benefits of multi-modal models over their uni-modal counterparts. The goal of the joint latent representation is to capture information from all modalities. Thus, decoders for each modality must extract the information from the joint latent representation, which now carries information from all other modalities as well. Therefore, data reconstructions capture only information relevant to a particular modality and may also be poorer compared to uni-modal methods. As such, particularly when incorporating modalities with a high degree of modality-specific variation, we believe latent space deviation metrics would better capture deviations from normative behaviour across multiple modalities. Then, once an abnormal subject has been identified, feature space metrics can be used to identify deviating brain regions (e.g. Supp. Fig. 3).

We propose a latent deviation metric to measure deviations from the joint normative distribution. To account for correlation between latent vectors and derive a single multivariate measure of deviation, we measure the Mahalanobis distance from the encoding distribution of the training cohort:

$$\begin{aligned} D_{\text {ml}}=\sqrt{\left( z_{j}-\mu (z^{\text {norm}})\right) ^T \varSigma (z^{\text {norm}})^{-1}\left( z_{j}-\mu (z^{\text {norm}})\right) } \end{aligned}$$

(4)

where $z_j \sim q\left( {\textbf {z}}_j \mid {\textbf {X}}_j\right) $ is a sample from the joint posterior distribution for subject j, $\mu (z^{\text {norm}})$ is the mean and $\varSigma (z^{\text {norm}})$ the covariance of the healthy cohort latent position. We use robust estimates of the mean and covariance to account for outliers within the healthy control cohort. For closer comparison with $D_{\text {ml}}$, we derive the following multivariate feature space metric:

$$\begin{aligned} D_{\text {mf}}=\sqrt{\left( d_{j}-\mu (d^{\text {norm}})\right) ^T \varSigma (d^{\text {norm}})^{-1}\left( d_{j}-\mu (d^{\text {norm}})\right) } \end{aligned}$$

(5)

where $d_j = \{ d_{ij}, \ldots , d_{Ij} \}$ is the reconstruction error for subject j for brain regions $(i=1,...,I)$, $\mu (d^{\text {norm}})$ is the mean and $\varSigma (d^{\text {norm}})$ the covariance of the healthy cohort reconstruction error.

Assessing Deviation Metric Performance. For each model, we calculated $D_{\text {ml}}$ and $D_{\text {mf}}$ for a healthy holdout cohort and disease cohort. For each deviation metric, we identified individuals whose deviations were significantly different from the healthy training distribution ($p<0.001$) [15]. Ideally, we want a model which correctly identifies disease individuals as outliers and healthy individuals as sitting within the normative distribution. As such, we use the following significance ratio (positive likelihood ratio) to assess model performance:

(6)

In order to calculate significance ratios, we calculated $D_{\text {uf}}$ relative to the training cohort for the healthy holdout and disease cohorts (Bonferroni adjusted p=0.05/$N_{\text {features}}$) [7].

3 Experiments

Data Processing. To train the normVAE models, we used 10,276 healthy subjects from the UK Biobank [14] (application number: 70047). We used pre-processed (provided by the UK Biobank [1]) grey-matter volumes for 66 cortical (Desikan-Killiany atlas) and 16 subcortical brain regions, and Fractional Anisotropy (FA) and Mean Diffusivity (MD) measurements for 35 white matter tracts (John Hopkins University atlas). At test time, we used 2,568 healthy controls from a holdout cohort and 122 individuals with one of several neurodegenerative disorders; motor neuron disease, multiple sclerosis, Parkinson’s disease, dementia/Alzheimer/cognitive-impairment and other demyelinating disease.

We also tested the models using an external dataset. We extracted 213 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI)^{Footnote 1} [10] dataset with significant memory concern (SMC; N=27), early mild cognitive impairment (EMCI; N=63), late mild cognitive impairment (LMCI; N=34), Alzheimer’s disease (AD; N=43) as well as healthy controls (HC; N=45). We used the healthy controls to fine-tune the models in a transfer learning approach. The same T1 and DTI features as for the UK Biobank were extracted for the ADNI dataset.

Rather than conditioning on covariates as done in some related work [7, 8], we adjusted for confounding effects prior to analysis. Non-linear age and linear ICV affects where removed from the DTI and T1 MRI features of both datasets [12]. Each brain ROI was normalised by removing the mean and dividing by the standard deviation of the healthy control cohort brain regions.

UK Biobank Results. As expected, we see greater significance ratios for all models when using $D_{\text {ml}}$ rather than $D_{\text {mf}}$ (Table 1). When using $D_{\text {mf}}$ or $D_{\text {uf}}$,

Table 1. Significance ratio calculated from $D_{\text {ml}}$, $D_{\text {mf}}$, and $D_{\text {uf}}$ for the UK Biobank. See Supp. for results in figure form. Using GPR, we observed a significance ratio of 6.01, poorer performance than our models (using $D_{\text {ml}}$).

Full size table

all models perform similiarly. Using $D_{\text {ml}}$ over $D_{\text {mf}}$ leads to a 4-fold increase in the signficance ratio. Further, our proposed models give the best overall performance across different $L_{\text {dim}}$ with the highest significance ratio for gPoE-normVAE with $L_{\text {dim}}$=10. Generally, all multi-modal normVAE showed better performance than the uni-modal models suggesting that by modelling the joint distribution between modalities, we can learn better normative models.

ADNI Results. Previous work [8] explored the ability of a uni-modal T1 normVAE to detect deviations in the ADNI cohorts. Figure 2a shows the latent deviation $D_{\text {ml}}$ for different diagnosis in the ADNI cohort for the T1 normVAE, DTI normVAE, PoE-normVAE and gPoE-normVAE models. All models reflect the increasing disease severity with increasing disease stage. The gPoE-normVAE model showed greater sensitivity to disease stage as suggested by the higher F statistic and p-values from an ANOVA analysis. We measured the Pearson correlation with composite measures of memory and executive function (Fig. 2b) and found that our proposed model exhibited greater correlation with both cognition scores than baseline approaches. Finally, we see that the sensitivity to disease severity for the gPoE-normVAE model extends to the feature space where we see a general increase in average $D_{\text {uf}}$ from the LMCI to AD cohort (Supp. Figs. 3a and 3b respectively).

4 Discussion and Further Work

We have built on recent works [7, 8, 11] and introduced two novel mVAE normative models, which provide an alternative method of learning the joint normative distribution between modalities to address the limitations of current approaches. Our models provide a more informative joint representation compared to baseline methods as evidenced by the better significance ratio for the UK Biobank dataset and greater sensitivity to disease staging and correlation with cognitive measures in the ADNI dataset. We also proposed a latent deviation metric suitable for detecting deviations in the multivariate latent space of multi-modal normative models which gave an approximately 4-fold performance increase over metrics based on the feature space.

Further work will involve extending our models to more data modalities, such as genetic variants, to better characterise the behaviour of a physiological system. We note that, for fair comparison across models, we remove the effects of confounding variables prior to analysis. However, confounding effects could be removed during analysis via condition variables [8]. Another limitation of normVAE models introduced here is the use of ROI level data. Data processing software, such as FreeSurfer, may fail to accurately capture abnormality in images, particularly if large lesions are present. Further work involves creating normative models designed for voxel level data to better capture disease effects.

Normative models have been successfully applied to the study of a range of heterogeneous diseases. Diseases often present abnormalities across a range of neuroimaging, biological and physiological features which provide different information about the underlying disease process. Normative systems that incorporate features from different data modalities offer a holistic picture of the disease and will be capable of detecting abnormalities across a broad range of different diseases. Furthermore, multi-modal normative modelling captures the relationship between different modalities in healthy individuals, with disruption to this relationship potentially leading to a disease signal. Code is publicly available at https://github.com/alawryaguila/multimodal-normative-models.

Notes

1.
Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

References

Alfaro-Almagro, F., et al.: Image processing and quality control for the first 10,000 brain imaging datasets from UK biobank. NeuroImage 166, 400–424 (2018). https://doi.org/10.1016/j.neuroimage.2017.10.034, https://www.sciencedirect.com/science/article/pii/S1053811917308613
Cao, Y., Fleet, D.J.: Generalized product of experts for automatic and principled fusion of gaussian process predictions (2014). https://doi.org/10.48550/ARXIV.1410.7827, https://arxiv.org/abs/1410.7827
Daunhawer, I., Sutter, T.M., Chin-Cheong, K., Palumbo, E., Vogt, J.E.: On the limitations of multimodal vaes. CoRR abs/2110.04121 (2021). https://arxiv.org/abs/2110.04121
Hendrycks, D., Mazeika, M., Dietterich, T.G.: Deep anomaly detection with outlier exposure. CoRR abs/1812.04606 (2018). http://arxiv.org/abs/1812.04606
Hwang, H., Kim, G.H., Hong, S., Kim, K.E.: Multi-view representation learning via total correlation objective. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 12194–12207. Curran Associates, Inc. (2021), https://proceedings.neurips.cc/paper/2021/file/65a99bb7a3115fdede20da98b08a370f-Paper.pdf
Joshi, A., Gupta, N., Shah, J., Bhattarai, B., Modi, A., Stoyanov, D.: Generalized product-of-experts for learning multimodal representations in noisy environments. In: Proceedings of the 2022 International Conference on Multimodal Interaction, pp. 83–93. ICMI ’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3536221.3556596, https://doi.org/10.1145/3536221.3556596
Kumar, S., Payne, P., Sotiras, A.: Normative modeling using multimodal variational autoencoders to identify abnormal brain structural patterns in alzheimer disease (2021). https://doi.org/10.48550/ARXIV.2110.04903, https://arxiv.org/abs/2110.04903
Lawry Aguila, A., Chapman, J., Janahi, M., Altmann, A.: Conditional vaes for confound removal and normative modelling of neurodegenerative diseases. In: Medical Image Computing and Computer Assisted Intervention - MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part I. pp. 430–440. Springer-Verlag (2022)
Google Scholar
Marquand, A., Rezek, I., Buitelaar, J., Beckmann, C.: Understanding heterogeneity in clinical cohorts using normative models: Beyond case-control studies. Biol. Psych. 80(7), 552-561 (2016)
Google Scholar
Petersen, R., et al.: Alzheimer’s disease neuroimaging initiative (adni): clinical characterization. Neurology 74(3), 201–209 (2010). https://doi.org/10.1212/wnl.0b013e3181cb3e25, https://europepmc.org/articles/PMC2809036
Pinaya, W., et al.: Using normative modelling to detect disease progression in mild cognitive impairment and alzheimer’s disease in a cross-sectional multi-cohort study. Sci. Reports 11(1), 15746 (2021)
Google Scholar
Pomponio, R., et al.: Harmonization of large mri datasets for the analysis of brain imaging patterns throughout the lifespan. NeuroImage 208, 116450 (2020). https://doi.org/10.1016/j.neuroimage.2019.116450, https://www.sciencedirect.com/science/article/pii/S1053811919310419
Shi, Y., Siddharth, N., Paige, B., Torr, P.H.S.: Variational mixture-of-experts autoencoders for multi-modal deep generative models (2019). https://doi.org/10.48550/ARXIV.1911.03393, https://arxiv.org/abs/1911.03393
Sudlow, C., et al.: Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015)
Article Google Scholar
Tabachnick, B.G., Fidell, L.S.: Using multivariate statistics, 7th edn. Pearson, Upper Saddle River, NJ (2018)
Google Scholar
Wu, M., Goodman, N.D.: Multimodal generative models for scalable weakly-supervised learning. CoRR abs/1802.05335 (2018). http://arxiv.org/abs/1802.05335

Download references

Acknowledgements

This work is supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) and the Department of Health’s NIHR-funded Biomedical Research Centre at University College London Hospitals.

Author information

Authors and Affiliations

University College London, London, WC1E 6BT, England
Ana Lawry Aguila, James Chapman & Andre Altmann

Authors

Ana Lawry Aguila
View author publications
You can also search for this author in PubMed Google Scholar
James Chapman
View author publications
You can also search for this author in PubMed Google Scholar
Andre Altmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ana Lawry Aguila .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen's University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 869 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lawry Aguila, A., Chapman, J., Altmann, A. (2023). Multi-modal Variational Autoencoders for Normative Modelling Across Multiple Imaging Modalities. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14220. Springer, Cham. https://doi.org/10.1007/978-3-031-43907-0_41

Download citation

DOI: https://doi.org/10.1007/978-3-031-43907-0_41
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43906-3
Online ISBN: 978-3-031-43907-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Multi-modal Variational Autoencoders for Normative Modelling Across Multiple Imaging Modalities

Abstract

Similar content being viewed by others

Conditional VAEs for Confound Removal and Normative Modelling of Neurodegenerative Diseases

Progression Models for Imaging Data with Longitudinal Variational Auto Encoders

Using normative modelling to detect disease progression in mild cognitive impairment and Alzheimer’s disease in a cross-sectional multi-cohort study

Keywords

1 Introduction

2 Methods

3 Experiments

4 Discussion and Further Work

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 869 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Multi-modal Variational Autoencoders for Normative Modelling Across Multiple Imaging Modalities

Abstract

Similar content being viewed by others

Conditional VAEs for Confound Removal and Normative Modelling of Neurodegenerative Diseases

Progression Models for Imaging Data with Longitudinal Variational Auto Encoders

Using normative modelling to detect disease progression in mild cognitive impairment and Alzheimer’s disease in a cross-sectional multi-cohort study

Keywords

1 Introduction

2 Methods

3 Experiments

4 Discussion and Further Work

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 869 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation