Semi-supervised Assessment of Incomplete LV Coverage in Cardiac MRI Using Generative Adversarial Nets

Zhang, Le; Gooya, Ali; Frangi, Alejandro F.

doi:10.1007/978-3-319-68127-6_7

Le Zhang¹⁷,
Ali Gooya¹⁷ &
Alejandro F. Frangi¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10557))

Included in the following conference series:

International Workshop on Simulation and Synthesis in Medical Imaging

2178 Accesses
25 Citations

Abstract

Cardiac magnetic resonance (CMR) images play a growing role in diagnostic imaging of cardiovascular diseases. Ensuring full coverage of the Left Ventricle (LV) is a basic criteria of CMR image quality. Complete LV coverage, from base to apex, precedes accurate cardiac volume and functional assessment. Incomplete coverage of the LV is identified through visual inspection, which is time-consuming and usually done retrospectively in large imaging cohorts. In this paper, we propose a novel semi-supervised method to check the coverage of LV from CMR images by using generative adversarial networks (GAN), we call it Semi-Coupled-GANs (SCGANs). To identify missing basal and apical slices in a CMR volume, a two-stage framework is proposed. First, the SCGANs generate adversarial examples and extract high-level features from the CMR images; then these image attributes are used to detect missing basal and apical slices. We constructed extensive experiments to validate the proposed method on UK Biobank with more than 6000 independent volumetric MR scans, which achieved high accuracy and robust results for missing slice detection, comparable with those of state of the art deep learning methods. The proposed method, in principle, can be adapted to other CMR image data for LV coverage assessment.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Image Quality Assessment for Population Cardiac Magnetic Resonance Imaging

Multi-Input and Dataset-Invariant Adversarial Learning (MDAL) for Left and Right-Ventricular Coverage Estimation in Cardiac MRI

Adversarial Convolutional Networks with Weak Domain-Transfer for Multi-sequence Cardiac MR Images Segmentation

1 Introduction

Left Ventricular (LV) cardiac anatomy and function are widely used for diagnosis and monitoring disease progression in cardiology and to assess the patient’s response to cardiac surgery and interventional procedures. Cardiac ultrasound (US) and cardiac magnetic resonance (CMR) imaging are arguably the most wide-spread techniques for clinical diagnostic imaging of the heart. For population imaging studies, however, CMR remains the modality of choice and provides one-stop-shop access to cardiac anatomy and function non-invasively. The quantification of LV anatomy and function from large population imaging studies or patient cohorts from large clinical trials requires automatic image quality assessment and image analysis tools. A basic criteria for cardiac image quality is LV coverage and detection of missing apical and basal CMR slices [7]. Due to rapid mechanical motion of the heart, breathing motion, and imperfect triggering, CMR can display incomplete LV coverage, which hampers quantitative LV characterization and diagnostic accuracy [12]. For example, missing basal slices has an important impact on LV volume calculation and several derived LV functional measures like ejection fraction and cardiac output. Even if scout images are acquired to center the LV in the field of view and minimize this problem, incomplete coverage can result at any points throughout the cardiac cycle due to patient breathing and cardiac motion. Automatic quality assessment is important in large-scale population imaging studies, where data is acquired across different imaging sites, from subjects with diverse constitutions, and with strict time constraints on scanner availability [4].

Few guidelines exist, clinical or otherwise, that objectively establish what constitutes a good medical image and a good CMR study [6]. To ensure consistent quantification of CMR data, automatic assessment of complete LV coverage is a first step. LV coverage is still assessed by visual inspection of CMR image sequences, which is subjective, repetitive, error prone, and time consuming [2]. Automatic coverage assessment must intervene and correct data acquisition soon, and/or discard promptly images with incomplete LV coverage whose analysis would otherwise impair any aggregated statistics over the cohort.

In medical imaging it is hard to have access to quality-labelled image databases due to the diversity of image characteristics, and their artifacts, of diverse anatomical locations and image modalities. Therefore, it is essential to devise techniques that do not require manual labelling of visual image quality. Image synthesis models provide a unique opportunity for performing unsupervised learning. These models build a rich prior over natural image statistics that can be leveraged by classifiers to improve predictions on datasets for which few labels exist [11]. Among them, generative adversarial networks (GAN) can synthesize adversarial examples, which increase the loss by a machine learning model [13]. Meanwhile, GAN can perform unsupervised learning by simply ignoring the component of the loss arising from class labels when a label is unavailable for a training image [5].

In this paper, we mainly focus on the analysis of short axis (SA) cine MRI. We aim to identify missing apical slices (MAS) and/or basal slices (MBS) in cardiac MRI volumes. In previous research, Le [14] used convolutional neural network (CNN) constructed on single-slice images and processed them sequentially. But this solution needs large amount of labelled data and lacks the ability to classify examples with perturbations correctly. In this paper, we exploit semi-coupled-GANs (SCGANs), a semi-supervised approach, for incomplete LV coverage detection. To alleviate the lack of sufficient numbers of CMR datasets with MBS or MAS, the proposed SCGANs use two generative models to synthesize adversarial examples. By learning adversarial examples, it improves not only robustness to adversarial examples, but also generalization performance for original examples. This work is the first work we know of to use adversarial examples to improve the robustness of an attribute learning model.

2 Methodology

We present a novel technique of LV coverage assessment for CMRI by using SCGANs. The motivation behind our proposed method is: In medical image quality assessment problems, we are always faced with a lack of quality-labelled data, especially images with artifacts. Several deep learning models cannot classify the examples with perturbation correctly. Our semi-supervised SCGANs is proposed by using adversarial examples as the outlying observations for discriminative model training. We generate adversarial samples by two generators separately, which confuse the discriminator into mistaking them for genuine images. After that, we obtain the robust attribute classifiers by learning both original data and synthetic data. Our proposed SCGANs represents a strategy to better handle the typical LV coverage assessment problem.

2.1 Generative Adversarial Learning

Recently, GAN [5] was proposed as a novel way for adversarial learning. It consists of a generative model and a discriminative model, both are realized as multilayer perceptrons [9]. The aim of the discriminator is to correctly classify the original examples and adversarial examples. By learning the adversarial examples, the network cannot only becomes robust to adversarial examples, but also generalization improves for unmodified examples. GAN does not need the label information when training the generator and then the discriminator can estimates the probability that a sample came from the original data rather than the generator.

We assume a probability distribution $ \textit{M} $, which is a black box relative to us. To realize how the black box works, we construct two ‘adversarial’ models: a generative model $\textit{G}$ that captures the data distribution, and a discriminative model $\textit{D}$ that estimates the probability that a sample from the training data rather than $\textit{G}$. Both $\textit{G}$ and $\textit{D}$ could be a non-liner mapping function, such as a multi-layer perceptron. Our objective is to learn feature representation to handle a wide range of visual appearances in cardiac MRI and identify images with incomplete LV coverage. We regard adversarial examples as outlying observations regarding other samples in training data. The generative model constantly produce new adversarial samples and the discriminative model classify the positive and negative samples by learning the new produced adversarial samples constantly. Given a particular describable visual attribute - say ‘MBS’. An outlier image is expected to be mapped to negative values, which indicates the absence of basal slice. This can happen for two reasons: (1) the image does not belong to the basal slice, (2) the image belongs to the adversarial examples. We consider them all as the outliers.

2.2 Semi-coupled GANs

Here we introduce our model based on the above discussion. Our model is illustrated in Fig. 1 designed as a semi-coupled-GANs for attribute learning. It consists of a pair of $ Generators - $ $G_1$ and $G_2$, which share a same discriminator. Each generator synthesizes the adversarial samples $Y_1$ and $Y_2$ for positive and negative data, respectively.

Generative Models: We firstly feed the two generators $G_1$ and $G_2$ noise data $\varvec{z}$, $G_1$ and $G_2$ learn probability distribution from the original positive and negative images respectively, and generate the corresponding adversarial samples. Then, we give the adversarial data to discriminator D. Denote the distributions of $G_1(\varvec{z})$ and $G_2(\varvec{z})$ by $p_{G_1}$ and $p_{G_2}$. Both $G_1$ and $G_2$ are realized as multilayer perceptions:

$$\begin{aligned} \left\{ \begin{matrix} {G_1}(\varvec{z}) = G_1^{({m_1})}(G_1^{({m_1} - 1)}(...G_1^{(2)}(G_1^{(1)}(\varvec{z})))) \\ {G_2}(\varvec{z}) = G_2^{({m_2})}(G_2^{({m_2} - 1)}(...G_2^{(2)}(G_2^{(1)}(\varvec{z})))) \end{matrix}\right. \end{aligned}$$

(1)

where $G_1^{(i)}$ and $G_2^{(i)}$ are the ith layers of $G_1$ and $G_2$ and $m_1$ and $m_2$ are the numbers of layers in $G_1$ and $G_2$. In our training process, $m_1$ and $m_2$ need not to be the same. In traditional discriminative deep neural network, the feature information is extracted from low-level features in first layers to the high-level features in last layers. While, through multi-layer perceptron operations, our two generator models decode the information with an opposite flow direction from abstract concepts to more material details.

Discriminative Models: Every generated sample has a corresponding class label and the discriminator gives both a probability distribution over dataset and a probability distribution over the class labels. We put both the original samples and the adversarial samples into D for the discriminator training, D output multiple output values between 0 and 1. In this process, if the training samples $\varvec{x}$ is the positive/or real data, the discriminant D ensures the output value is similar with the trained corresponding value, which represents the input data is the positive/or real, while output values close to 0 indicates the input data is the negative/or fake. The discriminant D equals a classifier with supervision situation, which returns to 1 or 0. Let D be the discriminative model given by:

$$\begin{aligned} {D}({\varvec{x}}) = D^{({n})}(D^{({n} - 1)}(...D^{(2)}(D^{(1)}({\varvec{x}})))) \end{aligned}$$

(2)

where $D^{(i)}$ is the ith layer of D and n is the number of layers. The discriminator maps each input image to a probability score which indicates the input is drawn from the positive data or the negative data. In this process, the first layer of the discriminative model extracts low-level features, while the last layer extracts high-level features.

Learning: The Semi-Coupled-GANs framework corresponds to a constrained minimax game given by

$$\begin{aligned} \begin{aligned} \mathop {\max }\limits _D \mathop {\min }\limits _{{G_1},{G_2}} V({G_1},{G_2},D)&= {E_{{x}\sim {p_{{x_{data}}}}}}[\log {D}({\varvec{x}\mid \varvec{y}})] + {E_{z\sim {p_z}}}[\log (1 - {D}(G_1(\varvec{z})))]\\&+\,{E_{z\sim {p_z}}}[\log (1 - {D}(G_2(\varvec{z})))] \end{aligned} \end{aligned}$$

(3)

There are two terms in (3), each term has an independent generator but share a same discriminator. The two generative models synthesize a pair of adversarial samples for confusing the discriminative models. The discriminator gives both a probability distribution over image data and a probability distribution over the class labels, ${D}({\varvec{x}\mid \varvec{y}})$. Here, there are four kinds of samples for training the discriminator: the positive and negative samples from original images and their corresponding adversarial samples computed by two generators. The inputs discriminative model is data and corresponding labels. Similar to GAN, our SCGANs can be trained by back propagation with the alternating gradient update steps.

2.3 Quality Estimation

For a given cardiac volume, a dissimilarity score is computed for each representative visual attribute - MAS and MBS. Any visual attributes with a score below an optimal threshold is classified as an artifact. After computing the visual attributes, we could verify the cardiac MRI quality based on the corresponding attributes scores. Let $ x_{target}=P_{MAS}(\varvec{X}_{target}) $ and $ y_{target}=P_{MBS}(\varvec{X}_{target}) $ be the outputs of the discriminator. If the quality of target cardiac volume $ \varvec{X}_{target} $ is good, the values $ P_{MAS}(\varvec{X}_{target}) $ and $ P_{MBS}(\varvec{X}_{target}) $ from the target cardiac volume should be similar with the trained corresponding positive attribute values. We combine the output values so the verification classifier Q can make sense of the data. To address the problem, we use the concatenation of these tuples for both MAS and MBS attribute classifier outputs form the input to the verification classifier Q [8]. Finally, putting both terms together yields the tuples $ q(S_{target}) $:

$$\begin{aligned} q(S_{target})=Q({<}p_{MAS},p_{MBS}{>}) \end{aligned}$$

(4)

Training Q requires pairs of positive examples and negative examples. For the classification function, we use SVM with an RBF kernel for $ \varvec{X} $, trained using libsvm [3] with the default parameters of $ C = 1 $ and $ \gamma = 1/ndims $, where ndims is the dimensionality of ${<}p_{MAS}, p_{MBS}{>}$.

3 Experiment and Related Analysis

Data specifications: In the UK Biobank (UKBB) dataset, we have 3400 subjects, each with 50 time points covering the heart from the base to apex. We use the endocardial contour as the main characteristic to identify the apical, middle and basal slices. For example, we can find the Left Ventricular Outflow Tract (LVOT) in the basal slice. In other slices, LVOT is nonexistant. As for the apical slice, we define it as the LV cavity is still visible at end-systole. Besides the basal slice and apical slice, we can consider the rest slices as the middle slices. To obtain the negative samples, we choose the middle slice as the negative samples for each attribute learning.

Experimental set-up: All experiments used TensorFlow [1] on GPUs. With all 50 time points consideration for each subject, we can obtain 17,0000 and regarded as the ground truth in our experiments. The architecture of the two generators $G_1$ and $G_2$ are consisted of several ‘deconvolution’ layers that transform the noise $\varvec{z}$ and class c into an image [11]. We train the model architecture for generating images at $120 \times 120$ spatial resolutions. The discriminator D is a deep convolutional neural network with a Leaky ReLU nonlinearity [10]. In our experiment, 10-fold cross-validation method is used to evaluate the final performance of our attribute classifiers. To evaluate the classification algorithms, we use Accuracy, Precision Rate and Recall Rate defined as: Accuracy = (TP + TN)/(TP + FP + TN + FN), Precision Rate = TP/(TP + FP) and Recall Rate = TP/(TP + FN). Where TP, TN, FP, and FN are the numbers of the true positive, true negative, false positive and false negative samples, respectively.

Performance and Discussion: We evaluate the quality of our semi-supervised representation learning algorithms by applying it as a feature extractor on supervised datasets. Table 1 shows the test performance on UK Biobank Dataset with the state-of-art deep learning methods. With supervised deep learning methods, 2D CNN, it achieved accuracies with 77.5% and 74.9%. Our SCGANs achieved performance with significant increase, 92.5% and 89.3% accuracies. This is despite the state of the art models having no ability to discriminate the adversarial samples, whereas our model requires to training the generative model to produce the adversarial examples and can correctly classify both unmodified and adversarial samples. It improves not only robustness to adversarial examples, but also generalization performance for original examples. Meanwhile, our SCGANs also achieved a comparable result with the 3D CNN, which indicates opportunity for future 3D image synthesis models.

Table 1. The accuracy, precision rate and recall rate between the state-of-art deep learning approaches and our method.

Full size table

Our attribute classifiers are trained using nine folds and then evaluated on the remaining fold, cycling through all ten folds. Receiver Operating Characteristic (ROC) curves are obtained by saving the classifier outputs for each test pair in all ten folds and then sliding a threshold over all output values to obtain different false positive/detection rates. In Fig. 2, we demonstrate the ROC curve to show that our adversarial training (SCGANs) method can achieve ideal results. These results reinforce that adversarial examples are powerful samples for attribute leaning. In Fig. 2 we can see our proposed method can correctly classify a few challenging samples (True Positive) and adversarial samples (False Negative). Experimental results obtained confirm that adversarial training approach makes the model more robust to adversarial examples and generalization performance for original examples. Although the results show that the accuracy of the proposed method is slightly lower but comparable to that of 3D CNN, our SCGAN can reduce the computation cost, which is especially important in population imaging.

4 Conclusion

In this paper, we tackled the problem of defining missing apical and basal slices in large imaging databases. We illustrated the concept by proposing a SCGANs to CMR image studies from the UK Biobank pilot datasets. By training the classifier with the adversarial examples, our model can achieve a significant improvement in attribute representation. A well-trained attribute classifiers are performed on the candidates to corresponding categories. We also validated our model by comparing with traditional deep learning methods and applying them to UK Biobank data sets. The proposed model shows a high consistency with human perception and becomes superior compared to the state-of-the-art methods, showing its high potential. Our proposed semi-couple-GANs can also be easily applied and boost the results for other detection and segmentation tasks in medical image analysis.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint (2016). arXiv:1603.04467
Attili, A.K., Schuster, A., Nagel, E., Reiber, J.H., van der Geest, R.J.: Quantification in cardiac MRI: advances in image acquisition and processing. Int. J. Cardiovasc. Imaging 26(1), 27–40 (2010)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Ferreira, P.F., Gatehouse, P.D., Mohiaddin, R.H., Firmin, D.N.: Cardiovascular magnetic resonance artefacts. J. Cardiovasc. Magn. Resonance 15(1), 1 (2013)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
van der Graaf, A., Bhagirath, P., Ghoerbien, S., Götte, M.: Cardiac magnetic resonance imaging: artefacts for clinicians. Neth. Heart J. 22(12), 542–549 (2014)
Article Google Scholar
Klinke, V., Muzzarelli, S., Lauriers, N., Locca, D., Vincenti, G., Monney, P., Lu, C., Nothnagel, D., Pilz, G., Lombardi, M., et al.: Quality assessment of cardiovascular magnetic resonance in the setting of the european CMR registry: description and validation of standardized criteria. J. Cardiovasc. Magn. Resonance 15(1), 1 (2013)
Article Google Scholar
Kumar, N., Berg, A., Belhumeur, P.N., Nayar, S.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1962–1977 (2011)
Article Google Scholar
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: Advances in Neural Information Processing Systems, pp. 469–477 (2016)
Google Scholar
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of ICML, vol. 30 (2013)
Google Scholar
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier gans. arXiv preprint (2016). arXiv:1610.09585
Pusey, E., Lufkin, R.B., Brown, R., Solomon, M.A., Stark, D.D., Tarr, R., Hanafee, W.: Magnetic resonance imaging artifacts: mechanism and clinical significance. Radiographics 6(5), 891–911 (1986)
Article Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv preprint (2013). arXiv:1312.6199
Zhang, L., Gooya, A., Dong, B., Hua, R., Petersen, S.E., Medrano-Gracia, P., Frangi, A.F.: Automated quality assessment of cardiac MR images using convolutional neural networks. In: Tsaftaris, S.A., Gooya, A., Frangi, A.F., Prince, J.L. (eds.) SASHIMI 2016. LNCS, vol. 9968, pp. 138–145. Springer, Cham (2016). doi:10.1007/978-3-319-46630-9_14
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic and Electrical Engineering, Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), University of Sheffield, Sheffield, UK
Le Zhang, Ali Gooya & Alejandro F. Frangi

Authors

Le Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ali Gooya
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro F. Frangi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Le Zhang .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, United Kingdom
Sotirios A. Tsaftaris
University of Sheffield, Sheffield, United Kingdom
Ali Gooya
University of Sheffield, Sheffield, United Kingdom
Alejandro F. Frangi
The Johns Hopkins University, Baltimore, Maryland, USA
Jerry L. Prince

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Gooya, A., Frangi, A.F. (2017). Semi-supervised Assessment of Incomplete LV Coverage in Cardiac MRI Using Generative Adversarial Nets. In: Tsaftaris, S., Gooya, A., Frangi, A., Prince, J. (eds) Simulation and Synthesis in Medical Imaging. SASHIMI 2017. Lecture Notes in Computer Science(), vol 10557. Springer, Cham. https://doi.org/10.1007/978-3-319-68127-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-68127-6_7
Published: 26 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68126-9
Online ISBN: 978-3-319-68127-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us