Scalable balanced training of conditional generative adversarial neural networks on image data

Lupo Pasini, Massimiliano; Gabbi, Vittorio; Yin, Junqi; Perotto, Simona; Laanait, Nouamane

doi:10.1007/s11227-021-03808-2

Scalable balanced training of conditional generative adversarial neural networks on image data

Published: 26 April 2021

Volume 77, pages 13358–13384, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Journal of Supercomputing Aims and scope Submit manuscript

Scalable balanced training of conditional generative adversarial neural networks on image data

Download PDF

Massimiliano Lupo Pasini ORCID: orcid.org/0000-0002-4980-6924¹,
Vittorio Gabbi²,
Junqi Yin³,
Simona Perotto⁴ &
…
Nouamane Laanait⁵

281 Accesses
2 Citations
2 Altmetric
Explore all metrics

Abstract

We propose a distributed approach to train deep convolutional generative adversarial neural network (DC-CGANs) models. Our method reduces the imbalance between generator and discriminator by partitioning the training data according to data labels, and enhances scalability by performing a parallel training where multiple generators are concurrently trained, each one of them focusing on a single data label. Performance is assessed in terms of inception score, Fréchet inception distance, and image quality on MNIST, CIFAR10, CIFAR100, and ImageNet1k datasets, showing a significant improvement in comparison to state-of-the-art techniques to training DC-CGANs. Weak scaling is attained on all the four datasets using up to 1000 processes and 2000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Article 03 August 2022

Multi-node Training for StyleGAN2

How Good Is My GAN?

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Generative adversarial neural networks (GANs) [1,2,3,4] are deep learning (DL) models whereby a dataset is used by an agent, called the generator, to sample white noise from a latent space and simulate a data distribution to create new (fake) data that resemble the original data it has been trained on. Another agent, called the discriminator, has to correctly discern between the original data (provided by the external environment for training) and the fake data (produced by the generator). The generator prevails over the discriminator if the latter does not succeed in distinguishing anymore the original from the fake. The discriminator prevails over the generator if the fake data created by the generator is categorized as fake, and the original data is still categorized as original. An illustration that describes a GANs model is shown in Fig. 1. Originally, GANs have been used on image data to improve the generalizability of DL models for object recognition. In particular, the goal was to use GANs for data augmentation to generate new data (similar to the original data), and use the augmented dataset to improve the accuracy of the object classifier. Since GANs was originally introduced, the scope of applications that rely on GANs to overcome computational limitations has broadened. For instance, recent applications of GANs are related to video synthesis [5] to improve the resolution of videos generated through videocameras, and face image alteration [6] to help national security agencies identify fugitive criminals that undergo mild facial surgeries.

The training of GANs is driven by the values of the cost functions associated with the agents [7]. The cost functions used to evaluate the performance of the discriminator and the generator are related to the number of false positives (original images identified by the discriminator as fake) and false negatives (fake images identified by the discriminator as original). The task of the discriminator is relatively simple in that it only has to assign a Boolean value to an image, according to whether the image is predicted as original or fake. On the contrary, the generator needs to map white noise sampled from the latent space into newly created images, and the images created must reproduce relevant features that belong to each data category represented in the training data.

This imbalance between the difficulty of the computational tasks of discriminator and generator is natural in GANs and cost functions currently used to measure the performance of the generator do not retain information about the disparity in computational tasks between discriminator and generator. As a result, the precision attained by the discriminator in performing its tasks (saying if an image is fake or original) is always higher than the precision with which the generator performs its own (create a whole set of fake images from white noise). Recent game theoretic results show that the unbalanced training of GANs can cause the generator to cycle [8] or converge to a (potentially bad) local optimum [9], which causes the generator to get stuck reproducing only one specific data point (this phenomenon is known in the DL literature as mode collapse). It is thus important to balance the training of GANs models in order to improve the performance of the generator and obtain fake images with similar features to the ones contained in the original data, but this task is challenging [10,11,12,13,14].

Some recent approaches have tackled the imbalance of the two agents by changing the numerical optimization used to train the GANs model. [15] Other recent approaches have tackled the challenge of imbalance between discriminator and generator by improving the complexity of the GANs model [16,17,18,19,20,21,22,23,24]. However, datasets with a large number of categories still pose a non-trivial challenge that prevents the generator from attaining a good performance in creating new fake images still due to a large variability between classes.

In addition, all the existing approaches to train GANs are characterized by a limited parallelizability, in that existing parallel techniques for GANs are based on data parallelization that distributes large-scale data to multiple replicas of the same model via ensemble learning, and do not further enhance the scalability of GANs by attempting any model parallelization [11]. Therefore, state-of-the-art GANs approaches do not fully leverage high-performance computing (HPC) facilities to attain a better performance.

We propose a novel distributed approach to train CGANs through a nonzero-sum game formulation that uses data categories to address the performance disparity between discriminator and generator and improve the scalability of the CGANs training via model parallelization. We use the labels to split the data and process each class independently, using a generator for each class. Our distributed approach relies on a factorization of the data distribution where each factor is associated with a single data category. The factorization of the probability distribution makes our approach differ from standard CGANs, where the joint data probability is never decomposed into simpler factors and a single generator is still assigned with the task of creating new images that span all the data categories. The data splitting performed according to the labels removes the variability between classes and thus corrects the imbalance of standard GANs training. Because of the independence of each generator from the others, the generators can be trained concurrently and this enhances scalability.

2 Related work

Some approaches presented in the literature address the imbalance of the two agents by changing the numerical optimization used to train the GANs model. An example is the Competitive Gradient Descent method (CGD) [15], which recasts the GANs training as a zero-sum game, whereby the discriminator and the generator compete against each other, and the goal is to identify an equilibrium between the agents. However, the zero-sum formulation does not reflect well the interaction between generator and discriminator during the GANs training, since the loss of one agent does not directly translate into the gain of the other agent, as it is indeed assumed in a zero-sum game. Other recent approaches have tackled the challenge of imbalance between discriminator and generator by improving the complexity of the GANs model [16,17,18,19,20,21,22,23,24]. Among these approaches, Conditional GANs (CGANs) [25,26,27,28,29,30] proceed by expanding the latent space used as input for the generator by adding information about the data categories. The role of CGAN models is to reconstruct a joint data distribution defined on the expanded latent space that combines the image data with the corresponding labels. The inclusion of data labels as an additional latent space variable facilitates the generator in discerning relevant features that make an image more likely to belong to a data category than to another. However, datasets with a large number of categories still pose a non-trivial challenge that prevents the generator from attaining a good performance in creating new fake images still due to a large variability between classes.

Other approaches presented in the literature aim at improving the scalability of the training, such as ensemble learning [11]. Ensemble learning methods combine several machine learning models into one predictive model to decrease variance, bias, or improve predictions. In the context of GANs, one main advantage provided by ensemble learning is data parallelization, which accelerates the processing of large data by distributing it across different model replicas. Different model replicas exchange the portions of data in a round-robin fashion throughout consecutive iterations to ensure that the entire dataset is visited by each model replica. Moreover, the updates of the model parameters computed locally are exchanged between the model replicas at a tunable frequency to guarantee consistency between different replicas of the model.

3 Background on conditional generative adversarial neural networks

We first define the following input and output spaces, each with an associated probability distribution:

Z is a noise space used to seed the generative model. $Z = \mathbb {R}^{d_Z}$, where $d_Z$ is a hyperparameter. Values $\mathbf {z} \in Z$ are sampled from a noise distribution $p_\mathbf {z}(\mathbf {z})$. In our experiments $p_\mathbf {z}$ is a white noise model.
Y is an embedding space used to condition the generative model on additional external information, drawn from the training data. $Y = \mathbb {R}^{d_Y}$, where $d_Y$ is a hyperparameter. Using conditional information provided in the training data, we define a density model $p_\mathbf {y}(\mathbf {y})$.
X is the data space which represents an image output from the generator or input to the discriminator. Values are normalized pixel values: $X = [0,1]^W \times C$, where W represents the resolution of the input images, and C is the set of distinct color channels in the input images. Using the images in the training data and their associated conditional data, we can define a density model $p_{\text {data}}(\mathbf {x})$ of face images. This is exactly the density model we wish to replicate with the overall model in this paper.

We now define two functions:

$G: Z \times Y \rightarrow X$ is the conditional generative model (or generator), which accepts noise data $\mathbf {z}\in Z$ and produces an image $\mathbf {x}\in X$ conditional to the external information $\mathbf {y}\in Y$.
$D:X \rightarrow [0, 1]$ is the discriminative model (or discriminator), which accepts an image $\mathbf {x}$ and condition $\mathbf {y}$ and predicts the probability under condition $\mathbf {y}$ that $\mathbf {x}$ came from the empirical data distribution rather than from the generative model.

The goal of CGANs is to provide a model that estimates the probability distribution $p_\text {model}(\mathbf {x},\varvec{\theta },\mathbf {y})$, parameterized by parameters $\varvec{\theta }$ that describes the DL model. We then refer to the likelihood as the probability that the model assigns to the training data: $\Pi _{i=1}^m p_\text {model}(\mathbf {x}_i,\varvec{\theta },\mathbf {y})$, for a dataset containing m training samples $\mathbf {x}_i$. Among the different types of generative models, GANs is a type of model that works via the principle of maximum likelihood. The principle of maximum likelihood aims at choosing the parameters $\varvec{\theta }$ for the DL model that maximize the likelihood of the training data

$$\begin{aligned} \varvec{\theta }^* = \underset{\varvec{\theta }}{{\text {argmax}}}(p_\text {model}(\mathbf {x},\varvec{\theta },\mathbf {y})) \end{aligned}$$

(1)

Using condition information provided in the training data, we define a density model $p_\mathbf {y}(\mathbf {y})$. CGANs use the Bayes theorem to combine the conditional probability $p_\text {model}(\mathbf {x}| \mathbf {y})$ and the density model $p_\mathbf {y}(\mathbf {y})$ to yield the joint model probability $p_\text {model}(\mathbf {x},\mathbf {y})$:

$$\begin{aligned} p_\text {model}(\mathbf {x},\varvec{\theta },\mathbf {y}) = p_\text {model}(\mathbf {x},\varvec{\theta }|\mathbf {y})p_\mathbf {y}(\mathbf {y}). \end{aligned}$$

(2)

While (2) cannot be expressed in closed analytical form, CGANs can be trained without needing to explicitly define a density function, because this type of generative model offers a way to train the model while interacting only indirectly with $p_\text {model}(\mathbf {x},\varvec{\theta },\mathbf {y})$, usually by sampling from it.

The two players in the game are represented by two functions. The discriminator is defined by a function D that takes $\mathbf {x}$ as input and uses $\varvec{\theta }^{(D)}$ as parameters. The generator is defined by a function G that takes $\mathbf {z}$ as input and uses $\varvec{\theta }^{(G)}$ as parameters. Both players have cost functions that are defined in terms of both players’ parameters. A consensus has been reached in the literature about which cost functions fully describe the performance of the discriminator due to the simplicity of the discriminator’s task [1, 31], whereas the complexity of the computational task of the generator still keeps different options open as to the cost function that better describes the generator’s performance. In the zero-sum game formulation, the discriminator minimizes a cross-entropy and the generator maximizes the same cross-entropy. The generator’s gradient tends to vanish when the discriminator successfully rejects the generator’s samples with high confidence, but vanishing gradients reduce the effectiveness of the updates computed during the training. To avoid vanishing gradients, an approach widely used in the literature is to transform the GANs training into a nonzero-sum game. In the context of nonzero-sum games, the generator maximizes the log-probability of the discriminator being mistaken instead of having the generator minimize the log-probability of the discriminator being correct. Following the nonzero sum game formulation, the cost used for the discriminator is

$$\begin{aligned} J^{(D)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)}) = - \frac{1}{2} \mathbb {E}_{\mathbf {x}\sim p_{\text {data}}}\log D(\mathbf {x}) - \frac{1}{2} \mathbb {E}_{\mathbf {z}} \log (1-D(G(\mathbf {z}))) \end{aligned}$$

(3)

The cost function we choose for the generator is

$$\begin{aligned} J^{(G)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)}) = \frac{1}{2}\mathbb {E}_{\mathbf {z}}\log D(G(\mathbf {z})). \end{aligned}$$

(4)

This cost function $J^{(G)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$ quantitatively describes the ability of the generator in tricking the discriminator so that the discriminator confuses fake images as real. This choice of $J^{(G)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$ better reflects the goal of the generator as an individual agent, with respect to other alternatives that force a strong dependence of $J^{(G)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$ on $J^{(D)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$, such as, for instance, in the zero-sum game formulation where $J^{(G)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})=-J^{(D)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$. The definition in (4) ignores the false positive, because the original images are not a product of the generator (only fake images are).

The discriminator wishes to minimize $J^{(D)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$ and must do so while controlling only $\varvec{\theta }^{(D)}$. The generator wishes to maximize $J^{(G)}(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$ and must do so while controlling only $\varvec{\theta }^{(G)}$. The solution of this mini-max game is a Nash equilibrium. Here, we use the terminology of local differential Nash equilibria [32]. In this context, a Nash equilibrium is a tuple $(\varvec{\theta }^{(D)}, \varvec{\theta }^{(G)})$ that is a local minimum of $J^{(D)}$ with respect to $\varvec{\theta }^{(D)}$ and a local maximum of $J^{(G)}$ with respect to $\varvec{\theta }^{(G)}$.

The training process consists of a numerical optimization scheme that iteratively updates $\varvec{\theta }^{(D)}$ and $\varvec{\theta }^{(G)}$ [11]. On each step, two minibatches are sampled: a minibatch of $\mathbf {x}$ values from the dataset and a minibatch of $\mathbf {z}$ values drawn from the model’s prior over latent variables. The standard choice of a numerical optimization algorithm used to update $\varvec{\theta }^{(D)}$ and $\varvec{\theta }^{(G)}$ is a gradient-based optimization algorithm called Adam [33].

4 Distributed conditional generative adversarial neural networks

Our novel approach aims at implementing a distributed version of CGANs as a nonzero-sum game. Figure 2 describes how our approach relates to the other generative models presented in the literature. Our distributed approach to train CGANs relies on the equality

$$\begin{aligned} p_\text {model}(\mathbf {x},\varvec{\theta }) = \sum _{k=1}^K p_\text {model}(\mathbf {x},\varvec{\theta }|\mathbf {y}_k)p_\mathbf {y}(\mathbf {y}_k) \end{aligned}$$

(5)

to distribute the computation of each term $p_\text {model}(\mathbf {x},\mathbf {y}_k)$ by training K distributed CGANs, each one per class, and then we combine the results at the end of each training to yield $p_\text {model}(\mathbf {x})$. The numerical examples presented in this paper are characterized by a one-to-one mapping between $\mathbf {y}_k$ and the labels in the image dataset. The splitting of the total probability performed in (5) is possible only under the assumption that the input data contains labels, which is used to perform the splitting. The advantage of our approach consists in the fact that all the K distributed CGANs can be trained concurrently and independently of each other, thus exposing the model to a higher level of parallelism. If the complexity representation of the objects in each category is comparable, the training time for each distributed CGANs model is approximately the same, which in turn translates into promising performance in terms of weak scalability (the time-to-solution is constant for an increasing number of processors used to solve problems of increasing size so that the computational workload per processor is unchanged). An illustration that describes the distribution of CGANs is provided in Fig. 3.

The parameters $\varvec{\theta }$ for each distributed CGANs are independent. Therefore, they are updated independently using Adam on each separate generator–discriminator pair. When the trained model is deployed for production, a random number generator provides the white noise and the label of the object whose image has to be generated. The randomly selected label determines which GANs pair to call, and the white noise is passed to the selected GANs pair to generate a new fake image for the specific object category associated with the label. The fact that our approach never enforces an exchange of updates across different agent pairs makes it differ from ensemble learning, where the different model replicas continuously exchange local updates of $\varvec{\theta }^{(D)}$ and $\varvec{\theta }^{(G)}$ between each other to guarantee global consistency of the parameters. Ensemble learning mainly resorts to parallelization as a means to stabilize the training of GANs, and to accelerate the processing of large datasets. However, this stable training and faster data processing do not necessarily result in faster convergence. In fact, ensemble learning requires each model replica to span the entire dataset in a round-robin fashion, and this means that each model replica is required to reconstruct the data distribution associated with the entire dataset, meaning that the difficulty of the modeling task has not been addressed. Our distributed approach confines each model replica to be trained on data associated with a single label, thereby resulting into an accelerated training, because the number of data batches processed by each GANs pair is significantly reduced. The partition of the data according to the classes facilitates our approach to scale, as confirmed by the weak scaling tests presented at the end of numerical section.

Our approach can be changed so that one generator handles multiple classes altogether, and create fake images associated with multiple data classes. This variation of our approach would allow the code to run on small clusters (where there are not enough resources to instantiate several generators that can work independently). However, this would defeat the statistical motivation behind the way we split the data. In fact, we remind the reader that our distributed approach aims at reducing the variability in the portion of data that is handled by each generator separately. In situations like the ones described in this paper, the variability between classes is much larger than the variability between data of the same class. When multiple data classes are simultaneously handled by the same generator, the variability of the data portion still remains large, and this hinders the generator from thoroughly exploring the data space. Because of the lack of benefit in grouping multiple classes under the same generator, our distributed approach (with one generator per data class) is designed to work on large scale computers, because it inherently assumes that the number of processors available must be at least equal to the total number of data classes.

In terms of neural network architectures to model the agents, our distributed approach aims at leveraging the parallelization of the training to enhance the predictive performance of simple (and thus faster to train) neural networks as opposed to state-of-the-art GANs that focus on building more complex (and thus more expensive to train) neural networks. To this goal, the numerical results presented in Sect. 5 focus on using our distributed training to improve the performance of DC-CGANs, a relatively small and simple neural network architecture with respect to larger and more complex ones that have been recently proposed to improve the accuracy [19, 20, 22, 34].

5 Numerical results

In this section, we present numerical tests where we compare the performance of standard deep convolutional GANs [2] (DC-GANs) with deep convolutional conditional GANs (DC-CGANs) [35] and our distributed approach to train DC-CGANs on image data. Image data are represented as pixels in a Cartesian structure. For each pixel, a set of values called channels are assigned to describe the local graphic properties. The channels per pixel are only one for black and white images, and colored images have three channels (red, green and blue). In general, the graphic variability between classes is more pronounced than the graphic variability within a specific class, because objects of the same type generally resemble more than objects of different nature. The benchmark datasets we consider are characterized by labels that clearly separate images according to their category, and the category is related to the type of object represented in the image.

The comparison between standard DC-GANs, standard DC-CGANs and our novel method for distributed DC-CGANs is performed on a quantitative level by measuring the Inception Score (IS) [36] and the Fréchet Inception Distance (FID) [37]. The IS takes a list of images and returns a single floating point number, the score. The score is a measure of how realistic a GAN’s output is. IS is an automatic alternative to having humans grade the quality of images. The score measures two things simultaneously: the image variety (e.g., each image is a different breed of dog), and whether each image distinctly looks like a real object. If both things are true, the score will be high. If either or both are false, the score will be low. A higher score describes better performance for GANs, as it means that the GAN model can generate many different distinct and realistic images. The lowest score possible is zero. Mathematically the highest possible score is infinity, although in practice there will probably emerge a finite ceiling. FID is another metric used to assess the quality of images created by the generator of a generative adversarial network (GAN). Unlike IS, which evaluates only the distribution of generated images, the FID compares the distribution of generated images with the distribution of real images that were used to train the generator. Lower values of FID correspond to the distribution of generated images approaching the distribution of real images, and this is interpreted as an improvement of the generator in creating more realistic images .

In DC-GANs, the input of the generator has size 100 (size of the white noise) and the output of the generator has one channel for black–white images and three channels for colored images. The specifics of the architectures for generator and discriminator used to build DC-GANs models are described in Tables 1 and 2. DC-CGANs differ from DC-GANs because they use additional information about the labels to improve the training of the generative model. In DC-CGANs, the input of the generator has size $100 + K$ (100 is the size of the white noise and K is the number of data classes) and the output of the generator has two channels (one channel for the color and one for label) for black–white images and four channels (three channel for the color and one for the label) for colored images. The specifics of the architectures for generator and discriminator used to build DC-CGANs models are described in Tables 3 and 4. The architecture of generator and discriminator for our approach that implements distributed DC-CGANs are the same as for DC-GANs, because each generator–discriminator pair focuses only on one data class, so the conditional information about the data label is inherently retained in the selection of the data portion used for training. The training is performed using the optimizer Adam and a learning rate of $2\mathrm {e}-4$, and a total number of 1,000 epochs for all the types of GANs we consider on each dataset.

In order to support our claim that the variability of the feature between image classes is much larger than the variability of the features between images of the same class, we run the Analysis of Variance (ANOVA) test [38] on each dataset. ANOVA is a procedure to measure the “variation” among and between groups, and it is used to analyze the differences among means. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal. The null hypothesis of the ANOVA test assumes that the means of all the groups of the population are equal, whereas the alternative hypothesis assumes that at least one group of the population has mean different from all the others.

The test statistic used in the hypothesis test is the F-statistic, which is distributed as a Fisher distribution. High values of the F-statistic correspond to small p values (i.e., p values lower than 0.05) of the ANOVA test, and this results into statistical evidence to reject the null hypothesis and accept the alternative hypothesis, meaning that the variability between groups is much larger than the variability within groups. Statistical evidence to reject the null hypothesis is used as supporting argument that different classes can be treated separately.

5.1 Hardware description

The numerical experiments are performed using Summit [39], a supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory. Summit has a hybrid architecture, and each node contains two IBM POWER9 CPUs and six NVIDIA Volta GPUs all connected together with NVIDIA’s high-speed NVLink. Each node has over half a terabyte of coherent memory (high bandwidth memory + DDR4) addressable by all CPUs and GPUs plus 1.6 TB of non-volatile memory (NVMe) storage that can be used as a burst buffer or as extended memory. To provide a high rate of communication and I/O throughput, the nodes are connected in a non-blocking fat-tree using a dual-rail Mellanox EDR InfiniBand interconnect.

5.2 Software description

The numerical experiments are performed using Python3.7 with PyTorch v1.3.1 package [40] for autodifferentiation to train the DL models with the use of GPUs, and the mpi4py v3.0.2 tool is used for distributed computing.

As for the DC-GANs and the DC-CGANs approach, generator and discriminator are mapped to the same MPI processes. As for the distributed DC-CGANs, there are multiple discriminator–generator pairs, each one associated with a specific data class, and every discriminator–generator pair is mapped to an MPI process. Each MPI process instantiated in the distributed DC-CGANs is linked to two GPUs, one dedicated to training the discriminator and one dedicated to training the generator. Therefore, the total number of GPUs used with distributed DC-CGANs amounts to twice the number of MPI processes instantiated.

Table 1 Architecture of the generator in DC-GANs

Scalable balanced training of conditional generative adversarial neural networks on image data

Abstract

Similar content being viewed by others

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Multi-node Training for StyleGAN2

How Good Is My GAN?

Explore related subjects

1 Introduction

2 Related work

3 Background on conditional generative adversarial neural networks

4 Distributed conditional generative adversarial neural networks

5 Numerical results

5.1 Hardware description

5.2 Software description

5.3 MNIST [41]

5.4 CIFAR10 [42]

5.5 CIFAR100 [43]

5.6 ImageNet1k [44]

5.7 Scaling performance of distributed DC-CGANs

6 Conclusions and future developments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation