1 Introduction

Generative adversarial networks are the sub-class of the generative model, with the competence to produce/verify a new set of data. A generative adversarial network was introduced in 2014 by researchers Ian J GoodFellow et al. [28] in his research paper published in IEEE Journal.

Most neural networks aim to learn from the limited data set, which usually faces misclassification and overfitting problems. The GAN model is a powerful architecture with a component of self-generate, self-learning and competence to overcome the limitation of traditional networks.

According to GoodFellow et al. [28] research paper published in 2014, GAN and its structure are described as a two-player min–max game or Nash Equilibrium with the function value V(D,G). The detailed mathematical description is given by Good Fellow is shown in Formula 1.

$$\begin{gathered} \mathop {\min }\limits_G \mathop {\max }\limits_D V(D,G) \hfill \\ V(D,G) = {E_{x\ pdata(x)}}[\log D(x)] + {E_{z\ p(z)}}[\log (1 - D(G(z)))] \hfill \\ \end{gathered}$$
(1)

In 2015, a new variant of GAN was proposed by ABC, and this work has become a basic approach for all upcoming variants of GAN. In this work, the GAN is mainly broken down into two modules, Generator G(A) and Discriminator D(A). Here, the generator generates the data, similar to the training dataset, and the discriminator is a network trying to identify the real and generated data. The GAN model work on the principles of game probability. The theory is to generate a random variable (A) whose properties are similar to the actual variable. Specifically, the generation of the random variable experiment is repeated N number of times until it gets the actual variable value which is known as Probability(P). And the possible outcome of this is known as sample space represented by Ω. Overall we claim it as probability distribution function P(A) where the probability of all outcomes can generate the result R as shown in Formula 2.


P: Ω → Z

$$\left( {assuming \, the \, probability \, of \, generated \, random \, variable \, is \, always \, positive \, P\left( A \right) \geqslant 0} \right)$$
(2)

Hence, we can say the summation of all probability can give an actual variable, i.e. ΣAЄΩP(A) = 1. A simple real-time example of GAN is two people playing Guess the number in the mind game. R. Chang et al. 2023 [143] and Z Pan et al. 2019 [105] are some of the experimental works that supported the above hypothesis. The simple GAN model concerning Game probability is shown in Fig. 1.

Fig. 1
figure 1

Architecture of GAN

1.1 Basic Modules of GAN

GAN deep learning module is mainly made of two adversarial network modules Generator and a Discriminator.

Generator: It is an unsupervised model in GAN that generates new values in input distribution based on the summary of real input variable distribution. The generator reads fixed-length random vectors based on the Gaussian distribution concept; after training, the generator forms compressed data distribution corresponding to multi-dimensional vector space. The architecture of the generator is shown in Fig. 2.

Fig. 2
figure 2

Architecture of Generator in GAN

Discriminator: Discriminator is a supervised GAN model that uses input and general variables based on the class label. The discriminator inputs value from real and generated dataset and predict a binary label 0 and 1, classifying the received data as fake or the same, respectively. The architecture of the discriminator is shown in Fig. 3.

Fig. 3
figure 3

Architecture of Discriminator in GAN

1.2 Applications of GAN

A generative adversarial network is a trending neural network model with several fascinating applications in various domains. The usage of the GAN model in many applications has shown a drastic change in the result and system accuracy. In this study, we have discussed some of the well-known applications of GAN as follows.

1.2.1 Application of GAN in Cyber and Network Security

The various anomalies in the security system are damaging the system and our privacy. The new GAN approach is vital in improving cyber security and building a safer system environment that protects against various attacks. GAN is one of the latest ideas in self-driving cars to enhance their safety and protection during navigation and collection of specific sensor data. These days, applying GAN in cyber security has become one of the exciting fields among researchers. A large set of research works can be observed using the GAN approach in the cyber security area.

The GAN model can be practiced to detect various cyber intrusions like distributed denial of service attacks, botnet attacks etc. [1]. To detect cyber-physical system attacks, FID-GAN, an unsupervised intrusion detection system, is designed [2]. Many imbalanced data set problems during intrusion detection are solved by using simple GAN and GAN with Earth-Mover distance in [6, 7]. To enhance the accuracy of GAN model, the labelled sample set is expanded by using an advanced binary classification model [3]. In Yixuan Wimu et al., a mining approach is presented based on the fuzzy rough set, CNN and GAN to enhance intrusion detection based on feature extraction [4, 5]. GAN and modified versions of GAN, like PAC-GAN, have notably contributed to detecting malware and standard packets in cyber security [8, 9].

Overall, GAN can be used in most of the studies related to threat detection [10,11,12], false data injection attacks imbalanced data problems etc., in the cyber and network security domain.

1.2.2 Application of GAN in Healthcare Industry

GAN is one of the fascinating inventions of AI that has contributed to most of the domains in today's research environment. Most of the SURPRISING and splendid tasks of human and AI bots are the work of GAN. The healthcare industry is one of the majorly benefited fields of GAN. Radiology images like CT, MRI, ultrasound, radiography, and elastography resolution can be enhanced by GAN. The small data set problem during the training phase is one of the major issues addressed in the healthcare domain by GAN.

To understand the role of GAN in healthcare, we have gone through different research works. The major work was observed in enhancing image clarity. In Yuhui Ma et al., [13] a versatile novel approach, Still-GAN, is introduced to enhance low and high-quality images. Lesion Focused Multiscale in [14] and enhancement of low-resolution counterparts of CT images by the GAN-Circle approach [15] are a few other enhancement techniques noted. To enhance and generate a high-resolution 3D medical image, hierarchical amortized GAN is used in research work presented in [16].

The other notable application of GAN is image generation and synthesis. Chikato Yamasoba et al. [17] presented an approach to generate different modality images using DCGAN and Cycle GAN. In [18], a one more approach where DC-GAN is used for medical data synthesis, and generating MR images using GAN is observed [19]. Strategies like GAN augmentation for liver lesion classification [20], fund-GAN approach to augment fundus image for retinal image classification [21], pseudo-3D cycle GAN lumbar spine data synthesis [22] and 3D multi-conditional GAN for image augmentation in lung module classification some more work reviewed in image augmentation [23]. Finally, we noticed a few more applications like medical image segmentation by using MS-GAN [24], U-net Based GAN [26], image fusion on GAN [25] and tumour classification [27]. In conclusion, GAN has become a boon and advantage for the growth of the medical field.

1.2.3 Application of GAN in Computer Vision

In this survey, we have considered some of the applications of GAN, which have made revolutionary improvements in computer vision. The application of GAN in computer vision can be classified into the generation of image datasets, super-resolution, creating human face photographs, image-to-image translation, generating realistic pictures, face frontal view generation and generating new human poses.

Generating image datasets is an approach to creating new plausible images from existing images. Firstly, this approach was designed by Ian Goodfellow et al. in 2014 [28]. In this paper, the author has generated a likely image from the MINIST data set. The MINIST dataset combines CIFAR-10 small objects and the Toronto face database. In 2015 [29], Alec Radford et al. designed an approach to stabilize GAN. This approach was beneficial to overcome with small dataset overfit problem in CNN and ML.

To enhance the image resolution, SRGAN is one of the well-known approaches used widely. In this approach, the generated image has a higher pixel resolution; some of the known works using SRGAN were conducted in 2016 by Christin Leidg et al. [30] and in 2017 by Huang Bin et al. [31]. In 2018, Subeesh et al. [32] presented an approach to creating a high-resolution image for photographs using the SR network.

The GAN model can also be applied to generate pictures of human faces. In 2017, Tero Karras et al. [33] published a work where celebrity faces are generated from input samples, and the generated output is quite similar. Later many works were published using Tero Karras et al. work as a base paper.

The image-to-Image translation is a vital application of image translation research using GAN. The first paper on image translation was published in 2016 by Philip Isola et al. [34]. The work was proposed on conditional adversarial Network and pix2pix approach. In 2018, Andrew Brock et al. [35] proposed a work to generate realistic photographs using bigGAN. It is noticed the generated images are very similar to the old photos with better accuracy. Face frontal view generation by GAN came to light in 2017 by Rui Hang et al. [36]. The global and local GAN is used in this paper. The face photos taken from various angle is used to generate the different frontal view and human poses.

To analyze the growth and advancement of GAN in various fields, we have queried across the different journals with a keyword "GAN" and "Generative Adversarial Network" with a filter of publication year from 2016 to 2023. This search aims to give a detailed, comprehensive overview for researchers and practitioners where we can answer the following research questions based on the growth of GAN, as shown in Table 1. In Table 2, CONF: Conference, JOR: Journal, EAA: Early Access Article, MAG: Magazine, BOK: Book, RA: Review Article, RSA: Research Article, BOC: Book Chapter, COP: Conference Proceeding, RWE: Reference Work Entry and RW: Reference Work.

Table 1 Defined research question to analyze growth of GAN in various field
Table 2 Distribution and growth of research works in GAN across various journal that satisfy defined research questions

After analyzing research questions, we understood that the progress of GAN in various domains is increasing exponentially, especially in computer vision, as observed in RQ5 in Table 2. This paper aims to analyze and understand current practices, approaches and ground truth of GAN in computer vision and image enhancement techniques. Our contribution to this paper is as follows:

  • A detailed literature survey on GAN and its variants is carried out. The detailed report on the technique and the current tool is outlined by framing the research questions.

  • A detailed review of existing work in image enhancement techniques in GAN is discussed. Depth analysis of evaluation metrics, datasets, methodology and tools of various methods are explained in detail by carrying out a systematic literature review.

  • We highlighted some of the gaps and challenges in the spectrum of image enhancement techniques using GAN, which can be helpful for future research work.

Overall, this paper is structured as follows, in Sect. 2, the detailed review process is presented by defining the research question. In Sect. 3, variants of GAN in computer vision and outcome of research questions are outlined; Sect. 4, gaps and challenges are discussed, and in Sect. 5 conclusion.

2 Taxonomy of Systematic Literature Review

To perform a detailed and systematic literature survey, we have referred few benchmark review works proposed by Bugen et al. [37], B Kitchenham et al. [38] and M. A Barbar et al. [39] in the area of software engineering. Throughout this paper, we have taken up their approaches to design our review and manifested our survey into three significant steps planning, conducting, and reporting, as shown in Fig. 4.

Fig. 4
figure 4

Taxonomy of Systematic Literature Review

2.1 Planning

The primary aim of this stage is to give sufficient information and give a systematic path for the conduction and reporting stage. This phase consists of three steps.

  • Identifying the need for a Survey

    Before a systematic survey, the research scholar must understand how important the survey is. The researcher should undergo existing survey work available, and we have read a good count of work to perform this step.

  • Formulate Research Question

    A well-structured research question will help to understand the identified study in a proper direction. We have drawn all possible research questions in this phase to match our study.

  • Review Protocol

Generally, protocols are the critical element in most of the literature survey. Analyzing the described research question, planned strategy, and background context meet the designed survey or not is executed in this step. In this study, we have followed a hierarchical approach to review protocol.

2.2 Conducting

Conducting is the next step after the planning. In this phase, there are four steps.

  • Search Strategy

    It is a predefined approach that aims to find possible primary research papers related to our work. In this step, we designed a search technique based on a specific keyword, a synonym of a keyword or a constructed string using possible keywords.

  • Selection of Study on Criteria Basis

    Various challenges are encountered during the literature selection process, like language, author, journal etc. The presented work follows a well-defined protocol to decrease bias and ensure fairness.

  • Study Quality Assessment

    This process's primary goal is to ensure the quality and relevance of selected papers from the previous steps. Here, we have fixed a set of quality metrics to appraise the quality of this study.

  • Data Extraction and Monitoring

    In this phase, the source and form used to collect the required data for the study are designed. We have carefully selected the necessary references and entities in our research and well-recorded them.

2.3 Reporting

In this phase, all the extracted and analyzed data is summarized well. This phase consists of two steps.

  • Data Synthesis

    In this step, data synthesis and summarization are achieved using a graphical and tabular approach, which is more suitable for understanding.

  • Reporting Finding

In this stage, the synthesized data is reported in the proper channel that can target research scholars and evidence.

2.4 Implementation of Systematic Literature Review

2.4.1 Identifying the Need for a Survey

To identify the importance of the study, we tried to analyse the current research trend, especially in GAN. We have searched various journals, and it is observed there has been a steady growth in the count of papers published over the years, as shown in Table 2.

2.4.2 Formulate Research Question

Picking a research question is an essential first step to define the overall purpose of the specific study. In this paper, we have established stable research questions (RQ) to guide researchers, increase confidence in the domain and understand the recent exercise and trend of GAN in computer vision. The established RQs and SRQs are given in Table 3.

Table 3 Defined research question to perform systematic literature survey

2.4.3 Review Protocol

After defining the RQs, the research questions are sent to the research guide, research supervisor and co-supervisor to check the depth and correctness of the RQ. The research guide has also evaluated the protocol design of this study. After reviewing the protocol from the supervisor, we proceeded further in our research.

2.4.4 Search Strategy

We have started our research with the intent to compile as many studies and work related to our research domain. In this phase of the collection, we included all possible keywords and also phrases that match the keywords. The possible keyword used is shown in Table 4.

Table 4 Various keywords used in search strategy

To collect the study papers, we looked into several journal repositories. However, many digital journals are available these days; the selected journals for this paper are listed below.

  • Web of Science

  • IEEE digital library

  • ACM digital library

  • Springer

  • Semantic Scholar

This search is restrained to the period of 2014 to 2023, including journals, conferences and archives.

2.4.5 Selection of Study on Criteria Basis

In selecting the relevant work after the search and collection process, we established two inclusion criteria to pick the most relevant study, as listed below.

  • The keyword should be part of the abstract, keyword and title.

  • Few papers have worked in GAN and do not involve the keyword in the abstract, title and keywords. We have gone through the complete article to complete the selection process in such cases.

To skip some studies that do not support the objective and aim of the study, we have defined three exclusion criteria as follows.

  • Studies which are not in English.

  • GAN papers related to healthcare, cyber security, networks and other domains unrelated to computer vision.

  • Conference proceedings are not considered for the study.

The detailed inclusion process is shown using the PRISMA approach in Fig. 5.

Fig. 5
figure 5

Prisma Inclusion Process for Systematic Literature Review

2.4.6 Study Quality Assessment

After the selection process, accessing quality proof is crucial to conduct a proper systematic review. The result obtained from the survey should be firm and avoid all sorts of bias. This paper uses the criteria stated in research work [40] to analyse the quality assessment.

2.4.7 Data Extraction and Monitoring

In this phase, we will extract the data required for the study. After going through six journal repositories to answer the defined RQs, we have set some rules and minimal entities required from each paper. In this paper, we extracted author details, publication details, journal details, dataset, features, methods, and metrics used.

2.4.8 Data Synthesis and Reporting

The data synthesis and reporting is the last phase of the systematic review, where the findings from the data extraction stage are segregated and presented as a supportive definition for RQs. In this phase, we have used graphs and tables to visualize the summarized data.

3 Outcomes

3.1 RQ-1: What are the Well-known Variants of GAN?

3.1.1 Deep Convolutional Generative Adversarial Networks (DCGAN)

The DCGAN layer model was proposed by Radford et al. in 2015, in which they presented two CNN models, namely discriminator and generator with a convolution transpose layer as shown in Fig. 6.

Fig. 6
figure 6

Proposed DCGAN Model In [66]

The principal aim of DCGAN is to support unsupervised learning using stride and transposed convolution for downsampling and upsampling[66].

The essence of DCGAN is as follows:

  • Eliminates all hidden layers.

  • Max pooling layers are replaced with the stride convolution layer and fractional stride convolution layer in the discriminator and generator, respectively.

  • Batch normalisation is used, except for the generator's output layer and the discriminator's input layer.

  • Leaky ReLu is applied in all layers of the discriminator.

  • ReLu is used in the generator except in the output layer. In the generator output layer, tanh is applied.

In this paper, some of the work based on DCGAN are presented. In the survey process, our foremost aim is to identify the methodology, model and application where DCGAN can be applied. In [41], Yurika Sagawa et al. presented a model for facial image generation using attributes and labels by DCGAN, and a few more works are noticed where researchers' primary motivation was to generate a facial image using DCGAN in [44, 46, 53, 58, 61].

The DCGAN gives a higher contribution in data augmentation to enhance any target CNN model's accuracy by increasing the dataset's size or building a training model, as seen in [52, 59]. However, the most noticeable work of DCGAN is in creating and performing analysis of Anime Characters [61, 63]. It is noticed using the DCGAN with the CNN model or some well-known algorithm like self-learning [58], SVM [46] etc., will give better accuracy. The detailed study of DCGAN is outlined in Table 5.

Table 5 Comparative study on DCGAN model
3.1.1.1 SRQ-1.2: What Are the Applications of DCGAN?

Based on the applications of DCGAN in computer vision, we noticed the higher contribution of DCGAN is marked in image generation and data augmentation. Considering all 25 works together, we observed five papers specially used for face image synthesis, six on data augmentation, two on anime character generation, four on resolution enhancement, and eight on data generation. Table 5 illustrates a detailed study of 25 research papers on DCGAN; based on this table, Fig. 7 outlines a list of DCGAN applications. Hence it concludes DCGAN works fine in situations of image generation.

Fig. 7
figure 7

List of Applications Used In DCGAN

3.1.2 Conditional Generative Adversarial Networks (CGAN)

Conditional GAN (CGAN) is a novel approach and a well-known variant of GAN designed to train generative models. The first glance of CGAN was in 2015, presented by Mehdi Mirza et al. [67].

The primary function of conditional GAN is to learn samples from distribution instead of sampling from marginal distribution. In conditional GAN sampling is based on additional auxiliary information like labels and data. The detailed architecture is given in Fig. 8. Based on Fig. 8 the 2-player min–max function v(G, D) given in [29] can be redefined for CGAN as shown below.

$$\mathop {\min }\limits_G \mathop {\max }\limits_D V(D,G) = {E_{x\ pdata(x)}}[\log D(x)] + {E_{z\ p(z)}}[\log (1 - D(G(z)))]$$
(2)
Fig. 8
figure 8

Conditional GAN Model

Here D(x|y) is the discriminator with x input and y label, and G(x|y) is the generator with noise vector and y label.

Generally, the major applications of CGAN are video generation, face generation, Image-to-Image Synthesis and Text to Image Synthesis. When we queried IEEE digital library with the keyword CGAN and filtered from 2019 to 2023, 24 publication topics were listed; in the extracted list, image classification, feature extraction, and medical image processing are the top 3 publication topics for CGAN. In this study, we have received 34 papers on CGAN by restricting our subject to CGAN in computer vision and image processing. The detailed outline of the studied research papers is given in Table 6.

Table 6 Comparative study on CGAN model

In the survey phase, we came across various works; among these, image processing in the medical field using CGAN has many notable results. In [68], Changhee Han et al. used 3D Multi conditional GAN to augment a small fragmented CT image dataset. Similar works are observed in Ke Xu et al. [69] and Meng Li et al. [70], presenting a novel approach of CGAN named MCRGAN with the capacity to generate pseudo-CT images under limited training dataset conditions and transform-based architecture CGAN called MedViTGAN for augmentation of synthetic histopathology image. In the medical field, one more application of CGAN is image segmentation. In [71, 72], we noticed the application of CGAN in improving lesion contrast of MR images and retinal vessel segmentation. Image denoising by Zhao Yang et al. [73],[74]and Miao Tian et al. [75], Image synthesis by Huan Yang et al. [76], Zhaohui Liang et al. [77] and Yulin Yang et al. [78] are some of the noticed works of CGAN in image processing for the medical field.

Apart from medical image processing, we have studied the application of CGAN in the computer vision domain. In Jeongik Cho et al. [79], CGAN increases hyperparameters and reduces training speed. The designed approach uses multiple GANs, sharing all the hidden layers. In [80], the work presented by Tetsuya Ishikawa et al. illustrated a method to augment training data using CGAN. Few works in computer vision addressed problems like large model size and high interface time [81], and in [82], Felipe Coelho Silva et al. demonstrated a semi-automatic frame for manga art colourization. The other application of CGAN is in quality reconstruction, Art font, image generation, video games, rejuvenation of face image etc. In Table 6, we have given a comparative analysis of all our studies in CGAN based on parameters like purpose, model and outcome.

3.1.2.1 SRQ-1.2: What are the Applications of CGAN?

After studying 34 research works on CGAN in computer vision, we recorded Image to Image Synthesis is one of the well-noted applications. Considering the application and purpose of all 34 works, a detailed pictorial view is given in the graph of Fig. 9. From Fig. 9, we can conclude Image to Image Synthesis, Image Enhancement and Text to Image Synthesis are some of the applications where CGAN can definitely be used.

Fig. 9
figure 9

List of Applications Used

3.1.3 Cycle Generative Adversarial Networks (CYCLEGAN)

CycleGAN is another noteworthy variant of GAN presented in 2017 by Jun-Yan Zhu et al. [102]. The principal objective of the model is to map the images without paired data using the mapping function G(x- > y) and an adversarial loss function.

The image generates from the first generator, G(x), is similar to y, that is, G(x- > y) =  > y = G(x). Moreover, in this approach using inverse mapping, y will learn from x that is F(y- > x) =  > x = F(y). It can be said F(G(x)) = x and G(F(y)) = y using inverse mapping and cycle consistency loss. The pictorial representation of the Cycle GAN methodology is given in Fig. 10.

Fig. 10
figure 10

Cycle GAN Training Process [101]

During the training process, Cycle GAN focuses more on the training dataset and follows a few practices as follows.

  • The training set paired image {xi,yi} where all xi in a dataset has yi as its counterpart.

  • The training set paired image {xi,yi} where every xi in the dataset dont have any match with yi.

To get a broad view of CycleGAN and its methodology, we have surveyed more than 25 research papers. The significant observation is that CycleGAN is majorly used for Image Synthesis, especially in the medical field. In Taesung Kwon et al. [103] and Jawook Gu et al. [123], image synthesis is used for denoising low-dose CT images. CycleGAN is also used for augmentation purposes in the classification of Melanoma medical images when a limited labelled dataset is available for training purposes[104]. ECG restoration [104] and fundus image enhancement in diabetic retinopathy classification [112] are the other recognized applications of CycleGAN in medical image processing. Moving apart, if we consider the general application of CycleGAN in computer vision, SAR to optical image registration [106, 120], NIR to RGB image [116] and VIS to NIR image [117] are the maximal noted research works. Along with this, image colourization, denoising and image enhancement in low light and night images are the few other works observed. A detailed study of Cycle GAN is given in Table 7.

Table 7 Comparative study on CycleGAN model
3.1.3.1 SRQ-1.2: What are the Applications Of CycleGAN?

Based on the research and problem addressed in the state of art methods from Table 7, we collected some of the following basic observations. Firstly, CycleGAN is majorly used in Image Synthesis for unpaired data in various domains. Secondly, using CycleGAN, training time and memory consumption can be reduced. At last, CycleGAN is also helpful for converting any existing supervised method to an unsupervised one. The detailed usage of CycleGAN is given in Fig. 11.

Fig. 11
figure 11

List of Applications used in CycleGAN

3.1.3.2 Style Generative Adversarial Networks (STYLEGAN)

StyleGAN is a variant of GAN introduced by Tero Karras et al. in 2019 [134]. It is the first variant of GAN focused on the advancement and improvement of the generator, then the discriminator. This model is built with two networks, namely the mapping network and the synthesis network. The StyleGAN inputs the latent space vector directly into the mapping network, which comprises eight fully connected layers. The output of the mapping network is later sent to the synthesis network architecture consisting of 18 convolution layers and an AdaIN style network.

The synthesis network produces 4 × 4 to 1024X1024-sized images in every layer. Gaussian noise is added to the activation map before sending the images into the AdaIN method. And this is the primary reason that StyleGAN can produce high-resolution images. The comphrehensive architecture of StyleGAN is shown in Fig. 12.

Fig. 12
figure 12

STYLEGAN Architecture [133]

The significant changes and updation in the StyleGAN compared to other GAN architecture are as follows.

  • Tuning and bilinear upsampling are added.

  • Gaussian noise is added in each block.

  • Mapping and Synthesis networks are added.

  • Latent vector input is not added to the generator.

Since StyleGAN was introduced in 2019, we got only a few research work on this model related to computer vision. The survey shows that most of the work collected from the paper is on the enhancement of image quality and advancement of StyleGAN. Dongsik Yoon et al. [135] started with the objective of generating diverse face images using available static faces. A similar work is observed in Shao Xiaofeng et al. [150], where the author develops the image using StyleGAN with ResNet using the FFHQ dataset. The idea of single-dimension pluralistic face image generation is taken to 3D pluralistic image generation in [136], where they worked on fixed styleGAN and RigNet with the 3DMM model. StyleGAN can also be used for classification, as demonstrated in [137], Face generation from the masked image in [138] and [151]. StyleGAN is widely used in fashion [154] and painting [145] [155] for better-quality images. The detailed study on StyleGAN is outlined in Table 8.

Table 8 Comparative study on style GAN model
3.1.3.3 SRQ-1.2: What are the Applications of STYLEGAN?

To understand the application of StyleGAN in computer vision, we have been through 20 research papers. As we observed, StyleGAN in computer vision is widely used to address quality enhancement problems in generated images. Another major application of StyleGAN, as per the literature study, is Image Synthesis. For a better understanding of applications of StyleGAN in computer vision, we have plotted the graph as shown in Fig. 13.

Fig. 13
figure 13

List of Applications used in STYLEGAN

3.1.4 Super Resolution Generative Adversarial Networks (SRGAN)

Super Resolution GAN is a well-known GAN variant to convert images with low resolution to high-resolution. This model was proposed by Twitter researchers in 2017. SRGAN model mainly subsist of three networks, namely generator, discriminator and VGG16 network, which is built using perceptual loss function.

The generator network consists of a convolution layer, PReLU layer and k3n64S1 strands with skip connection. And the discriminator network consists of a convolution layer, Leaky ReLU layer and k3n64S1 strands. The simple training network of SRGAN is illustrated in Fig. 14.

Fig. 14
figure 14

SRGAN Training Network

Super Resolution GAN is mainly used for creating photo-realistic images by using down-sampled images. In this study, we have been through some existing works to understand the role of SRGAN in removing the artefacts in low-resolution images. SRGAN can be used across various domains using computer vision techniques. In Yudai Nagano et al. [156], SRGAN creates a high-resolution food image. The author has mainly focused on inducing noises like jpg, blur etc. Junchao et al. [167], in this work the author used SRGAN for textile image reconstruction to get better accuracy than bilinear. In the survey, we observed most of the SRGAN works are based on facial resolution enhancement in the face image. In Hao Dou et al. [157], Minjie et al. [160], and Hai Nguyen Truong et al. [166], the SRGAN is used for facial resolution enhancement using orthogonal projection, wavelet transform and total variation loss, respectively. The SRGAN can be used to enhance the CT images [161] and fundus images [163] in medical image processing. The researcher Nai Feng Zhang et al. [174] have used SRGAN to deblur distant pedestrians. and Yong Hun Kim et al. [158] used SRGAN to restore old documents. The detailed study on SRGAN is outlined in Table 9.

Table 9 Comparative study on SRGAN model
3.1.4.1 SRQ-1.2: What are the Applications of SRGAN?

After analyzing several research works on SRGAN, we noted that image resolution enhancement, especially facial, medical image, textile, and pedestrian images, are the main areas in which SRGAN is used. SRGAN can also be used for image segmentation, classification and restoration purposes. The detailed use of SRGAN in various domains is shown in Fig. 15.

Fig. 15
figure 15

List of Applications used in SRGAN

3.2 SRQ-1.1: What are the Frameworks Available to Work with GAN?

Generative Adversarial Network (GAN) is successfully used for image synthesis, data augmentation, image restoration and many more. Practising GAN on primary python IDE or any framework is challenging and lengthy. To minimize the complexity these days, we have various tools in the market to support GAN. In this section, we have discussed available GAN tools, their features and applications that simplify the usage of GAN.

  • GAN LAB

It is a visual interactive experiment tool to train GAN with a 2D data distribution model and visualize the internal working system. The GAN lab is built on TensorFlow. js and UI on GPU accelerated deep learning library. Using the GAN Lab, model learning visualization and improving fake samples is much easier.

Some of the features of GAN LAB are:

  • Slow motion code

  • Adjustment of the interactive hyperparameter is possible

  • User-defined data distribution is possible.

  • VeGANs

VeGANs is a python library with PyTorch framework for GAN. This library is mainly designed for developers willing to develop their own generator and discriminator network.

  • TORCH-GAN

    Torch-GAN is a PyTorch framework for GAN. This framework is a collection of building blocks of GAN which gives customization for popular GAN datasets. Torch-GAN library offers provision for adding a new plugin for loss function and architecture, as well as the option to visualize various logging backgrounds.

  • HYPERGAN

    HyperGAN is a framework with a user interface and API. Building the GAN model on HyperGAN makes the training process more straightforward. In HyperGAN, replacing part of GAN with JSON file or creating a new GAN is way easier than in other frameworks.

  • IMAGINAIRE

    Imaginaire is an invention of NVIDIA; also a PyTorch-based GAN library integrating all the NVIDIA image and video synthesis projects. This library has several functionalities with six algorithms like Pix2PixHD, FUNIT, MUNIT, UNIT, COCO-FUNIT and SPADE.

  • MIMICRY

    Mimicry is a lightweight PyTorch library to monitor GAN's loss and probability curves. This library is supported by the Tensor board, which is helpful in the performance comparison of multiple GAN models.

  • GAN TOOLKIT

    GAN toolkit is a flexible library by IBM based on No code approach. This library helps the user to work with config files and command line arguments. It is an open-source library that allows multiple libraries like Keras, PyTorch and Tensor flow.

  • TFGAN

    TFGAN is a light weighted library used for the evaluation of GAN. This library comprises many GAN operations, normalization techniques, losses etc. TFGAN can be used in Google TPU and GPU and is also compatible with Tensorflow2. For the self-study of GAN, TFGAN is the best tool.

  • PyGAN

    PyGAN is a library in Python to implement models like CGAN, GAN, adversarial autoencoder and energy-based GAN. This library is mainly used for semi-supervised learning.

  • STUDIOGAN

StudioGAN is a library for GAN on PyTorch Framework for both conditional and unconditional image generation. StudioGAN has an inbuilt benchmark for CIFARIO, TinyImage Net and ImageNet. This library has a unique feature that performs better for low memory consumption.

3.3 RQ-2: What are the Well-known Approaches for Image Enhancement Techniques Using GAN?

Image Enhancement is a technique of manipulating digital pixel value so that the resultant images are more suitable for visualization and further analysis. The general idea of image enhancement is to process the given image and make it more convenient for the specific application.

Image enhancement can be executed in different ways; it can be the sharpening of image features such as boundaries, edges etc. It can also be removing noise, increasing an image's brightness or changing contrast. It is said that image enhancement can't improve the inherent content of data, but it can enhance the dynamic range of chosen features.

There are numerous techniques for image enhancement in computer vision. And Fig. 16 shows a general approach or hierarchy to carry out image enhancement.

Fig. 16
figure 16

Image Enhancement Techniques

To understand the methodology used for image enhancement using GAN, we studied many research papers on different variants of GAN for image enhancement techniques. Some researchers worked on the enhancement of face images and their features [31, 175, 184, 200], and some papers mainly concentrated on computer vision in the medical field. In [178] [180, 190], the author focused on enhancing the clarity of the fundus image for better recognition of the iris. And in [76, 128, 191, 195, 202, 204], the author's principal objective was enhancing X-Ray, MRI and CT Scan images. The research in image enhancement is not only restricted to image processing in the medical field; it has also shown a comprehensive improvement in enhancing low light, low luminance and underwater images. In Table 10, we have illustrated all the studied research work in detail based on their methodologies.

Table 10 Comparative study on image enhancement using GAN with respect to methodology

3.4 RQ-2.1: Which are the Datasets Typically Used in Image Enhancement by GAN?

We observed various datasets were used in numerous studies related to image enhancement using GAN variants for testing and training purposes. Generally, the datasets are publicly available on the internet; in some cases, datasets are private, self-created, and acquired. It is found these dataset has made an incredible advancement in image enhancement using GAN. In turn, because of these datasets, most of the GAN variants can achieve their desired outcome. In Table 11, all the datasets used in different research papers related to image enhancement in GAN are displayed respectively to their variants.

Table 11 The list of datasets for image enhancement using GAN

3.5 RQ-2.2: What are the Models Used in Image Enhancement Techniques Using GAN?

This section illustrates various GAN variants used in image enhancement. Based on studies and considering all the GAN variants used for image enhancement, we have outlined Table 12 in this paper. While presenting the summary table, we considered noise removal, clarity enhancement, blurriness removal, contrast enhancement and brightness enhancement as image enhancement techniques. In this paper, we assessed 69 reports to study image enhancement using GAN. Based on 69 articles, Table 12 is drawn, listing all the variants of GAN used for image enhancement, the number of studies in each category and the percentage of studies in each category (PSC). Using Table 12, we can reveal SRGAN is the most used GAN in the image enhancement approach.

Table 12 The list of models for image enhancement using GAN with respect to their distribution

3.6 RQ-2.3: What are the Metrics Used to Evaluate Image Enhancement Using GAN?

This section of the paper showcases various measurement metrics used in calculating, analysing and assessing the performance of the model used for image enhancement in GAN. Table 13 defines multiple metrics and performance units in all the studies on image enhancement methods. It also gives the proper explanation and description of each measurement metric and the number of studies related to each metric. Based on Table 13, it can be concluded PSNR and SSIM are often used measurement metrics to evaluate image enhancement studies across various GAN models.

Table 13 The list of measurement metrics used for image enhancement using GAN in various studies

3.7 RQ-3: Whether GAN is a Better Approach for Image Enhancement? How is Image Enhancement Performance in GAN, MATLAB, and Other Platforms for Image Enhancement?

To analyse how the GAN model is efficient for image enhancement compared to the other existing techniques, we split our analysis based on three categories: (i) Image enhancement using the GAN model (ii) Image enhancement using machine learning (iii) Image enhancement using MATLAB.

In this review work, we considered a maximum of ten sample existing studies from each category [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219]. And the PSNR and SSIM performance metrics are used for comparative analysis. We recorded minimum and maximum PSNR and SSIM observed from the collected sample study from each category as given in Table 14. Overall, in this section, Table 14 and Fig. 17 present the gist of the comparative analysis. By analyzing Table 14, we can say the GAN model is a better approach for image enhancement.

Table 14 Performance analysis of image enhancement in various techniques
Fig. 17
figure 17

Comparison of Result Among Different Techniques

4 Limitations and Challenges

Please make sure that the paper you submit is final and complete, that any copyright This section lists some of the challenges, limitations and gaps noticed during the study. The observed gaps are as follows.

  • Minimal work is proposed to enhance and restore the image by extracting the original features of the image.

  • Using the GAN model for training purposes can increase the output, but it is noticed the model will become very unstable so that in each iteration result gets varied.

  • One more notable observation in numerous image enhancement works is that handling high-frequency and low-frequency features in images using the same model doesn’t give effective results.

  • Combining GAN with the extra deep neural network can increase the accuracy of output, but a rapid increase in training time is observed.

  • It is noticed no single GAN model is designed to address all possible noise in the image during the image enhancement technique.

5 Conclusion

The presented SLR illustrates the study of various state-of-the-art methods on GAN, variants on GAN and image enhancement techniques using GAN. This research gives a detailed view of the existing work of GAN published from 2018 to 2023. Throughout this paper, we answered all the possible questions on GAN by discussing its history, application, variants, limitations, image enhancement approaches, and conducted a comparative and summarizing examination of distinctions with other existing works. The overall summary of this study is as follows.

  • The GAN model is widely used in many domains like machine designing, architecture, medicine, construction, computer vision etc.

  • Linear growth is observed in research publications related to GAN. And in 2019–2020, a rapid increase in the publication count was seen.

  • Every GAN model has its own specialization approach; for example, the DCGAN can be mainly used in data augmentation like this; the detailed explanation of every variant of GAN is given in section III.

  • The SRGAN model holds a significant role in image enhancement.

  • PSNR and SSIM have widely used performance metrics for image enhancements.

  • The experimental result demonstrates that GAN is a practical approach and outperforms as a better model for image enhancement than other techniques.

With the rapid progress in technology and multimedia, GAN still needs to address many challenges. And this study gives a route map and valuable basic details for the research community in developing compelling research works on GAN.