Generative Adversarial Networks (GANs) are a type of neural network architecture for generative modeling. They consist of two models: a generator and a discriminator. The generator produces synthetic data samples that are intended to be indistinguishable from real data, while the discriminator is trained to distinguish between real and synthetic data samples.
The generator and discriminator are trained together in an adversarial process. The generator takes a random noise vector as input and produces a synthetic image. The discriminator then takes the generated image and classifies it as either real or fake. The generator then uses the feedback from the discriminator to adjust its parameters and try to generate images that are more similar to the real data, in order to trick the discriminator.
One of the key benefits of GANs is that they can learn to generate synthetic data that is highly realistic and diverse. This has made them particularly popular for tasks such as image generation, text generation, and audio synthesis.
An example of synthetic data generated by a GAN might be a set of realistic images of objects or scenes that do not actually exist in the real world. For example, a GAN might be trained on a large dataset of real images of faces and then be able to generate synthetic images of new, previously unseen faces that are highly realistic and diverse.
Real data, on the other hand, is data that is collected from the real world and is not artificially generated. For example, a dataset of real images of faces might be collected by taking photographs of people and using those images as the training data for a machine learning model.
The specific architecture and design of the generator model can vary depending on the type of data it is generating. For example, a GAN trained to generate images might use a convolutional neural network (CNN) as the generator, while a GAN trained to generate text might use a recurrent neural network (RNN) as the generator.
Let us take a look at sample code for GAN image generator using Keras library.
from keras.layers import Input, Dense, Reshape, Conv2DTranspose
from keras.layers import LeakyReLU
from keras.layers import BatchNormalization
from keras.optimizers import Adam
from keras.models import Model
import numpy as np
# Define the generator
latent_dim = 100
generator_input = Input(shape=(latent_dim,))
x = Dense(7*7*256)(generator_input)
x = LeakyReLU(alpha=0.01)(x)
x = Reshape((7, 7, 256))(x)
x = Conv2DTranspose(128, kernel_size=3, strides=2, padding='same')(x)
x = LeakyReLU(alpha=0.01)(x)
x = BatchNormalization()(x)
x = Conv2DTranspose(64, kernel_size=3, strides=1, padding='same')(x)
x = LeakyReLU(alpha=0.01)(x)
x = BatchNormalization()(x)
x = Conv2DTranspose(1, kernel_size=3, strides=2, padding='same', activation='tanh')(x)
generator = Model(generator_input, x)
As you can see in the above code, There are multiple LeakyReLU and dense layers in the GAN generator model in the above code because they allow the model to learn a hierarchy of features from the input noise.
To learn more about Activation functions, checkout following links...
https://www.nbshare.io/notebook/751082217/Activation-Functions-In-Python/
https://www.nbshare.io/notebook/626290365/What-is-LeakyReLU-Activation-Function/
In a deep learning model, the layers closer to the input are responsible for learning lower-level features (e.g., edges, corners), while the layers closer to the output are responsible for learning higher-level features (e.g., shapes, objects). By stacking multiple layers, the model can learn a hierarchy of features at different levels of abstraction, which can help it generate more realistic images.
The dense layers in the generator model learn to transform the input noise into a high-dimensional feature space, which is then upsampled by the convolutional layers to generate the output image. The LeakyReLU activation function is used in between the layers to introduce non-linearity into the model, which can help it learn more complex relationships between the input and output.
The specific number and arrangement of layers in the generator model can vary depending on the complexity of the task and the size of the input and output data. Experimenting with different architectures can sometimes lead to better model performance.
The discriminator model in a GAN is a neural network that takes an image as input and outputs a probability that the image is real (as opposed to synthetic). The discriminator model is trained to maximize the probability of correctly classifying real images as real, and synthetic images as synthetic.
# Define the discriminator
from keras.layers import Conv2D, Flatten
discriminator_input = Input(shape=(28, 28, 1))
x = Conv2D(64, kernel_size=3, strides=2, padding='same')(discriminator_input)
x = LeakyReLU(alpha=0.01)(x)
x = BatchNormalization()(x)
x = Conv2D(128, kernel_size=3, strides=2, padding='same')(x)
x = LeakyReLU(alpha=0.01)(x)
x = BatchNormalization()(x)
x = Flatten()(x)
x = Dense(1, activation='sigmoid')(x)
discriminator = Model(discriminator_input, x)
This discriminator model takes an image of shape (height, width, channels) as input and outputs a probability that the image is real.
The model consists of a series of convolutional layers that learn to extract features from the input image, followed by a dense layer that uses these features to classify the image as real or synthetic.
Note:
The sigmoid function is used as the activation function of the output layer of the discriminator model in the above code because it maps the output of the layer to a value between 0 and 1, which can be interpreted as a probability.
# Define the full GAN
discriminator.trainable = False
gan_input = Input(shape=(latent_dim,))
gan_output = discriminator(generator(gan_input))
gan = Model(gan_input, gan_output)
# Compile the GAN
gan.compile(optimizer=Adam(), loss='binary_crossentropy')
The binary cross-entropy loss function can then be used to measure the distance between the predicted probability and the true label (real or synthetic), and the model can be trained to minimize this distance.