Understanding and building Generative Adversarial Networks(GANs)- Deep Learning with PyTorch.

We’ll be building a Generative Adversarial Network that will be able to generate images of birds that never actually existed in the real world.

--

-These bird images are purely generated by the Deep Learning Model(GAN)-

Before we actually start building a GAN, let us first talk about the idea behind GANs. GANs were invented by Ian Goodfellow, heobtained his B.S. and M.S. in computer science from Stanford University and his Ph.D. in machine learning from the Université de Montréal,. This is the new big thing in the field of Deep Learning right now. Yann LeCun, the director of Facebook AI said :

“Generative Adversarial Networks is the most interesting idea in the last ten years in Machine Learning.”

Top 3 Most Popular Ai Articles:

1. Text Classification using Algorithms

2. Regularization in deep learning

3. An augmentation based deep neural network approach to learn human driving behavior

So what are GANs ? Why were they created ?

Neural Networks are good at classifying and predicting things, and AI Researchers wanted to make the neural net more human in nature by allowing it to CREATE rather than just letting it see things, and turns out that Ian Goodfellow was successful in inventing a class of Deep Learning Model which could do that.

How do they work ?

GANs contain two separate neural networks. Let us call one neural network as “G”, which stands for Generator and the other neural network as “D”, which is a Discriminator. The Generator first generates random images and a Discriminator sees those images and tells the Generator how real the generated images are.

Let us consider the Generator :

In the starting phase, a Generator model takes random noise signals as input and generates a random noisy image as the output, gradually with the help of the Discriminator, it starts generating images of a particular class that look real.

Discriminator :

The Discriminator which will be the opponent of Generator is fed with both the generated images as well as a certain class of images at the same time, allowing it to tell the generated how the real image looks like.

After reaching a certain point, the Discriminator will be unable to tell if the generate image is a real or a fake image, and that is when we can see images of a certain class(class that the discriminator is trained with.) being generated by out Generator that never actually existed before.

Applications of GANs :

  • Super Resolution.
  • Assisting Artists.
  • Element Abstraction.

Lets Code !

NOTE : The below explanation of the code is not prepared for a novice deep learning programmer , i expect you to be comfortable with the deep learning accent in python.

Let us start by importing all the required python libraries for building our GAN. Please make sure PyTorch is installed in your computer before you start.

#importing required librariesfrom __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.optim as optim
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
from torch.autograd import Variable

Now let us set the hyper-parameters which will be the batch-size and image-size in this case :

# Setting hyperparametersbatchSize = 64 
imageSize = 64

In the first line, we have set the size of the batch to 64. And in the second line we have set the size of the images generated by the generator to 64 x 64 resolution.

Then we are going to create an object to perform image transformations as given below :

# Creating the transformationstransform = transforms.Compose([transforms.Scale(imageSize), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),])

The above transformations are necessary to make the image compatible as an input to the neural network of the discriminator.

NOTE : In order to get the dataset, click here and you will be directed to https://github.com/venkateshtata/GAN_Medium.git , clone that repository into your local system and replace the “dcgan.py” file with the python file your writing to. the “data” folder contains the dataset.

Now lets load our dataset from a respective directory. The type of dataset we are going to be using here is a CIFAR-10 dataset. We are going to load them in batches, and make sure that the python file you are writing to is in the same directory for less complexity while importing the dataset.

# Loading the datasetdataset = dset.CIFAR10(root = './data', download = True, transform = transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size = batchSize, shuffle = True, num_workers = 2)

We download the training set in the ./data folder and we apply the previous transformations on each image. Then use dataLoader to get the images of the training set batch by batch. Almost every element of the above code is self explanatory, the value of ‘num_workers’ defines the number of threads that must be used to carry out the process of loading the training data.

As we will be dealing with multiple(2) neural networks here, we will be defining a universal function to initialise the weights of a given neural network by calling the function and passing the NN(Neural Network) into it.


def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
m.weight.data.normal_(0.0, 0.02)
elif classname.find('BatchNorm') != -1:
m.weight.data.normal_(1.0, 0.02)
m.bias.data.fill_(0)

The above ‘weights_init’ function takes as input a neural network ‘m’ and will initialise all its weights. This function will be called for each iteration during the training process.

Fig 1.0, Source : Nvidia.

Our first big step will be to define a class for our Generator neural network. We’ll start by creating a class that will be holding the architecture of the Generator, which will basically contain a sequence of layers that each input undergoes.

class G(nn.Module):def __init__(self):
super(G, self).__init__()
self.main = nn.Sequential(
nn.ConvTranspose2d(100, 512, 4, 1, 0, bias = False),
nn.BatchNorm2d(512),
nn.ReLU(True),
nn.ConvTranspose2d(512, 256, 4, 2, 1, bias = False),
nn.BatchNorm2d(256),
nn.ReLU(True),
nn.ConvTranspose2d(256, 128, 4, 2, 1, bias = False),
nn.BatchNorm2d(128),
nn.ReLU(True),
nn.ConvTranspose2d(128, 64, 4, 2, 1, bias = False),
nn.BatchNorm2d(64),
nn.ReLU(True),
nn.ConvTranspose2d(64, 3, 4, 2, 1, bias = False),
nn.Tanh()
)

Breaking down the above code :

  • We have created a class ‘G’, referring to the Generator neural network, and inheriting from ‘nn.module’ which contains all the tools required for building neural networks, which help us is placing different applications and and connections inside a given neural network.
  • Then we create a meta module of a neural network that will contain a sequence of modules such as convolutions, full connections, etc.
  • A great thing to observe from the above Fig 1.0 is that the structures of neural networks of both Generator and the Discriminator are inverse to each other, which basically means that in Generator, the Convolution must be in an inverse way, where the the input will be random noise vectors. Hence we start with an inverse convolution using ‘ConvTranspose2d’.
  • Then we normalize all the features along the dimension of the batch and apply a ReLU rectification to break the linearity. Click here for more detailed explanation of parameters used in the above functions.
  • We repeat the above operations again while changing the input nodes from ‘100’ to ‘512’, the number of feature maps from ‘512’ to ‘256' and keeping the bias as False. [ Note: The values i am choosing in the above code are choices of researchers. ]
  • In the final ‘ConvTranspose2d’ we will be outputting 3 filters as the output image of the generator is going to be a 3 channel(RGB) and we apply a ‘Tanh’ rectification to break the linearity and stay between -1 and +1.

Now we need to create a tool which will be a forward function to propogate the signal inside the Generator.

def forward(self, input):
output = self.main(input)
return output

The input of the above function will be some random vector of size 100 as defined inside the G class. It returns the output containing the generated images. The initial image is made up random vectors.

Creating the generator :

netG = G() 
netG.apply(weights_init)

Here we are creating a generator object and initialising all the weights of the input neural network.

Now, lets start defining our Discriminator class that will be holding the architecture of a Discriminator.

class D(nn.Module):def __init__(self):
super(D, self).__init__()
self.main = nn.Sequential(
nn.Conv2d(3, 64, 4, 2, 1, bias = False),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(64, 128, 4, 2, 1, bias = False),
nn.BatchNorm2d(128),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(128, 256, 4, 2, 1, bias = False),
nn.BatchNorm2d(256),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(256, 512, 4, 2, 1, bias = False),
nn.BatchNorm2d(512),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(512, 1, 4, 1, 0, bias = False),
nn.Sigmoid()
)

Breaking down the Discriminator :

  • Similar to the G class, the ‘D’ Discriminator class is inheriting from the ‘nn.module’. The input of the Discriminator will be the image generated by the Generator, to which the Discriminator will be returning a number between 0 and 1 as output.
  • Since it takes a generated image of the generator, the first operation is going to be a convolution, hence we start with a convolution and apply LeakyReLU.
  • Observe that unlike the what we did in G class, we are using LeakyReLU here, which will take the negative slope till 0.2, and this comes from frequent experimentation, which i didn’t do, but researchers choice.
  • We use ‘BatchNorm2d’ to normalize all the features along the dimension of the batch.
  • And at the end, we are using the classic old fashioned function, which is the sigmoid function to break the linearity and stay between 0 and 1.

Now, in order to forward propagate the signal into the Discriminator, we need to define a Forward class, which is going to carry the output of the generator to the Discriminator :

def forward(self, input):
output = self.main(input)
return output.view(-1)

In the final line we return the output which will be a value between 0 and 1, because we need to flatten passed NN to make sure the vectors are in the same dimension.

Creating the Discriminator :

netD = D() 
netD.apply(weights_init)

We create the discriminator object of the above class D and initialize all the weights of its neural network.

Now its time we train our Generative Adversarial Network. But before that we need to start by getting a criteria that will measure the error of prediction given by the discriminator. In order to achieve that, we are going to use BCE Loss(where BCE means Binary Cross Entropy.), which is perfect for Adversarial Neural Networks. Hence we need optimisers for both the generator as well as the discriminator.

criterion = nn.BCELoss()
optimizerD = optim.Adam(netD.parameters(), lr = 0.0002, betas = (0.5, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr = 0.0002, betas = (0.5, 0.999))

We start by creating a criterion object that will measure the error between the prediction and the target. Then we create optimisers for objects of both discriminator and the generator.

We are using ‘Adam’ optimiser from the ‘optim’ module, which is a highly advance optimal for stochastic gradient descent.

We’ll be training our neural nets for 25 epochs, hence :

for epoch in range(25):

Then we need to iterate over the images within the dataset, hence :

for i, data in enumerate(dataloader, 0):

First step is to update the weights of the neural network of the discriminator, hence we initialise the gradients of the discriminator to 0 with respect to the weights :

netD.zero_grad()

As we know that our discriminator must be trained with both the real and fake images at a time. Hence we will train the discriminator with a real image of the dataset first :

real, _ = data
input = Variable(real)
target = Variable(torch.ones(input.size()[0]))
output = netD(input)
errD_real = criterion(output, target)

We get a real image of the dataset which will be used to train the discriminator, and then wrap it in a variable. Then we forward propagate this real image into the neural network of the discriminator to get the prediction (a value between 0 and 1) and compute the loss between the predictions (output) and the target (equal to 1).

Now, training the discriminator with a fake image generated by the generator :

noise = Variable(torch.randn(input.size()[0], 100, 1, 1))
fake = netG(noise)
target = Variable(torch.zeros(input.size()[0]))
output = netD(fake.detach())
errD_fake = criterion(output, target)

Here, first we are making a random input vector (noise) of the generator and forward propagate this random input vector into the neural network of the generator to get some fake generated images. Then we forward propagate the fake generated images into the neural network of the discriminator to get the prediction (a value between 0 and 1) and compute the loss between the prediction (output) and the target (equal to 0).

Back-propagating the total error :

errD = errD_real + errD_fake
errD.backward()
optimizerD.step()

Here we are computing the total error of the discriminator and backpropagating the loss error by computing the gradients of the total error with respect to the weights of the discriminator. At the end we apply the optimizer to update the weights according to how much they are responsible for the loss error of the discriminator.

Next step is to update the weights of the neural network of the generator :

netG.zero_grad()
target = Variable(torch.ones(input.size()[0]))
output = netD(fake)
errG = criterion(output, target)
errG.backward()
optimizerG.step()

As done previously , first we are initialising the gradients of the generator to 0 with respect to the weights. Getting the target. Forward propagating the fake generated images into the neural network of the discriminator to get the prediction (a value between 0 and 1) and then computing the loss between the prediction (output between 0 and 1) and the target (equal to 1). Then back-propagating the loss error by computing the gradients of the total error with respect to the weights of the generator and applying the optimizer to update the weights according to how much they are responsible for the loss error of the generator.

Now, our final step is to print the losses and save the real images and the generated images of the mini batch every 100 steps. Which is done as followed :

print('[%d/%d][%d/%d] Loss_D: %.4f Loss_G: %.4f' % (epoch, 25, i, len(dataloader), errD.data[0], errG.data[0]))
if i % 100 == 0:
vutils.save_image(real, '%s/real_samples.png' % "./results", normalize = True)
fake = netG(noise)
vutils.save_image(fake.data, '%s/fake_samples_epoch_%03d.png' % ("./results", epoch), normalize = True)

The complete code :


from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.optim as optim
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
from torch.autograd import Variable
batchSize = 64
imageSize = 64
transform = transforms.Compose([transforms.Scale(imageSize), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),]) # We create a list of transformations (scaling, tensor conversion, normalization) to apply to the input images.dataset = dset.CIFAR10(root = './data', download = True, transform = transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size = batchSize, shuffle = True, num_workers = 2)
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
m.weight.data.normal_(0.0, 0.02)
elif classname.find('BatchNorm') != -1:
m.weight.data.normal_(1.0, 0.02)
m.bias.data.fill_(0)
class G(nn.Module):def __init__(self):
super(G, self).__init__()
self.main = nn.Sequential(
nn.ConvTranspose2d(100, 512, 4, 1, 0, bias = False),
nn.BatchNorm2d(512),
nn.ReLU(True),
nn.ConvTranspose2d(512, 256, 4, 2, 1, bias = False),
nn.BatchNorm2d(256),
nn.ReLU(True),
nn.ConvTranspose2d(256, 128, 4, 2, 1, bias = False),
nn.BatchNorm2d(128),
nn.ReLU(True),
nn.ConvTranspose2d(128, 64, 4, 2, 1, bias = False),
nn.BatchNorm2d(64),
nn.ReLU(True),
nn.ConvTranspose2d(64, 3, 4, 2, 1, bias = False),
nn.Tanh()
)
def forward(self, input):
output = self.main(input)
return output
netG = G()
netG.apply(weights_init)
class D(nn.Module):def __init__(self):
super(D, self).__init__()
self.main = nn.Sequential(
nn.Conv2d(3, 64, 4, 2, 1, bias = False),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(64, 128, 4, 2, 1, bias = False),
nn.BatchNorm2d(128),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(128, 256, 4, 2, 1, bias = False),
nn.BatchNorm2d(256),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(256, 512, 4, 2, 1, bias = False),
nn.BatchNorm2d(512),
nn.LeakyReLU(0.2, inplace = True),
nn.Conv2d(512, 1, 4, 1, 0, bias = False),
nn.Sigmoid()
)
def forward(self, input):
output = self.main(input)
return output.view(-1)
netD = D()
netD.apply(weights_init)
criterion = nn.BCELoss()
optimizerD = optim.Adam(netD.parameters(), lr = 0.0002, betas = (0.5, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr = 0.0002, betas = (0.5, 0.999))
for epoch in range(25):for i, data in enumerate(dataloader, 0):

netD.zero_grad()

real, _ = data
input = Variable(real)
target = Variable(torch.ones(input.size()[0]))
output = netD(input)
errD_real = criterion(output, target)

noise = Variable(torch.randn(input.size()[0], 100, 1, 1))
fake = netG(noise)
target = Variable(torch.zeros(input.size()[0]))
output = netD(fake.detach())
errD_fake = criterion(output, target)

errD = errD_real + errD_fake
errD.backward()
optimizerD.step()
netG.zero_grad()
target = Variable(torch.ones(input.size()[0]))
output = netD(fake)
errG = criterion(output, target)
errG.backward()
optimizerG.step()
print('[%d/%d][%d/%d] Loss_D: %.4f Loss_G: %.4f' % (epoch, 25, i, len(dataloader), errD.data[0], errG.data[0]))
if i % 100 == 0:
vutils.save_image(real, '%s/real_samples.png' % "./results", normalize = True)
fake = netG(noise)
vutils.save_image(fake.data, '%s/fake_samples_epoch_%03d.png' % ("./results", epoch), normalize = True)

You can follow and check out the code in my GitHub repository : https://github.com/venkateshtata/GAN_Medium

And feel free to fork and send pull requests, if you have any great modifications or suggestions to the code. Thank you.

Note :I have started my own stie where I will be implementing latest research papers on computer vision and Artificial Intelligence. Please visit www.matrixbynature.com for more tutorials.

--

--