Creating fake Dog images using GANs: Generating endless good boys

Taira Mehta
6 min readJan 5, 2020

--

Ain’t he adorable

Generative Adversarial Networks (GANs) were introduced in a paper by Ian Goodfellow and other researchers at the University of Montreal back in 2014. When word came out about it, everyone in the AI and tech community was totally shook at the development of such a model.

Even Facebook’s AI research director had dubbed it as “the most interesting idea in the last 10 years in ML.”

So, what’s so special about these things anyway?

GANs allow for machines to mimic data, but in a sense maintain originality by randomizing content generation. Pretty neat, right?

That brings us here. I had a good look at this concept and thought to myself,

“Let’s have some fun here.”

So, I created a GAN that can be used to generate fake dog images. Not something you’d suspect to be very useful, but certainly a more simple take on a rather complex project. And hey, you can never have too many dog images.

Another one :))

Generative vs. Discriminative algorithms

The reason why GANs are so revolutionary is due to the fact that they were trained to specifically solve two problems at the same time. Data generation for creating fake data, and discrimination for picking apart at fake data.

These tasks are practically opposite to each other, but cool things happen when you fit them into different competitive models.

The Generator and Discriminator are two adversarial networks, both fighting to achieve their opposite goals. When the Generator spews out images that it’s made based on training data and random noise, the discriminator had the opportunity to separate between the real and fake images. This is repeated in a feedback loop where the Discriminator identifies the flaws in the Generator until the Generator beats the system and makes crazy real-looking images.

In a sense, the two main objectives of this system are as follows. We aim to train the Generator to maximize the Discriminator’s final classification error, in order to make the images appear realistic. We then train the Discriminator to minimize the final classification error, so that real data is correctly distinguished from fake data. They use each other’s outputs for judging performance.

To achieve this, during backpropagation, the Generator’s weights will be updated using Gradient Ascent to maximize the error, while the Discriminator will use Gradient Descent to minimize it. Backpropagation essentially adjusts each weight in the right direction by calculating the weight’s impact on the output, practically just how the output would change if you changed the weight.

How does it work here though?

The Discriminator is created so that it basically solves a supervised image classification task (for this case dog or no dog). The filters learned by the GAN can be utilized to draw specific objects in the generated image. The Generator contains vectorized properties that can learn very complex representations of objects.

There’s a good 3 step process

  1. The generator takes in random numbers and returns an image.
  2. Discriminator looks at the generated image alongside other real images
  3. Based on the discriminator’s analysis of the image, it gives a number between 0–1, 1 indicating the most real and 0 indicating absolutely fake

The discriminator is in a feedback loop with the ground truth of the images, which we know, but the generator is also learning from its mistakes. This makes it a double feedback loop.

A popular comparison is describing this model as a game of cops and robbers, or more realistically, cops and counterfeiters. “Practice makes better”, and that applies for both the cop and the counterfeiter. Whenever the counterfeiter tries making fake money, the cop has to spot out the fake money while the counterfeiter learns what they did wrong at the same time. The cop would obviously be the Discriminator while the counterfeiter would be the Generator.

Generator

To be more specific, we actually train the generator a bit differently than explained above. It’s a multi-step process that is rather intuitive but simplified nonetheless.

  1. Sample random noise.

Random noise is the thing that makes generated images different from one another, and the thing that also contributes to making images seem original.

2. Produce generator output from sampled random noise

3. Get discriminator “Real” or “Fake” classification for generator output

4. Calculate loss from discriminator classification

The loss function maps the values of one or more variables into a real number, therefore representing some “cost” associated with the event. An optimization problem seeks to minimize a loss function, like in this case

5. Backpropagate through both the discriminator and generator to obtain gradients

6. Use gradients to change only the generator weights

Learn from what didn’t work, and apply it.

Discriminator

The same thing goes for discriminator training, this stuff is actually more complex the deeper you look into it. In simplified terms it looks something like this:

  1. Classify real and fake data from the set given
  2. Penalize for misclassifying a real instance as fake or a fake instance as real
  3. Updates weights through backpropagation

How did you do this?

So I actually got a fair bit of inspiration from the original GAN paper itself by Ian Goodfellow and just seeing the potentials it had. It really helped to look at a few of tutorials online (I’ll link them below), they went over it pretty well and it made it a lot easier for me to absorb the content. I mainly used the TensorFlow tutorials to help me out.

Just show me the main code!

Damn alright, well here you have it.

The expectation from the training is that our Generator network should start producing data which follows the quadratic distribution. Although we are starting with very simple data distribution, this approach can be easily extended to generate data from the much more complex dataset.

Discriminator

The Leaky Relu function The activation function here is a sigmoid, and it’s going to get samples from either R or G and will output a single scalar between 0 and 1, interpreted as ‘fake’ vs. ‘real’.

Generator

The generator is designed to map and place space vectors and data space.

Since our data are images, converting z to data-space means ultimately creating an RGB image with the same size as the training images (i.e. 3x64x64). These layers help with the flow of gradients during training.

The main reason this thing works is because of convolutional layers, the things that turn the vector of a bunch of random values into an actual image. As the vector gets placed into convolutional layers, we eventually get the end result.

Takeaways:

  • GANs are much harder to make than they seem to the average person, you need a good amount of patience to actually sit down and make one.
  • The Generator is used to generate original images and learns from feedback
  • The Discriminator is used to pick at fake vs real images and also learns from feedback
  • You can never have too many dog pictures

I hope this helped :))

--

--