What is a GAN?

two robots representing the generator and the discriminator in a given GAN system

In the context of machine learning, a GAN is a Generative Adversarial Network. This is a common method of generating new ‘data’ that didn’t previously exist while heavily based on some existing set of training data.

Let’s Focus on ‘Adversarial’

In the world of restaurants, a chef creates a menu and at some point, a critic will judge it. The chef may choose to adapt or not based on that critique. This phenomenon happens in a GAN as well, except in the GAN the adaptation of the output is a requirement of the system. It’s adversarial because the entities create output based on the result of the other. In a basic GAN, two entities need to be understood.

The Generator

This is the Generative part of the GAN. Without getting too deep into the weeds, this entity produces an output based on some random noise. That output gets judged by another entity (the Discriminator) and the results of the judgment are passed back. This then shapes the generator’s next output, resulting in a feedback loop that helps to produce more outputs that have better judgment scores. An important note to remember is that the networks that comprise the generator are ‘the engine’ that generates the new image.

The Discriminator

This is the judge in the relationship described above. Every GAN has some type of output it’s set up to produce. Let’s pretend we have built a GAN that produces images of clothing items. It’s got ‘validated’ training data that the output of the generator is compared against to see if it is passable as a clothing item. The entity doing the comparison is the Discriminator.

When the generator produces outputs, it should be as close to its training data set as possible while being slightly different. An interesting thing is that the Discriminators ‘opinions’ on the target output of the generator change over time. Even though its set of classification data might be a set of pictures, sounds, or text, the expectations of the discriminator are also shaped over time, based on the output of the generator.

Note: The Discriminator is not executing a loss/cost function. That happens separately for both the Generator and the Discriminator to update their functionality. The Discriminator is a classification system and the loss function ‘grades’ the classifications. I didn’t immediately understand this.

That’s it; at least conceptually

In this high-level overview, we've touched on the fundamental aspects of GANs. The unique interplay between the Generator and Discriminator sets GANs apart in the landscape of neural network architectures. There’s much more to learn to implement a GAN, but this is a decent start to know if it’s the architecture you should to implement.

This post serves as a primer for an upcoming series in which I'll delve deeper into GANs, specifically focusing on a project that employs the MNIST Fashion dataset to generate images of uniquely generated articles of clothing. Stay tuned for more detailed insights into the intricacies of GAN implementation.

Here’s a preview of my next post.

uniquely generated image based on MNIST fashion dataset

pictures of articles of clothing generated by GAN


Previous
Previous

5 Qualities of Great Senior Software Engineers That I Love

Next
Next

Tensors: Introduction