Horses to Zebras with CycleGAN
Using CycleGAN to create fake images from real images
Created on September 20|Last edited on September 20
Comment
When I started studying neural networks, one of the most inspiring projects I found was CycleGAN. The results were impressive but - at first - I had a hard time understanding how it worked. Now with a bit more experience and effective experiment tracking tools, I have a better idea of what happens.
In Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, the authors do an amazing job at providing clear code that is well documented with reproducible results. The team presents a method that can learn to capture special characteristics of one image collection and figure out how these characteristics could be translated into a second image collection - all in the absence of any paired training examples. This means CycleGAN can solve problems with limited amount of labeled data, where normally it is costly, tedious or impossible to label and.
To dive deeper, it is fascinating to see how the network achieves results without being fed any paired “before-and-after” images. For example, to produce the outcome in the example below I gave it several images of zebras and several images of horses and the network learned how to go from zebra to horse while keeping the background and the shape of the animal the same.

This type of style transfer also works on other categories. Here’s another example where I gave the network pictures of winter scenes and summer scenes. I can feed this model a picture of a winter scene and it will change the season in the picture to summer.

Results were quite amazing when combining datasets of Monet paintings with real pictures.
Details suddenly appear and pictures look quite realistic. We can easily guess from the examples that the dataset of real pictures included a lot of sunsets or sunrises.

For CycleGAN, there are quite a few different losses defining the model. The network comprises 2 generators and 2 discriminators. Those losses help us understand what the model is doing when mapping A images with B images:

- D_A and G_A are standard GAN losses showing how well the discriminator and the generator “fool” each other into differentiating between real pictures and generated pictures in A domain. The generator tries to create a picture that looks like A (for example a horse) while the discriminator tries to spot fake images. The same losses apply on the B domain.
- cycle_A and cycle_B are used to provide consistency in the mappings. An image from A is mapped to B, and this resulting image is mapped back to A. The objective is to succeed in getting back to A with little variation. It helps preserve features between mappings.
- idt_A and idt_B bring stability by ensuring that if we are already in the output domain, no change will happen. For example if we take an image of a zebra and try to make it into a zebra (as if it was a horse), no change should happen. This helps ensure that images do not change too much, making the model very interesting.
As you can see from the graphs, the loss decreases and oscillates within a range. The best way to see if the model is performing well is to look at sample predictions. The W&B interface makes this easy. You can see some example images in my logged runs:
- Here’s an example transitioning between winter ↔ summer.
- Here’s another example flipping between zebra ↔ horse.
- Finally another example between Monet paintings ↔ real pictures.
Here are some of the most amazing samples I obtained going from Monet paintings to real pictures.




And other examples doing the opposite: real picture to Monet painting.


If you’re interested in learning more, check out our fork to easily log CycleGAN experiments with Weights & Biases.
Add a comment
Tags: Computer Vision
Iterate on AI agents and models faster. Try Weights & Biases today.