Image-to-Image Translation Using CycleGAN and Pix2Pix

A practical introduction to image-to-image translation, complete with code and examples. Made by Ayush Chaurasia using Weights & Biases
Ayush Chaurasia

Image-to-Image Translation

Image-to-image translation is a vision task where the goal is to learn the mapping between an input image and an output image. It covers specific tasks like enhancement and style transfer, among others.
In this report, we'll look at two methods used for image-to-image translation tasks, CycleGAN, and Pix2Pix.

Official project repository - pytorch-CycleGAN-and-pix2pix

Follow along using Colabs!

CycleGAN
Pix2Pix
Let's start with CycleGAN and move on the Pix2Pix.

CycleGAN

Architecture Overview

Figure is from CycleGAN [Zhu*, Park*, et al., ICCV 2017]
CycleGAN consists of two mapping functions G : X → Y and F : Y → X, and associated adversarial discriminators D_Y and D_X.
D_Y encourages G to translate X into outputs indistinguishable from domain Y, and vice versa for D_X and F. To further regularize the mappings, two-cycle consistency losses were introduced that capture the intuition that if we translate from one domain to the other and back again we should arrive at where we started.

Objective function:

Consistency losses:
forward cycle-consistency loss: x → G(x) → F(G(x)) ≈ x
backward cycle-consistency loss: y → F(y) → G(F(y)) ≈ y
\textbf{Lcyc(G, F) = }E_x∼p_{data(x)} [||F(G(x)) − x||] + E_y∼p_{data(y)}[||G(F(y)) − y||]
Adversarial Losses:
L_{GAN}(G, D_Y , X, Y ) = E_y∼p_{data(y)}[log D_Y (y)] + E_x∼p_{data(x)}[log(1 − D_Y (G(x))]
Similarly, the second adversarial loss L_{GAN}(F, D_X, Y, X)
Combined Objective function :
L(G, F, D_X, D_Y ) =L_{GAN}(G, D_Y , X, Y ) + L_{GAN}(F, D_X, Y, X) + λL_{cyc}(G, F)

Results

Inference

To download any of the available pre-trained models, run the following command inside the repository.
To log your test results to W&B, pass --use_wandb flag with the script.
bash ./scripts/download_cyclegan_model.sh {model_name}
Here's a horse2zebra model inference. Use the slider to see different input-output pairs
bash ./scripts/download_cyclegan_model.sh horse2zebrapython test.py --dataroot datasets/horse2zebra/testA --use_wandb --name horse2zebra_pretrained --model test --no_dropout

Training

Download the dataset of your choice.
Available datasets are: apple2orange, summer2winter_yosemite, horse2zebra, monet2photo, cezanne2photo, ukiyoe2photo, vangogh2photo, maps, cityscapes, facades, iphone2dslr_flower, ae_photos
bash ./datasets/download_cyclegan_dataset.sh {dataset_name}
To track your training progress pass --use_wandb in the training script.
python train.py --use_wandb / --dataroot ./datasets/horse2zebra --name horse2zebra --model cycle_gan

Pix2Pix

The Pix2Pix GAN is an approach to training a deep convolutional neural network for paired image-to-image translation tasks. Pix2Pix works on the concept of conditional GANs learn a mapping from observed image x and random noise vector z, to y,
G:\{x,z\} \rightarrow y
The generator G is trained to produce outputs that cannot be distinguished from “real” images by an adversarially trained discriminator, D, which is trained to do as well as possible at detecting the generator’s “fakes”
Figure is from Pix2Pix [Isola et al., CVPR 2017]
Figure: Training a conditional GAN to map edges→photo. The discriminator, D, learns to classify between fake (synthesized by the generator) and real {edge, photo} tuples. The generator, G, learns to fool the discriminator. Unlike an unconditional GAN, both the generator and discriminator observe the input edge map.

Inference

First, you need to download any of the available pre-trained pix2pix models. Alternatively, you can train your own model as described in the next section.
bash ./datasets/download_pix2pix_dataset.sh {model_name}
To log your test results to W&B, pass --use_wandb flag with the script.
test.py --dataroot ./datasets/facades --use_wandb /--name facades_pix2pix --model pix2pix --direction BtoA

Training

Download the dataset of your choice. The available datasets are cityscapes, night2day, edges2handbags, edges2shoes, facades, maps
In this example, we'll train a model to generate building images from outlines.
bash ./datasets/download_pix2pix_dataset.sh maps
To track your training progress pass --use_wandb in the training script.
python train.py --use_wandb / --dataroot ./datasets/facades --name facades_pix2pix --model pix2pix --direction BtoA

Compare results across runs

You can explore all your experiments and compare the performance of the latest models in the Result Table panel of your project dashboard. You can also compare the final predictions across each epoch by comparing the tables of a particular run in the "artifacts" panel of your dashboard.

Try it yourself!

CycleGAN
Pix2Pix