This paper presents an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples as shown below.

Turning a horse video into a Zebra video using CycleGANs
Various examples of style transfer
Example of paired and Unpaired training data. (Paired): For each sketch (source domain) there is a corresponding actual image (target domain) of the item. (Unpaired): There are some collection of natural images of some scenic locations and an independent collection of paintings of some set of scenic locations. There is no one-one correspondence between the images in source and target domain.


  1. The architecture consists of two generators, where one generator transfers from source domain to target domain, while the other one reconstructs our original image in source domain.
  2. There loss function consists of two terms: i) the usual adversarial loss term in GANs which tries to minimize the difference between a real and fake images in the target domain, and ii) an additional L2 loss term between the generated image and the reconstructed image which ensures that the content of the original image is retained in the target domain.
  3. The addition of L2 loss is important because, the adversarial just ensures that the output of the generator is a horse, but it does not ensure the content of the image is transferred. For example if we look at the images in the architecture diagram below, if the transferred image had just one horse (that looks real) instead of two zebras that were present in the source image, the discriminator would have been fine. However, the addition of L2 loss will ensure that the content of the image (in this case, the two zebras and their relative positions etc.) are preserved in the target domain.
Architecture of a CycleGAN [3]


  1. Original paper: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
  2. Blog post:
  4. Dataset and Implementation details: