Using GANs to Produce Art in a Particular Style from Semantic Maps

The following are examples of the art generated by one of my later(I’ve continued working on this long after I turned the project in) Pix2Pix models(trained with noise). As you can see, it’s not always the best; but, I do think there are some genuinely good results mixed throughout, which gives me hope that I can get better results with a better training regimen.


We investigate the role of noise in the training process of an image-to-image style transference GAN. We will do this by comparing two models trained on the same training data, a set of semantic maps, and their target image; however, in one of the models, we will add noise to the semantic maps prior to the training process. We will then apply our models to a semantic map that was not used in the training process and measure the Euclidean distance, treating the RGB pixel values as spacial coordinates, of the generated image from its target image to compare the accuracy of our two models.

Which image was generated by an AI and which one was created by a person?

Presentation and Paper

<object class="wp-block-file__embed" data="; type="application/pdf" style="width:100%;height:430px" aria-label="<strong>PresentationPresentationDownload
The image on the left is the target image, the middle image is the source image, and the image on the right is the generated image.
<object class="wp-block-file__embed" data="; type="application/pdf" style="width:100%;height:440px" aria-label="<strong>PaperPaperDownload

As an aside, something that I think might be interesting is finding the difference in pixel values between the target image and the generated image and training two GANs to predict how far off the generated image may be from the target. If we were to keep with the usual analogies of counterfeiters and police officers, then you could think of these other two models as playing the role of touchup artists working with the counterfeiter. One touchup artist adds to, and the other subtracts from, the RGB values generated by the counterfeiter.

To train them, you apply the GAN to your original data set and apply a high(x>0) and low(x<0) pass filter to the matrix formed by taking the difference in pixel values of the target images and the generated image. Then you take the absolute values of the remaining non-zero values, form a new image, build the two new datasets, and begin training on the touchup artists. So, in the end, you’d have three models and the linear combination of their generated images should(theoretically) conform better to your target image dataset.


On the left, we have the image generated by each generation of the pix2pix using data without noise. On the right, we have the target image.
On the left, we have the image generated by each generation of the pix2pix using data with noise. On the right, we have the target image.

Code and Data Set


The data set:

Related Papers


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s