Updated: Apr 17
Stability AI—the company that released a text-to-image generation model named Stable Diffusion last year—found itself at the center of an interesting controversy.
Text-to-image generation models are neural network algorithms trained to generate images based on a text prompt by the user. A prompt might be something like “An image of a hill overlooking a lake at dawn.” Given this prompt, the model generates an image depicting the scene described by the text. The issue at the heart of the controversy stems from these models being used to generate artwork that companies then sell for profit on such marketplaces as Etsy.
“What is so wrong about selling AI-generated artwork?” you ask.
Well, it turns out that training these models involves utilizing a data set of millions of images scraped from different parts of the internet. These images are photographs, paintings, and different artworks created by artists who are not getting paid by the sale of the AI-generated work. What’s worse—the AI algorithms are now competing for sales with some of these artists on the same platforms!
Is it fair for AI systems to be trained on the hard work of artists who are not getting paid for their contributions to these algorithms? Consider that these AI systems are capable of not just generating images at a rate that the human artists will never match, but in a feat that adds insult to injury, they are capable of also emulating the artists’ own styles.
Whether the use of artwork scraped from the internet in this manner qualifies as copyright violation or not is a matter for lawyers to decide. I am interested in the technical nuances involved, and I’d like to raise a few questions that may help put some perspective onto the debate.
First, let’s address a fundamental aspect of how these generative models work. When a trained algorithm generates a brand-new image, the image output derives from a mathematical function that satisfies some statistical properties. That is, the model is not performing copy-and-paste operations from its data set of training images. The resulting output is not some Frankenstein collage of image patches from original artworks.
Instead, what is happening is that the model learned during training how to represent images of landscapes in pixels as mathematical concepts. To generate an image of a person, the model learns a function that outputs a pixel grouping that resembles a person.
This mathematical function is typically modeling some probability distribution. So when the model outputs an image of a person, a bird, or a hill, it is simply outputting a pixel grouping that has a pretty good chance of looking like a person, bird, or hill based on the training data.
But because we are dealing with probability distributions, this means that the images that the model outputs are not copies of the training data—they are brand-new samples from a distribution that includes the images from the training data set. In other words, these are, in fact, brand-new images.
With this understanding, if we still object to such use of original artwork, the objection must be on other grounds. We cannot simply say, “AI algorithms are blatantly copying the work of original artists in generated fakery.” Strictly speaking there is no “copying,” although you might say they are imitating the original artists.
Suppose I am an aspiring artist learning to paint on canvas. I might visit the Louvre, Prado, and MOMA. I might borrow every art book in the library and search every example of oil-on-canvas paintings available on the web. I may spend thousands of hours painstakingly observing and imitating the works I see, attempting to perfect different techniques. Eventually, I may come to master some of those techniques and produce decent works of my own. The work that I produce, although original, will undoubtedly have benefitted by the work of the artists from whose originals I learned to develop my own skills.
If I sell an original piece, do I owe part of my proceeds to them? If not, why is it OK for me to learn to paint by observing other people’s work, but it is not OK for an AI algorithm to do so? Is the problem simply that someone had to merely scrape images off the web to train the model in a way that looks like taking something they didn’t pay for, to use for a product that will generate money, without paying restitutions? What if, instead of scraping the images off the web, the model could be trained by looking at the images in the original locations without downloading them, just as a human would? Would this solve the problem?
Or is the issue that an AI algorithm can generate high-quality output at a rate that humans cannot compete with, and therefore what we are really worried about is protecting humans from a possible future where original artwork is obsolete? If so, this might be less a problem of using the original artworks and more a problem of scale.
I am not sure I know the answer to these questions, but I know they must be considered if we want to solve this problem. It is worth noting that, although the current debate revolves around visual arts, these problems will continue to arise as AI advances in multiple domains. AI is already being used to generate music. Perhaps prose and AI-generated novels will be next.
For those concerned that AI-generated art, in all domains, will eventually replace human artists entirely, perhaps we can look to the early days of the camera and find some comfort. Although many feared that photographs would make paintings obsolete, artists keep finding a way to stay relevant.