Artistic and ethical implications of artificial intelligence-powered image generation software, Dall-E-2.

all-E and Dall-E 2 are transformer (deep learning) models that use techniques of self-attention, weighing differently, each part of the input data. The technology has become largely popular for use in NLP (natural language processing) and CV (computer vision). The models have been developed by an artificial intelligence research laboratory, OpenAI. Dall-E 2 succeeds Dall-E and generates more vivid images at higher resolutions that can combine concepts, characteristics and styles.
Think of an image. Any image, no matter how silly. A yellow bus walking on legs? A tortoise carrying a Prada purse? Now imagine a technology generating a real picture of your imagination almost instantly. All you need are the right keywords, and a picture will be ready for you to download and share.
Our ancestors would call it magic – we call it, AI. The last few months have seen several text-to-image systems that required only a few keywords to generate images. This was mind-boggling to experts. An example is Craiyon, an image generation system that was previously known as Dall-E mini (although it has no relation to Dall-E or Dall-E 2). It was the best system of its kind when it came out and was free to use and widely available. The creative crowd quickly lapped it up and the system was especially loved by meme-makers for its unique and creative imagery. Craiyon has millions of images programmed into its database to draw from. However, what was profound about the system was that the images it spat out had never existed before – they were new each time they were requested.
However, due to its unclear and rather grainy graphics, the creative folks waited for something better in its wake and AI delivered. Meta’s Make-A-Scene has already been announced, as has Google’s Imagen. Nevertheless, neither its predecessors nor its announced successors have garnered the sort of excitement that Dall-E 2 has. When Dall-E was born,


the image it created did not have a very high resolution. Dall-E 2, on the other hand, has a four times improved image quality and even offers users the option to colour in and customise the image by adding shadows, and textures. There is even an option to add or remove whole objects. How this works in real time is something the internet can explain better – we are only here to marvel at the precision of the image which is flawless enough to grace the covers of a magazine. It’s true – Dall-E 2 images have already graced the covers of the latest Cosmopolitan issue which boasts the world’s first artificial magazine cover.
Any new technology will have avenues for evildoers to exploit. Dall-E 2 users have so far been quick to identify potential problems and have tried to nip them in the bud. Using someone’s real pictures, for instance, are a no-go. It’s not that the technology isn’t capable of churning one out, it can do quite a decent job of it. Also, off limits, are violent or pornographic images as the designers have taken the liberty of blocking certain keywords from their end. ‘Shooting’ for instance, is a word that would not produce an image due to its violent implications.
The advent of the Dall-E 2 technology has given rise to a number of questions – will the system render the creative community useless? If images can be produced easily and by literally any one, what does that mean for the future of stock photographers, graphic designers, commercial illustrators and even models? Is AI racist or sexist? And who owns the images – is it the system or the company that built the system? Or is it the person who envisioned flying purple pigs and manifested it on paper? A number of such concerns are picking the minds of experts and, no doubt will be addressed before the text-to-image system is released to the public. As of now, the technology is only available to a small group of private beta testers. Although there are reportedly more than 1 million requests on their waiting list, the company wishes to take a more gradual approach and add a thousand new users every week.
The biggest and the glaringly most obvious question that seems to stem from all these goings-on is what actual artists are making of the technology. One such artist called Bridget Moser mentioned, in an in-depth interview, that the AI gives out about 50 prompts per day and one can get about 6 pictures per prompt. That’s


one designer looking at 300 pictures per 24 hours and it is fairly easy, at least in the beginning, to get carried away. She equates the journey of exploring the AI pictures to sketching or brainstorming and feels that she takes something from each image the system churns out, to create the final product. Moser has experienced first-hand what it is like to feed forbidden words to the AI and how it blocks anything remotely gory or violent. Some of the images that come out have been slightly disturbing and she says she has posted some of the more balanced ones on her Instagram page.
Moser also claims to have tried to test the AI by going for unrealistic, complicated photography. One of her first searches looked something like ‘12 rubber gloves in the air + in the woods at night + disposable camera + flash photo.’ According to her, the resulting images were quite impressive and very close to what she had pictured in her mind. She feels like there is an art to working with Dall-E 2 and that the many nuances of the technology come to each designer with practice and a lot of playing around. While expressing concern about the biases that the AI has, and simply because inherently biased humans are training it, she also hopes that the designers find a way to trump these problems to create a system that works without discrimination for any artist in the world. She says that the technology is coming to us no matter what, so it’s best to be prepared for its pros and cons. As an artist, Moser

feels that Dall-E 2, by giving her so many variations of her thoughts, is making it easy for her to pick and choose what she wants to make real.
Some artists are worried if AI can do it better for them. The concern is real as that is indeed the unsettling bit about any new technology – the risk of rendering previous practices redundant. However, others are confident that the creative minds are better at enhancing these AI-churned images, and so, will always have an upper hand in exploiting it to their advantage. Troubling or not, a majority of the artists who have had a chance to experiment with the technology feel like it has a lot of potential and has been collectively impressed with it.
While it remains to be seen when and how designers of the system address the concerns stemming from this otherwise game-changing technology, the future is a distorted mess of positives and negatives. Due to the open-ended nature of the system and its ability to churn out a large number of images, the effects of the kinks also seem to be far-reaching, adding to the list of worries for designers. However, the bolstering thought is that, as with any new technology, more streamlining, and polishing of the entire system will definitely take place before the technology can be declared ‘safe’ for public use. Until then, the artists and creative minds wait with

bated breath to see what this whirlwind development means for their profession and for the future of art as we know it.