Miles Zimmerman, a 31-year-old San Francisco engineer, was having his mind blown earlier this month while playing with Midjourney, an AI-powered application that produces visuals with a simple word query.
A candid photo of some happy 20-something year-olds in 2018 dressed up for a night out, enjoying themselves mid-dance at a house party in some apartment in the city, photographed by Nan Goldin, taken with a Fujifilm Instax Mini 9, flash, candid, natural, spontaneous, youthful, lively, carefree. Midjourney spit forth picture after made-up image of gorgeous young people having a good time at a party in seconds.
Zimmerman was taken aback at first by the depth of detail. Faces, complexion, hair, and clothing were lifelike — although somewhat artificial, as some onlookers subsequently pointed out — and the expressions were precisely what he had requested. But the closer he stared, the stranger the images became. A happy lady posing for a photo with a companion while holding a point-and-shoot camera has many additional digits on her left hand. There were nine of them in all. Another had the exact amount of digits, but they were abnormally lengthy. Almost everyone had an excess of teeth.
He shared the photos on Twitter, and they instantly went viral.
Midjourney, Stable Diffusion, and DALL-E 2 have all grown in popularity in recent months. These applications, driven by a completely new sort of artificial intelligence known as generative AI, allow anybody to make practically any type of picture they desire using simple text instructions, eliciting both enthusiasm and criticism.
The systems function because they are “taught” to detect the connections between billions of photographs scraped from the internet and the written descriptions that accompany them, until the software “understands” that the word “dog” links to a picture of a canine, for example. These photos and descriptions are referred to as “datasets.”
AI-trained art is currently winning contests and being utilized by artists to depict articles and newsletters, among other things.
Despite fast advancements, AI-powered picture generators continue to fail miserably at one task in particular: creating realistic-looking human hands. When I gave the world’s finest AI-powered picture producers, Stable Diffusion, DALL-E 2, and Midjourney a simple prompt: human hands, here’s what they produced.
However, generative AI will one day become substantially better at producing images of hands, feet, and teeth. In order for AI to be a valuable tool for mankind, it must first comprehend what it means to be human, as well as the physical reality of being human.