The process involves taking images and changing a few pixels in them at a time, over many steps, until the original images are erased and you’re left with nothing but random pixels. Diffusion models are neural networks trained to clean images up by removing pixelated noise that the training process adds. Instead, DALL-E 2 uses something called a diffusion model. “It was not a magical experience,” says Altman. The first version of DALL-E used an extension of the technology behind OpenAI’s language model GPT-3, producing images by predicting the next pixel in an image as if they were words in a sentence. The big breakthrough behind the new models is in the way images get generated. The basic idea is to get the second neural network to generate an image that the first neural network accepts as a match for the prompt. Under the hood, text-to-image models have two key components: one neural network trained to pair an image with text that describes that image, and another trained to generate images from scratch. “If I want it to create stories and build worlds, it needs far more awareness of what I’m creating,” he says. That’s great if you just want to keep generating images, but not if you need a creative partner. “Right now it’s like you have a little magic box, a little wizard,” he says. Eventually, he sees this technology being embraced not only by media giants but also by architecture and design firms. Nelson thinks there’s still more to come. “But it’s terrifying if you’re not open to change.” “It’s uplifting for all the folks who’ve never been able to create because it was too expensive or too technical,” he says. With DALL-E, he was able to do the work of multiple departments in a few minutes. Stevenson says he has tried out DALL-E at every step of the process that an animation studio uses to produce a film, including designing characters and environments. Shutterstock has signed a deal with OpenAI to embed DALL-E in its website and says it will start a fund to reimburse artists whose work has been used to train the models. Stock image companies are taking different positions. “We’re in a revolution of media generation.” Related Story “You can sit down with the client and generate designs together,” she says. Altman even used DALL-E to generate designs for sneakers that someone then made for him after he tweeted the image.Īmy Smith, a computer scientist at Queen Mary University of London and a tattoo artist, has been using DALL-E to design tattoos. People generated fan art, even whole comic books, and shared them online in the thousands. Within weeks of their debut, people were using these tools to prototype and brainstorm everything from magazine illustrations and marketing layouts to video-game environments and movie concepts. Learn how in our new series of business reports. Generative AI is changing the way work gets done in many industries. “The speed at which you can create and explore is revolutionary-beyond anything I’ve experienced in 30 years.” “This tech takes you from that lightbulb in your head to a first sketch in seconds,” he says. I think we’re going to spend a while digesting it as a society.”įor Chad Nelson, a digital creator who has worked on video games and TV shows, text-to-image models are a once-in-a-lifetime breakthrough. “But it’s moved so fast that your initial impressions are being updated before you even get used to the idea. “The shock and awe of this technology is amazing-and it’s fun, it’s what new technology should be,” says Mike Cook, an AI researcher at King’s College London who studies computational creativity. In just a few months, the technology has inspired hundreds of newspaper headlines and magazine covers, filled social media with memes, kicked a hype machine into overdrive-and set off an intense backlash. The pace of development has been breathtaking. Instead of just generating still images, these can create short video clips, animations, and 3D pictures. (Emad Mostaque, Stability AI’s founder, says he’s aiming for a billion users.)Īnd then in October we had Round Two: a spate of text-to- video models from Google, Meta, and others. More than a million people started using Stable Diffusion via its paid-for service Dream Studio in less than half that time many more used Stable Diffusion through third-party apps or installed the free version on their own computers. OpenAI signed up a million users in just 2.5 months.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |