AI tools have seemingly popped up overnight, with each iteration trying to fill in some niche gap the former failed to see.
But Google has somewhat been silent during this recent AI explosion.
Until recently, that is. A team working with Google Research has announced Dreamix, which the company claims is a diffusion-based video editor. While it can’t generate videos from just a prompt, it can take existing material and modify that video using text prompts.
But how does it work? Can this new tech benefit creatives? Or should we start to worry that audiences will just be able to generate their own movies on the fly in the next decade?
The Nuts and Bolts of Dreamix
No, you probably shouldn’t be worried about being replaced by a Google app just yet.
While AI technology is moving forward in leaps and bound, that Star Trek level of engagement is still a ways away. Google does have a track record of sending technology to the graveyard before it's had its time.
But with how incredibly valuable and popular AI tools have become, Dreamix could stay around for a bit.
According to the released info, when given video material, “Dreamix edits the video while maintaining fidelity to color, posture, object size, and camera pose, resulting in a temporally consistent video.” This is done by using "a video diffusion model to combine, at inference time, the low-resolution spatio-temporal information from the original video with new, high-resolution information that it synthesized to align with the guiding text prompt."
Inference overviewCredit: Google DreamixIf that makes zero sense to you, don't worry. We had the same issue. The basic idea, as far as we can tell, is the program takes the original video and adds noise. From there, it uses a video diffusion model and text prompts to "nudge" the original video into the director of what you prompted.
Below, Dreamix turns a video of a monkey into a dancing bear after being given the prompt: "A bear dancing and jumping to upbeat music, moving his whole body”.
A video of an eating monkey is completely transformedCredit: Dreamix
If you’ve watched this in a vacuum, the quality won’t wow you at all. But considering how far AI-generated content has come since the psychedelic video-generated content days, these results are spectacular.
Initially, the tech can only take inputs and then make AI-generated content. This includes not only video but also photos (or a series of photos).
Why Creatives Should Care
Seeing something like this, a filmmaker or creative might be ready to hand up their artistic spurs, so to speak. In reality, this is an incredible step forward in a toolset that could help editors, directors, and cinematographers alike.
Imagine you’re pitching a project, and you’ve put together a killer sizzle real. With Dreamix, creating something that fits your final narrative will be much easier.
For editors, Dreamix could help refine edits or even generate stock footage.
Cinematographers can even adjust material they’ve already shot to storyboard.
While these use cases are strictly for creatives, Dreamix could (and probably will) find a bigger audience. Imagine social media filters, but instead of adjusting your face, you create completely new content. It is a bit like motion capture but on AI steroids. Sure, there could be cases of nefarious use for this type of tech, but people had the same thing to say about seatbelts.
If you want some nerdy details, you can read the research paper right here.
What do you think about Dreamix? How would you like to see the technology evolve? Let us know in the comments!