Now finally in the last month of 2023, we can safely say that this has 100% been the year of AI. And, sadly, perhaps, it sounds like 2024—and countless years into the future—will also be all about AI as well. That’s mostly because these AI wars are just starting to heat up really as another major player is stepping further into the game.

Meta has announced several new developments with its own AI endeavors, including a new generative AI video feature called Emu Video—plus an image editing called Emu Edits, which could find its way to video here soon too.

Let’s take a look at these new AI technologies and explore how they might further bring about change in this AI video space.

Emu Video

Leveraging Meta’s new Emu model, this new text-to-video generation tool is based on diffusion models and should operate similarly to other text-to-video AI programs like Runway, Midjourney, or Pika Labs. Meta has shared quite a bit about its processes here and how this Emu Video makes use of a “unified architecture for video generation tasks that can respond to a variety of inputs: text only, image only, and both text and image.”

This Emu Video though is making use of some state-of-the-art approaches for simplifying the use of just two diffusion models to generate 512x512 four-second long videos at 16 frames per second. They’ve also done quite a bit of testing so far to help provide the most human-approved results, rather than just machine learning.

Of course, at launch, this will be quite elementary compared to what a human video editor or animator could create. It will also be limited to animation-style videos for now. But, as always, it’s not really about what this AI can do today, it’s what it will be able to do tomorrow.

Emu Edit

The other major AI news being announced by Meta is its Emu Edit feature, which is set to provide precise image editing via recognition and generation tasks. Designed to most likely and immediately available for apps like Instagram and Facebook, this image editing AI tool will solve many of the issues users have found with generative AI where the results don’t always resemble what was put into the prompts.

Emu Edits makes use of a new approach to prompts by streamlining various image manipulation tasks together to bring enhanced capabilities and precision to image editing. Emu Edits is capable of free-form editing through instructions and can perform some helpful tasks like removing and adding a background, color and geometry transformations, and other detection and segmentation commands.

And that appears to be Meta’s biggest claim for its AI, specifically that Emu Edits will be able to more precisely follow instructions better than any other image generation AI tools or apps. Of course, this is image only at this point, but as this technology develops, it’s very likely going to include video too.

Factorizing Text-to-Video Generation by Explicit Image Conditioning

Factorizing Text-to-Video Generation by Explicit Image Conditioning


The Long Road Ahead

According to Meta, these new AI tools are just the start, and the technology company plans to keep refining them as well as possibly introduce new AI tools and features in the future as well. The uses right now seem pretty simple and mundane, probably just limited to generating animated stickers and clever GIFs to send to friends.

But as these AI technologies continue to develop, in particular Emu’s more sophisticated prompt request abilities, they’ll likely get better, smarter, and more competitive when compared to human capabilities.

Of course, this is all hopefully a long road away. But at the speed that innovations are already happening, and with the biggest players in the game doubling down on AI, it seems like that road is growing shorter by the day.