Google’s Lumiere brings AI video closer to real than unreal


Google’s new video generation AI model Lumiere uses a new diffusion model called Space-Time-U-Net, or STUNet, that figures out where things are in a video (space) and how they simultaneously move and change (time). Ars Technica reports this method lets Lumiere create the video in one process instead of putting smaller still frames together.
Lumiere starts with creating a base frame from the prompt. Then, it uses the STUNet framework to begin approximating where objects within that frame will move to create more frames that flow into each other, creating the appearance of seamless motion. Lumiere also generates 80 frames compared to 25 frames from Stable Video Diffusion.
Admittedly, I am more of a text reporter than a video person, but the sizzle reel Google published, along with a pre-print scientific paper, shows that AI video generation and editing tools have gone from uncanny valley to near realistic in just a few years. It also establishes Google’s tech in the space already occupied by competitors like Runway, Stable Video Diffusion, or Meta’s Emu. Runway, one of the first mass-market text-to-video platforms, released Runway Gen-2 in March last year and has started to offer more realistic-looking videos. Runway videos also have a hard time portraying movement.
Google was kind enough to put clips and prompts on the Lumiere site, which let me put the same prompts through Runway for comparison. Here are the results:
Yes, some of the clips presented have a touch of artificiality, especially if you look closely at skin texture or if the scene is more atmospheric. But look at that turtle! It moves like a turtle actually would in water! It looks like a real turtle! I sent the Lumiere intro video to a friend who is a professional video editor. While she pointed out that “you can clearly tell it’s not entirely real,” she thought it was impressive that if I hadn’t told her it was AI, she would think it was CGI. (She also said: “It’s going to take my job, isn’t it?”)
Other models stitch videos together from generated key frames where the movement already happened (think of drawings in a flip book), while STUNet lets Lumiere focus on the movement itself based on where the generated content should be at a given time in the video.
Google has not been a big player in the text-to-video category, but it has slowly released more advanced AI models and leaned into a more multimodal focus. Its Gemini large language model will eventually bring image generation to Bard. Lumiere is not yet available for testing, but it shows Google’s capability to develop an AI video platform that is comparable to — and arguably a bit better than — generally available AI video generators like Runway and Pika. And just a reminder, this was where Google was with AI video two years ago.
Beyond text-to-video generation, Lumiere will also allow for image-to-video generation, stylized generation, which lets users make videos in a specific style, cinemagraphs that animate only a portion of a video, and inpainting to mask out an area of the video to change the color or pattern.
Google’s Lumiere paper, though, noted that “there is a risk of misuse for creating fake or harmful content with our technology, and we believe that it is crucial to develop and apply tools for detecting biases and malicious use cases to ensure a safe and fair use.” The paper’s authors didn’t explain how this can be achieved.
Google’s new video generation AI model Lumiere uses a new diffusion model called Space-Time-U-Net, or STUNet, that figures out where things are in a video (space) and how they simultaneously move and change (time). Ars Technica reports this method lets Lumiere create the video in one process instead of putting…
Recent Posts
- Rabbit shows off the AI agent it should have launched with
- Instagram wants you to do more with DMs than just slide into someone else’s
- HPE launches slew of Xeon-based Proliant servers which claim to be impervious to quantum computing threats
- There’s No Longer a Sub-$500 iPhone. Does It Matter?
- Limited Run says potentially damaging NES carts are supplier’s fault
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010