In late December 2024, Google released Veo 2, a powerful Video Generation model "YouTube creators are exploring the creative possibilities of video backgrounds for their YouTube Shorts, enterprise customers are enhancing creative workflows on Vertex AI and creatives are using VideoFX and ImageFX to tell their stories"
We'll explore Veo 2’s main features, highlight what sets it apart, showcase its capabilities, and explain how you can start using it with Google’s VideoFX tool.
Veo 2: state-of-the-art video generation
Veo 2 creates incredibly high-quality videos in a wide range of subjects and styles. In head-to-head comparisons judged by human raters, Veo 2 achieved state-of-the-art results against leading models. It supports up to 4K resolution (though current outputs are limited to 720p in the VideoFX tool).
The model can be useful for anyone needing to generate AI videos, including marketers, creators, business owners, hobbyists, and possibly professional filmmakers.

Here are some of the features that we can expect from Veo 2:
- Realistic videos: Veo 2 generates detailed videos with fewer errors than its previous version, producing realistic, lifelike visuals.
- Advanced control: Users can provide specific instructions, such as selecting lens types, camera angles, or special effects, to customize the output.
- High resolution: Veo 2 supports video generation at up to 4K resolution, though current testing is limited to 720p.
- Smooth motion: The model incorporates an understanding of real-world physics, enabling it to create natural and accurate motion in scenes.
Veo 2 can handle both simple and complex instructions while creating videos that mimic real-world physics and different artistic styles.
How to Get Started With Veo 2 on VideoFX
VideoFX is Google’s experimental platform that lets you try out Veo 2.
On VideoFX, Veo 2 can create videos at 720p resolution and up to 8 seconds long. While the tool currently has these limits, Veo 2 can generate videos in 4K resolution and several minutes long.

To get started with Veo 2:
- Join the waitlist: Visit Google Labs and sign up. Access is being rolled out gradually, and it’s currently limited to U.S. users who are 18 or older.
- Write your prompt: Use cinematic language to guide Veo 2. For example, you could describe a "low-angle shot gliding through a scene" or a "close-up of a scientist looking into a microscope" to get professional-quality visuals.
- Experiment: Play around with different styles, genres, camera angles, or effects. You can even specify lenses like an "18mm lens" for wide shots or effects like "shallow depth of field" to blur the background.
The access is limited for now, but Google plans to expand Veo 2’s capabilities. By 2025, it could become available for creating videos on platforms like YouTube Shorts and Vertex AI, which will allow even more people to use it.
Veo 2 Video Examples
Let’s now have a look at some examples of videos that Veo 2 can create from a prompt (these are examples shared by the DeepMind team):
Scientist in a lab
Prompt: Cinematic shot of a female doctor in a dark yellow hazmat suit, illuminated by the harsh fluorescent light of a laboratory. The camera slowly zooms in on her face, panning gently to emphasize the worry and anxiety etched across her brow. She is hunched over a lab table, peering intently into a microscope, her gloved hands carefully adjusting the focus. The muted color palette of the scene, dominated by the sickly yellow of the suit and the sterile steel of the lab, underscores the gravity of the situation and the weight of the unknown she is facing. The shallow depth of field focuses on the fear in her eyes, reflecting the immense pressure and responsibility she bears.
Video Description: A dramatic close-up of a doctor wearing a protective hazmat suit, deeply focused as she looks into a microscope. The lighting and camera focus highlight the seriousness of her work.
Cartoon in a 1980s kitchen
Prompt: This medium shot, with a shallow depth of field, portrays a cute cartoon girl with wavy brown hair, sitting upright in a 1980s kitchen. Her hair is medium length and wavy. She has a small, slightly upturned nose, and small, rounded ears. She is very animated and excited as she talks to the camera.
Video Description: A fun, animated character comes to life in a retro kitchen, full of charm and colorful nostalgia.
Beekeeper on a farm
Prompt: The camera floats gently through rows of pastel-painted wooden beehives, buzzing honeybees gliding in and out of frame. The motion settles on the refined farmer standing at the center, his pristine white beekeeping suit gleaming in the golden afternoon light. He lifts a jar of honey, tilting it slightly to catch the light. Behind him, tall sunflowers sway rhythmically in the breeze, their petals glowing in the warm sunlight. The camera tilts upward to reveal a retro farmhouse with mint-green shutters, its walls dappled with shadows from swaying trees. Shot with a 35mm lens on Kodak Portra 400 film, the golden light creates rich textures on the farmer’s gloves, marmalade jar, and weathered wood of the beehives.
Video Description: A peaceful scene showing rows of painted beehives glowing in the sun, with a beekeeper holding a jar of honey, capturing the calm beauty of rural life.
Flamingos in a lagoon
Prompt: A low-angle shot captures a flock of pink flamingos gracefully wading in a lush, tranquil lagoon. The vibrant pink of their plumage contrasts beautifully with the verdant green of the surrounding vegetation and the crystal-clear turquoise water. Sunlight glints off the water's surface, creating shimmering reflections that dance on the flamingos' feathers. The birds' elegant, curved necks are submerged as they walk through the shallow water, their movements creating gentle ripples that spread across the lagoon. The composition emphasizes the serenity and natural beauty of the scene, highlighting the delicate balance of the ecosystem and the inherent grace of these magnificent birds. The soft, diffused light of early morning bathes the entire scene in a warm, ethereal glow.
Video Description: A relaxing shot of flamingos gracefully walking through clear water, surrounded by lush greenery and glowing in the soft morning light.
Rotating cube
Prompt: A perfect cube rotates in the center of a soft, foggy void. The surface shifts between different hyper-real textures—smooth marble, velvety suede, hammered brass, and raw concrete. Each material reveals subtle details: marble veins slowly spreading, suede fibers brushing with wind, brass tarnishing in slow motion, and concrete crumbling to reveal polished stone inside. Ends with a soft glow surrounding the cube as it transitions to a smooth mirrored surface, reflecting infinity.
Video Description: A cool, abstract animation of a cube that changes its surface to look like marble, suede, and other textures, set in a foggy atmosphere.
Dog on a pool float
Veo 2 vs. Sora vs. Other Competition
Veo 2 is one of the best video generation tools available, based on how people rated its performance in tests comparing it to others.
When comparing Veo 2 to other video tools, all the videos were shown in 720p resolution to keep things fair. The video length varied:
- Veo 2 videos were 8 seconds long.
- VideoGen videos were slightly longer, at 10 seconds.
- Other models’ videos were shorter, at just 5 seconds.
The people rating the videos were shown the full length of each video to give their feedback.
To test Veo 2's quality, participants watched videos created from 1,003 prompts using a dataset called MovieGenBench, which Meta developed. Here are the results:

The bar charts show how Veo 2 compares to other AI video tools—Meta Movie Gen, Kling v1.5, Minimax, and Sora Turbo—in two areas: overall preference and prompt adherence.
First off, we need to take these results with a grain of salt because Google presents them. More people liked Veo 2’s videos the most, especially when compared to Sora Turbo (58.8%) and Minimax (54.5%).
Veo 2 was also the best at following instructions accurately, scoring highest against Minimax (55.7%) and Sora Turbo (58.2%). In the charts, the green bars show where Veo 2 performed best, the pink bars show where other tools were preferred, and the white sections show ties.
However, Veo 2 isn’t perfect. Veo 2 has made big improvements in creating realistic and detailed videos, but it still has some challenges. Like other AI video tools, it still has difficulty keeping things consistent in very complex scenes or videos with a lot of fast or detailed motion.
SynthID Watermark: Responsible AI Video Generation
Google has focused on making Veo 2 safe and responsible to use. To help with this, every video it creates includes an invisible SynthID watermark.
The watermark is embedded directly into the pixels of the video frames and stays intact even if the video is edited (cropped, filtered, compressed, or reordered).
We can’t see the watermark, so the quality of the video stays the same, but tools can detect it.
The SynthID watermark ensures that the content can be identified as AI-generated. This helps to prevent misuse, misinformation, or confusion about who created the video.
You can learn more about SynthID here.
However, Google hasn’t shared where Veo 2’s training data comes from. Many believe YouTube, which Google owns, might be a source.
Conclusion
As Google continues to develop and expand access to Veo 2, it will be interesting to observe how it shapes the landscape of video creation. Its ability to produce high-quality videos from detailed prompts could democratize video production, but concerns about misuse and misinformation remain. I hope that Google maintains its focus on responsible AI practices as Veo 2's capabilities grow.