The Ever-Evolving Landscape of AI: Google Unveils Veo 3 and Imagen 4
The whirlwind of advancements in artificial intelligence continues unabated. In the wake of ChatGPT’s significant enhancements in image generation, Google has stepped up to showcase its own innovative models capable of transforming text prompts into both video and images. Introducing Veo 3, designed for video creation, and Imagen 4, a new player in the world of image generation, both of which were revealed during the Google I/O 2025 event. Each model boasts remarkable advancements.
Starting with Veo 3, this model is an upgraded version of the previously released Veo 2, which was rolled out to premium Gemini subscribers just last month. As outlined by Google, Veo 3 enhances its grasp of real-world physics—a common pitfall for AI-generated videos—along with other intricate details such as lip-syncing. This evolution promises videos that appear more authentic than ever before.
Additionally, sound quality has seen a significant upgrade. Previously, videos generated by Veo lacked audio components, but the new model incorporates suitable background sounds, including ambient noise from traffic, nature, and even character dialogues.
To demonstrate these new features, Google has released several sample videos, one being the Old Sailor. This example showcases the impressive capability of generating a high-quality video from a text prompt, offering a realism that avoids past issues, like awkwardly rendered hands.
However, the core characteristics of AI are still evident. The generated sailor appears generic, the sea is standard, and the dialogues remain somewhat clichéd. Essentially, the video is a compilation of various inputs from training data on sailors and the sea, possibly diverging from the original prompt’s intent, as no details have been provided by Google.
Access to Veo 3 comes with a hefty price tag, requiring a subscription to Google’s AI Ultra plan at $250 per month. For those opting for a more budget-friendly AI Pro subscription, Veo 2 will also receive enhancements, such as improved control, consistency, camera movement, and the ability to add and remove objects from video clips.
Next-Level Image Generation with Imagen 4
Transitioning to the realm of images, the newly released Imagen 4 succeeds Imagen 3. This model boasts remarkable clarity in intricate details such as fabric textures, water droplets, and animal fur, as well as support for higher resolutions (up to 2K) and various aspect ratios. Google asserts that its outputs are top-tier, whether in photorealistic or abstract styles.

Credit: Google
Google has also addressed one of the prevailing challenges in AI image generation: typography. Imagen 4 significantly outperforms its predecessors in producing coherent and precise characters and words, eliminating issues with peculiar spellings or glyphs that are indecipherable.
Currently, Imagen 4 is accessible through the Gemini app, with no specified usage limits mentioned by Google. It’s presumed that those not subscribed may encounter quicker limitations, mirroring the experience with Imagen 3, where limits fluctuate based on demand for Google’s AI services.
The showcased samples from Google appear polished, without glaring errors, exhibiting the typical enhancement sought from AI outputs. Furthermore, Imagen 4 is reported to be faster than its predecessor, with future enhancements slated to unveil a version that operates ten times quicker.
Introducing Flow: A Tool for Seamless AI Filmmaking
Finally, another innovative tool is on the horizon: Flow. This AI filmmaking software consolidates Google’s text, video, and image models to assist users in seamlessly crafting successive scenes that maintain character and location consistency. Available to both AI Pro and AI Ultra subscribers, Flow offers enhanced utilization limits and advanced models for users on the more premium plan.