The Advancements in AI Video Generation: A Comparative Review
The landscape of AI-driven video creation is evolving at a remarkable pace. Google’s Veo 2 has recently been integrated into the Gemini application, available to those subscribed to a Google One AI Premium plan. Similar to platforms like OpenAI’s Sora, Runway, and Adobe’s Firefly, Veo 2 allows for the generation of polished videos from simple text descriptions.
With Veo 2 now accessible to subscribers, it’s an opportune moment to evaluate the performance of various AI video generators by examining their respective merits and drawbacks, as well as assessing the current status of AI-generated video content. Claims abound regarding the potential for these technologies to revolutionize filmmaking or inundate online spaces with subpar AI content, but can they be practically beneficial?
Microsoft appears to be a believer in this technology, having incorporated it into a recent advertisement. Yet, not all scenes were created by AI—certain shots with quick edits and minimal action minimized the likelihood of noticeable inaccuracies.
This article explores a comparison between Google Veo 2 and its competitors: Sora, Runway, and Firefly. While several other platforms exist, these four stand out as leading options, each requiring a subscription (with prices beginning at $20 per month) to unlock their features.
Bouncing Balls
For those who recall the iconic 2005 Sony advertisement for the Bravia TV, featuring over 100,000 colorful balls cascading down the hilly streets of San Francisco, this task presents a formidable challenge for AI. The prompt issued for this test was: “A multitude of vibrant, distinct balls bouncing down a sunny, steep street in San Francisco, captured in slow motion as the camera glides along, navigating around trees and parked vehicles.”
The attempt made by Google Veo 2 is commendable. While certain physics aspects appear unusual, the visuals generally seem authentic and could work as a brief segment, provided one doesn’t scrutinize too closely. The backdrop is well-rendered, and the initial instructions were adhered to quite accurately.
On the other hand, Sora struggles to interpret the scene accurately. Although colored balls are present, they appear to move chaotically and violate the principles of gravity. While the pace is acceptable—albeit reversing the intended direction—the background remains visually satisfactory.
Runway presents a vibe reminiscent of the original Sony clip but is marred by several issues. The balls are inconsistent in appearance, movement diverges from the request, and an inexplicable presence of what seems to be an alien observing from a window in the upper right corner adds to the confusion. However, the street itself maintains a distinctive visual appeal.
Firefly ranks as the least effective here. Most balls appear motionless, and those that do move lack detail. While the street isn’t objectionable, it exudes a retro video game vibe, with the camera angle frustratingly moving up the street when the opposite was requested.
Recreating the Iconic “Jurassic Park” Scene
If AI aims to substitute human filmmakers, mastering the crafting of impactful scenes akin to the iconic “welcome to Jurassic Park” moment in Spielberg’s 1993 classic is essential. This pivotal scene features Richard Attenborough’s John Hammond unveiling dinosaurs to guests.
The input prompt included: “Atop a hill, two paleontologists gingerly traverse grassy terrain. As they proceed, the camera pulls back, unveiling an expansive clearing and a lake below, where dinosaurs stroll amidst the trees.”
The output from Google Veo 2 exhibits a fair level of quality. Although the camera movement does not align with the directive and the paleontologists lack the staggering motion described, the overall backdrop is visually appealing and the dinosaurs are passably rendered. General impressions may lean towards generic, but it’s a solid effort nonetheless.
Sora’s output appears rather erratic. The movements are stiff, failing to adhere to the guidelines, and the dinosaurs resemble amorphous creatures. While all designated elements are present, the surrounding scenery has a commendable quality.
Runway, in comparison, most effectively captures the desired camera movements and overall ambiance of the scene. The visuals of the lake and the dinosaurs are notably realistic, although it raises questions regarding the disappearance of one of the paleontologists.
Firefly’s submission leaves much to be desired; it lacks a clear grasp of what a paleontologist is, and the dinosaurs appear deceptively small. Meanwhile, the lake and surrounding greenery exhibit a satisfactory standard, albeit with a noticeable AI sheen across the entire frame. However, the translation of camera movements remains faithful.
Recreating the “Living Daylights” Scene
Finally, let’s evaluate the memorable scene from The Living Daylights, where two characters navigate a snowy incline on a cello case. This is another chance for AI to showcase its capabilities.
The prompt for this visual: “Two figures dressed for winter slide down a snow-blanketed road on a cello case, encountering a barrier that they duck beneath.”
The output from Google Veo 2 is surprisingly effective; the scene is mostly lifelike, and the cello case appears recognizable. Although the characters pass through the road barrier as if it’s nonexistent, at least the presence of the barrier is acknowledged—something that eluded other AI models.
Turning to Sora, the results are decent, though not incredibly accurate. While the snowy road and trees are depicted well, the cello case is not congruous with the prompt, and the characters should logically face forward. Where’s my barrier, Sora? I want to see their evasive maneuvering!
In the case of Runway, the training sets it utilized seem unfamiliar with the peculiar scenario. The characters merge together, resulting in odd appearances, and the elements are in a regrettably unstable state. Despite this, the snowy environment, complete with animated falling snow, showcases some degree of realism.
Adobe Firefly presents a baffling interpretation, with nonsensical physics, inconsistent characters, and a conspicuous absence of the road barrier beneath which the actors are meant to duck. Although snow coverage and the cello case elements are present, the result is rather disconcerting.
No Definitive Champion
Overall, Veo 2 demonstrates the most impressive outputs, but Runway often excels in realism. However, each platform exhibits significant challenges regarding physics, authenticity, and the ability to accurately interpret prompts. It’s clear these videos bear the unmistakable mark of AI generation, showcasing various oddities and inconsistencies.
Anticipating these AI creators to rival the quality of professional advertisements or films is unrealistic, given the constraints of relying solely on textual prompts and a few moments of processing time. The intention isn’t to disparage these advanced tools, which are undeniably sophisticated, but to highlight the inherent limitations within AI-generated video content.

Credit: Adobe Firefly/DailyHackly
Through dedicated work and skill, it’s plausible to produce visually appealing results, and these video generation tools will unquestionably continue to evolve. Envisioning their capabilities in the next five to ten years is thought-provoking. Observing highlighted showcases from these platforms reveals the potential for remarkable outcomes.
Nonetheless, skepticism surrounds the notion that these AI technologies could completely supplant traditional filmmaking techniques, irrespective of advancements in training. Achieving results comparable to the Sony advertisement entails producing copious, intricate prompts, and even then, the desired outcome may remain elusive. Would AI conjure the creative details, like a frog leaping from a drain? While the speed and ease of content generation is appealing, it diverts most creative decisions toward AI systems, resulting in a noticeably computerized finish.

Credit: Runway/DailyHackly
AI lacks an innate understanding of fundamental principles such as how balls bounce, the appearance of dinosaurs, or the direction in which individuals should face when sliding down a snowy road on a cello case. These systems generate estimates based on prior video examples, causing flaws to become more pronounced in the realm of video compared to images or text. Many AI videos, similar to those mentioned, struggle with elements that fade in and out of view due to the AI’s tendency to overlook unfamiliar aspects when they’re not visible.
This discussion hasn’t even scratched the surface of copyright implications or the environmental costs involved. The expected rise of AI-generated advertisements and short films is inevitable as technological advancements take place. However, it’s crucial to remember the famous cautionary sentiment from Jurassic Park: while we focus on what we can achieve, we often neglect to consider if we should.
Disclaimer: The parent company of DailyHackly, Ziff Davis, has initiated a lawsuit against OpenAI, alleging copyright infringement related to the training and operation of its AI systems.