While photorealism is a common goal, the first fully AI-generated films will likely be animated or fantasy. This is because traditional filmmaking is already cheap and effective at capturing reality. AI's true economic and creative advantage lies in generating complex, non-photorealistic visuals that are currently expensive to produce.
Don't view generative AI video as just a way to make traditional films more efficiently. Ben Horowitz sees it as a fundamentally new creative medium, much like movies were to theater. It enables entirely new forms of storytelling by making visuals that once required massive budgets accessible to anyone.
AI generating high-quality animation is more impressive than photorealism because of the extreme scarcity of training data (thousands of hours vs. millions for video). Sora 2's success suggests a fundamental improvement in its learning efficiency, not just a brute-force data advantage.
Creating rich, interactive 3D worlds is currently so expensive it's reserved for AAA games with mass appeal. Generative spatial AI dramatically reduces this cost, paving the way for hyper-personalized 3D media for niche applications—like education or training—that were previously economically unviable.
While generative video gets the hype, producer Tim McLear finds AI's most practical use is automating tedious post-production tasks like data management and metadata logging. This frees up researchers and editors to focus on higher-value creative work, like finding more archival material, rather than being bogged down by manual data entry.
The computational requirements for generative media scale dramatically across modalities. If a 200-token LLM prompt costs 1 unit of compute, a single image costs 100x that, and a 5-second video costs another 100x on top of that—a 10,000x total increase. 4K video adds another 10x multiplier.
While today's focus is on text-based LLMs, the true, defensible AI battleground will be in complex modalities like video. Generating video requires multiple interacting models and unique architectures, creating far greater potential for differentiation and a wider competitive moat than text-based interfaces, which will become commoditized.
Former DreamWorks CEO Jeffrey Katzenberg compares the current backlash against AI in creative fields to the initial revolt from traditional animators against computer graphics. He argues that, like computer animation, AI's adoption is an unstoppable technological shift that creators will either join or be left behind by.
ElevenLabs' CEO predicts AI won't enable a single prompt-to-movie process soon. Instead, it will create a collaborative "middle-to-middle" workflow, where AI assists with specific stages like drafting scripts or generating voice options, which humans then refine in an iterative loop.
An AI CEO predicts that within two years, AI tools will make content creation instantaneous and nearly free. This will destroy traditional moats like audience loyalty and production quality, as anyone can generate photorealistic content. The market will shift focus from the creator to the individual content piece.
When analyzing video, new generative models can create entirely new images that illustrate a described scene, rather than just pulling a direct screenshot. This allows AI to generate its own 'B-roll' or conceptual art that captures the essence of the source material.