AI video is evolving from passive generation to active engagement. Synthesia's new products focus on the intersection of video and AI agents, allowing users to, for example, watch a training video and then enter a role-playing simulation with an AI to test their comprehension.
Learners demand hands-on experience. The next evolution of training involves AI agents that act as sidekicks, not just explaining concepts but also taking over the user's screen to demonstrate precisely how to perform a task, dramatically accelerating skill acquisition and reducing friction.
Don't view generative AI video as just a way to make traditional films more efficiently. Ben Horowitz sees it as a fundamentally new creative medium, much like movies were to theater. It enables entirely new forms of storytelling by making visuals that once required massive budgets accessible to anyone.
The 'uncanny valley' is where near-realistic digital humans feel unsettling. The founder believes once AI video avatars become indistinguishable from reality, they will break through this barrier. This shift will transform them from utilitarian tools into engaging content, expanding the total addressable market by orders of magnitude.
People increasingly consume real-life events as passive entertainment. AI can economically enable mass-market interactive media where user choices create different outcomes. This could help teach that the future is contingent on our collective decisions, not a pre-written script to be watched.
While today's focus is on text-based LLMs, the true, defensible AI battleground will be in complex modalities like video. Generating video requires multiple interacting models and unique architectures, creating far greater potential for differentiation and a wider competitive moat than text-based interfaces, which will become commoditized.
The future of media is not just recommended content, but content rendered on-the-fly for each user. AI will analyze micro-behaviors like eye movement and swipe speed to generate the most engaging possible video in that exact moment. The algorithm will become the content itself.
While consumer AI video grabs headlines, Synthesia found a massive market by focusing on enterprise knowledge. Their talking-head avatars replace slide decks and text documents for corporate training, where utility trumps novelty and the competition is text, not high-production video.
The primary interface for AI is shifting from a prompt box to a proactive system. Future applications will observe user behavior, anticipate needs, and suggest actions for approval, mirroring the initiative of a high-agency employee rather than waiting for commands.
The OpenAI team believes generative video won't just create traditional feature films more easily. It will give rise to entirely new mediums and creator classes, much like the film camera created cinema, a medium distinct from the recorded stage plays it was first used for.
Sam Altman suggests AI will create a new form of entertainment on the spectrum between passive movies and intense games. Experiences will be more interactive than a film but less demanding than a typical video game, allowing users to lean back while also having moments of creative input.