Veo – Why YouTube’s video creator expertise is its trump card in the battle with OpenAI’s Sora
Photo: Souvik Banerjee
When OpenAI revealed its Sora project last February (2024), the video industry began a transformational journey. It was clear even by the early clips that OpenAI’s AI text-to-video tool would recalibrate how videos were made. Such was the concern among traditional filmmakers that the actor-director Tyler Perry paused an $800m expansion of his film studio in Atlanta, US. In an interview with The Hollywood Reporter, Perry said “being told [Sora] can do all these things is one thing, but actually seeing the capabilities, it was mind-blowing”.
What we don’t know yet is how historians will come to view this moment. Will this be OpenAI’s launch pad towards a dominant position in video production? Or, will the pre-existing video industry deliver a revolutionary response? These questions are worth posing because the first mover isn’t always the last man standing. Those who remain once disruption has run its course tend to be the ones who have all the tools at their disposal to make the most of the opportunity.
Lowering the barrier to entry
These thoughts came to mind as Google used its developer conference on May 14 to reveal Veo, an AI text-to-video program designed to challenge Sora. This response was inevitable for two reasons. Firstly, Sora is aiming to achieve what YouTube has been doing for a long time now: it will lower the barrier to entry for content creation. Anyone with an idea can create using Sora. They just need to type into the text box and start tweaking the results. Veo promises to offer the same kind of service, with minute-long video clips in 1080p resolution that can capture an array of cinematic styles. The result should be more people creating content for YouTube. Theoretically, this could lead to more people watching and more advertising revenues. However, MIDiA has reservations about this pattern given the propensity for generative AI to flood social video platforms with content that a finite audience will struggle to consume.
Secondly, Google and YouTube have taken this step because they understand that creator tools are key to securing sustained engagement. A growing number of social video platforms are bringing creator tools in-house for this very reason. TikTok’s CapCut and Twitch’s short-form video editing tools are good examples. The thinking here is ‘by making creating easy and fun, it will become a force of habit. Therefore, more users will keep doing it’. After all, creating content is a form of entertainment. It must be free flowing and enjoyable if users are to stick at it and not revert to passively consuming.
Featured Report
Visionary audio Unlocking the power of video in podcasting
YouTube may be the only viable platform for long-form video podcasts, but that does not mean audio-first podcast platforms should abandon video. Instead, podcast platforms should leverage video both as...
Find out more…So, what is it about Veo that could make it a more powerful than Sora? Reports suggest Google DeepMind developers understand the platform must speak the creator’s language. This is imperative for ensuring an AI text-to-video generator can quickly transform a user’s idea into a usable output. When the Toronto-based film producers Shy Kids made the surrealist short film Air Head in Sora, there was a challenge around controlling the end result because it didn’t always understand traditional filmmaking terminology. Google claims Veo already understands key cinematic terms such as “timelapse” or “aerial shots of a landscape”. Of course, this is just the first step. YouTube can transform the years it has spent addressing creator pain points into relevant features in Veo. This will sharpen the quality of the output beyond the AI training data. Without this inherent knowledge within the business, OpenAI will have to rely on the model’s natural progression or partnerships with third parties - such as its work with Adobe Premiere Pro - to fully understand the needs of creators.
Data and distribution power
The other key advantage is data. The vast troves of information Google can bring to bear from creators on YouTube could prove crucial for giving Veo a competitive advantage. Google updated its privacy policy on July 1, 2023, that gave itself permission to train its AI products on user data. While it may be too early to say what help this will give YouTube, it would be fair to assume that being able to use its own data to inform Veo will be advantageous over OpenAI’s reliance on using third-party data. By controlling the data, Google has more oversight over potential copyright issues. Data on how users create using Veo is also likely to feed into the wider pool of information YouTube collects on ratings, comments, and search histories, which is then used to improve its recommendation systems.
However, the key advantage Google, YouTube, and Veo is through its distribution process. In the absence of its own distribution platform, Sora must rely upon creators distributing content onto third party platforms when it comes to social video creation. YouTube can not only provide frictionless publishing of content to its long-form and short-form service, but it can also help creators make distribution decisions based on the content which they have created. This is particularly pertinent given YouTube’s ambitions to dominate smart TV viewing in the home. In time, Veo may help creators achieve cinema-grade visual effects that could narrow the quality gap with TV streaming services. With its control over the algorithm, YouTube may even be able to guarantee preferential discovery if users create through Veo as opposed to other third-party apps like Sora. This is potentially even more important as a selling point for creators. Effective distribution and content discovery is key for creators keen to reach and grow an audience. YouTube’s user generated content distribution experience and reputation is therefore going to give Google’s Veo a compelling edge over OpenAI’s Sora (and other alternatives) for the creator community.
The power of YouTube is not only its deep knowledge of creator behaviours, but its control over the whole content creation ecosystem. OpenAI will need to build third-party partnerships – and fast – to ensure it remains relevant, effective, and popular with creators. Like all juggernauts, YouTube may have been slow to get moving with a credible AI text-to-video service. But now it is moving, it will be incredibly difficult to stop.
The discussion around this post has not yet got started, be the first to add an opinion.