Odyssey, a startup based by self-driving pioneers Oliver Cameron and Jeff Hawke, has developed an AI mannequin that lets customers “work together” with streaming video.
Out there on the internet in an “early demo,” the mannequin generates and streams video frames each 40 milliseconds. By way of primary controls, viewers can discover areas inside a video, just like a 3D-rendered online game.
“Given the present state of the world, an incoming motion, and a historical past of states and actions, the mannequin makes an attempt to foretell the subsequent state of the world,” explains Odyssey in a blog post. “Powering this can be a new world mannequin, demonstrating capabilities like producing pixels that really feel life like, sustaining spatial consistency, studying actions from video, and outputting coherent video streams for five minutes or extra.”
Various startups and Large Tech firms are chasing after world models, together with DeepMind, influential AI researcher Fei-Fei Li’s World Labs, Microsoft, and Decart. They consider that world fashions might at some point be used to create interactive media, similar to video games and flicks, and run life like simulations like coaching environments for robots.
However creatives have combined emotions concerning the tech. A recent Wired investigation discovered that sport studios like Activision Blizzard, which has laid off scores of staff, are utilizing AI to chop corners and fight attrition. And a 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based movie, tv, and animation jobs might be disrupted by AI within the coming months.
For its half, Odyssey is pledging to collaborate with artistic professionals — not substitute them.
“Interactive video … opens the door to completely new types of leisure, the place tales may be generated and explored on demand, free from the constraints and prices of conventional manufacturing,” writes the corporate in its weblog put up. “Over time, we consider the whole lot that’s video right this moment — leisure, advertisements, schooling, coaching, journey, and extra — will evolve into interactive video, all powered by Odyssey.”
Odyssey’s demo is a bit tough across the edges, which the corporate acknowledges in its put up. The environments the mannequin generates are blurry and distorted, and unstable within the sense that their layouts don’t all the time stay the identical. Stroll ahead in a single path for some time or flip round, and the environment may abruptly look completely different.
However the firm’s promising to quickly enhance upon the mannequin, which might at the moment stream video at as much as 30 frames per second from clusters of Nvidia H100 GPUs at the price of $1 to $2 per “user-hour.”
“Wanting forward, we’re researching richer world representations that seize dynamics way more faithfully, whereas rising temporal stability and protracted state,” writes Odyssey in its put up. “In parallel, we’re increasing the motion house from movement to world interplay, studying open actions from large-scale video.”
Odyssey is taking a distinct method than many AI labs on the earth modeling house. It designed a 360-degree, backpack-mounted camera system to seize real-world landscapes, which Odyssey thinks can function a foundation for higher-quality fashions than fashions skilled solely on publicly obtainable knowledge.
To this point, Odyssey has raised $27 million from traders, together with EQT Ventures, GV, and Air Road Capital. Ed Catmull, one of many co-founders of Pixar and former president of Walt Disney Animation Studios, is on the startup’s board of administrators.
Final December, Odyssey said it was engaged on software program that permits creators to load scenes generated by its fashions into instruments similar to Unreal Engine, Blender, and Adobe After Results in order that they are often hand-edited.