Imagine computer programs that can whip up virtual worlds on the fly — that's what world models, a fascinating offshoot of machine learning, do. There's been quite a buzz around these AI wonders lately. Jumping on the trend, Google DeepMind rolled out Genie 2 this Wednesday. Unlike its predecessor that dabbled only in flat, 2D landscapes, Genie 2 dives into the realm of 3D and keeps these virtual realms going for much longer periods.
The Magic of Genie 2
Don't confuse Genie 2 with your typical game-making software. It's a unique diffusion model that paints images as you, or an AI companion, navigate through the virtual space it conjures. As it crafts each frame, Genie 2 isn't just making pictures; it's imagining the world it's creating, complete with flowing water, billowing smoke, and realistic physics — though some of these effects might feel a bit video-gamey. And it's versatile with viewpoints, offering first-person, third-person, and even overhead perspectives. All it needs to kickstart this virtual journey is a single image prompt, sourced either from Google's Imagen 3 or a snapshot of reality.
Genie 2's Memory and Edge Over Others
What sets Genie 2 apart is its memory. It can hold onto details of the virtual scene even when they slip out of the player's view and then bring them back to life accurately once they're back in sight. This is a step up from other models like Oasis, which, during its public demo last October, struggled to keep track of the Minecraft levels it was generating in real-time.