Google DeepMind's Genie: Creating Super Mario-Style Games from Images

Home AI News Google DeepMind's Genie: Creating Super Mario-Style Games from Images

Updated on October 24 2024

DeepMind has established its reputation in the artificial intelligence landscape by leveraging video games to test and refine its algorithmic innovations. Fourteen years down the line and following its acquisition by Google, gaming continues to play a pivotal role in its research endeavors. Enter Genie, a groundbreaking model that empowers users to transform images into immersive video game scenes.

Genie, short for Generative Interactive Environments, has been trained on extensive collections of internet videos, enabling it to craft interactive gameplay environments from a variety of input sources, including images, videos, and even user sketches that it has never encountered before. Imagine uploading a photograph of a clay sculpture; Genie can generate a vibrant 2D representation reminiscent of classic platformers like Super Mario Bros., all from a single image.

While the concept may initially seem like a mere novelty, DeepMind posits that Genie holds significant implications for the development of generalist agents—AI systems capable of performing a diverse range of tasks. Genie serves as a versatile method for learning latent actions through videos that can be adapted to human-designed environments without necessitating additional domain expertise.

DeepMind has explored various applications for Genie by training it on videos devoid of specific actions. Remarkably, the model demonstrated the ability to comprehend movements and adapt to new environments autonomously, without the need for detailed instructions.

The team behind Genie has emphasized that this initiative is merely the beginning of what could be achieved in the future. With an astonishing dataset of 200,000 hours of internet videos featuring 2D platformer games like Super Mario and robotic data from RT-1, Genie has learned intricate controls and recognized diverse actions across generated environments. This learning mimics human observational learning; for example, show Genie an image of a character poised near a ledge, and it can accurately infer that the character would perform a jump, crafting an engaging scene based on that action.

Genie operates with 11 billion parameters, establishing itself as a “foundation world model.” This classification is significant as it indicates a system that learns and understands the intrinsic mechanics of the world. For further insights, interested readers might explore comments from experts like Yann LeCun, Meta's Chief AI Scientist, on the definition of a world model.

Tim Rocktäschel, a research scientist at DeepMind involved in the Genie project, acknowledged the capabilities of Sora, OpenAI's latest video generation model, noting its impressive visual output. However, he reiterated LeCun's point that a world model fundamentally requires the integration of actions to be truly effective.

As of now, there has been no official announcement regarding the availability of the Genie model for public access or its potential integration into future Google offerings. However, a showcase page provides glimpses into the model's exemplary projects, allowing users to see Genie’s creative capabilities in action.

This Week's Gemini Image Flaws: Safeguarding Your Job from AI Threats

Discover Sora: OpenAI’s Incredible Video Generation Model Unveiled

Most people like

Topicfinder - The Ultimate Blog Content Title Finder and Generator

Unlock valuable content ideas with Topicfinder, an essential research tool tailored for creators and marketers alike.

competitive research AI Content Generator

Kreo Software

Transform your construction projects with AI-driven takeoff and estimating software to enhance efficiency and accuracy. Streamline your processes and maximize productivity today!

AI-powered AI Product Description Generator

JibJab

In today's fast-paced digital landscape, personalized entertainment platforms have revolutionized the way we consume content. These tailored services curate experiences based on your preferences, ensuring that every user enjoys a customized viewing journey. With advanced algorithms and a vast array of options, these platforms not only save you time but also enhance your overall entertainment experience, connecting you with shows and movies you'll love. Join us as we explore the benefits and features of these innovative entertainment solutions.

Personalized AI photos AI GIF Generator

YesChat

Explore the power of advanced chatbots driven by Claude 2, offering dynamic and engaging conversations at no cost.

chatbot AI Chatbot

Find AI tools in YBX