Google DeepMind Unveils Genie 2 AI Model
In a blog post, the corporate detailed the brand new AI mannequin and its capabilities. While its predecessor may solely generate sport worlds for 2D platformer video games, the Genie 2 AI mannequin can generate 3D worlds full with constant fashions that may be interacted with. This means people or AI brokers can stroll, run, swim, climb, and carry out extra actions in these environments.
Genie 2’s generative capabilities permit it to generate routes, buildings, and objects that can’t be seen within the enter picture. These components are designed and rendered by the mannequin from scratch. Additionally, the muse mannequin can be able to sustaining consistency in these environments. This means even when a participant strikes away from one space and returns again, the environments stay the identical.
Apart from this, Genie 2 is able to producing totally different views comparable to first-person views, isometric views, or third-person views. Further, customers also can work together with the objects within the generated worlds and may carry out actions comparable to opening a door, bursting a balloon, or climbing a ladder. The mannequin will also be prompted to generate physics-related results comparable to water ripples, smoke, gravity, directional lighting, reflections, and extra.
Coming to the technical particulars, DeepMind defined that Genie 2 is an autoregressive latent diffusion mannequin and has been skilled on a big video dataset. The transformer structure additionally contains an autoencoder which allows frame-by-frame technology of those worlds.
Notably, DeepMind additionally launched an AI mannequin dubbed Scalable Instructable Multiworld Agent or SIMA earlier this yr, which is basically able to agentic AI capabilities in 3D worlds. The firm says Genie 2 is able to offering distinctive environments to related AI brokers and coaching them for numerous real-life eventualities.
Since the world mannequin can generate distinctive environments, Google says it will remove the chance of information contamination and can permit builders to accurately assess an AI agent’s capabilities.