Genie 3 accepts two primary types of inputs: text prompts for world generation and navigational controls for real-time interaction. Given a text prompt, Genie 3 can generate dynamic worlds that you can navigate in real time. The system can understand detailed descriptive prompts, as demonstrated by examples like "First-person view drone video. High speed flight into and along a narrow canyon in Iceland with a river at the bottom and moss on the rocks, golden hour, realworld". These text descriptions can specify viewpoint, location, environmental conditions, time of day, and visual style.
Beyond initial world generation, Genie 3 supports what the researchers call “promptable world events.” Promptable world events make it possible to change the generated world, like altering weather conditions or introducing new objects and characters, enhancing the experience from navigation controls. This means users can provide additional text commands during interaction to modify the environment dynamically. For example, you might prompt the system to change sunny weather to rain, add new characters to a scene, or introduce specific objects into the environment. This capability significantly expands the creative possibilities beyond simple navigation.
The system also accepts standard navigational inputs that allow users to move through and explore the generated environments. These controls work in conjunction with the text-based inputs to create a comprehensive interaction system. The navigation inputs enable real-time exploration at 24 FPS, while the promptable world events provide users with the ability to shape and modify their experience through natural language commands. This dual input approach makes Genie 3 accessible to users who want both guided control over world generation and intuitive exploration mechanics, supporting various use cases from creative experimentation to research applications.