AI 3D scene creation tools
Directory of AI tools that generate 3D scenes, worlds and assets — Marble, Genie 3, Meshy, Tripo, Skybox AI, Polycam, Omniverse and more — with sourced capability notes.
3D World & Scene Generation
Marble
A multimodal world model that generates explorable, persistent 3D worlds from text, images, video, or coarse 3D layouts, with tools to edit, expand, and combine generated worlds.
- Marble became generally available on November 12, 2025, accepting text, single or multiple images, video, and coarse 3D layouts (via the Chisel tool) as input. [worldlabs.ai]
- Generated worlds can be exported as Gaussian splats, triangle meshes (including collider meshes), or videos with camera control. [worldlabs.ai]
- World Labs announced the World API on January 21, 2026, exposing Marble's world generation programmatically via platform.worldlabs.ai, with early integrations including NVIDIA Isaac Sim and MuJoCo. [worldlabs.ai]
HunyuanWorld / HY-World
An open-source 3D world generation framework that turns text or images into immersive, explorable panoramic worlds with exportable, semantically layered 3D meshes.
- HunyuanWorld-1.0, released July 26, 2025, uses a semantically layered 3D mesh representation with panoramic images as 360° world proxies, and outputs meshes compatible with standard graphics pipelines. [github.com]
- Follow-up releases include HunyuanWorld-1.0-lite for consumer GPUs (August 2025), WorldMirror for video/multi-view input (October 2025), and HunyuanWorld-1.5 'WorldPlay' for real-time world creation (December 18, 2025). [github.com]
- HY-World 2.0 was announced on April 16, 2026, with model weights distributed via Hugging Face. [github.com]
CSM
An AI platform that converts images, video, and text into segmented, production-ready 3D assets for game engines, VR, and visual effects, with a stated roadmap toward generating full 3D worlds from 2D input.
- CSM converts 2D images and video into production-ready 3D assets with segmented components prepared for rigging and animation, for use in game engines, VR, and VFX. [ai.meta.com]
- CSM uses Meta's Segment Anything Model 2 (SAM 2) to identify and segment individual elements in images and video before translating them into 3D components. [ai.meta.com]
Real-Time Interactive World Models
Odyssey
An AI lab building general world models that stream long-form interactive video simulations from text or image prompts, explorable in real time from the browser.
- Odyssey-2, the company's flagship general-purpose world model, generates multi-minute interactive simulations with streaming that initiates in about 50ms, accessible at experience.odyssey.ml. [odyssey.ml]
- Agora-1 is a multi-agent world model enabling simultaneous real-time interaction between human and AI participants in shared simulations. [odyssey.ml]
- Odyssey raised a $310M Series B; TechCrunch reported the round in June 2026 at a $1.45B valuation with Amazon among the backers. [techcrunch.com]
Oasis / Lucy
A real-time AI lab offering Oasis, an API-accessible interactive world model aimed at physical AI (robotics, autonomous vehicles), and Lucy, a real-time video/world editing model.
- Decart describes Oasis 3 as an API-accessible world model for physical AI that generates interactive, physically accurate environments in real time, targeting robotics, autonomous vehicles, manufacturing, and drones. [decart.ai]
- Lucy 2.0 performs real-time, text-driven video transformation at production scale, and MirageLSD is presented as achieving infinite real-time video generation. [decart.ai]
- Developer access, documentation, and pricing are provided through Decart's API platform at platform.decart.ai. [decart.ai]
Genie 3
A general-purpose world model that generates navigable, dynamic environments from text prompts in real time; currently available only as a limited research preview.
- Genie 3 generates dynamic worlds at 720p and 24 frames per second, navigable in real time, and maintains consistency for several minutes with visual memory extending back about one minute. [deepmind.google]
- It supports promptable world events, letting users change weather or add objects and characters via text during a session. [deepmind.google]
- Availability is a limited research preview open to a small cohort of academics and creators. [deepmind.google]
3D Asset Generation
Meshy
A web-based AI 3D generator that produces textured 3D models from text prompts or images, with AI texturing, auto-rigging, and animation for game and 3D-printing workflows.
- Meshy 6 is the current default text-to-3D model, generating meshes with up to roughly 600K faces plus PBR texture maps (albedo, normal, metallic, roughness). [meshy.ai]
- Exports FBX, GLB, OBJ, STL, 3MF, USDZ, and BLEND, with plugins for Blender, Unity, Unreal Engine, 3ds Max, Maya, Godot, and several 3D-printing slicers. [meshy.ai]
- Free plan includes 100 credits per month with CC BY 4.0 licensed outputs; the Pro plan is $20/month for 1,000 credits with API access and private assets. [meshy.ai]
- Includes auto-rigging and a library of 500+ game-ready character motions. [meshy.ai]
Tripo
A text-to-3D and image-to-3D platform with AI texturing, part segmentation, and auto-rigging, available as a web studio, API, and engine/DCC plugins.
- Tripo v3.0 offers Standard mode (optimized topology for real-time use) and Ultra mode with assets up to 2 million polygons, plus a sketch-to-3D pipeline. [tripo3d.ai]
- Provides 4K PBR-ready AI texturing, intelligent part segmentation, and automatic rigging/animation with export-ready files. [tripo3d.ai]
- Offers plugins for Blender, Unity, Unreal Engine, ComfyUI, Cocos, and Godot alongside the Tripo API. [tripo3d.ai]
- Free plan includes 200 credits/month; Pro is $19.90/month (or $13.93/month billed annually) with 3,000 monthly credits and batch generation. [tripo3d.ai]
Rodin
An AI 3D model generator that creates production-oriented meshes from text prompts or images, with mask-based 3D editing (inpainting) of existing models.
- Rodin Gen 2 generates 3D models from text or images and supports 3D inpainting that uses a masked diffusion process to modify selected regions of a model. [hyper3d.ai]
- Outputs high-poly quad meshes positioned as production-ready, with export to STL, FBX, OBJ, GLB, and USDZ. [hyper3d.ai]
- Rodin image-to-3D generation is also offered as a hosted API through fal.ai. [fal.ai]
Sloyd
A game-asset generator combining parametric templates with AI, producing customizable, game-ready 3D models from text prompts, images, or slider-based template editing.
- Generates 3D models from text prompts, photos, sketches, and parametric templates with adjustable controls, under a subscription model with unlimited generation rather than per-credit consumption. [sloyd.ai]
- Offers topology control with adjustable polygon counts, one-click auto-rigging, AI retexturing, and manifold geometry suitable for 3D printing. [sloyd.ai]
- Available as a web app with plugins for Unity, Unreal, and Blender, and includes a Roblox avatar creator. [sloyd.ai]
Cube
Roblox's open-source generative AI system for 3D creation, generating meshes from text prompts inside Roblox Studio and via a Lua API, with a roadmap toward scene generation.
- Cube launched March 17, 2025 with text-to-mesh generation in Roblox Studio and an in-experience Lua API (e.g. '/generate a motorcycle'). [about.roblox.com]
- The model is open-sourced on GitHub and Hugging Face, and developers can fine-tune it or train it on their own data. [about.roblox.com]
- Cube tokenizes 3D shapes the way language models tokenize text, predicting shape tokens autoregressively; Roblox's stated roadmap extends to scene generation and '4D creation'. [about.roblox.com]
TRELLIS
An open-source large 3D asset generation model using a unified Structured LATent (SLAT) representation, decoding to radiance fields, 3D Gaussians, or meshes from text or image prompts.
- TRELLIS outputs Radiance Fields, 3D Gaussians, and meshes from image or text prompts, with local 3D editing and variant generation. [github.com]
- Model lineup ranges from 342M to 2.0B parameters (image-large: 1.2B; text-xlarge: 2.0B), trained on the 500K-object TRELLIS-500K dataset. [github.com]
- The models and main codebase are MIT-licensed; the image model shipped December 2024 and training code plus text models followed on March 25, 2025. [github.com]
Skybox & 360° Environment Generation
Skybox AI
A text-to-360° environment generator that produces seamless equirectangular skybox panoramas for games, VR, and virtual production, with API/SDK and engine export paths.
- Generates equirectangular 360° panoramas at up to 16K resolution, with 32-bit HDRI export for scene lighting. [blockadelabs.com]
- Offers an API and SDK, a native Unity plugin, direct exports to Blender and Unreal, and MCP (Model Context Protocol) server support. [blockadelabs.com]
- Advertises generation times as fast as 15 seconds per skybox, with commercial licensing and creator IP ownership included. [blockadelabs.com]
Capture & 3D Reconstruction
Luma Interactive Scenes
A capture platform that turns phone photos and video of real objects and places into photorealistic, embeddable interactive 3D scenes viewable on the web.
- Available on web, iOS, and Android, with scenes that stream immediately and run at 30 FPS in web browsers. [lumalabs.ai]
- Captures are usable commercially without additional licensing and remain private unless made public; a Luma API supports bulk generation. [lumalabs.ai]
- Luma publishes an open web library and three.js examples for embedding its captures in WebGL projects. [github.com]
Polycam
A cross-platform 3D capture app supporting photogrammetry, LiDAR scanning, Gaussian splatting, drone processing, and automatic floor plans, with broad export-format support.
- Capture modes include AI-assisted photogrammetry, LiDAR (on supported devices), Gaussian splatting, and drone footage processing, organized into spatial, object, floor plan, and aerial captures. [poly.cam]
- Runs on iOS, Android, web, and Apple Vision Pro. [poly.cam]
- Exports to workflows including AutoCAD, Maya, FBX, glTF, Unity, Blender, and Unreal Engine; a dedicated Gaussian splatting tool page is at poly.cam/tools/gaussian-splatting. [poly.cam]
RealityScan
Epic Games' photogrammetry software (formerly RealityCapture) that reconstructs 3D models from photos and laser scans, with desktop and mobile versions.
- Free for individuals, educators, and companies under $1 million USD annual gross revenue; paid seats apply above that threshold. [realityscan.com]
- Supports photogrammetry plus laser-scan input, SLAM data import, and classified point clouds; version 2.2 added full AMD GPU support (Radeon, RDNA 4, Ryzen AI Max). [realityscan.com]
- Ships as a Windows desktop app via the Epic Games Launcher plus a RealityScan Mobile app for iOS/Android. [realityscan.com]
KIRI Engine
A freemium 3D scanning app for phone and web that reconstructs objects and spaces via photogrammetry, LiDAR, NeRF-based featureless-object scanning, and 3D Gaussian splatting.
- Offers Photo Scan (photogrammetry), LiDAR Scan (on supported iOS devices), featureless-object scanning using NeRF, and 3D Gaussian splatting with mesh export. [kiriengine.app]
- Available on iOS, Android, and web browsers under a freemium model. [kiriengine.app]
- Supports quad mesh output with PBR materials, mesh decimation, and auto-rigging. [kiriengine.app]
Postshot
A Windows desktop application for training radiance-field/Gaussian-splatting scenes locally from photos or video, with live training preview and editing tools.
- Builds 3D scenes from regular RGB images using radiance field technology, with all processing done locally on the user's machine. [jawset.com]
- Requires Windows 10+ and an NVIDIA GPU with Compute Capability 7.5 or higher (e.g. GeForce RTX 2060 or better). [jawset.com]
- Features include live training preview, scene merging, image masking, AprilTag alignment, GUI and command-line interfaces, and Unreal Engine and After Effects integrations. [jawset.com]
Engine & Scene-Assembly Tools
Promethean AI
An AI assistant for environment artists that assembles scenes from natural-language requests using a team's own asset libraries, working inside standard 3D tools and engines.
- Provides AI world building from natural-language requests and reasons over a team's existing creative assets (images, video, 3D models, animations) via a local metadata library. [prometheanai.com]
- Ships open-source plugins for Unreal Engine (the primary integration), Unity, 3ds Max, Maya, and Blender. [prometheanai.com]
- Asset files stay on the user's systems; the company states it does not upload, store, or train on customer assets. [prometheanai.com]
Ludus AI
An AI toolkit plugin for Unreal Engine that generates C++ code, Blueprints, and 3D models, and assists with scene creation and engine-specific questions.
- Generates C++ code, 3D models, and functional Blueprints inside Unreal Engine, with documentation at docs.ludusengine.com. [ludusengine.com]
- Supports Unreal Engine versions 5.4 to 5.7 and offers a free trial. [ludusengine.com]
Unity AI
Unity's in-editor AI suite for Unity 6, combining an agentic Assistant, asset Generators, an AI Gateway, and an MCP server for editor automation and asset creation.
- Unity AI's open beta was announced May 1, 2026, including an agentic in-project Assistant, Generators, an AI Gateway, and a Model Context Protocol server. [discussions.unity.com]
- Generators create placeholder materials, sounds, cubemaps, and 2D/3D assets in-editor. [discussions.unity.com]
- Personal Edition users get a 14-day trial with 1,000 one-time credits, then $10/month for 1,000 monthly credits; Pro/Enterprise/Industry seats include access. Unity states it does not train on user code or data by default. [discussions.unity.com]
- The Generators package (com.unity.ai.generators) is distributed for Unity 6000.2+ and was in pre-release (1.0.0-pre.20) as of this documentation version. [docs.unity3d.com]
Omniverse
NVIDIA's platform of libraries and services for building OpenUSD-based 3D pipelines, industrial digital twins, and physical-AI simulation, with generative AI microservices for scene search and assembly.
- USD Code and USD Search NIM microservices are generally available, letting developers use text prompts to generate or search for OpenUSD assets (announced at CES, January 6, 2025). [nvidianews.nvidia.com]
- NVIDIA Edify SimReady automatically labels existing 3D assets with attributes such as physics and materials, processing 1,000 objects in minutes versus roughly 40 hours manually. [nvidianews.nvidia.com]
- Omniverse blueprints announced alongside include Mega (robot-fleet testing in industrial digital twins), AV simulation, spatial streaming to Apple Vision Pro, and real-time digital twins for CAE. [nvidianews.nvidia.com]