AI 3D scene creation tools

Directory of AI tools that generate 3D scenes, worlds and assets — Marble, Genie 3, Meshy, Tripo, Skybox AI, Polycam, Omniverse and more — with sourced capability notes.

We verified each tool against official documentation near 2026-07-05. Listings are never sold. Corrections: see contact.

3D World & Scene Generation

Marble

World Labs

A multimodal world model that generates explorable, persistent 3D worlds from text, images, video, or coarse 3D layouts, with tools to edit, expand, and combine generated worlds.

Marble became generally available on November 12, 2025, accepting text, single or multiple images, video, and coarse 3D layouts (via the Chisel tool) as input. [worldlabs.ai]
Generated worlds can be exported as Gaussian splats, triangle meshes (including collider meshes), or videos with camera control. [worldlabs.ai]
World Labs announced the World API on January 21, 2026, exposing Marble's world generation programmatically via platform.worldlabs.ai, with early integrations including NVIDIA Isaac Sim and MuJoCo. [worldlabs.ai]

HunyuanWorld / HY-World

Tencent Hunyuan

An open-source 3D world generation framework that turns text or images into immersive, explorable panoramic worlds with exportable, semantically layered 3D meshes.

HunyuanWorld-1.0, released July 26, 2025, uses a semantically layered 3D mesh representation with panoramic images as 360° world proxies, and outputs meshes compatible with standard graphics pipelines. [github.com]
Follow-up releases include HunyuanWorld-1.0-lite for consumer GPUs (August 2025), WorldMirror for video/multi-view input (October 2025), and HunyuanWorld-1.5 'WorldPlay' for real-time world creation (December 18, 2025). [github.com]
HY-World 2.0 was announced on April 16, 2026, with model weights distributed via Hugging Face. [github.com]

An AI platform that converts images, video, and text into segmented, production-ready 3D assets for game engines, VR, and visual effects, with a stated roadmap toward generating full 3D worlds from 2D input.

CSM converts 2D images and video into production-ready 3D assets with segmented components prepared for rigging and animation, for use in game engines, VR, and VFX. [ai.meta.com]
CSM uses Meta's Segment Anything Model 2 (SAM 2) to identify and segment individual elements in images and video before translating them into 3D components. [ai.meta.com]

Real-Time Interactive World Models

Odyssey

An AI lab building general world models that stream long-form interactive video simulations from text or image prompts, explorable in real time from the browser.

Odyssey-2, the company's flagship general-purpose world model, generates multi-minute interactive simulations with streaming that initiates in about 50ms, accessible at experience.odyssey.ml. [odyssey.ml]
Agora-1 is a multi-agent world model enabling simultaneous real-time interaction between human and AI participants in shared simulations. [odyssey.ml]
Odyssey raised a $310M Series B; TechCrunch reported the round in June 2026 at a $1.45B valuation with Amazon among the backers. [techcrunch.com]

Oasis / Lucy

Decart

A real-time AI lab offering Oasis, an API-accessible interactive world model aimed at physical AI (robotics, autonomous vehicles), and Lucy, a real-time video/world editing model.

Decart describes Oasis 3 as an API-accessible world model for physical AI that generates interactive, physically accurate environments in real time, targeting robotics, autonomous vehicles, manufacturing, and drones. [decart.ai]
Lucy 2.0 performs real-time, text-driven video transformation at production scale, and MirageLSD is presented as achieving infinite real-time video generation. [decart.ai]
Developer access, documentation, and pricing are provided through Decart's API platform at platform.decart.ai. [decart.ai]

Genie 3

Google DeepMind

A general-purpose world model that generates navigable, dynamic environments from text prompts in real time; currently available only as a limited research preview.

Genie 3 generates dynamic worlds at 720p and 24 frames per second, navigable in real time, and maintains consistency for several minutes with visual memory extending back about one minute. [deepmind.google]
It supports promptable world events, letting users change weather or add objects and characters via text during a session. [deepmind.google]
Availability is a limited research preview open to a small cohort of academics and creators. [deepmind.google]

3D Asset Generation

Meshy

Meshy · pricing

A web-based AI 3D generator that produces textured 3D models from text prompts or images, with AI texturing, auto-rigging, and animation for game and 3D-printing workflows.

Meshy 6 is the current default text-to-3D model, generating meshes with up to roughly 600K faces plus PBR texture maps (albedo, normal, metallic, roughness). [meshy.ai]
Exports FBX, GLB, OBJ, STL, 3MF, USDZ, and BLEND, with plugins for Blender, Unity, Unreal Engine, 3ds Max, Maya, Godot, and several 3D-printing slicers. [meshy.ai]
Free plan includes 100 credits per month with CC BY 4.0 licensed outputs; the Pro plan is $20/month for 1,000 credits with API access and private assets. [meshy.ai]
Includes auto-rigging and a library of 500+ game-ready character motions. [meshy.ai]

Tripo

Tripo (VAST) · pricing

A text-to-3D and image-to-3D platform with AI texturing, part segmentation, and auto-rigging, available as a web studio, API, and engine/DCC plugins.

Tripo v3.0 offers Standard mode (optimized topology for real-time use) and Ultra mode with assets up to 2 million polygons, plus a sketch-to-3D pipeline. [tripo3d.ai]
Provides 4K PBR-ready AI texturing, intelligent part segmentation, and automatic rigging/animation with export-ready files. [tripo3d.ai]
Offers plugins for Blender, Unity, Unreal Engine, ComfyUI, Cocos, and Godot alongside the Tripo API. [tripo3d.ai]
Free plan includes 200 credits/month; Pro is $19.90/month (or $13.93/month billed annually) with 3,000 monthly credits and batch generation. [tripo3d.ai]

Rodin

Hyper3D (Deemos)

An AI 3D model generator that creates production-oriented meshes from text prompts or images, with mask-based 3D editing (inpainting) of existing models.

Rodin Gen 2 generates 3D models from text or images and supports 3D inpainting that uses a masked diffusion process to modify selected regions of a model. [hyper3d.ai]
Outputs high-poly quad meshes positioned as production-ready, with export to STL, FBX, OBJ, GLB, and USDZ. [hyper3d.ai]
Rodin image-to-3D generation is also offered as a hosted API through fal.ai. [fal.ai]

Sloyd

Sloyd · pricing

A game-asset generator combining parametric templates with AI, producing customizable, game-ready 3D models from text prompts, images, or slider-based template editing.

Generates 3D models from text prompts, photos, sketches, and parametric templates with adjustable controls, under a subscription model with unlimited generation rather than per-credit consumption. [sloyd.ai]
Offers topology control with adjustable polygon counts, one-click auto-rigging, AI retexturing, and manifold geometry suitable for 3D printing. [sloyd.ai]
Available as a web app with plugins for Unity, Unreal, and Blender, and includes a Roblox avatar creator. [sloyd.ai]

Cube

Roblox

Roblox's open-source generative AI system for 3D creation, generating meshes from text prompts inside Roblox Studio and via a Lua API, with a roadmap toward scene generation.

Cube launched March 17, 2025 with text-to-mesh generation in Roblox Studio and an in-experience Lua API (e.g. '/generate a motorcycle'). [about.roblox.com]
The model is open-sourced on GitHub and Hugging Face, and developers can fine-tune it or train it on their own data. [about.roblox.com]
Cube tokenizes 3D shapes the way language models tokenize text, predicting shape tokens autoregressively; Roblox's stated roadmap extends to scene generation and '4D creation'. [about.roblox.com]

TRELLIS

Microsoft Research

An open-source large 3D asset generation model using a unified Structured LATent (SLAT) representation, decoding to radiance fields, 3D Gaussians, or meshes from text or image prompts.

TRELLIS outputs Radiance Fields, 3D Gaussians, and meshes from image or text prompts, with local 3D editing and variant generation. [github.com]
Model lineup ranges from 342M to 2.0B parameters (image-large: 1.2B; text-xlarge: 2.0B), trained on the 500K-object TRELLIS-500K dataset. [github.com]
The models and main codebase are MIT-licensed; the image model shipped December 2024 and training code plus text models followed on March 25, 2025. [github.com]

Skybox & 360° Environment Generation

Skybox AI

Blockade Labs

A text-to-360° environment generator that produces seamless equirectangular skybox panoramas for games, VR, and virtual production, with API/SDK and engine export paths.

Generates equirectangular 360° panoramas at up to 16K resolution, with 32-bit HDRI export for scene lighting. [blockadelabs.com]
Offers an API and SDK, a native Unity plugin, direct exports to Blender and Unreal, and MCP (Model Context Protocol) server support. [blockadelabs.com]
Advertises generation times as fast as 15 seconds per skybox, with commercial licensing and creator IP ownership included. [blockadelabs.com]

Capture & 3D Reconstruction

Luma Interactive Scenes

Luma AI

A capture platform that turns phone photos and video of real objects and places into photorealistic, embeddable interactive 3D scenes viewable on the web.

Available on web, iOS, and Android, with scenes that stream immediately and run at 30 FPS in web browsers. [lumalabs.ai]
Captures are usable commercially without additional licensing and remain private unless made public; a Luma API supports bulk generation. [lumalabs.ai]
Luma publishes an open web library and three.js examples for embedding its captures in WebGL projects. [github.com]

Polycam

A cross-platform 3D capture app supporting photogrammetry, LiDAR scanning, Gaussian splatting, drone processing, and automatic floor plans, with broad export-format support.

Capture modes include AI-assisted photogrammetry, LiDAR (on supported devices), Gaussian splatting, and drone footage processing, organized into spatial, object, floor plan, and aerial captures. [poly.cam]
Runs on iOS, Android, web, and Apple Vision Pro. [poly.cam]
Exports to workflows including AutoCAD, Maya, FBX, glTF, Unity, Blender, and Unreal Engine; a dedicated Gaussian splatting tool page is at poly.cam/tools/gaussian-splatting. [poly.cam]

RealityScan

Epic Games · pricing

Epic Games' photogrammetry software (formerly RealityCapture) that reconstructs 3D models from photos and laser scans, with desktop and mobile versions.

Free for individuals, educators, and companies under $1 million USD annual gross revenue; paid seats apply above that threshold. [realityscan.com]
Supports photogrammetry plus laser-scan input, SLAM data import, and classified point clouds; version 2.2 added full AMD GPU support (Radeon, RDNA 4, Ryzen AI Max). [realityscan.com]
Ships as a Windows desktop app via the Epic Games Launcher plus a RealityScan Mobile app for iOS/Android. [realityscan.com]

KIRI Engine

KIRI Innovations

A freemium 3D scanning app for phone and web that reconstructs objects and spaces via photogrammetry, LiDAR, NeRF-based featureless-object scanning, and 3D Gaussian splatting.

Offers Photo Scan (photogrammetry), LiDAR Scan (on supported iOS devices), featureless-object scanning using NeRF, and 3D Gaussian splatting with mesh export. [kiriengine.app]
Available on iOS, Android, and web browsers under a freemium model. [kiriengine.app]
Supports quad mesh output with PBR materials, mesh decimation, and auto-rigging. [kiriengine.app]

Postshot

Jawset

A Windows desktop application for training radiance-field/Gaussian-splatting scenes locally from photos or video, with live training preview and editing tools.

Builds 3D scenes from regular RGB images using radiance field technology, with all processing done locally on the user's machine. [jawset.com]
Requires Windows 10+ and an NVIDIA GPU with Compute Capability 7.5 or higher (e.g. GeForce RTX 2060 or better). [jawset.com]
Features include live training preview, scene merging, image masking, AprilTag alignment, GUI and command-line interfaces, and Unreal Engine and After Effects integrations. [jawset.com]

Engine & Scene-Assembly Tools

Promethean AI

An AI assistant for environment artists that assembles scenes from natural-language requests using a team's own asset libraries, working inside standard 3D tools and engines.

Provides AI world building from natural-language requests and reasons over a team's existing creative assets (images, video, 3D models, animations) via a local metadata library. [prometheanai.com]
Ships open-source plugins for Unreal Engine (the primary integration), Unity, 3ds Max, Maya, and Blender. [prometheanai.com]
Asset files stay on the user's systems; the company states it does not upload, store, or train on customer assets. [prometheanai.com]

Ludus AI

Ludus

An AI toolkit plugin for Unreal Engine that generates C++ code, Blueprints, and 3D models, and assists with scene creation and engine-specific questions.

Generates C++ code, 3D models, and functional Blueprints inside Unreal Engine, with documentation at docs.ludusengine.com. [ludusengine.com]
Supports Unreal Engine versions 5.4 to 5.7 and offers a free trial. [ludusengine.com]

Unity AI

Unity Technologies

Unity's in-editor AI suite for Unity 6, combining an agentic Assistant, asset Generators, an AI Gateway, and an MCP server for editor automation and asset creation.

Unity AI's open beta was announced May 1, 2026, including an agentic in-project Assistant, Generators, an AI Gateway, and a Model Context Protocol server. [discussions.unity.com]
Generators create placeholder materials, sounds, cubemaps, and 2D/3D assets in-editor. [discussions.unity.com]
Personal Edition users get a 14-day trial with 1,000 one-time credits, then $10/month for 1,000 monthly credits; Pro/Enterprise/Industry seats include access. Unity states it does not train on user code or data by default. [discussions.unity.com]
The Generators package (com.unity.ai.generators) is distributed for Unity 6000.2+ and was in pre-release (1.0.0-pre.20) as of this documentation version. [docs.unity3d.com]

Omniverse

NVIDIA

NVIDIA's platform of libraries and services for building OpenUSD-based 3D pipelines, industrial digital twins, and physical-AI simulation, with generative AI microservices for scene search and assembly.

USD Code and USD Search NIM microservices are generally available, letting developers use text prompts to generate or search for OpenUSD assets (announced at CES, January 6, 2025). [nvidianews.nvidia.com]
NVIDIA Edify SimReady automatically labels existing 3D assets with attributes such as physics and materials, processing 1,000 objects in minutes versus roughly 40 hours manually. [nvidianews.nvidia.com]
Omniverse blueprints announced alongside include Mega (robot-fleet testing in industrial digital twins), AV simulation, spatial streaming to Apple Vision Pro, and real-time digital twins for CAE. [nvidianews.nvidia.com]