Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training
Abstract
Muses enables feed-forward generation of 3D creatures by leveraging skeletal structures and graph-constrained reasoning for coherent design and assembly.
We present Muses, the first training-free method for fantastic 3D creature generation in a feed-forward paradigm. Previous methods, which rely on part-aware optimization, manual assembly, or 2D image generation, often produce unrealistic or incoherent 3D assets due to the challenges of intricate part-level manipulation and limited out-of-domain generation. In contrast, Muses leverages the 3D skeleton, a fundamental representation of biological forms, to explicitly and rationally compose diverse elements. This skeletal foundation formalizes 3D content creation as a structure-aware pipeline of design, composition, and generation. Muses begins by constructing a creatively composed 3D skeleton with coherent layout and scale through graph-constrained reasoning. This skeleton then guides a voxel-based assembly process within a structured latent space, integrating regions from different objects. Finally, image-guided appearance modeling under skeletal conditions is applied to generate a style-consistent and harmonious texture for the assembled shape. Extensive experiments establish Muses' state-of-the-art performance in terms of visual fidelity and alignment with textual descriptions, and potential on flexible 3D object editing. Project page: https://luhexiao.github.io/Muses.github.io/.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement (2025)
- MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts (2025)
- 3DProxyImg: Controllable 3D-Aware Animation Synthesis from Single Image via 2D-3D Aligned Proxy Embedding (2025)
- Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation (2025)
- UniPart: Part-Level 3D Generation with Unified 3D Geom-Seg Latents (2025)
- MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing (2026)
- Self-Evolving 3D Scene Generation from a Single Image (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper