Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 7 days ago • 54
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 219
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7, 2025 • 141
MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes Paper • 2508.05630 • Published Aug 7, 2025 • 9