Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published 8 days ago • 30
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives Paper • 2512.14699 • Published 10 days ago • 27
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling Paper • 2512.12675 • Published 12 days ago • 40
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 17 days ago • 125
AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4D Scenes Paper • 2510.10670 • Published Oct 12 • 18
AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4D Scenes Paper • 2510.10670 • Published Oct 12 • 18 • 2
CGB-DM: Content and Graphic Balance Layout Generation with Transformer-based Diffusion Model Paper • 2407.15233 • Published Jul 21, 2024 • 7
SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution Paper • 2506.19838 • Published Jun 24 • 13
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning Paper • 2507.12841 • Published Jul 17 • 41
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution Paper • 2510.08143 • Published Oct 9 • 20
Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis Paper • 2509.09595 • Published Sep 11 • 48
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning Paper • 2507.12841 • Published Jul 17 • 41
SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution Paper • 2506.19838 • Published Jun 24 • 13
Scaling Image and Video Generation via Test-Time Evolutionary Search Paper • 2505.17618 • Published May 23 • 41