Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 2 days ago • 87
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published 4 days ago • 123
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models Paper • 2603.12252 • Published 15 days ago • 10
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models Paper • 2603.12252 • Published 15 days ago • 10
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 24 days ago • 100
Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning Paper • 2601.21037 • Published Jan 28 • 15
Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning Paper • 2601.21037 • Published Jan 28 • 15 • 5
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published Dec 30, 2025 • 52
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation Paper • 2601.17761 • Published Jan 25 • 14
TwinFlow Collection A collection of TwinFlow-accelerated diffusion models • 4 items • Updated 3 days ago • 6