view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models 7 days ago • 96
Devstral 2 Collection A couple of agentic LLMs for software engineering tasks, excelling at using tools to explore codebases, edit multiple files, and power SWE Agents. • 3 items • Updated 13 days ago • 37
view article Article DeepMath: A lightweight math reasoning Agent with smolagents +1 19 days ago • 30
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 20 days ago • 131
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 22 days ago • 249
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data Paper • 2511.12609 • Published Nov 16 • 103
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14 • 112
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17 • 136
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17 • 132
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14 • 162
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19 • 226
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 13 days ago • 45