Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published 4 days ago • 78
ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition Paper • 2601.03822 • Published 2 days ago • 21
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 85
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published 21 days ago • 48
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published 29 days ago • 46
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 283
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
LiveTradeBench: Seeking Real-World Alpha with Large Language Models Paper • 2511.03628 • Published Nov 5, 2025 • 12
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27, 2025 • 84
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use Paper • 2510.27363 • Published Oct 31, 2025 • 22
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29, 2025 • 45