What do Language Models Learn and When? The Implicit Curriculum Hypothesis Paper • 2604.08510 • Published 10 days ago • 3
Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems Paper • 2604.14228 • Published 5 days ago • 13
LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning Paper • 2604.14922 • Published 3 days ago • 5
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published 27 days ago • 23
CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation Paper • 2604.09746 • Published 9 days ago • 1
You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass Paper • 2604.10966 • Published 6 days ago • 10
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation Paper • 2604.13010 • Published 5 days ago • 10
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation Paper • 2604.09497 • Published 9 days ago • 28
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published 5 days ago • 29
Toward Autonomous Long-Horizon Engineering for ML Research Paper • 2604.13018 • Published 5 days ago • 31
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 5 days ago • 96
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 5 days ago • 77
Do AI Coding Agents Log Like Humans? An Empirical Study Paper • 2604.09409 • Published 9 days ago • 3
SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering Paper • 2604.11548 • Published 6 days ago • 18
From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space Paper • 2604.14142 • Published 4 days ago • 26
Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure Paper • 2604.11045 • Published 6 days ago • 23