Submitted by iseesaw 190 A Survey of Reinforcement Learning for Large Reasoning Models TsinghuaC3I 2.35k 5
Submitted by taesiri 57 AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning · 23 authors 607 2
Submitted by TongZheng1999 28 CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models · 11 authors 2
Submitted by spermwhale 15 The Majority is not always right: RL training for solution aggregation · 6 authors 2
Submitted by memyprokotow 13 <think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs · 3 authors 0 2
Submitted by taesiri 3 HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants · 4 authors 1 2