When Does Trajectory-Level Supervision Permit Efficient Offline Reinforcement Learning? Paper • 2606.18531 • Published 4 days ago • 3 • 2
SciOrch: Learning to Orchestrate Expert LLMs for Solving Frontier Multimodal Scientific Reasoning Tasks Paper • 2606.15872 • Published 5 days ago • 4 • 3
FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows Paper • 2606.20404 • Published 1 day ago • 12 • 2
Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages Paper • 2606.20517 • Published 1 day ago • 5 • 2
DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects Paper • 2606.15133 • Published 7 days ago • 45 • 3
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability Paper • 2606.19236 • Published 3 days ago • 8 • 4
A Benchmark and Framework for Evaluating Next Action Predictions in Spreadsheets Paper • 2606.13802 • Published 9 days ago • 3
LLM-Enabled NWDAF: A Step Toward AI-Native 6G Network Intelligence Paper • 2606.11877 • Published 10 days ago • 3
Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for Turkish Paper • 2606.18717 • Published 3 days ago • 1 • 2
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 4 days ago • 16 • 2
Visual-Seeker: Towards Visual-Native Multimodal Agentic Search via Active Visual Reasoning Paper • 2606.15231 • Published 7 days ago • 3 • 3
Guava: An Effective and Universal Harness for Embodied Manipulation Paper • 2606.18363 • Published 4 days ago • 24 • 5
EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts Paper • 2606.18967 • Published 3 days ago • 20 • 3
SAE Interventions are Unreliable: Post-Intervention Recovery of Suppressed Behavior Paper • 2606.18322 • Published 4 days ago • 16 • 3
GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine? Paper • 2606.17861 • Published 4 days ago • 44 • 5
Learning from the Self-future: On-policy Self-distillation for dLLMs Paper • 2606.18195 • Published 4 days ago • 70 • 2
Rethinking the Role of Efficient Attention in Hybrid Architectures Paper • 2606.15378 • Published 7 days ago • 14 • 3
Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion Paper • 2606.15236 • Published 4 days ago • 18 • 3
ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining Paper • 2606.17200 • Published 5 days ago • 43 • 3