Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]
Yulei Qin
yolay
AI & ML interests
Medical Imaging, Computer Vision,
Language Models
Recent Activity
upvoted
a
paper
about 3 hours ago
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context
Prompting
updated
a model
17 days ago
yolay/SPEAR-SearchQA-Qwen2.5-14B
updated
a model
18 days ago
yolay/SPEAR-SearchQA-Qwen2.5-7B