Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 5 days ago • 240
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published 7 days ago • 39
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 6 days ago • 110
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper • 2603.24533 • Published 18 days ago • 47
Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper • 2603.16932 • Published 29 days ago • 86
view post Post 5461 Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-trainingAuthor: SKT AI LABSAffiliation: SKT AI Labs / Project SuryaModel Architecture: Optimized Dense TransformerParameters: 1.1 TrillionTraining Tokens: 146 TrillionWanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfullWhitepaper - https://github.com/SHRIJANAGAIN/PROFF See translation 57 replies · 🔥 15 15 👍 9 9 🚀 8 8 🤗 7 7 ➕ 7 7 👀 6 6 ❤️ 6 6 😎 5 5 🧠 5 5 🤝 5 5 🤯 3 3 + Reply