view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day 23 days ago • 46
Tiny-A2D Collection Small diffusion language models adapted from AR models • 4 items • Updated 25 days ago • 11
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 260
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE Paper • 2502.06282 • Published Feb 10, 2025 • 6
view article Article Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio Jul 31, 2025 • 60
view article Article ✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use Jan 3, 2025 • 22
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 53
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4, 2025 • 101
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 58
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 119
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published Oct 19, 2025 • 106
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 121