Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models Paper • 2404.11502 • Published Apr 17, 2024 • 1
Easy and Efficient Transformer : Scalable Inference Solution For large NLP model Paper • 2104.12470 • Published Apr 26, 2021 • 1
nabla^2DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials Paper • 2406.14347 • Published Jun 20, 2024 • 102
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal Paper • 2508.05988 • Published Aug 8, 2025 • 22
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 61
gemini-3-flash Collection 0xb2f264fdd8a28748cb3c6a748127009fec56cba3 • 9 items • Updated 5 days ago • 1
Easy3E: Feed-Forward 3D Asset Editing via Rectified Voxel Flow Paper • 2602.21499 • Published Mar 22 • 1
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published Feb 5 • 52
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents Paper • 2602.02474 • Published Feb 2 • 63
Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening Paper • 2602.05386 • Published Feb 5 • 68