Mowgli: Passively Learned Rate Control for Real-Time Video Paper • 2410.03339 • Published Oct 4, 2024 • 1
Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs Paper • 2512.20573 • Published 22 days ago • 1
Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning Paper • 2210.00093 • Published Sep 30, 2022
Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs Paper • 2512.20573 • Published 22 days ago • 1
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning Paper • 2504.07891 • Published Apr 10, 2025 • 5
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning Paper • 2504.07891 • Published Apr 10, 2025 • 5
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving Paper • 2312.05385 • Published Dec 8, 2023
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning Paper • 2504.07891 • Published Apr 10, 2025 • 5 • 3