Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation Paper • 2606.06712 • Published 11 days ago • 2