VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Dec 4, 2025 • 184
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published Sep 1, 2025 • 50
Video ReCap: Recursive Captioning of Hour-Long Videos Paper • 2402.13250 • Published Feb 20, 2024 • 26
VideoPrism: A Foundational Visual Encoder for Video Understanding Paper • 2402.13217 • Published Feb 20, 2024 • 38