RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies Paper • 2603.04639 • Published Mar 4 • 29
MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos Paper • 2603.14145 • Published 21 days ago • 14
Implicit Neural Representation Facilitates Unified Universal Vision Encoding Paper • 2601.14256 • Published Jan 20 • 7
HUVR Collection Vision unified representation model with standard and compressed features for classification, generation, and more: https://arxiv.org/abs/2601.14256 • 4 items • Updated Jan 23 • 5
Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition Paper • 2508.03695 • Published Aug 5, 2025 • 1