Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 21 days ago • 284
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective Paper • 2506.17930 • Published Jun 22, 2025 • 19
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 502
KVAE 1.0 Collection KVAE 1.0 tokenizers are for images (KVAE-2D-1.0) and video (KVAE-3D-1.0) are distributed under MIT license (commercial use is possible). • 2 items • Updated Dec 14, 2025 • 7
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Jan 14 • 437
VideoPrism Collection VideoPrism is a foundational video encoder that enables state-of-the-art performance on a large variety of video understanding tasks. • 5 items • Updated Jul 16, 2025 • 17
Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models Paper • 2512.00590 • Published Nov 29, 2025 • 48
Saiga GGUF Collection Russian fine-tunes of different base LLMs in the GGUF format compatible with llama.cpp • 8 items • Updated Apr 27, 2025 • 38
DTrOCR: Decoder-only Transformer for Optical Character Recognition Paper • 2308.15996 • Published Aug 30, 2023 • 4
Transformer-Based Approach for Joint Handwriting and Named Entity Recognition in Historical documents Paper • 2112.04189 • Published Dec 8, 2021 • 3
Handwritten and Printed Text Segmentation: A Signature Case Study Paper • 2307.07887 • Published Jul 15, 2023 • 1
WriteViT: Handwritten Text Generation with Vision Transformer Paper • 2505.13235 • Published May 19, 2025 • 1
Ocean-OCR: Towards General OCR Application via a Vision-Language Model Paper • 2501.15558 • Published Jan 26, 2025 • 2
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models Paper • 2109.10282 • Published Sep 21, 2021 • 12
TRIDIS: A Comprehensive Medieval and Early Modern Corpus for HTR and NER Paper • 2503.22714 • Published Mar 25, 2025 • 1