david riordan's picture

7 41

david riordan

drdn

·

AI & ML interests

hci, audio ml, public sector tech, historical data

Organizations

upvoted an article over 1 year ago

Article

Introducing TextImage Augmentation for Document Images

+1

Aug 6, 2024

•

33

upvoted 2 papers over 1 year ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83

VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

Paper • 2406.05370 • Published Jun 8, 2024 • 17

upvoted 2 papers almost 2 years ago

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18, 2024 • 55

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19, 2024 • 43

upvoted 2 collections almost 2 years ago

Idefics2 🐶

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6, 2024 • 92

OpenCulture

A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 132