-
The Ultra-Scale Playbook
π3.69kThe ultimate guide to training LLM on large GPU Clusters
-
The Smol Training Playbook
π2.98kThe secrets to building world-class LLMs
-
Evaluation Guidebook
π269Explore LLM benchmark trends over time
-
FineVision: Open Data is All You Need
π218A new open-source dataset for training VLMs
Sergio Paniego PRO
AI & ML interests
Recent Activity
Organizations
-
Running41
comparevlms
π41Compare Vision Language Models
-
Runtime error4
Gemma3 License Plate Detection
π4Gemma 3 for license plate detection
-
Running on ZeroFeatured142
Gemma 3n E4B It
β‘142Generate text responses to images, videos, and audio
-
Running on ZeroFeatured37
Moondream3
π’37Image and video tasks with moondream3.
-
Running41
comparevlms
π41Compare Vision Language Models
-
Sleeping66
OCR Time Machine
π66Extract text from images and XML files using OCR models
-
Sleeping26
Compare Docvqa Models
π¦26Compare different visual question answering
-
Running on CPU Upgrade23
Compare Clip Siglip
π23Compare strong zero-shot image classification models
-
Qwen/Qwen2.5-Omni-7B
Any-to-Any β’ Updated β’ 295k β’ 1.86k -
RunningFeatured366
Qwen2.5 Omni 7B Demo
π366Chat with an AI using text, audio, image, or video and hear responses
-
Qwen2.5-Omni Technical Report
Paper β’ 2503.20215 β’ Published β’ 170 -
openbmb/MiniCPM-o-2_6
Any-to-Any β’ 9B β’ Updated β’ 96.3k β’ 1.28k
-
Running3.69k
The Ultra-Scale Playbook
π3.69kThe ultimate guide to training LLM on large GPU Clusters
-
Running on CPU UpgradeFeatured2.98k
The Smol Training Playbook
π2.98kThe secrets to building world-class LLMs
-
Running269
Evaluation Guidebook
π269Explore LLM benchmark trends over time
-
Running218
FineVision: Open Data is All You Need
π218A new open-source dataset for training VLMs
-
Running41
comparevlms
π41Compare Vision Language Models
-
Sleeping66
OCR Time Machine
π66Extract text from images and XML files using OCR models
-
Sleeping26
Compare Docvqa Models
π¦26Compare different visual question answering
-
Running on CPU Upgrade23
Compare Clip Siglip
π23Compare strong zero-shot image classification models
-
Running41
comparevlms
π41Compare Vision Language Models
-
Runtime error4
Gemma3 License Plate Detection
π4Gemma 3 for license plate detection
-
Running on ZeroFeatured142
Gemma 3n E4B It
β‘142Generate text responses to images, videos, and audio
-
Running on ZeroFeatured37
Moondream3
π’37Image and video tasks with moondream3.
-
Qwen/Qwen2.5-Omni-7B
Any-to-Any β’ Updated β’ 295k β’ 1.86k -
RunningFeatured366
Qwen2.5 Omni 7B Demo
π366Chat with an AI using text, audio, image, or video and hear responses
-
Qwen2.5-Omni Technical Report
Paper β’ 2503.20215 β’ Published β’ 170 -
openbmb/MiniCPM-o-2_6
Any-to-Any β’ 9B β’ Updated β’ 96.3k β’ 1.28k