FP8 Models - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

FP8 Models

updated Nov 17

RedHatAI/Meta-Llama-3-8B-Instruct-FP8

Text Generation • 8B • Updated Jul 18, 2024 • 2.8k • • 24
RedHatAI/Meta-Llama-3-8B-Instruct-FP8-KV

Text Generation • 8B • Updated Sep 15 • 6.18k • • 8
RedHatAI/Mixtral-8x7B-Instruct-v0.1-AutoFP8

Text Generation • 47B • Updated Jul 18, 2024 • 49 • 3
RedHatAI/Meta-Llama-3-70B-Instruct-FP8

Text Generation • 71B • Updated Jul 18, 2024 • 1.17k • • 13
RedHatAI/Qwen2-72B-Instruct-FP8

Text Generation • 73B • Updated Jul 18, 2024 • 1.24k • 15