inference-optimization/Llama-3.1-8B-Instruct-Mixed-NVFP4-FP8_BLOCK-out_proj-all
5B
•
Updated
•
1
inference-optimization/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
33B
•
Updated
•
1
inference-optimization/Qwen3-32B-QKV-Cache-FP8-Per-Head
33B
•
Updated
inference-optimization/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
33B
•
Updated
•
1
inference-optimization/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
33B
•
Updated
•
1
inference-optimization/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
71B
•
Updated
•
1
inference-optimization/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
71B
•
Updated
•
1
inference-optimization/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
71B
•
Updated
•
2
inference-optimization/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
71B
•
Updated
•
1
inference-optimization/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
8B
•
Updated
•
7
inference-optimization/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
8B
•
Updated
•
1