nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8A16_tensor-e2e
1B
•
Updated
•
9
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8A16_channel-e2e
1B
•
Updated
•
3
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8_BLOCK-e2e
1B
•
Updated
•
4
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4A16-e2e
0.7B
•
Updated
•
7
nm-testing/tinysmokeqwen3moe-W4A16-first-only-CTstable
2.54M
•
Updated
•
2.12k
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/DeepSeek-R1-Distill-Qwen-32B-NVFP4
Text Generation
•
19B
•
Updated
•
128
nm-testing/tinysmokeqwen3moe-W4A16-first-only
2.54M
•
Updated
•
3
nm-testing/tinysmokeqwen3moe
2.93M
•
Updated
•
3
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4
5B
•
Updated
•
6
nm-testing/TinyLlama-1.1B-Chat-v1.0-MXFP4
0.6B
•
Updated
•
5
nm-testing/granite-4.0-h-small-FP8-block
Text Generation
•
32B
•
Updated
•
7
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8
8B
•
Updated
•
5
nm-testing/Llama3_2_1B_speculator.eagle3
0.4B
•
Updated
•
53.1k
nm-testing/Llama-3.1-8B-Instruct-KV-Cache-FP8
8B
•
Updated
•
5
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-test132
0.7B
•
Updated
•
6
nm-testing/TinyLlama-1.1B-Chat-v1.0-awq-asym-test-awq-asym
0.3B
•
Updated
•
4
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-1105
Updated