Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
inference-optimization
/
Qwen3-30B-A3B-Thinking-2507.w4a16
like
0
Follow
Inference Optimization
12
Text Generation
Transformers
Safetensors
qwen3_moe
neuralmagic
redhat
llmcompressor
quantized
INT4
conversational
compressed-tensors
arxiv:
2210.17323
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
Qwen3-30B-A3B-Thinking-2507.w4a16
16.7 GB
1 contributor
History:
5 commits
ChibuUkachi
updated scores
308d398
verified
4 days ago
.gitattributes
1.57 kB
add model and config
18 days ago
README.md
7.77 kB
updated scores
4 days ago
added_tokens.json
707 Bytes
add model and config
18 days ago
chat_template.jinja
4.05 kB
add model and config
18 days ago
config.json
1.84 kB
add model and config
18 days ago
generation_config.json
214 Bytes
add model and config
18 days ago
merges.txt
1.67 MB
add model and config
18 days ago
model-00001-of-00004.safetensors
5 GB
xet
add model and config
18 days ago
model-00002-of-00004.safetensors
5 GB
xet
add model and config
18 days ago
model-00003-of-00004.safetensors
5 GB
xet
add model and config
18 days ago
model-00004-of-00004.safetensors
1.67 GB
xet
add model and config
18 days ago
model.safetensors.index.json
5.42 MB
add model and config
18 days ago
recipe.yaml
756 Bytes
add model and config
18 days ago
special_tokens_map.json
613 Bytes
add model and config
18 days ago
tokenizer.json
11.4 MB
xet
add model and config
18 days ago
tokenizer_config.json
5.41 kB
add model and config
18 days ago
vocab.json
2.78 MB
add model and config
18 days ago