📖 About This Model
This model is google/functiongemma-270m-it converted to MLX format optimized for Apple Silicon (M1/M2/M3/M4) Macs with native acceleration.
🚀 Quick Start
Generate Text with mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")
prompt = "Explain quantum computing in simple terms"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True
)
text = generate(model, tokenizer, prompt=prompt_formatted, verbose=True)
print(text)
Streaming Generation
from mlx_lm import load, stream_generate
model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")
prompt = "Write a haiku about coding"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True
)
for token in stream_generate(model, tokenizer, prompt=prompt_formatted, max_tokens=200):
print(token, end="", flush=True)
Command Line Interface
pip install mlx-lm
python -m mlx_lm.generate --model QuantLLM/functiongemma-270m-it-4bit-mlx --prompt "Hello!"
python -m mlx_lm.chat --model QuantLLM/functiongemma-270m-it-4bit-mlx
System Requirements
| Requirement |
Minimum |
| Chip |
Apple Silicon (M1/M2/M3/M4) |
| macOS |
13.0 (Ventura) or later |
| Python |
3.10+ |
| RAM |
8GB+ (16GB recommended) |
pip install mlx-lm
📊 Model Details
🚀 Created with QuantLLM

Convert any model to GGUF, ONNX, or MLX in one line!
from quantllm import turbo
model = turbo("google/functiongemma-270m-it")
model.export("mlx", quantization="Q4_K_M")
model.push("your-repo", format="mlx")
📚 Documentation ·
🐛 Report Issue ·
💡 Request Feature