FunctionGemma 270M - Delia Dispatcher

A fine-tuned version of google/functiongemma-270m-it for Delia LLM orchestration.

This tiny model (270M params) acts as a fast dispatcher, routing user requests to the appropriate backend:

  • call_coder - Code generation tasks
  • call_reviewer - Code review and analysis
  • call_planner - Architecture and planning (also handles ambiguous requests)
  • call_executor - Running commands and scripts

Key Features

  • Minimalist schema: Single reasoning parameter per tool
  • Thought tokens: Brief CoT scratchpad before tool calls
  • EOS hardening: Explicit stop tokens prevent infinite loops
  • Negative samples: 13% ambiguous examples → planner (graceful handling)
  • GBNF grammar: Constrained decoding for 100% valid output

Usage

With llama.cpp (recommended for speed)

# Download the GGUF
wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/functiongemma-270m-delia-dispatcher-f16.gguf

# Download the grammar
wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/dispatcher.gbnf

# Run with grammar constraint
./llama-cli -m functiongemma-270m-delia-dispatcher-f16.gguf \
  --grammar-file dispatcher.gbnf \
  -p "<start_of_turn>user
Write a fibonacci function<end_of_turn>
<start_of_turn>model"

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher")
tokenizer = AutoTokenizer.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher")

prompt = """<start_of_turn>user
Review this code for bugs<end_of_turn>
<start_of_turn>model"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Output Format

<start_of_turn>user
{request}<end_of_turn>
<start_of_turn>model
thought
{brief reasoning}
<tool_call>{"name": "call_X", "arguments": {"reasoning": "..."}}</tool_call><end_of_turn>

Training

Fine-tuned with Unsloth using LoRA:

  • Epochs: 3
  • LoRA rank: 32
  • Training examples: 92 (balanced across 4 tools + 13% ambiguous)
  • Final loss: 0.46

Files

File Description
functiongemma-270m-delia-dispatcher-f16.gguf GGUF model (F16, 518MB)
model.safetensors Transformers model
dispatcher.gbnf GBNF grammar for constrained decoding
dispatcher_tools.json Tool schema (4 tools)
train.jsonl Training data

License

Apache 2.0 (same as base model)

Part of Delia

This model is designed for use with Delia, an LLM orchestration system that routes requests to optimal backends.

Downloads last month
59
Safetensors
Model size
0.2B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for devopsforflops/functiongemma-270m-delia-dispatcher

Quantized
(22)
this model