FunctionGemma 270M - Delia Dispatcher

A fine-tuned version of google/functiongemma-270m-it for Delia LLM orchestration.

This tiny model (270M params) acts as a fast dispatcher, routing user requests to the appropriate backend:

call_coder - Code generation tasks
call_reviewer - Code review and analysis
call_planner - Architecture and planning (also handles ambiguous requests)
call_executor - Running commands and scripts

Key Features

Minimalist schema: Single reasoning parameter per tool
Thought tokens: Brief CoT scratchpad before tool calls
EOS hardening: Explicit stop tokens prevent infinite loops
Negative samples: 13% ambiguous examples → planner (graceful handling)
GBNF grammar: Constrained decoding for 100% valid output

Usage

With llama.cpp (recommended for speed)

# Download the GGUF
wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/functiongemma-270m-delia-dispatcher-f16.gguf

# Download the grammar
wget https://huggingface.co/devopsforflops/functiongemma-270m-delia-dispatcher/resolve/main/dispatcher.gbnf

# Run with grammar constraint
./llama-cli -m functiongemma-270m-delia-dispatcher-f16.gguf \
  --grammar-file dispatcher.gbnf \
  -p "<start_of_turn>user
Write a fibonacci function<end_of_turn>
<start_of_turn>model"

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher")
tokenizer = AutoTokenizer.from_pretrained("devopsforflops/functiongemma-270m-delia-dispatcher")

prompt = """<start_of_turn>user
Review this code for bugs<end_of_turn>
<start_of_turn>model"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Output Format

<start_of_turn>user
{request}<end_of_turn>
<start_of_turn>model
thought
{brief reasoning}
<tool_call>{"name": "call_X", "arguments": {"reasoning": "..."}}</tool_call><end_of_turn>

Training

Fine-tuned with Unsloth using LoRA:

Epochs: 3
LoRA rank: 32
Training examples: 92 (balanced across 4 tools + 13% ambiguous)
Final loss: 0.46

Files

File	Description
`functiongemma-270m-delia-dispatcher-f16.gguf`	GGUF model (F16, 518MB)
`model.safetensors`	Transformers model
`dispatcher.gbnf`	GBNF grammar for constrained decoding
`dispatcher_tools.json`	Tool schema (4 tools)
`train.jsonl`	Training data

License

Apache 2.0 (same as base model)

Part of Delia

This model is designed for use with Delia, an LLM orchestration system that routes requests to optimal backends.

Downloads last month: 59

Safetensors

Model size

0.2B params

Tensor type

F32

F16

Model tree for devopsforflops/functiongemma-270m-delia-dispatcher

Base model

google/functiongemma-270m-it

Quantized

(22)

this model