Stateful TranslateGemma 4B IT (Core ML)
Stateful Core ML export of google/translategemma-4b-it with KV-cache states for incremental decoding on Apple platforms.
Included Files
StatefulTranslateGemma4BITFP16.mlpackageStatefulTranslateGemma4BITInt8PerChannel.mlpackageStatefulTranslateGemma4BITInt4PerChannel.mlpackageconvert_stateful_translategemma_coreml.pyNOTICE
Model Interface
Inputs:
inputIds:int32, shape(1, queryLength)fullAttentionMask:float16, shape(1, 1, queryLength, endStep)slidingAttentionMask:float16, shape(1, 1, queryLength, endStep)
States:
keyCache:float16, shape(layers, 1, kvHeads, maxContext, headDim)valueCache:float16, same shape askeyCache
Output:
logits:float16
Conversion Notes
- Conversion target: iOS 18+ (
ct.target.iOS18) - Stateful export via Core ML states (
ct.StateType) - Gemma3 mixed-attention export with explicit
fullAttentionMaskandslidingAttentionMaskinputs StatefulTranslateGemma4BITFP16.mlpackage: smallest stable FP16-focused artifactStatefulTranslateGemma4BITInt8PerChannel.mlpackage: working balanced-size quantized variantStatefulTranslateGemma4BITInt4PerChannel.mlpackage: smallest working quantized variant validated in short decode runs
Base Model and License
- Base model: https://huggingface.co/google/translategemma-4b-it
- Gemma terms: https://ai.google.dev/gemma/terms
This repository contains a converted derivative of Gemma model weights. Use is subject to Gemma license terms and policies.
- Downloads last month
- -
Model tree for aoiandroid/translategemma-4b-it-coreml
Base model
google/translategemma-4b-it