YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Coconut-Enhanced Qwen2.5-7B-Instruct

This model was trained using the Coconut method for continuous latent space reasoning.

Base Model

  • Base: Qwen/Qwen2.5-7B-Instruct
  • Method: Coconut (Continuous Latent Space Reasoning)
  • Training: Custom reasoning dataset with spacy-segmented reasoning steps

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("agurung/coconut-qwen2.5-7b")
tokenizer = AutoTokenizer.from_pretrained("agurung/coconut-qwen2.5-7b")

# Use like any other Qwen model
inputs = tokenizer("Your question here", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

  • Dataset: Reasoning traces with "In summary:" endings
  • Method: Progressive latent token replacement during training
  • Latent Tokens: 2 per reasoning step (c_thought)
  • Max Reasoning Stages: 2 (max_latent_stage)

Extracted from checkpoint: checkpoint_2

Downloads last month
5
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for agurung/coconut-qwen2.5-7b

Quantizations
1 model