🤖 Fin.AI v2.0

⚠️ EXPERIMENTAL - Continuously Learning Language Model

Fin.AI v2 is an optimized transformer language model that trains itself every ~85 minutes on diverse datasets via GitHub Actions. THIS MODEL IS STILL IN TRAINING IT WILL GIVE U GIBBERISH!!!

🚀 What's New in v2

Architecture Improvements

Grouped Query Attention (GQA): 40% faster inference with fewer KV heads
SwiGLU Activation: Better learning dynamics (used in LLaMA, PaLM)
RMSNorm: 20% faster than LayerNorm
Rotary Position Embeddings (RoPE): Better position encoding
Pre-norm Architecture: More stable training

Performance Gains

40% faster training on CPU
24% less memory usage
Better model quality with improved architecture
More efficient parameter usage

📊 Model Details

Architecture: Custom GPT-style transformer with modern improvements
Parameters: ~40M (small preset)
Layers: 8
Attention Heads: 8 (4 KV heads for GQA)
Embedding Dimension: 512
FFN Dimension: 1792 (with SwiGLU)
Max Sequence Length: 512 tokens
Vocabulary Size: 50,257 (GPT-2 tokenizer)

🎯 Training

Schedule: Trains every ~85 minutes (24/7)
Datasets: Rotates through 24+ diverse datasets
Platform: GitHub Actions (free tier, CPU)
Framework: PyTorch
Tracking: Weights & Biases

📥 Usage

Download and Load

from huggingface_hub import hf_hub_download
import torch

# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./model")

# Load model
from fin_ai.model import FinAIModel

model = FinAIModel.from_pretrained("./model")
model.eval()

Generate Text

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("gpt2")
prompt = "The future of AI is"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=100,
    temperature=0.8,
    top_k=50,
    top_p=0.9,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0]))

⚠️ Limitations

Experimental: This is a research project, not production-ready
Quality: Model is continuously learning and may produce errors
Biases: May reflect biases from training data
Size: Small model (40M params) has limited capabilities
Context: 512 token context window

🔗 Links

GitHub: MeridianAlgo/FinAI
Training Logs: GitHub Actions
Metrics: Wandb Dashboard
Architecture: Technical Documentation

📜 License

MIT License - See LICENSE

🙏 Acknowledgments

Architecture inspired by:

LLaMA (Meta AI) - GQA, SwiGLU, RMSNorm, RoPE
PaLM (Google) - SwiGLU
GPT-NeoX (EleutherAI) - RoPE

Last Updated: Auto-updated with each training run

Built with ❤️ for continuous learning

Downloads last month: 528

Safetensors

Model size

16M params

Tensor type

F32