π€ Fin.AI v2.0
β οΈ EXPERIMENTAL - Continuously Learning Language Model
Fin.AI v2 is an optimized transformer language model that trains itself every ~85 minutes on diverse datasets via GitHub Actions. THIS MODEL IS STILL IN TRAINING IT WILL GIVE U GIBBERISH!!!
π What's New in v2
Architecture Improvements
- Grouped Query Attention (GQA): 40% faster inference with fewer KV heads
- SwiGLU Activation: Better learning dynamics (used in LLaMA, PaLM)
- RMSNorm: 20% faster than LayerNorm
- Rotary Position Embeddings (RoPE): Better position encoding
- Pre-norm Architecture: More stable training
Performance Gains
- 40% faster training on CPU
- 24% less memory usage
- Better model quality with improved architecture
- More efficient parameter usage
π Model Details
- Architecture: Custom GPT-style transformer with modern improvements
- Parameters: ~40M (small preset)
- Layers: 8
- Attention Heads: 8 (4 KV heads for GQA)
- Embedding Dimension: 512
- FFN Dimension: 1792 (with SwiGLU)
- Max Sequence Length: 512 tokens
- Vocabulary Size: 50,257 (GPT-2 tokenizer)
π― Training
- Schedule: Trains every ~85 minutes (24/7)
- Datasets: Rotates through 24+ diverse datasets
- Platform: GitHub Actions (free tier, CPU)
- Framework: PyTorch
- Tracking: Weights & Biases
π₯ Usage
Download and Load
from huggingface_hub import hf_hub_download
import torch
# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./model")
# Load model
from fin_ai.model import FinAIModel
model = FinAIModel.from_pretrained("./model")
model.eval()
Generate Text
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
prompt = "The future of AI is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=100,
temperature=0.8,
top_k=50,
top_p=0.9,
repetition_penalty=1.1
)
print(tokenizer.decode(outputs[0]))
β οΈ Limitations
- Experimental: This is a research project, not production-ready
- Quality: Model is continuously learning and may produce errors
- Biases: May reflect biases from training data
- Size: Small model (40M params) has limited capabilities
- Context: 512 token context window
π Links
- GitHub: MeridianAlgo/FinAI
- Training Logs: GitHub Actions
- Metrics: Wandb Dashboard
- Architecture: Technical Documentation
π License
MIT License - See LICENSE
π Acknowledgments
Architecture inspired by:
- LLaMA (Meta AI) - GQA, SwiGLU, RMSNorm, RoPE
- PaLM (Google) - SwiGLU
- GPT-NeoX (EleutherAI) - RoPE
Last Updated: Auto-updated with each training run
Built with β€οΈ for continuous learning
- Downloads last month
- 528