| | --- |
| | library_name: transformers |
| | tags: |
| | - finance |
| | - chat |
| | license: apache-2.0 |
| | datasets: |
| | - sujet-ai/Sujet-Finance-Instruct-177k |
| | language: |
| | - en |
| | base_model: |
| | - HuggingFaceTB/SmolLM2-360M-Instruct |
| | --- |
| | |
| | # FinChat-XS |
| |
|
| | FinChat-XS is a lightweight financial domain language model designed to answer questions about finance, markets, investments, and economics in a conversational style. |
| |
|
| | ## Model Overview |
| |
|
| | FinChat-XS is a fine-tuned version of [HuggingFaceTB/SmolLM2-360M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct), optimized for financial domain conversations using LoRA (Low-Rank Adaptation). With only 360M parameters, it offers a balance between performance and efficiency, making it accessible for deployment on consumer hardware. |
| |
|
| | The model combines professional financial knowledge with a conversational communication style, making it suitable for applications where users need expert financial information delivered in an approachable manner. |
| |
|
| |
|
| | ## Repository & Resources |
| |
|
| | For full code, training process, and additional details, visit the GitHub repository: |
| |
|
| | 🔗 [FinLLMOpt Repository](https://github.com/peremartra/FinLLMOpt) |
| |
|
| | ## How the Model was Created |
| |
|
| | FinChat-XS was developed through a focused fine-tuning process designed to enhance financial domain expertise while maintaining conversational abilities: |
| |
|
| | 1. **Base model selection**: Started with SmolLM2-360M-Instruct, a lightweight instruction-tuned language model |
| | 2. **Dataset preparation**: |
| | - Filtered the sujet-ai/Sujet-Finance-Instruct-177k dataset to focus on QA and conversational QA examples |
| | - Applied length filtering to keep responses below 500 characters |
| | - Augmented short conversational QA examples to improve conciseness |
| |
|
| | 3. **Fine-tuning approach**: |
| | - Applied LoRA (Low-Rank Adaptation) to efficiently fine-tune the model |
| | - Targeted key attention modules (q_proj, v_proj) |
| | - Used rank r=4 and alpha=16 |
| | - Training configuration: |
| | - Batch size: 2 (effective batch size 16 with gradient accumulation) |
| | - Learning rate: 1.5e-4 |
| | - BF16 precision |
| |
|
| | ## Challenges |
| | The primary challenge encountered during the development of FinChat-XS was the lack of high-quality conversational datasets specifically focused on personal finance. While the Sujet-Finance-Instruct-177k dataset provided valuable financial QA examples, there remains a notable gap in naturalistic, multi-turn conversations about personal financial scenarios. |
| |
|
| | ## Why Use This Model? |
| |
|
| | FinChat-XS offers several advantages for specific use cases: |
| |
|
| | - **Efficient deployment**: At only 362MB, it can run on devices with limited resources. |
| | - **Financial domain knowledge**: Fine-tuned specifically on financial QA data |
| | - **Balanced communication style**: Combines professional financial knowledge with conversational delivery |
| | - **Low deployment cost**: Requires significantly less computational resources than larger models |
| | - **Customizable**: The LoRA adapter can be mixed with other adapters or further fine-tuned |
| |
|
| | Ideal for: |
| | - Embedded financial assistants in mobile apps |
| | - Personal financial planning tools |
| | - Educational applications about finance and investing |
| | - Customer service automation for financial institutions |
| | - Quick deployment scenarios where larger models aren't practical |
| |
|
| | ## How to Use the Model |
| |
|
| | ### Basic Usage with Transformers |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | import torch |
| | |
| | # Load model and tokenizer |
| | model_name = "oopere/FinChat-XS" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16) |
| | |
| | # Create a conversation |
| | messages = [ |
| | {"role": "user", "content": "What's the difference between stocks and bonds?"} |
| | ] |
| | |
| | # Format the prompt using the chat template |
| | prompt = tokenizer.apply_chat_template(messages, tokenize=False) |
| | |
| | # Tokenize the prompt |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | |
| | # Generate a response |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=256, |
| | temperature=0.7, |
| | top_p=0.9, |
| | do_sample=True, |
| | repetition_penalty=1.2 |
| | ) |
| | |
| | # Decode and print the response |
| | response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
| | print(response) |
| | ``` |
| |
|
| | ### Optimized Inference with 8-bit Quantization |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
| | import torch |
| | |
| | # Configure 8-bit quantization |
| | bnb_config = BitsAndBytesConfig( |
| | load_in_8bit=True, |
| | bnb_4bit_compute_dtype=torch.float16 |
| | ) |
| | |
| | # Load model with quantization |
| | model = AutoModelForCausalLM.from_pretrained( |
| | "oopere/FinChat-XS", |
| | quantization_config=bnb_config, |
| | device_map="auto" |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained("oopere/FinChat-XS") |
| | |
| | # Continue with the same usage pattern as above |
| | ``` |
| |
|
| | ### Using with LoRA Adapter Only |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | from peft import PeftModel, PeftConfig |
| | |
| | # Load base model |
| | base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct") |
| | tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct") |
| | |
| | # Load LoRA adapter |
| | peft_model = PeftModel.from_pretrained(base_model, "oopere/qa-adapterFinChat-XS") |
| | |
| | # Continue with the same usage pattern as above |
| | ``` |
| |
|
| | ## Limitations & Considerations |
| |
|
| | While FinChat-XS performs well in many financial conversation scenarios, users should be aware of these limitations: |
| |
|
| | 1. **Knowledge limitations**: The model's knowledge is limited to its training data and has a knowledge cutoff date from the base model (SmolLM2). |
| |
|
| | 2. **Size trade-offs**: As a 360M parameter model, it has less capacity than larger models (7B+) and may provide less nuanced or detailed responses on complex topics. |
| |
|
| | 3. **Financial advice disclaimer**: The model is not a certified financial advisor and should not be used for making investment decisions. Its responses should be considered educational, not professional financial advice. |
| |
|
| | 4. **Domain boundaries**: While focused on finance, the model may struggle with highly specialized financial topics or recent developments not covered in its training data. |
| |
|
| | 5. **Hallucination potential**: Like all language models, FinChat-XS may occasionally generate plausible-sounding but incorrect information, especially when asked about specific numerical data or complex financial details. |
| |
|
| | 6. **Style variations**: The model balances formal financial knowledge with a conversational style, which may not be appropriate for all professional contexts. |
| |
|
| | 7. **Regulatory compliance**: This model has not been specifically audited for compliance with financial regulations in various jurisdictions. |
| |
|
| | ## Citation |
| |
|
| | If you use FinChat-XS in your research or applications, please consider citing it as: |
| |
|
| | ``` |
| | @misc{oopere2025finchatxs, |
| | author = {Martra, P.}, |
| | title = {FinChat-XS: A Lightweight Financial Domain Chat Language Model}, |
| | year = {2025}, |
| | publisher = {Hugging Face}, |
| | howpublished = {\url{https://huggingface.co/oopere/FinChat-XS}} |
| | } |
| | ``` |
| |
|
| | ## Acknowledgements |
| |
|
| | - [HuggingFaceTB](https://huggingface.co/HuggingFaceTB) for creating the SmolLM2 model series |
| | - [Sujet AI](https://huggingface.co/sujet-ai) for their financial instruction dataset |
| | - [Hugging Face](https://huggingface.co/) for providing the infrastructure and tools for model development |