--- language: en license: mit tags: - log-analysis - anomaly-detection - embeddings - sentence-transformers - bge base_model: BAAI/bge-large-en-v1.5 pipeline_tag: sentence-similarity --- # BGE Log Embedding Model Fine-tuned [`BAAI/bge-large-en-v1.5`](https://huggingface.co/BAAI/bge-large-en-v1.5) for **log anomaly detection**. ## Usage ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("Swapnanil09/bge-log-embeddings") logs = ["INFO: login ok", "ERROR: DB timeout"] embeddings = model.encode(logs, normalize_embeddings=True) print(embeddings.shape) # (2, 1024) ``` ## Training Details | Property | Value | |---|---| | Base Model | BAAI/bge-large-en-v1.5 | | Embedding Dim | 1024 | | Loss | TripletLoss | | Epochs | 3 | | Batch Size | 32 | | Hardware | Kaggle T4 x2 | ## How It Works Normal logs cluster together; ERROR/CRITICAL anomalies are pushed apart via triplet loss. Use the embeddings with DBSCAN or KMeans for zero-shot anomaly detection.