16 24 44

Pulkit Mehta

pulkitmehtawork

AI & ML interests

None yet

Recent Activity

reacted to merve's post with 🚀 4 months ago

https://huggingface.co/deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML > it uses CLIP and SAM features concatenated, so better grounding > very efficient per vision tokens/performance ratio > covers 100 languages

updated a model 4 months ago

pulkitmehtawork/bart_summarizer

published a model 4 months ago

pulkitmehtawork/bart_summarizer

View all activity

Organizations

reacted to merve's post with 🚀 4 months ago

Post

9573

deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

4 replies

updated a model 4 months ago

pulkitmehtawork/bart_summarizer

Updated Nov 9, 2025

published a model 4 months ago

pulkitmehtawork/bart_summarizer

Updated Nov 9, 2025

liked a model 8 months ago

PhysicsWallahAI/Aryabhata-1.0

Text Generation • 8B • Updated Aug 13, 2025 • 81 • 109

updated a model 8 months ago

pulkitmehtawork/sparse-distilbert-base-uncased-python-code-lightening

Feature Extraction • 67M • Updated Jul 4, 2025 • 2

published a model 8 months ago

pulkitmehtawork/sparse-distilbert-base-uncased-python-code-lightening

Feature Extraction • 67M • Updated Jul 4, 2025 • 2

liked a model 8 months ago

prithivida/Splade_PP_en_v1

Feature Extraction • Updated Jun 30, 2025 • 54.7k • 30

reacted to tomaarsen's post with 🔥 8 months ago

Post

3169

‼️Sentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module for asymmetric models & much more. Sparse + Dense = 🔥 hybrid search performance! Details:

1️⃣ Sparse Encoder Models
Brand new support for sparse embedding models that generate high-dimensional embeddings (30,000+ dims) where <1% are non-zero:

- Full SPLADE, Inference-free SPLADE, and CSR architecture support
- 4 new modules, 12 new losses, 9 new evaluators
- Integration with @elastic-co , @opensearch-project , @NAVER LABS Europe, @qdrant , @IBM , etc.
- Decode interpretable embeddings to understand token importance
- Hybrid search integration to get the best of both worlds

2️⃣ Enhanced Encode Methods & Multi-Processing
- Introduce encode_query & encode_document automatically use predefined prompts
- No more manual pool management - just pass device list directly to encode()
- Much cleaner and easier to use than the old multi-process approach

3️⃣ Router Module & Advanced Training
- Router module with different processing paths for queries vs documents
- Custom learning rates for different parameter groups
- Composite loss logging - see individual loss components
- Perfect for two-tower architectures

4️⃣ Comprehensive Documentation & Training
- New Training Overview, Loss Overview, API Reference docs
- 6 new training example documentation pages
- Full integration examples with major search engines
- Extensive blogpost on training sparse models

Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0

What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap!

commented on Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 8 months ago

Great work . Best part is interpretability and speed .. @tomaarsen - I am planning to fine tune a model for text to code retrieval with below setup .. please guide if this setting seems fine for start or anything I can tune to do better .. Idea is to do decent on text to code and eval on (https://github.com/CoIR-team/coir)
Training dataset - claudios/code_search_net .. filter on Python code .. query is doc string of code and passage is code ... loss - SparseMultipleNegativesRankingLoss.. not able to think of decent dev evaluation .. shall I use SparseTripletEvaluator .. also , just query and positive passage is fine because I believe negative options will be all other data in that batch or we have to explicitly prepare data ( mine negative data ) .. please guide ..