5 132 294

Blanc Swan PRO

blancsw

https://swan-blanc.fr/

AI & ML interests

ChatBot

Recent Activity

liked a model about 10 hours ago

google/gemma-scope-2-27b-it

liked a dataset 1 day ago

nvidia/Nemotron-Agentic-v1

liked a model 1 day ago

Qwen/Qwen-Image-Layered

View all activity

Organizations

upvoted an article 5 days ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

5 days ago

•

upvoted a paper 8 days ago

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1 • 119

upvoted 2 articles 13 days ago

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

13 days ago

•

Article

We Got Claude to Fine-Tune an Open Source LLM

19 days ago

•

528

upvoted a paper 18 days ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 26 days ago • 131

upvoted a collection 19 days ago

Mistral Large 3

Collection

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 20 days ago • 79

upvoted an article 20 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

22 days ago

•

249

upvoted a paper 21 days ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13 • 48

upvoted a paper 24 days ago

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Paper • 2306.08568 • Published Jun 14, 2023 • 32

upvoted a paper 25 days ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 7

upvoted a collection 25 days ago

Qwen3-Next

Collection

4 items • Updated Sep 22 • 169

upvoted an article about 2 months ago

Article

EuroLLM-9B

Dec 2, 2024

•

138

upvoted a paper 2 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 500

upvoted 2 papers 3 months ago

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 173

upvoted a collection 3 months ago

Granite 4.0 Language Models

Collection

13 items • Updated Nov 17 • 198

upvoted an article 3 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

Sep 23

•

134

upvoted a paper 3 months ago

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10 • 661

upvoted 2 papers 4 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 194

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2 • 69

Blanc Swan PRO

AI & ML interests

Recent Activity

Organizations

blancsw's activity

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

We Got Claude to Fine-Tune an Open Source LLM

Transformers v5: Simple model definitions powering the AI ecosystem

EuroLLM-9B

Smol2Operator: Post-Training GUI Agents for Computer Use