Massimo Roberto Scamarcia PRO

mrs83

AI & ML interests

Natural Language Processing, Text Generation, Question Answering, Data Augmentation, Knowledge Transfer, Chain-of-Thought, ResearchOps, MLOps

Recent Activity

reacted to qgallouedec's post with 🚀 about 22 hours ago

TRL v1.3 ships day-one training support for Qwen 3.6 🚀 The new Qwen 3.6 family (`Qwen/Qwen3.6-27B`, `Qwen/Qwen3.6-35B-A3B`) reuses the Qwen3.5-MoE architecture but ships a slightly different chat template, so we updated the stack end-to-end: new training template with `{% generation %}` markers, tool-call response schema routing, tiny test models for the VLM matrix. SFT with assistant-only loss works out of the box: ```python from trl import SFTConfig, SFTTrainer trainer = SFTTrainer( model="Qwen/Qwen3.6-27B", args=SFTConfig(assistant_only_loss=True), train_dataset=dataset, ) trainer.train() ``` So does GRPO tool-calling — just hand `tools=[...]` to `GRPOTrainer`. v1.3 also brings a new experimental TPO trainer (Triple Preference Optimization), speculative decoding in `trl vllm-serve` (Qwen3 MTP / Eagle3 drafts), 12 more KTO ↔ DPO alignment PRs (KTO promotion to stable is now in reach), three more `{% generation %}` chat templates (Gemma/Gemma 2, Phi-3, GLM-4-MoE), and a chunky SFT entropy bug fix. Full release notes: https://github.com/huggingface/trl/releases/tag/v1.3.0

published a bucket about 23 hours ago

mrs83/huggingface-static-62f6b6-bucket

published a bucket about 23 hours ago

mrs83/huggingface-static-0eb09e-bucket

View all activity

Organizations

New activity in ethicalabs/Echo-DSRN-SmolTools-114M-Intent-Classifier 9 days ago

Apply for a GPU community grant: Personal project

#1 opened 9 days ago by

mrs83

New activity in ethicalabs/Echo-DSRN-114M-Base 12 days ago

Apply for a GPU community grant: Personal project

#1 opened 12 days ago by

mrs83

New activity in ethicalabs/Echo-DSRN-114M-Base 12 days ago

Echo-DSRN-114M-Base v0.1.1

#4 opened 12 days ago by

mrs83

New activity in ethicalabs/Echo-DSRN-114M 16 days ago

🚨 BREAKING: Echo-DSRN-114M downloads Pizza via HTTP

#1 opened 16 days ago by

mrs83

New activity in ethicalabs/Echo-DSRN-114M-Base 17 days ago

Training stability on AMD ROCm (distributed backend, gloo vs nccl)

#3 opened 17 days ago by

mrs83

New activity in ethicalabs/Echo-DSRN-114M-Base 19 days ago

Continued Pre-Training (SFTTrainer, HuggingFaceH4/ultrachat_200k)

#2 opened 19 days ago by

mrs83

New activity in ethicalabs/Echo-DSRN-114M-Base 20 days ago

Truncated Backpropagation Through Time (TBPTT) on Fineweb-EDU (10BT)

#1 opened 20 days ago by

mrs83

New activity in ethicalabs/Kurtis-EON1 24 days ago

SFT/Alignment - Phase 007-07-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch, 200k samples, bf16, LoRA disabled)

#11 opened 25 days ago by

mrs83

New activity in ethicalabs/Kurtis-EON1 about 1 month ago

Echo-DSRN - Triton Kernel Benchmark Report - PyTorch (native) vs Triton Legacy (sequential) vs Triton 3-Pass (new)

#10 opened about 1 month ago by

mrs83

SFT/Alignment - Phase 007-06-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch)

#9 opened about 1 month ago by

mrs83

New activity in ethicalabs/Kurtis-EON1 about 2 months ago

SFT/Alignment - Phase 007-02-MLP8: mlabonne/FineTome-100k-dedup (1000 steps + 500 steps)

#8 opened about 2 months ago by

mrs83

SFT/Alignment - Phase 007-02: mlabonne/FineTome-100k-dedup (Filtered, 128T, 1 Epoch)

#7 opened about 2 months ago by

mrs83

SFT/Alignment - Phase 007: Smoltalk2 (No Thinking, Daily Conversations)

#6 opened about 2 months ago by

mrs83

New activity in ethicalabs/Kurtis-EON1 2 months ago

Mid-Training - Phase 006: Smoltalk2 (No Thinking)

#5 opened 2 months ago by

mrs83

Mid-Training - Phase 005: Kurtis SFT Mix + OpenHermes2.5

#3 opened 2 months ago by

mrs83

A quick update on the development of Kurtis-EON1 (Echo-DSRN)

#4 opened 2 months ago by

mrs83

Mid-Training - Phase 004: Kurtis SFT Mix

#2 opened 2 months ago by

mrs83

Mid-Training - Phase 003: HuggingFaceTB/smoltalk

#1 opened 2 months ago by

mrs83

New activity in NX-AI/xLSTM-7b 3 months ago

Hunyuan-Mamba hybrid out 2502 - Where is the Mistral-xLSTM hybrid (DeepSeek moment)?

#11 opened about 1 year ago by

gue22

New activity in NX-AI/xLSTM-7b 6 months ago

Hunyuan-Mamba hybrid out 2502 - Where is the Mistral-xLSTM hybrid (DeepSeek moment)?

#11 opened about 1 year ago by

gue22

Massimo Roberto Scamarcia PRO

AI & ML interests

Recent Activity

Organizations

mrs83's activity

Apply for a GPU community grant: Personal project

Apply for a GPU community grant: Personal project

Echo-DSRN-114M-Base v0.1.1

🚨 BREAKING: Echo-DSRN-114M downloads Pizza via HTTP

Training stability on AMD ROCm (distributed backend, gloo vs nccl)

Continued Pre-Training (SFTTrainer, HuggingFaceH4/ultrachat_200k)

Truncated Backpropagation Through Time (TBPTT) on Fineweb-EDU (10BT)

SFT/Alignment - Phase 007-07-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch, 200k samples, bf16, LoRA disabled)

Echo-DSRN - Triton Kernel Benchmark Report - PyTorch (native) vs Triton Legacy (sequential) vs Triton 3-Pass (new)

SFT/Alignment - Phase 007-06-MLP8: ethicalabs/Kurtis-EON1-SFT Mix (1 epoch)

SFT/Alignment - Phase 007-02-MLP8: mlabonne/FineTome-100k-dedup (1000 steps + 500 steps)

SFT/Alignment - Phase 007-02: mlabonne/FineTome-100k-dedup (Filtered, 128T, 1 Epoch)

SFT/Alignment - Phase 007: Smoltalk2 (No Thinking, Daily Conversations)

Mid-Training - Phase 006: Smoltalk2 (No Thinking)

Mid-Training - Phase 005: Kurtis SFT Mix + OpenHermes2.5

A quick update on the development of Kurtis-EON1 (Echo-DSRN)

Mid-Training - Phase 004: Kurtis SFT Mix

Mid-Training - Phase 003: HuggingFaceTB/smoltalk

Hunyuan-Mamba hybrid out 2502 - Where is the Mistral-xLSTM hybrid (DeepSeek moment)?

Hunyuan-Mamba hybrid out 2502 - Where is the Mistral-xLSTM hybrid (DeepSeek moment)?