4 524

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

What do Language Models Learn and When? The Implicit Curriculum Hypothesis

upvoted a paper 1 day ago

Self-Sovereign Agent

upvoted a paper 1 day ago

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

View all activity

Organizations

None yet

upvoted 5 papers 1 day ago

LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning

Paper • 2604.14922 • Published 3 days ago • 5

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

Paper • 2604.14164 • Published 27 days ago • 23

upvoted 15 papers 3 days ago

CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation

Paper • 2604.09746 • Published 9 days ago • 1

You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass

Paper • 2604.10966 • Published 6 days ago • 10

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 5 days ago • 10

BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation

Paper • 2604.09497 • Published 9 days ago • 28

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2604.12374 • Published 5 days ago • 29

Toward Autonomous Long-Horizon Engineering for ML Research

Paper • 2604.13018 • Published 5 days ago • 31

KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance

Paper • 2604.12627 • Published 5 days ago • 96

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published 5 days ago • 77

ROSE: Retrieval-Oriented Segmentation Enhancement

Paper • 2604.14147 • Published 4 days ago • 2

Do AI Coding Agents Log Like Humans? An Empirical Study

Paper • 2604.09409 • Published 9 days ago • 3

SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering

Paper • 2604.11548 • Published 6 days ago • 18

TIP: Token Importance in On-Policy Distillation

Paper • 2604.14084 • Published 4 days ago • 11

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published 4 days ago • 26

Target Policy Optimization

Paper • 2604.06159 • Published 11 days ago • 22

Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure

Paper • 2604.11045 • Published 6 days ago • 23

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity