KABI's picture

KABI

dongguanting

·

https://dongguanting.github.io/

AI & ML interests

Reasoning and Alignment for Large Language Models

Recent Activity

liked a dataset about 7 hours ago

XXHStudyHard/EnvScaler-SFT-Traj-9K

upvoted a paper 1 day ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

upvoted a paper 1 day ago

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

View all activity

Organizations

upvoted 2 papers 1 day ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 4 days ago • 78

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

Paper • 2601.03822 • Published 2 days ago • 21

upvoted a paper 13 days ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 85

upvoted a paper 18 days ago

Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Paper • 2512.17260 • Published 21 days ago • 48

upvoted 3 papers 24 days ago

Memory in the Age of AI Agents

Paper • 2512.13564 • Published 25 days ago • 133

Thinking with Images via Self-Calling Agent

Paper • 2512.08511 • Published about 1 month ago • 21

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 29 days ago • 46

upvoted 2 papers about 1 month ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 283

Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published Nov 25, 2025 • 117

upvoted 2 papers about 2 months ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 60

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 161

upvoted 9 papers 2 months ago

DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published Nov 7, 2025 • 42

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5, 2025 • 81

V-Thinker: Interactive Thinking with Images

Paper • 2511.04460 • Published Nov 6, 2025 • 97

LiveTradeBench: Seeking Real-World Alpha with Large Language Models

Paper • 2511.03628 • Published Nov 5, 2025 • 12

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published Oct 27, 2025 • 84

LongCat-Flash-Omni Technical Report

Paper • 2511.00279 • Published Oct 31, 2025 • 22

ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use

Paper • 2510.27363 • Published Oct 31, 2025 • 22

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 100

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 45