Yihua Zhang's picture

1 3 5

Yihua Zhang

NormalUhr

·

https://www.yihua-zhang.com

AI & ML interests

None yet

Organizations

published an article 3 months ago

Article

A Role Shift for AI Infra: From Foundational Support to a Core Engine of Innovation

Oct 3

published an article 5 months ago

Article

Re-understanding KL Approximation from an RL-for-LLM Lens: Notes on “Approximating KL Divergence”

Aug 11

•

4

published an article 5 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9

•

71

published an article 7 months ago

Article

Decorators in Machine Learning

Jun 8

published an article 10 months ago

Article

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

Feb 28

•

14

published an article 11 months ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11

•

94

published an article 11 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7

•

263

published an article 11 months ago

Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

Feb 4

•

28

published an article 11 months ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

Feb 4

•

16

published an article 11 months ago

Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

Feb 4

•

18