Fine tuning
updated
When Scaling Meets LLM Finetuning: The Effect of Data, Model and
Finetuning Method
Paper
• 2402.17193
• Published
• 26
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A
Gradient Perspective
Paper
• 2410.23743
• Published
• 64
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper
• 2411.07618
• Published
• 17
Transformer^2: Self-adaptive LLMs
Paper
• 2501.06252
• Published
• 55
Control LLM: Controlled Evolution for Intelligence Retention in LLM
Paper
• 2501.10979
• Published
• 6
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Paper
• 2506.01049
• Published
• 38
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Paper
• 2506.05629
• Published
• 37
All is Not Lost: LLM Recovery without Checkpoints
Paper
• 2506.15461
• Published
• 39
SRFT: A Single-Stage Method with Supervised and Reinforcement
Fine-Tuning for Reasoning
Paper
• 2506.19767
• Published
• 15
Optimizing ML Training with Metagradient Descent
Paper
• 2503.13751
• Published
• 1
Towards a Unified View of Large Language Model Post-Training
Paper
• 2509.04419
• Published
• 76
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from
Token and Parameter Levels
Paper
• 2509.16596
• Published
• 14
Fine-tuning Done Right in Model Editing
Paper
• 2509.22072
• Published
• 28
Interactive Training: Feedback-Driven Neural Network Optimization
Paper
• 2510.02297
• Published
• 43
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper
• 2510.18866
• Published
• 114
π_RL: Online RL Fine-tuning for Flow-based
Vision-Language-Action Models
Paper
• 2510.25889
• Published
• 66
ROOT: Robust Orthogonalized Optimizer for Neural Network Training
Paper
• 2511.20626
• Published
• 43
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning
Paper
• 2602.01058
• Published
• 41
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration
Paper
• 2602.01734
• Published
• 32
Unified Latents (UL): How to train your latents
Paper
• 2602.17270
• Published
• 46