The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation
Paper • 2510.23393 • Published • 21
Reliable and context-aware coding assistance
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation
Mellum: Production-Grade in-IDE Contextual Code Completion with Multi-File Project Understanding