Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models
Paper
• 2511.12464 • Published
Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models