·
AI & ML interests
None yet
Organizations
xinpeng/big-math-hard_tiny_instruct_cheat_rm_loophole_v2_mixed_0.5
Viewer
• Updated • 25.8k • 5
xinpeng/big-math-hard_tiny_instruct_cheat_direct_mixed
Viewer
• Updated • 25.8k • 11
xinpeng/big-math-hard_tiny_instruct_cheat_direct
Viewer
• Updated • 25.8k • 84
xinpeng/big-math-hard_tiny_instruct_cheat_no
Viewer
• Updated • 25.8k • 16
xinpeng/big-math-hard_tiny_instruct_cheat_rm_loophole
Viewer
• Updated • 25.8k • 27
Viewer
• Updated • 132 • 8
xinpeng/Big-Math-RL-Verified-Combined-digit-hard-int-only
Viewer
• Updated • 25.8k • 39
xinpeng/Big-Math-RL-Verified-Combined-digit-hard
Viewer
• Updated • 25.9k • 17
xinpeng/Big-Math-RL-Verified-Combined-digit
Viewer
• Updated • 130k • 15
xinpeng/sycophancy_separate_long_cot_simple
Viewer
• Updated • 10.2k • 6
xinpeng/sycophancy_separate_cot_simple
Viewer
• Updated • 10.2k • 3
xinpeng/sycophancy_separate_10x_long_cot
Viewer
• Updated • 10.2k • 3
xinpeng/sycophancy_separate_long_cot
Viewer
• Updated • 10.2k • 2
xinpeng/sycophancy_separate_cot
Viewer
• Updated • 10.2k • 9
xinpeng/sycophancy_separate
Viewer
• Updated • 10.2k • 15
Viewer
• Updated • 10.2k • 5
Viewer
• Updated • 169k • 19
xinpeng/PKU-SafeRLHF-promt-quater
Viewer
• Updated • 11.1k • 5
xinpeng/ultrafeedback_binarized_quater
Viewer
• Updated • 15.8k • 3
xinpeng/hh-rlhf-harmless-base
Viewer
• Updated • 44.8k • 12