·
AI & ML interests
Machine learning, RLHF
Organizations
weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition
Viewer
•
Updated
•
5k
•
24
weqweasdas/ultrafeedback_binarized_processed
Viewer
•
Updated
•
61.1k
•
15
weqweasdas/qwen7b_prompt_difficult
Viewer
•
Updated
•
15.7k
•
21
weqweasdas/qwen7b_openr1_with_scores_sub
Viewer
•
Updated
•
57.7k
•
7
weqweasdas/qwen7b_openr1_with_scores_filtered_0375
Viewer
•
Updated
•
24.3k
•
13
weqweasdas/qwen7b_openr1_with_scores
Viewer
•
Updated
•
75k
•
10
weqweasdas/from_default_filtered_openr1_with_scores_filtered_05_and_filtered_allwrong
Viewer
•
Updated
•
25k
•
7
Viewer
•
Updated
•
1.68k
•
7
weqweasdas/dapo_with_scores
Viewer
•
Updated
•
13k
•
11
weqweasdas/dapo_and_openr1_can_be_evaluated_by_daporm_deduplicate_with_scores
Viewer
•
Updated
•
34.1k
•
9
weqweasdas/dapo_and_openr1_can_be_evaluated_by_daporm_deduplicate
Viewer
•
Updated
•
34.1k
•
6
weqweasdas/test_rm_from_default_filtered_openr_math_verify_scores_and_dapo_scores
Viewer
•
Updated
•
93.7k
•
13
weqweasdas/test_rm_from_default_filtered_openr_math_verify_scores
Viewer
•
Updated
•
93.7k
•
7
weqweasdas/from_default_filtered_openr1_with_scores_filtered_0125_but_not_all_wrong
Viewer
•
Updated
•
13.3k
•
3
weqweasdas/from_default_filtered_openr1_with_scores
Viewer
•
Updated
•
75k
•
14
weqweasdas/from_default_filtered_openr1_with_scores_filtered_025
Viewer
•
Updated
•
45.5k
•
7
weqweasdas/from_default_filtered_openr1_with_scores_filtered_0125
Viewer
•
Updated
•
37.8k
•
5
weqweasdas/from_default_filtered_openr1_with_scores_filtered_05
Viewer
•
Updated
•
56.2k
•
6
weqweasdas/from_default_filtered_openr1
Viewer
•
Updated
•
75k
•
26
weqweasdas/aime_hmmt_brumo_cmimc_amc23
Viewer
•
Updated
•
230
•
88
weqweasdas/aime_hmmt_brumo_cmimc
Viewer
•
Updated
•
190
•
9
weqweasdas/filtered_openr1
Viewer
•
Updated
•
145k
•
25
weqweasdas/numina_prompt_non_dedu
Viewer
•
Updated
•
312k
•
3
Viewer
•
Updated
•
66
•
6
Viewer
•
Updated
•
66
•
4
weqweasdas/qwen7b_self_rewarding_sft_with_score_passn
Viewer
•
Updated
•
500
•
3
weqweasdas/qwen7b_base_with_score_passn
Viewer
•
Updated
•
500
•
2
weqweasdas/qwen7b_grpo_ver2_step80_with_score_passn_second_64
Viewer
•
Updated
•
1k
•
3
weqweasdas/qwen7b_grpo_ver2_step300_with_score_passn
Viewer
•
Updated
•
1k
•
5
weqweasdas/qwen7b_grpo_ver2_step200_with_score_passn
Viewer
•
Updated
•
1k
•
5