AI & ML interests
None defined yet.
Recent Activity
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5__global_step_1480
MultiRL/qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_792
MultiRL/qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_396
MultiRL/qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_198
MultiRL/qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_594
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_no_is_A6000
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_seq_is
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_seq_is_epoch3
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_no_norm
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_only
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_geo_ms_token_tis
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_geo_ms_6epoch
MultiRL/qwen3_1.7b_new_standard_C_sft_overfit_lr_5e_5
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_geo_ms_token_tis
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_gem_ms_seq_is
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_1_98_mask_only
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_995_98_ori_norm
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_gamma_995_98
MultiRL/qwen3_1.7b_sft_final_easy_reinforce_ours_adv_fixed_gamma_0.9
MultiRL/qwen3_1.7b_easy_rl_old_adv_fixed
MultiRL/qwen3_1.7b_easy_rl_fixed_gamma_1
2B • Updated • 3
MultiRL/qwen3_1.7b_easy_rl_old_adv_final_fixed_sequence_max_token_norm_batch_128
2B • Updated MultiRL/qwen3_1.7b_medium_rl_ours_adv_final_fixed_sequence_gamma_1
2B • Updated MultiRL/qwen3_1.7b_medium_rl_ours_adv_fixed_sequence_from_epoch_3
2B • Updated MultiRL/qwen3_1.7b_easy_rl_ours_adv_final_fixed_sequence_max_token_norm
2B • Updated MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_sequence_batch_128
2B • Updated • 1
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_sequence_epoch_3
2B • Updated • 1
MultiRL/qwen3_1.7b_easy_rl_ours_adv_fixed_token
2B • Updated MultiRL/qwen3_1.7b_easy_rl_gamma_1_step_40
2B • Updated MultiRL/qwen3_4b_easy_rl_our_adv_final
4B • Updated • 1