arxiv:2506.11077
Chongyu Fan
a-F1
AI & ML interests
None yet
Organizations
models
206
a-F1/TAR-WMDP-llama3-8b-relearn-forget-100
8B
•
Updated
a-F1/SAM-WMDP-llama3-8b-relearn-forget-100
8B
•
Updated
a-F1/NPO-WMDP-llama3-8b-relearn-forget-100
8B
•
Updated
a-F1/RMU-WMDP-llama3-8b-relearn-forget-100
8B
•
Updated
a-F1/SimNPO-WMDP-llama3-8b
8B
•
Updated
•
1
a-F1/MATH-Qwen2.5-Math-7B-weighted
Updated
a-F1/MATH-Qwen2.5-Math-7B-negative
Updated
a-F1/MATH-Qwen2.5-Math-7B-weighted-simnpo10.0-clipp3e-2
Updated
a-F1/MATH-Qwen2.5-Math-7B-weighted-simnpo10.0-psr1e-1
Updated
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull-cftrue-decayepsfalse
3B
•
Updated
•
1
datasets
15
a-F1/aime_2024-Qwen3-14B-best_of_n-prm-completions
Viewer
•
Updated
•
5
•
3
a-F1/MARTI-Eval
Updated
•
1
a-F1/aime_2024-DeepSeek-R1-Distill-Qwen-1.5B-beam_search-prm-completions
Viewer
•
Updated
•
4
•
2
a-F1/aime_2024-DeepSeek-R1-Distill-Qwen-1.5B-best_of_n-prm-completions
Viewer
•
Updated
•
4
•
1
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-Llama3.1-8B-PRM-Deepseek-Data-best_of_n-prm-completions
Updated
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-Llama3.1-8B-PRM-Deepseek-Data-beam_search-prm-completions
Viewer
•
Updated
•
8
•
1
a-F1/DeepSeek-R1-Distill-Qwen-7B-Llama3.1-8B-PRM-Deepseek-Data-best_of_n-prm-completions
Viewer
•
Updated
•
7
•
2
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-best_of_n-prm-completions
Viewer
•
Updated
•
8
•
1
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-dvts-prm-completions
Viewer
•
Updated
•
6
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-beam_search-prm-completions
Viewer
•
Updated
•
6
•
1