Fardan/llama3.2-1b-alpha_rank_128_64_reasoning_instruct_1k_steps_merged Text Generation • 1B • Updated 3 days ago • 44
Fardan/Qwen2.5-1.5B-Instruct-DPO-Human-Like-DPO-Dataset Text Generation • 2B • Updated 18 days ago • 24