QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published Dec 15, 2025 • 111
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 110
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 Mar 26, 2025 • 188
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2, 2025 • 60
Running 3.76k The Ultra-Scale Playbook 🌌 3.76k The ultimate guide to training LLM on large GPU Clusters