Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 7 days ago • 82
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 13 days ago • 94
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 13 days ago • 112
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 8 days ago • 139
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 18 days ago • 363
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning Paper • 2512.02551 • Published Dec 2, 2025 • 13
CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search Paper • 2508.02091 • Published Aug 4, 2025 • 13
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning Paper • 2507.14111 • Published Jul 18, 2025 • 25