Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation Paper โข 2602.12125 โข Published 3 days ago โข 56
DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation Paper โข 2511.06307 โข Published Nov 9, 2025 โข 53
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Paper โข 2510.14943 โข Published Oct 16, 2025 โข 40
AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators Paper โข 2508.09101 โข Published Aug 12, 2025 โข 8 โข 5
AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators Paper โข 2508.09101 โข Published Aug 12, 2025 โข 8
GanymedeNil/text2vec-large-chinese Sentence Similarity โข 0.3B โข Updated Jun 25, 2024 โข 1.67k โข โข 760
Running on CPU Upgrade 13.9k Open LLM Leaderboard ๐ 13.9k Track, rank and evaluate open LLMs and chatbots