Dr. Zero: Self-Evolving Search Agents without Training Data Paper • 2601.07055 • Published 9 days ago • 17
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models Paper • 2503.04813 • Published Mar 4, 2025 • 2
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 189