Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 5 days ago • 270
Graph-Based Chain-of-Thought Pruning for Reducing Redundant Reflections in Reasoning LLMs Paper • 2604.05643 • Published 6 days ago • 9
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 10 days ago • 348
UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems Paper • 2604.00590 • Published 12 days ago • 8
TAPS: Task Aware Proposal Distributions for Speculative Sampling Paper • 2603.27027 • Published 16 days ago • 141