Improving Data and Reward Design for Scientific Reasoning in Large Language Models Paper • 2602.08321 • Published 3 days ago • 37
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published 10 days ago • 32
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published Dec 3, 2025 • 154
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection Paper • 2510.18909 • Published Oct 21, 2025 • 5
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23, 2025 • 48