🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5, 2025 • 251
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published 11 days ago • 77
Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV Paper • 2506.13001 • Published Jun 16, 2025 • 2
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published Oct 7, 2025 • 111
The Well Collection A 15TB collection of physics simulation datasets. • 18 items • Updated Mar 24, 2025 • 52
DLM-Scope Collection Sparse Autoencoders of Diffusion Language Models (Dream-7B, LLaDA-8B) and Large Language Models (Qwen-2.5-7B, LLaMA-3-8B) • 6 items • Updated Feb 5 • 7
Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data Paper • 2602.21320 • Published Feb 24 • 12
Tool-R0 Collection Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data (https://arxiv.org/pdf/2602.21320) • 5 items • Updated Mar 3 • 2
view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day sionic-ai • Dec 8, 2025 • 57
Waypoint-1 Collection The first real time diffusion world model designed for consumer hardware • 3 items • Updated Jan 30 • 8
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models Paper • 2601.14004 • Published Jan 20 • 48
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Paper • 2402.14740 • Published Feb 22, 2024 • 18
ThinkPRM Collection Process Reward Models that Think -- https://arxiv.org/abs/2504.16828 • 8 items • Updated Jul 29, 2025 • 6
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 514