GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity Paper • 2607.00152 • Published 5 days ago • 5
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
WARP: Weight-Space Analysis for Recovering Training Data Portfolios Paper • 2607.01686 • Published 3 days ago • 5
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
Denser neq Better: Limits of On-Policy Self-Distillation for Continual Post-Training Paper • 2607.01763 • Published 3 days ago • 5
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
DuoMem: Towards Capable On-Device Memory Agents via Dual-Space Distillation Paper • 2606.29961 • Published 6 days ago • 5
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
ReFreeKV: Towards Threshold-Free KV Cache Compression Paper • 2502.16886 • Published 9 days ago • 47
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
BlockPilot: Instance-Adaptive Policy Learning for Diffusion-based Speculative Decoding Paper • 2606.31315 • Published 5 days ago • 73
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs Paper • 2606.32032 • Published 5 days ago • 22
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46
AsyncOPD: How Stale Can On-Policy Distillation Be? Paper • 2606.24143 • Published 12 days ago • 29
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 191 items • Updated about 13 hours ago • 46