view article Article Releasing the largest multilingual open pretraining dataset Pclanglais • Nov 13, 2024 • 107
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published 5 days ago • 22
SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting Paper • 2605.07243 • Published 5 days ago • 3 • 3
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 126 items • Updated 2 days ago • 19
SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting Paper • 2605.07243 • Published 5 days ago • 3
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention Paper • 2605.05838 • Published 6 days ago • 4
UniPrefill: Universal Long-Context Prefill Acceleration via Block-wise Dynamic Sparsification Paper • 2605.06221 • Published 6 days ago • 20
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 126 items • Updated 2 days ago • 19
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models Paper • 2605.06597 • Published 6 days ago • 12
MISA: Mixture of Indexer Sparse Attention for Long-Context LLM Inference Paper • 2605.07363 • Published 5 days ago • 12
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 126 items • Updated 2 days ago • 19
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published Jul 1, 2024 • 41
SD-E^2: Semantic Exploration for Reasoning Under Token Budgets Paper • 2601.17982 • Published Jan 25 • 1
Reasoning Path Divergence: A New Metric and Curation Strategy to Unlock LLM Diverse Thinking Paper • 2510.26122 • Published Jan 4 • 1