Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion Paper • 2605.12825 • Published 4 days ago • 7 • 2
Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs Paper • 2605.12460 • Published 4 days ago • 16 • 2
PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks Paper • 2605.10977 • Published 7 days ago • 9 • 2
LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models Paper • 2605.11011 • Published 6 days ago • 9 • 2
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States Paper • 2605.07579 • Published 8 days ago • 15 • 3
$δ$-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 4 days ago • 107 • 3
SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting Paper • 2605.07243 • Published 8 days ago • 4 • 3
Large Language Models Explore by Latent Distilling Paper • 2604.24927 • Published 19 days ago • 74 • 7
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published 24 days ago • 14 • 5
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published 26 days ago • 93 • 4
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published Apr 1 • 50 • 8