YoCausal: How Far is Video Generation from World Model? A Causality Perspective Paper • 2605.30346 • Published 1 day ago • 30
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions Paper • 2605.27141 • Published 4 days ago • 14
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling Paper • 2310.04691 • Published Oct 7, 2023 • 3
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 4 days ago • 115
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 5 days ago • 97
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published 30 days ago • 76
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published 23 days ago • 52
A Benchmark for Interactive World Models with a Unified Action Generation Framework Paper • 2605.03941 • Published 25 days ago • 5
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published Apr 23 • 36
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 163
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published Apr 10 • 51
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 114
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published Mar 26 • 53
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing Paper • 2603.19224 • Published Mar 19 • 18