Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published 2 days ago • 66
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 7 days ago • 149
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published 9 days ago • 62
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 14 days ago • 113
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published 27 days ago • 51
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Paper • 2603.22117 • Published 30 days ago • 29
WorldCache: Content-Aware Caching for Accelerated Video World Models Paper • 2603.22286 • Published 29 days ago • 4
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published Mar 17 • 109
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published Mar 17 • 139
XSkill: Continual Learning from Experience and Skills in Multimodal Agents Paper • 2603.12056 • Published Mar 12 • 33
WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics Paper • 2603.13391 • Published Mar 11 • 19
\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published Mar 9 • 27
Proact-VL: A Proactive VideoLLM for Real-Time AI Companions Paper • 2603.03447 • Published Mar 3 • 37
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published Feb 26 • 45
How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities Paper • 2603.02578 • Published Mar 3 • 25