UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling Paper β’ 2604.19734 β’ Published 17 days ago β’ 29
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts Paper β’ 2507.20939 β’ Published Jul 28, 2025 β’ 57
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper β’ 2506.16141 β’ Published Jun 19, 2025 β’ 27
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper β’ 2506.16141 β’ Published Jun 19, 2025 β’ 27
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper β’ 2506.03126 β’ Published Jun 3, 2025 β’ 22
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? Paper β’ 2505.21374 β’ Published May 27, 2025 β’ 28
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? Paper β’ 2505.21374 β’ Published May 27, 2025 β’ 28
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper β’ 2504.01014 β’ Published Apr 1, 2025 β’ 70
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper β’ 2504.01014 β’ Published Apr 1, 2025 β’ 70
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper β’ 2503.24376 β’ Published Mar 31, 2025 β’ 38
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper β’ 2503.24376 β’ Published Mar 31, 2025 β’ 38
GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers Paper β’ 2503.19480 β’ Published Mar 25, 2025 β’ 16
GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers Paper β’ 2503.19480 β’ Published Mar 25, 2025 β’ 16
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper β’ 2412.04432 β’ Published Dec 5, 2024 β’ 16
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation Paper β’ 2412.04445 β’ Published Dec 5, 2024 β’ 22
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation Paper β’ 2409.04410 β’ Published Sep 6, 2024 β’ 25