VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding Paper • 2606.05259 • Published 6 days ago • 35
LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing Paper • 2606.06042 • Published 5 days ago • 24
DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory Paper • 2605.31336 • Published 11 days ago • 12
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper • 2605.30263 • Published 12 days ago • 57
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV Paper • 2605.26244 • Published 15 days ago • 38
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV Paper • 2605.26244 • Published 15 days ago • 38
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV Paper • 2605.26244 • Published 15 days ago • 38
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning Paper • 2605.22012 • Published 19 days ago • 46
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning Paper • 2605.22012 • Published 19 days ago • 46
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos Paper • 2605.18984 • Published 22 days ago • 22
MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation Paper • 2605.20183 • Published 21 days ago • 14
MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation Paper • 2605.20183 • Published 21 days ago • 14
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos Paper • 2605.18984 • Published 22 days ago • 22
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos Paper • 2605.18984 • Published 22 days ago • 22