OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration Paper • 2605.28805 • Published 6 days ago • 9
OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration Paper • 2605.28805 • Published 6 days ago • 9
OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration Paper • 2605.28805 • Published 6 days ago • 9
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 13 days ago • 106
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 15 days ago • 112
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published 20 days ago • 86
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis Paper • 2602.03139 • Published Feb 3 • 45
Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars Paper • 2602.01538 • Published Feb 2 • 15
Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars Paper • 2602.01538 • Published Feb 2 • 15
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models Paper • 2601.19834 • Published Jan 27 • 25
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests Paper • 2601.06953 • Published Jan 11 • 47
See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning Paper • 2512.22120 • Published Dec 26, 2025 • 15
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 134
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model Paper • 2510.19871 • Published Oct 22, 2025 • 30