🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement.
Shawn
csfufu
AI & ML interests
None yet
Recent Activity
authored a paper about 10 hours ago
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis authored a paper about 10 hours ago
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? upvoted a paper 1 day ago
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?