tencent/POINTS-GUI-G
Image-Text-to-Text • 9B • Updated
• 75 • 14
None defined yet.
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
WorldStereo: Bridging Camera-Guided Video Generation and Scene Reconstruction via 3D Geometric Memories