Running Featured 123 Voxtral Mini Realtime π€ 123 Transcribe speech instantly with realβtime captions
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper β’ 2601.10611 β’ Published Jan 15 β’ 28
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models Paper β’ 2601.21639 β’ Published 17 days ago β’ 49
Runtime error Featured 1.41k Qwen3-TTS Demo π 1.41k Generate speech from text with voice design, cloning, or speakers
LightOnOCR-2 π¦ Collection LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family β’ 12 items β’ Updated 25 days ago β’ 22