Running on Zero Agents Featured 131 Qwen3-ASR Demo π 131 Transcribe audio to text with multi-language timestamps
Running on Zero Agents Featured 1.77k Dia 1.6B π― 1.77k Generate realistic dialogue from a script, using Dia!
pyannote/speaker-diarization-3.1 Automatic Speech Recognition β’ Updated May 10, 2024 β’ 10.3M β’ 1.77k
Running on Zero Agents Featured 2.08k PuLID-FLUX π€ 2.08k Generate custom images from text and a reference photo
Running on CPU Upgrade Agents 1.01k Open VLM Leaderboard π 1.01k VLMEvalKit Evaluation Results Collection
Running Agents Featured 2.07k Wan2.1 π» 2.07k Wan: Open and Advanced Large-Scale Video Generative Models
MattyB95/AST-VoxCelebSpoof-Synthetic-Voice-Detection Audio Classification β’ 86.2M β’ Updated Jan 31, 2024 β’ 122k β’ 4
Running on Zero Agents Featured 5.06k FLUX.1 [Schnell] π 5.06k Generate images from text prompts with FLUX.1 Schnell
Running on L4 Agents Featured 725 StyleTTS 2 π£ 725 Efficient, fast, and natural text to speech with StyleTTS 2!
Configuration error Agents Featured 178 NaturalSpeech3 FACodec π 178 Convert and reconstruct speech files