Eric Bezzam PRO
AI & ML interests
speech, audio, imaging
Recent Activity
new activity
about 3 hours ago
microsoft/VibeVoice-ASR:VibeVoice ASR is part of Transformers from v5.3.0 updated
a model 3 days ago
bezzam/parakeet-tdt-0.6b-v3-hf upvoted an article 3 days ago
Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines Organizations
Omnilingual ASR (1,600+ Languages)
https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/
- Running on A100236
Omnilingual ASR Media Transcription
🌍236Transcribe audio/video to text in many languages
-
facebook/omnilingual-asr-corpus
Viewer • Updated • 548k • 1.33k • 194 -
facebook/omniASR-CTC-300M
Automatic Speech Recognition • Updated • 10 -
facebook/omniASR-CTC-1B
Automatic Speech Recognition • Updated • 4
Speech recognition datasets
DigiCam (CelebA)
Models for DigiCam trained on the CelebA 26K dataset.
VibeVoice
Neural codecs
Omnilingual ASR (1,600+ Languages)
https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/
- Running on A100236
Omnilingual ASR Media Transcription
🌍236Transcribe audio/video to text in many languages
-
facebook/omnilingual-asr-corpus
Viewer • Updated • 548k • 1.33k • 194 -
facebook/omniASR-CTC-300M
Automatic Speech Recognition • Updated • 10 -
facebook/omniASR-CTC-1B
Automatic Speech Recognition • Updated • 4
Multimodel audio
Speech recognition datasets
Text-to-speech datasets
DigiCam (CelebA)
Models for DigiCam trained on the CelebA 26K dataset.
DiffuserCam Mirflickr
Models for the paper "A modular and robust physics-based approach for lensless image reconstruction"