whisper-base-it-multi
Fine-tuned openai/whisper-base (74M params) for Italian ASR on multiple datasets.
Author: Ettore Di Giacinto
Brought to you by the LocalAI team. This model can be used directly with LocalAI.
Results
Evaluated on combined test set (Common Voice + MLS + VoxPopuli, 17,598 samples):
| Step | WER |
|---|---|
| 1000 | 28.48% |
| 2000 | 26.73% |
| 3000 | 24.45% |
| 5000 | 23.26% |
| 7000 | 21.79% |
| 10000 | 21.4% |
Training Details
- Base model: openai/whisper-base (74M parameters)
- Datasets: Common Voice 25.0 Italian (173k) + MLS Italian (60k) + VoxPopuli Italian (23k) = 255k train samples
- Steps: 10,000
- Precision: bf16 on NVIDIA GB10
Usage
Transformers
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-base-it-multi")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])
CTranslate2 / faster-whisper
For optimized CPU inference: LocalAI-io/whisper-base-it-multi-ct2-int8
Links
- CTranslate2 INT8: LocalAI-io/whisper-base-it-multi-ct2-int8
- Code: github.com/localai-org/whisper-it
- LocalAI: github.com/mudler/LocalAI
- Downloads last month
- 23
Model tree for LocalAI-io/whisper-base-it-multi
Base model
openai/whisper-base