mms-300m-ach-cmu

This model is a fine-tuned version of facebook/mms-300m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7082
  • Wer: 0.4716
  • Cer: 0.1720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
8.0220 0.7782 100 3.5251 1.0 1.0
5.5662 1.5525 200 2.7620 1.0 1.0
5.6035 2.3268 300 2.7453 1.0 1.0
5.3951 3.1012 400 2.7465 1.0 1.0
5.4394 3.8794 500 2.7119 1.0 1.0
5.4065 4.6537 600 2.7067 1.0 1.0
5.4326 5.4280 700 2.7476 1.0 1.0
5.4003 6.2023 800 2.7058 1.0 1.0
5.3921 6.9805 900 2.7002 1.0 1.0
5.3133 7.7549 1000 2.6550 1.0 1.0
5.2090 8.5292 1100 2.6010 1.0 1.0
4.6105 9.3035 1200 2.2064 1.0 0.8539
3.7426 10.0778 1300 1.7811 0.9603 0.5890
3.3573 10.8560 1400 1.5977 0.9249 0.4581
2.9498 11.6304 1500 1.3662 0.9089 0.4344
2.7720 12.4047 1600 1.2494 0.8562 0.3706
2.4446 13.1790 1700 1.1561 0.8479 0.3548
2.3130 13.9572 1800 1.0855 0.7957 0.3202
2.1652 14.7315 1900 0.9928 0.7609 0.2988
1.9082 15.5058 2000 0.9006 0.6800 0.2568
1.7134 16.2802 2100 0.8148 0.6379 0.2440
1.5725 17.0545 2200 0.7553 0.5540 0.2076
1.4688 17.8327 2300 0.7227 0.5331 0.1954
1.2945 18.6070 2400 0.6939 0.4838 0.1786
1.1473 19.3813 2500 0.6762 0.4952 0.1862
1.0918 20.1556 2600 0.6794 0.4655 0.1743
1.1155 20.9339 2700 0.6808 0.4632 0.1706
0.9617 21.7082 2800 0.7082 0.4716 0.1720

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
417
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for waxal-benchmarking/mms-300m-ach-cmu

Finetuned
(59)
this model