mms-300m-ach-cmu

This model is a fine-tuned version of facebook/mms-300m on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
8.0220	0.7782	100	3.5251	1.0	1.0
5.5662	1.5525	200	2.7620	1.0	1.0
5.6035	2.3268	300	2.7453	1.0	1.0
5.3951	3.1012	400	2.7465	1.0	1.0
5.4394	3.8794	500	2.7119	1.0	1.0
5.4065	4.6537	600	2.7067	1.0	1.0
5.4326	5.4280	700	2.7476	1.0	1.0
5.4003	6.2023	800	2.7058	1.0	1.0
5.3921	6.9805	900	2.7002	1.0	1.0
5.3133	7.7549	1000	2.6550	1.0	1.0
5.2090	8.5292	1100	2.6010	1.0	1.0
4.6105	9.3035	1200	2.2064	1.0	0.8539
3.7426	10.0778	1300	1.7811	0.9603	0.5890
3.3573	10.8560	1400	1.5977	0.9249	0.4581
2.9498	11.6304	1500	1.3662	0.9089	0.4344
2.7720	12.4047	1600	1.2494	0.8562	0.3706
2.4446	13.1790	1700	1.1561	0.8479	0.3548
2.3130	13.9572	1800	1.0855	0.7957	0.3202
2.1652	14.7315	1900	0.9928	0.7609	0.2988
1.9082	15.5058	2000	0.9006	0.6800	0.2568
1.7134	16.2802	2100	0.8148	0.6379	0.2440
1.5725	17.0545	2200	0.7553	0.5540	0.2076
1.4688	17.8327	2300	0.7227	0.5331	0.1954
1.2945	18.6070	2400	0.6939	0.4838	0.1786
1.1473	19.3813	2500	0.6762	0.4952	0.1862
1.0918	20.1556	2600	0.6794	0.4655	0.1743
1.1155	20.9339	2700	0.6808	0.4632	0.1706
0.9617	21.7082	2800	0.7082	0.4716	0.1720

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

(59)

this model