calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
4.0125	1.0	1	3.6878
3.6498	2.0	2	3.5295
3.4629	3.0	3	3.3332
3.2765	4.0	4	3.1663
3.1127	5.0	5	3.0043
2.9448	6.0	6	2.8323
2.7697	7.0	7	2.6876
2.6241	8.0	8	2.5433
2.4805	9.0	9	2.4253
2.3517	10.0	10	2.3125
2.2358	11.0	11	2.2009
2.1403	12.0	12	2.1837
2.1043	13.0	13	2.0344
1.9684	14.0	14	1.9755
1.9150	15.0	15	1.9030
1.8323	16.0	16	1.8490
1.7779	17.0	17	1.8000
1.7274	18.0	18	1.7390
1.6730	19.0	19	1.6942
1.6266	20.0	20	1.6702
1.5963	21.0	21	1.6459
1.5845	22.0	22	1.6264
1.5394	23.0	23	1.6115
1.5139	24.0	24	1.5931
1.5036	25.0	25	1.5726
1.4759	26.0	26	1.5746
1.4579	27.0	27	1.5542
1.4363	28.0	28	1.5278
1.4208	29.0	29	1.5133
1.4009	30.0	30	1.5193
1.3886	31.0	31	1.5103
1.3856	32.0	32	1.4881
1.3618	33.0	33	1.4763
1.3572	34.0	34	1.4638
1.3401	35.0	35	1.4597
1.3332	36.0	36	1.4534
1.3307	37.0	37	1.4416
1.3191	38.0	38	1.4326
1.3106	39.0	39	1.4282
1.3145	40.0	40	1.4265

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support