calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.7828	1.0	6	2.0687
1.7850	2.0	12	1.3799
1.1680	3.0	18	0.9251
0.8524	4.0	24	0.8014
0.7640	5.0	30	0.7196
0.6746	6.0	36	0.6506
0.5984	7.0	42	0.5609
0.5341	8.0	48	0.5140
0.4857	9.0	54	0.4675
0.4357	10.0	60	0.4175
0.3966	11.0	66	0.3690
0.3566	12.0	72	0.3337
0.3136	13.0	78	0.2890
0.2728	14.0	84	0.2274
0.2186	15.0	90	0.1968
0.1854	16.0	96	0.1546
0.1514	17.0	102	0.1240
0.1256	18.0	108	0.0980
0.1096	19.0	114	0.0838
0.0973	20.0	120	0.0623
0.0709	21.0	126	0.0514
0.0586	22.0	132	0.0419
0.0513	23.0	138	0.0342
0.0414	24.0	144	0.0265
0.0354	25.0	150	0.0225
0.0316	26.0	156	0.0193
0.0279	27.0	162	0.0161
0.0254	28.0	168	0.0156
0.0228	29.0	174	0.0130
0.0202	30.0	180	0.0111
0.0176	31.0	186	0.0104
0.0169	32.0	192	0.0094
0.0146	33.0	198	0.0092
0.0141	34.0	204	0.0085
0.0150	35.0	210	0.0083
0.0135	36.0	216	0.0080
0.0144	37.0	222	0.0078
0.0134	38.0	228	0.0076
0.0129	39.0	234	0.0074
0.0120	40.0	240	0.0074

Safetensors

Model size

7.79M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support