mcdc-test-phi_mini_1

This model is a fine-tuned version of microsoft/Phi-4-mini-instruct on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 15
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Perplexity
1.1104	0.6579	50	0.9184	2.5052
0.8034	1.3158	100	0.6000	1.8221
0.5196	1.9737	150	0.4776	1.6121
0.3217	2.6316	200	0.4401	1.5529
0.5679	3.2895	250	0.4143	1.5134
0.4583	3.9474	300	0.3560	1.4276
0.4572	4.6053	350	0.3440	1.4106
0.1729	5.2632	400	0.3382	1.4024
0.4072	5.9211	450	0.3295	1.3902
0.5669	6.5789	500	0.3296	1.3904
0.3213	7.2368	550	0.3301	1.3911
0.2435	7.8947	600	0.3218	1.3795
0.1889	8.5526	650	0.3264	1.3859
0.1813	9.2105	700	0.3247	1.3836
0.3983	9.8684	750	0.3212	1.3787
0.4828	10.5263	800	0.3180	1.3743
0.2931	11.1842	850	0.3153	1.3706
0.2421	11.8421	900	0.3148	1.3700
0.331	12.5	950	0.3170	1.3730
0.0932	13.1579	1000	0.3156	1.3711
0.289	13.8158	1050	0.3106	1.3643
0.1009	14.4737	1100	0.3141	1.3691