phi4-lora-xaji0y6d-1742330134

This model is a fine-tuned version of microsoft/Phi-4-mini-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0010
  • Perplexity: 2.7209

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Perplexity
5.6626 1.48 10 5.8212 337.3485
5.4363 2.96 20 5.4409 230.6381
5.2185 4.32 30 5.2027 181.7434
4.9729 5.8 40 4.9270 137.9507
4.68 7.16 50 4.6071 100.1871
4.3242 8.64 60 4.2787 72.1430
4.0147 10.0 70 3.9536 52.1171
3.7066 11.48 80 3.6597 38.8469
3.3654 12.96 90 3.3835 29.4712
3.1883 14.32 100 3.1183 22.6075
2.8444 15.8 110 2.8578 17.4224
2.6168 17.16 120 2.6088 13.5819
2.3689 18.64 130 2.3749 10.7493
2.1379 20.0 140 2.1532 8.6119
1.8909 21.48 150 1.9458 6.9986
1.7022 22.96 160 1.7602 5.8135
1.5127 24.32 170 1.6061 4.9831
1.3942 25.8 180 1.4847 4.4133
1.3053 27.16 190 1.3923 4.0240
1.2177 28.64 200 1.3193 3.7405
1.1161 30.0 210 1.2557 3.5101
1.1293 31.48 220 1.2023 3.3275
1.0622 32.96 230 1.1562 3.1778
1.015 34.32 240 1.1164 3.0536
0.9539 35.8 250 1.0830 2.9533
0.9387 37.16 260 1.0552 2.8725
0.8819 38.64 270 1.0340 2.8121
0.9162 40.0 280 1.0178 2.7670
0.8912 41.48 290 1.0074 2.7384
0.8641 42.96 300 1.0010 2.7209

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.2
  • Pytorch 2.1.0+cu118
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Swephoenix/phi4-lora-xaji0y6d-1742330134

Adapter
(40)
this model