maud-dr commited on
Commit
d67e488
·
verified ·
1 Parent(s): b6812c2

End of training

Browse files
Files changed (2) hide show
  1. README.md +22 -22
  2. model.safetensors +1 -1
README.md CHANGED
@@ -9,21 +9,21 @@ metrics:
9
  - recall
10
  - f1
11
  model-index:
12
- - name: baseline_1-seed_123
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # baseline_1-seed_123
20
 
21
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 3.1634
24
- - Precision: 0.6346
25
- - Recall: 0.6920
26
- - F1: 0.6620
27
 
28
  ## Model description
29
 
@@ -45,7 +45,7 @@ The following hyperparameters were used during training:
45
  - learning_rate: 0.0003
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
48
- - seed: 123
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 15
@@ -54,21 +54,21 @@ The following hyperparameters were used during training:
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
57
- | 0.5175 | 1.0 | 447 | 0.7010 | 0.6054 | 0.5725 | 0.5885 |
58
- | 0.4308 | 2.0 | 894 | 1.0253 | 0.5945 | 0.7065 | 0.6457 |
59
- | 0.417 | 3.0 | 1341 | 1.0984 | 0.5971 | 0.7464 | 0.6634 |
60
- | 0.3336 | 4.0 | 1788 | 1.6736 | 0.5952 | 0.7138 | 0.6491 |
61
- | 0.2283 | 5.0 | 2235 | 1.6788 | 0.6133 | 0.6667 | 0.6389 |
62
- | 0.1776 | 6.0 | 2682 | 1.5558 | 0.5964 | 0.7174 | 0.6513 |
63
- | 0.2152 | 7.0 | 3129 | 2.2730 | 0.6228 | 0.6341 | 0.6284 |
64
- | 0.1292 | 8.0 | 3576 | 2.4796 | 0.6179 | 0.6268 | 0.6223 |
65
- | 0.1486 | 9.0 | 4023 | 2.3853 | 0.6131 | 0.6775 | 0.6437 |
66
- | 0.1032 | 10.0 | 4470 | 2.4573 | 0.6098 | 0.7246 | 0.6623 |
67
- | 0.0678 | 11.0 | 4917 | 3.0114 | 0.6319 | 0.6594 | 0.6454 |
68
- | 0.0335 | 12.0 | 5364 | 3.2091 | 0.6287 | 0.6993 | 0.6621 |
69
- | 0.0627 | 13.0 | 5811 | 3.1270 | 0.6186 | 0.6993 | 0.6565 |
70
- | 0.0814 | 14.0 | 6258 | 3.1125 | 0.6242 | 0.7101 | 0.6644 |
71
- | 0.003 | 15.0 | 6705 | 3.1634 | 0.6346 | 0.6920 | 0.6620 |
72
 
73
 
74
  ### Framework versions
 
9
  - recall
10
  - f1
11
  model-index:
12
+ - name: baseline_1-seed_2025
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # baseline_1-seed_2025
20
 
21
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 3.0203
24
+ - Precision: 0.6238
25
+ - Recall: 0.7029
26
+ - F1: 0.6610
27
 
28
  ## Model description
29
 
 
45
  - learning_rate: 0.0003
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
48
+ - seed: 2025
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 15
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
57
+ | 0.1713 | 1.0 | 447 | 1.5784 | 0.6209 | 0.6884 | 0.6529 |
58
+ | 0.1871 | 2.0 | 894 | 1.6753 | 0.6399 | 0.6630 | 0.6512 |
59
+ | 0.1613 | 3.0 | 1341 | 1.8808 | 0.6084 | 0.6304 | 0.6192 |
60
+ | 0.1367 | 4.0 | 1788 | 2.2017 | 0.6164 | 0.7101 | 0.6599 |
61
+ | 0.1101 | 5.0 | 2235 | 2.2647 | 0.6245 | 0.5725 | 0.5974 |
62
+ | 0.0603 | 6.0 | 2682 | 2.1269 | 0.6280 | 0.6667 | 0.6467 |
63
+ | 0.0642 | 7.0 | 3129 | 2.4185 | 0.5971 | 0.7355 | 0.6591 |
64
+ | 0.0757 | 8.0 | 3576 | 2.3896 | 0.6167 | 0.6703 | 0.6424 |
65
+ | 0.042 | 9.0 | 4023 | 2.1657 | 0.6295 | 0.6341 | 0.6318 |
66
+ | 0.0506 | 10.0 | 4470 | 2.5379 | 0.6254 | 0.7138 | 0.6667 |
67
+ | 0.0184 | 11.0 | 4917 | 2.7895 | 0.6171 | 0.7065 | 0.6588 |
68
+ | 0.0257 | 12.0 | 5364 | 2.9373 | 0.6246 | 0.6993 | 0.6598 |
69
+ | 0.003 | 13.0 | 5811 | 3.1143 | 0.6213 | 0.6775 | 0.6482 |
70
+ | 0.0248 | 14.0 | 6258 | 3.0020 | 0.6287 | 0.6993 | 0.6621 |
71
+ | 0.0002 | 15.0 | 6705 | 3.0203 | 0.6238 | 0.7029 | 0.6610 |
72
 
73
 
74
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:735166f58f139bf7b587b3b073f6f05182843ef2d9a9f722983c0ee5ab56c5b8
3
  size 894020048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:224e214c451580b8c65170eb791251927a48e4da9a51379ba413f60697a4fd09
3
  size 894020048