End of training

6190323 verified 3 months ago

4.91 kB

	---
	library_name: peft
	license: llama3
	base_model: aaditya/Llama3-OpenBioLLM-8B
	tags:
	- llama-factory
	- lora
	- generated_from_trainer
	model-index:
	- name: Llama3-OpenBioLLM-8B-PsyCourse-fold8
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Llama3-OpenBioLLM-8B-PsyCourse-fold8

	This model is a fine-tuned version of [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B) on the course-train-fold8 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0360

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 16
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 5.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 0.5651 \| 0.0758 \| 50 \| 0.3454 \|
	\| 0.159 \| 0.1516 \| 100 \| 0.0916 \|
	\| 0.0775 \| 0.2275 \| 150 \| 0.0660 \|
	\| 0.0565 \| 0.3033 \| 200 \| 0.0587 \|
	\| 0.0563 \| 0.3791 \| 250 \| 0.0594 \|
	\| 0.0631 \| 0.4549 \| 300 \| 0.0575 \|
	\| 0.0677 \| 0.5308 \| 350 \| 0.0503 \|
	\| 0.0366 \| 0.6066 \| 400 \| 0.0471 \|
	\| 0.0397 \| 0.6824 \| 450 \| 0.0430 \|
	\| 0.0383 \| 0.7582 \| 500 \| 0.0479 \|
	\| 0.0508 \| 0.8340 \| 550 \| 0.0427 \|
	\| 0.0346 \| 0.9099 \| 600 \| 0.0434 \|
	\| 0.0513 \| 0.9857 \| 650 \| 0.0444 \|
	\| 0.0339 \| 1.0615 \| 700 \| 0.0417 \|
	\| 0.0296 \| 1.1373 \| 750 \| 0.0442 \|
	\| 0.0288 \| 1.2132 \| 800 \| 0.0397 \|
	\| 0.0299 \| 1.2890 \| 850 \| 0.0421 \|
	\| 0.0293 \| 1.3648 \| 900 \| 0.0401 \|
	\| 0.0278 \| 1.4406 \| 950 \| 0.0393 \|
	\| 0.0283 \| 1.5164 \| 1000 \| 0.0405 \|
	\| 0.0493 \| 1.5923 \| 1050 \| 0.0393 \|
	\| 0.0287 \| 1.6681 \| 1100 \| 0.0392 \|
	\| 0.0383 \| 1.7439 \| 1150 \| 0.0379 \|
	\| 0.0312 \| 1.8197 \| 1200 \| 0.0378 \|
	\| 0.0353 \| 1.8956 \| 1250 \| 0.0379 \|
	\| 0.0242 \| 1.9714 \| 1300 \| 0.0360 \|
	\| 0.0176 \| 2.0472 \| 1350 \| 0.0413 \|
	\| 0.0132 \| 2.1230 \| 1400 \| 0.0386 \|
	\| 0.0224 \| 2.1988 \| 1450 \| 0.0413 \|
	\| 0.0198 \| 2.2747 \| 1500 \| 0.0423 \|
	\| 0.0191 \| 2.3505 \| 1550 \| 0.0429 \|
	\| 0.017 \| 2.4263 \| 1600 \| 0.0412 \|
	\| 0.0194 \| 2.5021 \| 1650 \| 0.0465 \|
	\| 0.0178 \| 2.5780 \| 1700 \| 0.0439 \|
	\| 0.0238 \| 2.6538 \| 1750 \| 0.0411 \|
	\| 0.0181 \| 2.7296 \| 1800 \| 0.0414 \|
	\| 0.0128 \| 2.8054 \| 1850 \| 0.0439 \|
	\| 0.0287 \| 2.8812 \| 1900 \| 0.0410 \|
	\| 0.0202 \| 2.9571 \| 1950 \| 0.0418 \|
	\| 0.011 \| 3.0329 \| 2000 \| 0.0430 \|
	\| 0.005 \| 3.1087 \| 2050 \| 0.0487 \|
	\| 0.0045 \| 3.1845 \| 2100 \| 0.0502 \|
	\| 0.0072 \| 3.2604 \| 2150 \| 0.0496 \|
	\| 0.0098 \| 3.3362 \| 2200 \| 0.0482 \|
	\| 0.0089 \| 3.4120 \| 2250 \| 0.0492 \|
	\| 0.0072 \| 3.4878 \| 2300 \| 0.0486 \|
	\| 0.0116 \| 3.5636 \| 2350 \| 0.0496 \|
	\| 0.0094 \| 3.6395 \| 2400 \| 0.0489 \|
	\| 0.0055 \| 3.7153 \| 2450 \| 0.0501 \|
	\| 0.0095 \| 3.7911 \| 2500 \| 0.0529 \|
	\| 0.0113 \| 3.8669 \| 2550 \| 0.0517 \|
	\| 0.0042 \| 3.9428 \| 2600 \| 0.0518 \|
	\| 0.0021 \| 4.0186 \| 2650 \| 0.0539 \|
	\| 0.0027 \| 4.0944 \| 2700 \| 0.0573 \|
	\| 0.0017 \| 4.1702 \| 2750 \| 0.0590 \|
	\| 0.0033 \| 4.2460 \| 2800 \| 0.0603 \|
	\| 0.003 \| 4.3219 \| 2850 \| 0.0618 \|
	\| 0.0013 \| 4.3977 \| 2900 \| 0.0623 \|
	\| 0.003 \| 4.4735 \| 2950 \| 0.0625 \|
	\| 0.0036 \| 4.5493 \| 3000 \| 0.0631 \|
	\| 0.0017 \| 4.6252 \| 3050 \| 0.0634 \|
	\| 0.0023 \| 4.7010 \| 3100 \| 0.0635 \|
	\| 0.0028 \| 4.7768 \| 3150 \| 0.0635 \|
	\| 0.0028 \| 4.8526 \| 3200 \| 0.0637 \|
	\| 0.0021 \| 4.9284 \| 3250 \| 0.0636 \|


	### Framework versions

	- PEFT 0.12.0
	- Transformers 4.46.1
	- Pytorch 2.5.1+cu124
	- Datasets 3.1.0
	- Tokenizers 0.20.3

	---
	library_name: peft
	license: llama3
	base_model: aaditya/Llama3-OpenBioLLM-8B
	tags:
	- llama-factory
	- lora
	- generated_from_trainer
	model-index:
	- name: Llama3-OpenBioLLM-8B-PsyCourse-fold8
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Llama3-OpenBioLLM-8B-PsyCourse-fold8

	This model is a fine-tuned version of [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B) on the course-train-fold8 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0360

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 16
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 5.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 0.5651 \| 0.0758 \| 50 \| 0.3454 \|
	\| 0.159 \| 0.1516 \| 100 \| 0.0916 \|
	\| 0.0775 \| 0.2275 \| 150 \| 0.0660 \|
	\| 0.0565 \| 0.3033 \| 200 \| 0.0587 \|
	\| 0.0563 \| 0.3791 \| 250 \| 0.0594 \|
	\| 0.0631 \| 0.4549 \| 300 \| 0.0575 \|
	\| 0.0677 \| 0.5308 \| 350 \| 0.0503 \|
	\| 0.0366 \| 0.6066 \| 400 \| 0.0471 \|
	\| 0.0397 \| 0.6824 \| 450 \| 0.0430 \|
	\| 0.0383 \| 0.7582 \| 500 \| 0.0479 \|
	\| 0.0508 \| 0.8340 \| 550 \| 0.0427 \|
	\| 0.0346 \| 0.9099 \| 600 \| 0.0434 \|
	\| 0.0513 \| 0.9857 \| 650 \| 0.0444 \|
	\| 0.0339 \| 1.0615 \| 700 \| 0.0417 \|
	\| 0.0296 \| 1.1373 \| 750 \| 0.0442 \|
	\| 0.0288 \| 1.2132 \| 800 \| 0.0397 \|
	\| 0.0299 \| 1.2890 \| 850 \| 0.0421 \|
	\| 0.0293 \| 1.3648 \| 900 \| 0.0401 \|
	\| 0.0278 \| 1.4406 \| 950 \| 0.0393 \|
	\| 0.0283 \| 1.5164 \| 1000 \| 0.0405 \|
	\| 0.0493 \| 1.5923 \| 1050 \| 0.0393 \|
	\| 0.0287 \| 1.6681 \| 1100 \| 0.0392 \|
	\| 0.0383 \| 1.7439 \| 1150 \| 0.0379 \|
	\| 0.0312 \| 1.8197 \| 1200 \| 0.0378 \|
	\| 0.0353 \| 1.8956 \| 1250 \| 0.0379 \|
	\| 0.0242 \| 1.9714 \| 1300 \| 0.0360 \|
	\| 0.0176 \| 2.0472 \| 1350 \| 0.0413 \|
	\| 0.0132 \| 2.1230 \| 1400 \| 0.0386 \|
	\| 0.0224 \| 2.1988 \| 1450 \| 0.0413 \|
	\| 0.0198 \| 2.2747 \| 1500 \| 0.0423 \|
	\| 0.0191 \| 2.3505 \| 1550 \| 0.0429 \|
	\| 0.017 \| 2.4263 \| 1600 \| 0.0412 \|
	\| 0.0194 \| 2.5021 \| 1650 \| 0.0465 \|
	\| 0.0178 \| 2.5780 \| 1700 \| 0.0439 \|
	\| 0.0238 \| 2.6538 \| 1750 \| 0.0411 \|
	\| 0.0181 \| 2.7296 \| 1800 \| 0.0414 \|
	\| 0.0128 \| 2.8054 \| 1850 \| 0.0439 \|
	\| 0.0287 \| 2.8812 \| 1900 \| 0.0410 \|
	\| 0.0202 \| 2.9571 \| 1950 \| 0.0418 \|
	\| 0.011 \| 3.0329 \| 2000 \| 0.0430 \|
	\| 0.005 \| 3.1087 \| 2050 \| 0.0487 \|
	\| 0.0045 \| 3.1845 \| 2100 \| 0.0502 \|
	\| 0.0072 \| 3.2604 \| 2150 \| 0.0496 \|
	\| 0.0098 \| 3.3362 \| 2200 \| 0.0482 \|
	\| 0.0089 \| 3.4120 \| 2250 \| 0.0492 \|
	\| 0.0072 \| 3.4878 \| 2300 \| 0.0486 \|
	\| 0.0116 \| 3.5636 \| 2350 \| 0.0496 \|
	\| 0.0094 \| 3.6395 \| 2400 \| 0.0489 \|
	\| 0.0055 \| 3.7153 \| 2450 \| 0.0501 \|
	\| 0.0095 \| 3.7911 \| 2500 \| 0.0529 \|
	\| 0.0113 \| 3.8669 \| 2550 \| 0.0517 \|
	\| 0.0042 \| 3.9428 \| 2600 \| 0.0518 \|
	\| 0.0021 \| 4.0186 \| 2650 \| 0.0539 \|
	\| 0.0027 \| 4.0944 \| 2700 \| 0.0573 \|
	\| 0.0017 \| 4.1702 \| 2750 \| 0.0590 \|
	\| 0.0033 \| 4.2460 \| 2800 \| 0.0603 \|
	\| 0.003 \| 4.3219 \| 2850 \| 0.0618 \|
	\| 0.0013 \| 4.3977 \| 2900 \| 0.0623 \|
	\| 0.003 \| 4.4735 \| 2950 \| 0.0625 \|
	\| 0.0036 \| 4.5493 \| 3000 \| 0.0631 \|
	\| 0.0017 \| 4.6252 \| 3050 \| 0.0634 \|
	\| 0.0023 \| 4.7010 \| 3100 \| 0.0635 \|
	\| 0.0028 \| 4.7768 \| 3150 \| 0.0635 \|
	\| 0.0028 \| 4.8526 \| 3200 \| 0.0637 \|
	\| 0.0021 \| 4.9284 \| 3250 \| 0.0636 \|


	### Framework versions

	- PEFT 0.12.0
	- Transformers 4.46.1
	- Pytorch 2.5.1+cu124
	- Datasets 3.1.0
	- Tokenizers 0.20.3