speecht5_vidhigrwl

This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 2000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.5395	1.1913	100	0.4787
0.485	2.3827	200	0.4363
0.463	3.5740	300	0.4237
0.4527	4.7653	400	0.4117
0.4437	5.9567	500	0.4078
0.43	7.1435	600	0.4057
0.4274	8.3348	700	0.3991
0.4294	9.5262	800	0.3978
0.42	10.7175	900	0.3968
0.4175	11.9088	1000	0.3953
0.4108	13.0957	1100	0.3935
0.4009	14.2870	1200	0.3947
0.4075	15.4783	1300	0.3950
0.4059	16.6697	1400	0.3896
0.4012	17.8610	1500	0.3912
0.3981	19.0478	1600	0.3920
0.3929	20.2392	1700	0.3938
0.3959	21.4305	1800	0.3935
0.3911	22.6218	1900	0.3942
0.3929	23.8132	2000	0.3929