Spaces:

Pendrokar
/

xVASynth-TTS

Running on CPU Upgrade

Pendrokar commited on 15 days ago

Commit

bd394e1

verified ·

1 Parent(s): 51b815d

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -11,6 +11,7 @@ models:
 - Pendrokar/xvapitch_expresso
 - Pendrokar/TorchMoji
 - Pendrokar/xvasynth_lojban
 app_file: app.py
 app_port: 7860
 tags:
@@ -30,4 +31,23 @@ thumbnail: https://huggingface.co/spaces/Pendrokar/xVASynth/raw/main/thumbnail.p
 short_description: CPU powered, low RTF, emotional, multilingual TTS
 ---
-DanRuta's xVASynth, GitHub repo: [https://github.com/DanRuta/xVA-Synth](https://github.com/DanRuta/xVA-Synth)

 - Pendrokar/xvapitch_expresso
 - Pendrokar/TorchMoji
 - Pendrokar/xvasynth_lojban
+- Pendrokar/xvasynth_cabal
 app_file: app.py
 app_port: 7860
 tags:
 short_description: CPU powered, low RTF, emotional, multilingual TTS
 ---
+DanRuta's xVASynth, GitHub repo: [https://github.com/DanRuta/xVA-Synth](https://github.com/DanRuta/xVA-Synth)
+Papers:
+- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech - https://arxiv.org/abs/2106.06103
+- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418
+Referenced papers within code:
+- Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
+- Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
+- SDP - https://arxiv.org/pdf/2106.06103.pdf
+- Spline Flow - https://arxiv.org/abs/1906.04032
+Extra:
+- DeepMoji - https://arxiv.org/abs/1708.00524
+xVA FastPitch:
+- [1] [FastPitch: Parallel Text-to-speech with Pitch Prediction](https://arxiv.org/abs/2006.06873)
+- [2] [One TTS Alignment To Rule Them All](https://arxiv.org/abs/2108.10447)
+Used datasets: Unknown/Non-permissiable data