--- library_name: transformers license: cc datasets: - atlithor/talromur3_without_emotions language: - is base_model: - parler-tts/parler-tts-mini-multilingual-v1.1 pipeline_tag: text-to-speech --- # Model Card for RepeaTTS-level-2 See [Emotive Icelandic](https://huggingface.co/atlithor/EmotiveIcelandic) for more information about this model and the data that it is trained on. The RepeaTTS series is trained on the same data as Emotive Icelandic, but without emotive content disclosure. This model, level-2, corresponds to a model with a refined subset of the original training corpus. The model can be, additionally, prompted with the target setting of _voice intensity_: - low intensity: voice is low expressive - medium intensity: voice is somewhat expressive - high intensity: voice is very expressive ## Usage Use the code below to get started with the model. ```py import torch from parler_tts import ParlerTTSForConditionalGeneration from transformers import AutoTokenizer import soundfile as sf device = "cuda:0" if torch.cuda.is_available() else "cpu" model = ParlerTTSForConditionalGeneration.from_pretrained("atlithor/RepeaTTS-level-2").to(device) tokenizer = AutoTokenizer.from_pretrained("atlithor/EmotiveIcelandic") description_tokenizer = AutoTokenizer.from_pretrained(model.config.text_encoder._name_or_path) prompt = "Þetta er frábær hugmynd!" # E: this is a great idea! description = "The recording is of very high quality, with Ingrid's voice sounding clear and very close up. Ingrid speaks at very high intensity." input_ids = description_tokenizer(description, return_tensors="pt").input_ids.to(device) prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids) audio_arr = generation.cpu().numpy().squeeze() sf.write("ingrid_intense.wav", audio_arr, model.config.sampling_rate) ``` ## Citation _coming later_ **BibTeX:** [More Information Needed] **APA:** [More Information Needed]