theycallmeloki commited on
Commit
a0786c9
·
verified ·
1 Parent(s): c4d790b

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -17,7 +17,7 @@ This model provides Milady-styled speech synthesis, capturing the distinctive vo
17
 
18
  ## Intended Use
19
 
20
- The Milady Talks model is designed to:
21
  - Generate speech in the unique Milady voice style
22
  - Create playful and creative speech responses to text prompts
23
  - Emulate Milady's distinctive personality through speech
@@ -34,13 +34,13 @@ The model was fine-tuned on a curated dataset of Milady-style speech examples fr
34
  First, install the required packages using uv (recommended for faster installation):
35
 
36
  ```bash
37
- uv pip install torch torchaudio snac transformers
38
  ```
39
 
40
  Or using standard pip:
41
 
42
  ```bash
43
- pip install torch torchaudio snac transformers
44
  ```
45
 
46
  ## Usage
@@ -50,6 +50,7 @@ pip install torch torchaudio snac transformers
50
  Here's a complete script to generate speech with the model:
51
 
52
  ```python
 
53
  from transformers import AutoModelForCausalLM, AutoTokenizer
54
  import torch
55
  import torchaudio
@@ -165,10 +166,10 @@ audio_samples = redistribute_codes(code_list)
165
 
166
  if audio_samples is not None:
167
  # Save the audio to a WAV file
168
- output_path = "milady_speech.wav"
169
  audio_numpy = audio_samples.detach().squeeze().numpy()
170
  # The sampling rate of 24000 Hz is crucial for correct playback speed
171
- torchaudio.save(output_path, torch.tensor(audio_numpy).unsqueeze(0), 24000)
172
  print(f"Audio saved to {output_path}")
173
  else:
174
  print("Failed to generate audio")
 
17
 
18
  ## Intended Use
19
 
20
+ The Volady model is designed to:
21
  - Generate speech in the unique Milady voice style
22
  - Create playful and creative speech responses to text prompts
23
  - Emulate Milady's distinctive personality through speech
 
34
  First, install the required packages using uv (recommended for faster installation):
35
 
36
  ```bash
37
+ uv pip install torch torchaudio snac transformers accelerate soundfile
38
  ```
39
 
40
  Or using standard pip:
41
 
42
  ```bash
43
+ pip install torch torchaudio snac transformers accelerate soundfile
44
  ```
45
 
46
  ## Usage
 
50
  Here's a complete script to generate speech with the model:
51
 
52
  ```python
53
+
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
  import torch
56
  import torchaudio
 
166
 
167
  if audio_samples is not None:
168
  # Save the audio to a WAV file
169
+ output_path = os.path.join(os.getcwd(), "milady_speech.wav")
170
  audio_numpy = audio_samples.detach().squeeze().numpy()
171
  # The sampling rate of 24000 Hz is crucial for correct playback speed
172
+ torchaudio.save(output_path, torch.tensor(audio_numpy).unsqueeze(0), 24000, format="wav")
173
  print(f"Audio saved to {output_path}")
174
  else:
175
  print("Failed to generate audio")