thecollabagepatch commited on
Commit
a23b7ec
Β·
1 Parent(s): ec31c3e

update root html for better explains

Browse files
Files changed (1) hide show
  1. documentation.html +40 -26
documentation.html CHANGED
@@ -99,13 +99,26 @@
99
  </head>
100
  <body>
101
  <div class="header">
102
- <h1>🎡 MagentaRT Research API</h1>
103
  <p class="muted"><strong>AI Music Generation API</strong> β€’ Real-time streaming β€’ Custom fine-tune support</p>
104
  <span class="badge">Research Project</span>
105
  </div>
106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  <section id="env-vars" style="margin-top: 24px;">
108
- <h3>βš™οΈ Environment variables (optional, but helpful)</h3>
109
  <p>
110
  You can boot this Space directly into your own finetune by setting the variables below in
111
  <em>Settings β†’ Variables and secrets β†’ Variables</em>. If you don't set them, you can still
@@ -182,7 +195,7 @@
182
  </p>
183
 
184
  <div class="demo-placeholder">
185
- <h3>πŸ“± App Demo Video</h3>
186
  <video controls preload="metadata" playsinline style="width:100%; border-radius:8px; max-width:540px; display:block; margin:0 auto">
187
  <source src="./lil_demo_540p.mp4" type="video/mp4">
188
  Your browser does not support the video tag.
@@ -191,19 +204,15 @@
191
  </div>
192
 
193
  <div class="section">
194
- <h2>Overview</h2>
195
  <p>This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.</p>
196
-
197
- <div class="info">
198
- <strong>Hardware Requirements:</strong> Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).
199
- </div>
200
  </div>
201
 
202
  <div class="section">
203
- <h2>Quick Start - WebSocket Streaming</h2>
204
  <p>Connect to <code>wss://&lt;your-space&gt;/ws/jam</code> for real-time audio generation:</p>
205
 
206
- <h3>Start Real-time Generation</h3>
207
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
208
  "type": "start",
209
  "mode": "rt",
@@ -221,7 +230,7 @@
221
  }
222
  }</pre>
223
 
224
- <h3>Update Parameters Live</h3>
225
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
226
  "type": "update",
227
  "styles": "jazz, hiphop",
@@ -233,12 +242,12 @@
233
  "centroid_weights": "0.1, 0.3, 0.0"
234
  }</pre>
235
 
236
- <h3>Stop Generation</h3>
237
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{"type": "stop"}</pre>
238
  </div>
239
 
240
  <div class="section">
241
- <h2>API Endpoints</h2>
242
 
243
  <div class="endpoint">
244
  <strong>POST /generate</strong> - Generate 4–8 bars of music with input audio
@@ -274,14 +283,14 @@
274
  </div>
275
 
276
  <div class="section">
277
- <h2>Custom Fine-Tuning</h2>
278
  <p>Train your own MagentaRT models and use them with this API and the iOS app.</p>
279
 
280
  <div class="grid">
281
  <div class="card">
282
- <h3>1. Train Your Model</h3>
283
  <p>Use the official MagentaRT fine-tuning notebook:</p>
284
- <p><a href="https://github.com/magenta-realtime/notebooks/blob/main/Magenta_RT_Finetune.ipynb" target="_blank">πŸ”— MagentaRT Fine-tuning Colab</a></p>
285
  <p>This will create checkpoint folders like:</p>
286
  <ul>
287
  <li><code>checkpoint_1861001/</code></li>
@@ -291,7 +300,7 @@
291
  </div>
292
 
293
  <div class="card">
294
- <h3>2. Package Checkpoints</h3>
295
  <p>Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.</p>
296
  <div class="warning">
297
  <strong>Important:</strong> Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.
@@ -299,7 +308,7 @@
299
  </div>
300
  </div>
301
 
302
- <h3>Checkpoint Packaging Script</h3>
303
  <p>Use this in a Colab cell to properly package your checkpoints:</p>
304
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button># Mount Drive to access your trained checkpoints
305
  from google.colab import drive
@@ -325,7 +334,7 @@ CKPT_SRC = '/content/drive/MyDrive/thepatch/checkpoint_1862001' # Adjust path
325
  from google.colab import files
326
  files.download('/content/checkpoint_1862001.tgz')</pre>
327
 
328
- <h3>3. Upload to Hugging Face</h3>
329
  <p>Create a model repository and upload:</p>
330
  <ul>
331
  <li>Your <code>.tgz</code> checkpoint files</li>
@@ -338,12 +347,12 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
338
  Shows the correct file structure with .tgz files and .npy steering assets in the root directory.
339
  </div>
340
 
341
- <h3>4. Use in the App</h3>
342
  <p>In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.</p>
343
  </div>
344
 
345
  <div class="section">
346
- <h2>Technical Specifications</h2>
347
  <ul>
348
  <li><strong>Audio Format:</strong> 48 kHz stereo, ~2.0s chunks with ~40ms crossfade</li>
349
  <li><strong>Model Sizes:</strong> Base and Large variants available</li>
@@ -358,7 +367,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
358
  </div>
359
 
360
  <div class="section">
361
- <h2>Integration with iOS App</h2>
362
  <p>This API is designed to work seamlessly with our iOS music generation app:</p>
363
  <ul>
364
  <li>Real-time audio streaming via WebSockets</li>
@@ -369,7 +378,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
369
  </div>
370
 
371
  <div class="section">
372
- <h2>Deployment</h2>
373
  <p>To run your own instance:</p>
374
  <ol>
375
  <li>Duplicate this Hugging Face Space</li>
@@ -380,7 +389,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
380
  </div>
381
 
382
  <div class="section">
383
- <h2>Support & Contact</h2>
384
  <p>This is an active research project. For questions, technical support, or collaboration:</p>
385
  <p><strong>Email:</strong> <a href="mailto:kev@thecollabagepatch.com">kev@thecollabagepatch.com</a></p>
386
 
@@ -390,9 +399,14 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
390
  </div>
391
 
392
  <div class="section">
393
- <h2>Licensing</h2>
394
  <p>Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.</p>
395
- <p><a href="/docs">πŸ“– API Reference Documentation</a></p>
 
 
 
 
 
396
  </div>
397
 
398
  <script>
 
99
  </head>
100
  <body>
101
  <div class="header">
102
+ <h1>MagentaRT Research API</h1>
103
  <p class="muted"><strong>AI Music Generation API</strong> β€’ Real-time streaming β€’ Custom fine-tune support</p>
104
  <span class="badge">Research Project</span>
105
  </div>
106
 
107
+ <div class="section">
108
+ <h2>what this is</h2>
109
+ <p>This API serves Google's <a href="https://huggingface.co/google/magenta-realtime" target="_blank">MagentaRT</a> in two distinct ways. First, as a backend for our iOS app (the untitled jamming app) where users create initial loops with Stability AI's <a href="https://huggingface.co/stabilityai/stable-audio-open-small" target="_blank">stable-audio-open-small</a> and then MagentaRT jams on top of that audio context. Second, as a standalone web interface that connects directly to MagentaRT via WebSockets without any audio context.</p>
110
+
111
+ <p>Both modes support switching between base models and custom fine-tunes hosted on Hugging Face. This is designed as a template space for duplication, letting you experiment with real-time music generation outside of Google Colab.</p>
112
+
113
+ <p>This is meant to be duplicated to your own GPU-enabled space since the iOS app is still in active development and doesn't have funding to support multiple concurrent users yet.</p>
114
+
115
+ <div class="info">
116
+ <strong>Hardware Requirements:</strong> Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).
117
+ </div>
118
+ </div>
119
+
120
  <section id="env-vars" style="margin-top: 24px;">
121
+ <h3>environment variables (optional, but helpful)</h3>
122
  <p>
123
  You can boot this Space directly into your own finetune by setting the variables below in
124
  <em>Settings β†’ Variables and secrets β†’ Variables</em>. If you don't set them, you can still
 
195
  </p>
196
 
197
  <div class="demo-placeholder">
198
+ <h3>app demo video</h3>
199
  <video controls preload="metadata" playsinline style="width:100%; border-radius:8px; max-width:540px; display:block; margin:0 auto">
200
  <source src="./lil_demo_540p.mp4" type="video/mp4">
201
  Your browser does not support the video tag.
 
204
  </div>
205
 
206
  <div class="section">
207
+ <h2>overview</h2>
208
  <p>This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.</p>
 
 
 
 
209
  </div>
210
 
211
  <div class="section">
212
+ <h2>quick start - WebSocket streaming</h2>
213
  <p>Connect to <code>wss://&lt;your-space&gt;/ws/jam</code> for real-time audio generation:</p>
214
 
215
+ <h3>start real-time generation</h3>
216
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
217
  "type": "start",
218
  "mode": "rt",
 
230
  }
231
  }</pre>
232
 
233
+ <h3>update parameters live</h3>
234
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
235
  "type": "update",
236
  "styles": "jazz, hiphop",
 
242
  "centroid_weights": "0.1, 0.3, 0.0"
243
  }</pre>
244
 
245
+ <h3>stop generation</h3>
246
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{"type": "stop"}</pre>
247
  </div>
248
 
249
  <div class="section">
250
+ <h2>API endpoints</h2>
251
 
252
  <div class="endpoint">
253
  <strong>POST /generate</strong> - Generate 4–8 bars of music with input audio
 
283
  </div>
284
 
285
  <div class="section">
286
+ <h2>custom fine-tuning</h2>
287
  <p>Train your own MagentaRT models and use them with this API and the iOS app.</p>
288
 
289
  <div class="grid">
290
  <div class="card">
291
+ <h3>1. train your model</h3>
292
  <p>Use the official MagentaRT fine-tuning notebook:</p>
293
+ <p><a href="https://github.com/magenta-realtime/notebooks/blob/main/Magenta_RT_Finetune.ipynb" target="_blank">MagentaRT Fine-tuning Colab</a></p>
294
  <p>This will create checkpoint folders like:</p>
295
  <ul>
296
  <li><code>checkpoint_1861001/</code></li>
 
300
  </div>
301
 
302
  <div class="card">
303
+ <h3>2. package checkpoints</h3>
304
  <p>Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.</p>
305
  <div class="warning">
306
  <strong>Important:</strong> Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.
 
308
  </div>
309
  </div>
310
 
311
+ <h3>checkpoint packaging script</h3>
312
  <p>Use this in a Colab cell to properly package your checkpoints:</p>
313
  <pre><button class="copy-btn" onclick="copyCode(this)">Copy</button># Mount Drive to access your trained checkpoints
314
  from google.colab import drive
 
334
  from google.colab import files
335
  files.download('/content/checkpoint_1862001.tgz')</pre>
336
 
337
+ <h3>3. upload to hugging face</h3>
338
  <p>Create a model repository and upload:</p>
339
  <ul>
340
  <li>Your <code>.tgz</code> checkpoint files</li>
 
347
  Shows the correct file structure with .tgz files and .npy steering assets in the root directory.
348
  </div>
349
 
350
+ <h3>4. use in the app</h3>
351
  <p>In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.</p>
352
  </div>
353
 
354
  <div class="section">
355
+ <h2>technical specifications</h2>
356
  <ul>
357
  <li><strong>Audio Format:</strong> 48 kHz stereo, ~2.0s chunks with ~40ms crossfade</li>
358
  <li><strong>Model Sizes:</strong> Base and Large variants available</li>
 
367
  </div>
368
 
369
  <div class="section">
370
+ <h2>integration with iOS app</h2>
371
  <p>This API is designed to work seamlessly with our iOS music generation app:</p>
372
  <ul>
373
  <li>Real-time audio streaming via WebSockets</li>
 
378
  </div>
379
 
380
  <div class="section">
381
+ <h2>deployment</h2>
382
  <p>To run your own instance:</p>
383
  <ol>
384
  <li>Duplicate this Hugging Face Space</li>
 
389
  </div>
390
 
391
  <div class="section">
392
+ <h2>support & contact</h2>
393
  <p>This is an active research project. For questions, technical support, or collaboration:</p>
394
  <p><strong>Email:</strong> <a href="mailto:kev@thecollabagepatch.com">kev@thecollabagepatch.com</a></p>
395
 
 
399
  </div>
400
 
401
  <div class="section">
402
+ <h2>licensing</h2>
403
  <p>Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.</p>
404
+ <p><a href="/docs">API Reference Documentation</a></p>
405
+ </div>
406
+
407
+ <div class="section">
408
+ <h2>contributors</h2>
409
+ <p>Kevin Griffing and Andrew Luck</p>
410
  </div>
411
 
412
  <script>