Commit
Β·
a23b7ec
1
Parent(s):
ec31c3e
update root html for better explains
Browse files- documentation.html +40 -26
documentation.html
CHANGED
@@ -99,13 +99,26 @@
|
|
99 |
</head>
|
100 |
<body>
|
101 |
<div class="header">
|
102 |
-
<h1
|
103 |
<p class="muted"><strong>AI Music Generation API</strong> β’ Real-time streaming β’ Custom fine-tune support</p>
|
104 |
<span class="badge">Research Project</span>
|
105 |
</div>
|
106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
107 |
<section id="env-vars" style="margin-top: 24px;">
|
108 |
-
<h3
|
109 |
<p>
|
110 |
You can boot this Space directly into your own finetune by setting the variables below in
|
111 |
<em>Settings β Variables and secrets β Variables</em>. If you don't set them, you can still
|
@@ -182,7 +195,7 @@
|
|
182 |
</p>
|
183 |
|
184 |
<div class="demo-placeholder">
|
185 |
-
<h3
|
186 |
<video controls preload="metadata" playsinline style="width:100%; border-radius:8px; max-width:540px; display:block; margin:0 auto">
|
187 |
<source src="./lil_demo_540p.mp4" type="video/mp4">
|
188 |
Your browser does not support the video tag.
|
@@ -191,19 +204,15 @@
|
|
191 |
</div>
|
192 |
|
193 |
<div class="section">
|
194 |
-
<h2>
|
195 |
<p>This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.</p>
|
196 |
-
|
197 |
-
<div class="info">
|
198 |
-
<strong>Hardware Requirements:</strong> Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).
|
199 |
-
</div>
|
200 |
</div>
|
201 |
|
202 |
<div class="section">
|
203 |
-
<h2>
|
204 |
<p>Connect to <code>wss://<your-space>/ws/jam</code> for real-time audio generation:</p>
|
205 |
|
206 |
-
<h3>
|
207 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
208 |
"type": "start",
|
209 |
"mode": "rt",
|
@@ -221,7 +230,7 @@
|
|
221 |
}
|
222 |
}</pre>
|
223 |
|
224 |
-
<h3>
|
225 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
226 |
"type": "update",
|
227 |
"styles": "jazz, hiphop",
|
@@ -233,12 +242,12 @@
|
|
233 |
"centroid_weights": "0.1, 0.3, 0.0"
|
234 |
}</pre>
|
235 |
|
236 |
-
<h3>
|
237 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{"type": "stop"}</pre>
|
238 |
</div>
|
239 |
|
240 |
<div class="section">
|
241 |
-
<h2>API
|
242 |
|
243 |
<div class="endpoint">
|
244 |
<strong>POST /generate</strong> - Generate 4β8 bars of music with input audio
|
@@ -274,14 +283,14 @@
|
|
274 |
</div>
|
275 |
|
276 |
<div class="section">
|
277 |
-
<h2>
|
278 |
<p>Train your own MagentaRT models and use them with this API and the iOS app.</p>
|
279 |
|
280 |
<div class="grid">
|
281 |
<div class="card">
|
282 |
-
<h3>1.
|
283 |
<p>Use the official MagentaRT fine-tuning notebook:</p>
|
284 |
-
<p><a href="https://github.com/magenta-realtime/notebooks/blob/main/Magenta_RT_Finetune.ipynb" target="_blank"
|
285 |
<p>This will create checkpoint folders like:</p>
|
286 |
<ul>
|
287 |
<li><code>checkpoint_1861001/</code></li>
|
@@ -291,7 +300,7 @@
|
|
291 |
</div>
|
292 |
|
293 |
<div class="card">
|
294 |
-
<h3>2.
|
295 |
<p>Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.</p>
|
296 |
<div class="warning">
|
297 |
<strong>Important:</strong> Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.
|
@@ -299,7 +308,7 @@
|
|
299 |
</div>
|
300 |
</div>
|
301 |
|
302 |
-
<h3>
|
303 |
<p>Use this in a Colab cell to properly package your checkpoints:</p>
|
304 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button># Mount Drive to access your trained checkpoints
|
305 |
from google.colab import drive
|
@@ -325,7 +334,7 @@ CKPT_SRC = '/content/drive/MyDrive/thepatch/checkpoint_1862001' # Adjust path
|
|
325 |
from google.colab import files
|
326 |
files.download('/content/checkpoint_1862001.tgz')</pre>
|
327 |
|
328 |
-
<h3>3.
|
329 |
<p>Create a model repository and upload:</p>
|
330 |
<ul>
|
331 |
<li>Your <code>.tgz</code> checkpoint files</li>
|
@@ -338,12 +347,12 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
338 |
Shows the correct file structure with .tgz files and .npy steering assets in the root directory.
|
339 |
</div>
|
340 |
|
341 |
-
<h3>4.
|
342 |
<p>In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.</p>
|
343 |
</div>
|
344 |
|
345 |
<div class="section">
|
346 |
-
<h2>
|
347 |
<ul>
|
348 |
<li><strong>Audio Format:</strong> 48 kHz stereo, ~2.0s chunks with ~40ms crossfade</li>
|
349 |
<li><strong>Model Sizes:</strong> Base and Large variants available</li>
|
@@ -358,7 +367,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
358 |
</div>
|
359 |
|
360 |
<div class="section">
|
361 |
-
<h2>
|
362 |
<p>This API is designed to work seamlessly with our iOS music generation app:</p>
|
363 |
<ul>
|
364 |
<li>Real-time audio streaming via WebSockets</li>
|
@@ -369,7 +378,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
369 |
</div>
|
370 |
|
371 |
<div class="section">
|
372 |
-
<h2>
|
373 |
<p>To run your own instance:</p>
|
374 |
<ol>
|
375 |
<li>Duplicate this Hugging Face Space</li>
|
@@ -380,7 +389,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
380 |
</div>
|
381 |
|
382 |
<div class="section">
|
383 |
-
<h2>
|
384 |
<p>This is an active research project. For questions, technical support, or collaboration:</p>
|
385 |
<p><strong>Email:</strong> <a href="mailto:kev@thecollabagepatch.com">kev@thecollabagepatch.com</a></p>
|
386 |
|
@@ -390,9 +399,14 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
390 |
</div>
|
391 |
|
392 |
<div class="section">
|
393 |
-
<h2>
|
394 |
<p>Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.</p>
|
395 |
-
<p><a href="/docs"
|
|
|
|
|
|
|
|
|
|
|
396 |
</div>
|
397 |
|
398 |
<script>
|
|
|
99 |
</head>
|
100 |
<body>
|
101 |
<div class="header">
|
102 |
+
<h1>MagentaRT Research API</h1>
|
103 |
<p class="muted"><strong>AI Music Generation API</strong> β’ Real-time streaming β’ Custom fine-tune support</p>
|
104 |
<span class="badge">Research Project</span>
|
105 |
</div>
|
106 |
|
107 |
+
<div class="section">
|
108 |
+
<h2>what this is</h2>
|
109 |
+
<p>This API serves Google's <a href="https://huggingface.co/google/magenta-realtime" target="_blank">MagentaRT</a> in two distinct ways. First, as a backend for our iOS app (the untitled jamming app) where users create initial loops with Stability AI's <a href="https://huggingface.co/stabilityai/stable-audio-open-small" target="_blank">stable-audio-open-small</a> and then MagentaRT jams on top of that audio context. Second, as a standalone web interface that connects directly to MagentaRT via WebSockets without any audio context.</p>
|
110 |
+
|
111 |
+
<p>Both modes support switching between base models and custom fine-tunes hosted on Hugging Face. This is designed as a template space for duplication, letting you experiment with real-time music generation outside of Google Colab.</p>
|
112 |
+
|
113 |
+
<p>This is meant to be duplicated to your own GPU-enabled space since the iOS app is still in active development and doesn't have funding to support multiple concurrent users yet.</p>
|
114 |
+
|
115 |
+
<div class="info">
|
116 |
+
<strong>Hardware Requirements:</strong> Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).
|
117 |
+
</div>
|
118 |
+
</div>
|
119 |
+
|
120 |
<section id="env-vars" style="margin-top: 24px;">
|
121 |
+
<h3>environment variables (optional, but helpful)</h3>
|
122 |
<p>
|
123 |
You can boot this Space directly into your own finetune by setting the variables below in
|
124 |
<em>Settings β Variables and secrets β Variables</em>. If you don't set them, you can still
|
|
|
195 |
</p>
|
196 |
|
197 |
<div class="demo-placeholder">
|
198 |
+
<h3>app demo video</h3>
|
199 |
<video controls preload="metadata" playsinline style="width:100%; border-radius:8px; max-width:540px; display:block; margin:0 auto">
|
200 |
<source src="./lil_demo_540p.mp4" type="video/mp4">
|
201 |
Your browser does not support the video tag.
|
|
|
204 |
</div>
|
205 |
|
206 |
<div class="section">
|
207 |
+
<h2>overview</h2>
|
208 |
<p>This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.</p>
|
|
|
|
|
|
|
|
|
209 |
</div>
|
210 |
|
211 |
<div class="section">
|
212 |
+
<h2>quick start - WebSocket streaming</h2>
|
213 |
<p>Connect to <code>wss://<your-space>/ws/jam</code> for real-time audio generation:</p>
|
214 |
|
215 |
+
<h3>start real-time generation</h3>
|
216 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
217 |
"type": "start",
|
218 |
"mode": "rt",
|
|
|
230 |
}
|
231 |
}</pre>
|
232 |
|
233 |
+
<h3>update parameters live</h3>
|
234 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
235 |
"type": "update",
|
236 |
"styles": "jazz, hiphop",
|
|
|
242 |
"centroid_weights": "0.1, 0.3, 0.0"
|
243 |
}</pre>
|
244 |
|
245 |
+
<h3>stop generation</h3>
|
246 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{"type": "stop"}</pre>
|
247 |
</div>
|
248 |
|
249 |
<div class="section">
|
250 |
+
<h2>API endpoints</h2>
|
251 |
|
252 |
<div class="endpoint">
|
253 |
<strong>POST /generate</strong> - Generate 4β8 bars of music with input audio
|
|
|
283 |
</div>
|
284 |
|
285 |
<div class="section">
|
286 |
+
<h2>custom fine-tuning</h2>
|
287 |
<p>Train your own MagentaRT models and use them with this API and the iOS app.</p>
|
288 |
|
289 |
<div class="grid">
|
290 |
<div class="card">
|
291 |
+
<h3>1. train your model</h3>
|
292 |
<p>Use the official MagentaRT fine-tuning notebook:</p>
|
293 |
+
<p><a href="https://github.com/magenta-realtime/notebooks/blob/main/Magenta_RT_Finetune.ipynb" target="_blank">MagentaRT Fine-tuning Colab</a></p>
|
294 |
<p>This will create checkpoint folders like:</p>
|
295 |
<ul>
|
296 |
<li><code>checkpoint_1861001/</code></li>
|
|
|
300 |
</div>
|
301 |
|
302 |
<div class="card">
|
303 |
+
<h3>2. package checkpoints</h3>
|
304 |
<p>Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.</p>
|
305 |
<div class="warning">
|
306 |
<strong>Important:</strong> Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.
|
|
|
308 |
</div>
|
309 |
</div>
|
310 |
|
311 |
+
<h3>checkpoint packaging script</h3>
|
312 |
<p>Use this in a Colab cell to properly package your checkpoints:</p>
|
313 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button># Mount Drive to access your trained checkpoints
|
314 |
from google.colab import drive
|
|
|
334 |
from google.colab import files
|
335 |
files.download('/content/checkpoint_1862001.tgz')</pre>
|
336 |
|
337 |
+
<h3>3. upload to hugging face</h3>
|
338 |
<p>Create a model repository and upload:</p>
|
339 |
<ul>
|
340 |
<li>Your <code>.tgz</code> checkpoint files</li>
|
|
|
347 |
Shows the correct file structure with .tgz files and .npy steering assets in the root directory.
|
348 |
</div>
|
349 |
|
350 |
+
<h3>4. use in the app</h3>
|
351 |
<p>In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.</p>
|
352 |
</div>
|
353 |
|
354 |
<div class="section">
|
355 |
+
<h2>technical specifications</h2>
|
356 |
<ul>
|
357 |
<li><strong>Audio Format:</strong> 48 kHz stereo, ~2.0s chunks with ~40ms crossfade</li>
|
358 |
<li><strong>Model Sizes:</strong> Base and Large variants available</li>
|
|
|
367 |
</div>
|
368 |
|
369 |
<div class="section">
|
370 |
+
<h2>integration with iOS app</h2>
|
371 |
<p>This API is designed to work seamlessly with our iOS music generation app:</p>
|
372 |
<ul>
|
373 |
<li>Real-time audio streaming via WebSockets</li>
|
|
|
378 |
</div>
|
379 |
|
380 |
<div class="section">
|
381 |
+
<h2>deployment</h2>
|
382 |
<p>To run your own instance:</p>
|
383 |
<ol>
|
384 |
<li>Duplicate this Hugging Face Space</li>
|
|
|
389 |
</div>
|
390 |
|
391 |
<div class="section">
|
392 |
+
<h2>support & contact</h2>
|
393 |
<p>This is an active research project. For questions, technical support, or collaboration:</p>
|
394 |
<p><strong>Email:</strong> <a href="mailto:kev@thecollabagepatch.com">kev@thecollabagepatch.com</a></p>
|
395 |
|
|
|
399 |
</div>
|
400 |
|
401 |
<div class="section">
|
402 |
+
<h2>licensing</h2>
|
403 |
<p>Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.</p>
|
404 |
+
<p><a href="/docs">API Reference Documentation</a></p>
|
405 |
+
</div>
|
406 |
+
|
407 |
+
<div class="section">
|
408 |
+
<h2>contributors</h2>
|
409 |
+
<p>Kevin Griffing and Andrew Luck</p>
|
410 |
</div>
|
411 |
|
412 |
<script>
|