File size: 2,547 Bytes
dcfd5bb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>MagentaRT Research API</title>
<style>
body { font-family: Arial, sans-serif; max-width: 860px; margin: 48px auto; padding: 0 20px; color:#111; }
code, pre { background:#f6f8fa; border:1px solid #eaecef; border-radius:6px; padding:2px 6px; }
pre { padding:12px; overflow:auto; }
.muted { color:#555; }
ul { line-height: 1.8; }
</style>
</head>
<body>
<h1>π΅ MagentaRT Research API</h1>
<p class="muted"><strong>Purpose:</strong> AI music generation for iOS/web app research using Google's MagentaRT.</p>
<h2>Available Endpoints</h2>
<ul>
<li><code>POST /generate</code> β Generate 4β8 bars of music (HTTP, bar-aligned)</li>
<li><code>POST /jam/start</code> β Start continuous jamming (HTTP)</li>
<li><code>GET /jam/next</code> β Get next chunk (HTTP)</li>
<li><code>POST /jam/consume</code> β Confirm a chunk as consumed (HTTP)</li>
<li><code>POST /jam/stop</code> β End session (HTTP)</li>
<li><code>WEBSOCKET /ws/jam</code> β Realtime streaming (<code>mode="rt"</code>)</li>
<li><code>GET /docs</code> β API documentation (Gradio)</li>
</ul>
<h2>WebSocket Quick Start (rt mode)</h2>
<p>Connect to <code>wss://<your-space>/ws/jam</code> and send:</p>
<pre>{
"type": "start",
"mode": "rt",
"binary_audio": false,
"params": {
"styles": "warmup",
"temperature": 1.1,
"topk": 40,
"guidance_weight": 1.1,
"pace": "realtime", // or "asap" to bootstrap quickly
"max_decode_frames": 50 // default ~2.0s; try 36β45 on smaller GPUs
}
}</pre>
<p>Update parameters live:</p>
<pre>{
"type": "update",
"styles": "jazz, hiphop",
"style_weights": "1.0,0.8",
"temperature": 1.2,
"topk": 64,
"guidance_weight": 1.0,
"pace": "realtime",
"max_decode_frames": 40
}</pre>
<p>Stop:</p>
<pre>{"type":"stop"}</pre>
<h2>Notes</h2>
<ul>
<li>Audio: 48 kHz stereo, ~2.0 s chunks by default with ~40 ms crossfade.</li>
<li>L40S 48GB: faster than realtime β prefer <code>pace: "realtime"</code>.</li>
<li>L4 24GB: slightly under realtime even with pre-roll and tuning.</li>
<li>For sustained realtime, target ~40 GB VRAM per active stream (e.g., A100 40GB or β35β40 GB MIG slice).</li>
</ul>
<p class="muted"><strong>Licensing:</strong> Uses MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for outputs.</p>
<p>See <a href="/docs">/docs</a> for full API details and client examples.</p>
</body>
</html> |