Spaces:

thecollabagepatch
/

magenta-retry

Running

App Files Files Community

magenta-retry / documentation.html

thecollabagepatch

cleaning up app.py

dcfd5bb 6 days ago

raw

history blame

2.55 kB

	<!DOCTYPE html>
	<html>
	<head>
	<meta charset="utf-8">
	<title>MagentaRT Research API</title>
	<style>
	body { font-family: Arial, sans-serif; max-width: 860px; margin: 48px auto; padding: 0 20px; color:#111; }
	code, pre { background:#f6f8fa; border:1px solid #eaecef; border-radius:6px; padding:2px 6px; }
	pre { padding:12px; overflow:auto; }
	.muted { color:#555; }
	ul { line-height: 1.8; }
	</style>
	</head>
	<body>
	<h1>🎵 MagentaRT Research API</h1>
	<p class="muted"><strong>Purpose:</strong> AI music generation for iOS/web app research using Google's MagentaRT.</p>

	<h2>Available Endpoints</h2>
	<ul>
	<li><code>POST /generate</code> – Generate 4–8 bars of music (HTTP, bar-aligned)</li>
	<li><code>POST /jam/start</code> – Start continuous jamming (HTTP)</li>
	<li><code>GET /jam/next</code> – Get next chunk (HTTP)</li>
	<li><code>POST /jam/consume</code> – Confirm a chunk as consumed (HTTP)</li>
	<li><code>POST /jam/stop</code> – End session (HTTP)</li>
	<li><code>WEBSOCKET /ws/jam</code> – Realtime streaming (<code>mode="rt"</code>)</li>
	<li><code>GET /docs</code> – API documentation (Gradio)</li>
	</ul>

	<h2>WebSocket Quick Start (rt mode)</h2>
	<p>Connect to <code>wss://<your-space>/ws/jam</code> and send:</p>
	<pre>{
	"type": "start",
	"mode": "rt",
	"binary_audio": false,
	"params": {
	"styles": "warmup",
	"temperature": 1.1,
	"topk": 40,
	"guidance_weight": 1.1,
	"pace": "realtime", // or "asap" to bootstrap quickly
	"max_decode_frames": 50 // default ~2.0s; try 36–45 on smaller GPUs
	}
	}</pre>
	<p>Update parameters live:</p>
	<pre>{
	"type": "update",
	"styles": "jazz, hiphop",
	"style_weights": "1.0,0.8",
	"temperature": 1.2,
	"topk": 64,
	"guidance_weight": 1.0,
	"pace": "realtime",
	"max_decode_frames": 40
	}</pre>
	<p>Stop:</p>
	<pre>{"type":"stop"}</pre>

	<h2>Notes</h2>
	<ul>
	<li>Audio: 48 kHz stereo, ~2.0 s chunks by default with ~40 ms crossfade.</li>
	<li>L40S 48GB: faster than realtime → prefer <code>pace: "realtime"</code>.</li>
	<li>L4 24GB: slightly under realtime even with pre-roll and tuning.</li>
	<li>For sustained realtime, target ~40 GB VRAM per active stream (e.g., A100 40GB or ≈35–40 GB MIG slice).</li>
	</ul>

	<p class="muted"><strong>Licensing:</strong> Uses MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for outputs.</p>
	<p>See <a href="/docs">/docs</a> for full API details and client examples.</p>
	</body>
	</html>