Spaces:
Restarting
Restarting
## What this is | |
We're exploring AI‑assisted loop‑based music creation that can run on GPUs (not just TPUs) and stream to apps in realtime. | |
### Implemented backends | |
- **HTTP (bar‑aligned):** `/generate`, `/jam/start`, `/jam/next`, `/jam/stop`, `/jam/update`, etc. | |
- **WebSocket (realtime):** `ws://…/ws/jam` with `mode="rt"` (Colab‑style continuous chunks). New in this build. | |
## What we learned (GPU notes) | |
- **L40S 48GB:** comfortably **faster than realtime** → we added a `pace: "realtime"` switch so the server doesn't outrun playback. | |
- **L4 24GB:** **consistently just under realtime**; even with pre‑roll buffering, TF32/JAX tunings, reduced chunk size, and the **base** checkpoint, we still see eventual under‑runs. | |
- **Implication:** For production‑quality realtime, aim for ~**40GB VRAM** per user/session (e.g., **A100 40GB**, or MIG slices ≈ **35–40GB** on newer parts). Smaller GPUs can demo, but sustained realtime is not reliable. | |
## Model / audio specs | |
- **Model:** MagentaRT (T5X; decoder RVQ depth = 16) | |
- **Audio:** 48 kHz stereo, 2.0 s chunks by default, 40 ms crossfade | |
- **Context:** 10 s rolling context window |