CapabilityS/01
Multi-track capture.
Record several USB or XLR microphones and the system audio loopback at the same time, each on its own track — so every host, every guest, and the call all land in one session, cleanly separated from the start.
USB · XLR · loopback→
CapabilityS/02
Bleed & cross-talk removal.
An optimized local UVR model (HT-Demucs) isolates each voice onto its own track and strips the bleed, so the host mic stops carrying the guest and every track is clean enough to edit independently.
UVR · HT-Demucs→
CapabilityS/03
Metal transcription.
Multi-speaker speech-to-text runs through a Metal-accelerated Whisper model (mlx-whisper) on your Apple Silicon GPU. Fast, attributed to the right track, and entirely on-device.
Whisper · MLX→
CapabilityS/04
Edit audio by editing text.
Select a sentence in the transcript and delete it — the underlying audio is cut to match. Tighten an episode, drop a tangent, or remove a stumble without ever opening a waveform.
transcript-driven · non-destructive→
CapabilityS/05
On-device marketing pack.
A local language model via mlx-lm turns the finished episode into show notes, an episode outline, timestamped chapters, and a newsletter draft — no cloud notetaker, no copy-paste into a chatbot.
mlx-lm · structured→
CapabilityS/06
Local export.
Export isolated tracks, the mixed episode, the transcript, and every generated asset to a folder you choose. Your audio, your files, no re-upload and no watermark.
tracks · assets · local→
CapabilityS/07
First-run weights.
UVR, Whisper, and the language model download once from public open-weights mirrors with live byte-progress. After that, every separation, transcript, and draft is offline.
one-time · local→
CapabilityS/08
One local process tree.
A SwiftUI host boots a bundled FastAPI backend on loopback and renders the editor in a WKWebView from that same origin. No browser, no server, no port open to your network.
SwiftUI · FastAPI · same-origin→