CapabilityS/01
Double-stream capture.
Record system audio via ScreenCaptureKit and your microphone via AVAudioEngine at the same time — so a call, a recording, and the person in the room all land in one transcript.
ScreenCaptureKit · AVAudioEngine→
CapabilityS/02
Local transcription.
Speech-to-text runs through a Metal-accelerated Whisper model (large-v3-turbo, MLX) with in-memory 16kHz mono resampling. Fast on Apple Silicon, and entirely on-device.
Whisper · MLX→
CapabilityS/03
Speaker diarization.
Energy-based partitioning splits the audio into clean per-speaker utterances, so the transcript reads like a conversation instead of a wall of text.
on-device · per-utterance→
CapabilityS/04
Meeting intelligence.
A local Gemma model via mlx-lm turns the transcript into a structured summary, a decision register, action items, and a participant list — no cloud notetaker required.
Gemma · mlx-lm→
CapabilityS/05
Full-text search.
Every session is indexed with SQLite FTS5, so you can query across months of meetings instantly and jump straight to the line that matters.
SQLite FTS5 · instant→
CapabilityS/06
Edit & export.
Correct the transcript inline, retitle sessions, and export transcripts and summaries to a folder you choose. Your words, your files, no re-upload.
editable · exports→
CapabilityS/07
First-run weights.
Whisper and Gemma download once from public open-weights mirrors with live byte-progress. After that, every transcription and summary is offline.
one-time · local→
CapabilityS/08
Menu bar & file import.
Start a capture from the menu bar, drop in an existing audio file to transcribe, and let smart titles name the session for you.
menu bar · import→