Streaming transcription
Partial text lands as you speak — a low-latency scratch layer — and the final pass corrects punctuation, casing, and segment boundaries on stop.
Overture turns your iPhone mic into a dictation and transcription engine. WhisperKit on Apple Silicon, streaming partials, speaker-aware when a model supports it. Recordings and transcripts never leave the device.
Partial text lands as you speak — a low-latency scratch layer — and the final pass corrects punctuation, casing, and segment boundaries on stop.
Apple-Silicon-optimized Whisper variants run on the Neural Engine. No network, no uploads, no metering — just the mic and the model.
Overture output flows into Gale, Mist, and Ashe as text input — speak a prompt, hand the transcript to any module, keep working.
Overture is Nimbus8's on-device audio module. It turns your iPhone microphone into two things at once: a dictation surface for any other Nimbus8 module, and a standalone transcription tool for meetings, voice memos, and long-form recordings.
Status is exploring — the runtime is in place, WhisperKit is integrated, and we're tuning the UX for streaming partials versus finalized output before we commit to a ship date.
Overture is built on WhisperKit, Argmax's Apple-Silicon-optimized port of OpenAI's Whisper. Models ship as Core ML packages, compiled for the Neural Engine and GPU, and the runtime picks the fastest path for your chip.
Inference is fully local. There is no cloud fallback and no silent upload — the audio buffer lives in memory, the compiled model lives in the app sandbox, and neither touches the network.
Dictation is the short-form path: hold to speak, release to commit. Overture streams partials into the target field so you can see the transcript forming in real time, then rewrites the final output with proper punctuation and casing once the audio stops.
Any module that accepts text input can receive a dictation. Gale's composer, Mist's query bar, Ashe's task description field — the same capture surface feeds all of them through the shared runtime.
For meetings, interviews, and voice memos, Overture runs a longer pass with larger context windows and segment-level timestamps. Output is saved as a searchable transcript in the sandbox, alongside the source audio if you ask it to keep one.
Speaker diarization is opt-in and depends on the model. When a WhisperKit build supports speaker turns, Overture surfaces them; when it doesn't, the transcript is a single uninterrupted stream.
The Overture model picker filters WhisperKit builds down to what your device can actually run. Current picks, verified on-device:
Nothing, by default. Overture records into the iOS sandbox, runs WhisperKit against the buffer locally, and writes transcripts to local storage. The microphone permission is the standard iOS prompt; the network permission is irrelevant because Overture does not use the network.
There is no telemetry. There is no account. Audio is not uploaded, not streamed, not cached on a server — there is no server. See the privacy policy for the full data flow.
Only to download WhisperKit models the first time. Once a model is on your device, Overture records and transcribes fully offline — airplane mode is a supported configuration.
Whisper large-v3 turbo, running locally via WhisperKit, matches the cloud Whisper endpoint word-for-word on clean audio and holds up well under noise. Distil-whisper trades a few percentage points of accuracy for roughly 2x speed.
Yes — drop an m4a, wav, or mp3 into Overture and it runs the same WhisperKit pipeline against the file. Long files are segmented and transcribed in chunks, then stitched into a single transcript.
Opt-in and model-dependent. WhisperKit builds with speaker-turn support will surface diarized output in Overture; others will produce a single continuous transcript.
Through the shared Nimbus8 runtime. A dictation from Overture is just text — it flows into Gale's composer, Mist's query bar, or Ashe's task description through the same capability manifest the rest of the app uses.