Overture — On-device audio and transcription

What is Overture?

Overture is Nimbus8's on-device audio module. It turns your iPhone microphone into two things at once: a dictation surface for any other Nimbus8 module, and a standalone transcription tool for meetings, voice memos, and long-form recordings.

Status is exploring — the runtime is in place, WhisperKit is integrated, and we're tuning the UX for streaming partials versus finalized output before we commit to a ship date.

WhisperKit under the hood

Overture is built on WhisperKit, Argmax's Apple-Silicon-optimized port of OpenAI's Whisper. Models ship as Core ML packages, compiled for the Neural Engine and GPU, and the runtime picks the fastest path for your chip.

Inference is fully local. There is no cloud fallback and no silent upload — the audio buffer lives in memory, the compiled model lives in the app sandbox, and neither touches the network.

Dictation mode

Dictation is the short-form path: hold to speak, release to commit. Overture streams partials into the target field so you can see the transcript forming in real time, then rewrites the final output with proper punctuation and casing once the audio stops.

Any module that accepts text input can receive a dictation. Gale's composer, Mist's query bar, Ashe's task description field — the same capture surface feeds all of them through the shared runtime.

Long-form transcription

For meetings, interviews, and voice memos, Overture runs a longer pass with larger context windows and segment-level timestamps. Output is saved as a searchable transcript in the sandbox, alongside the source audio if you ask it to keep one.

Speaker diarization is opt-in and depends on the model. When a WhisperKit build supports speaker turns, Overture surfaces them; when it doesn't, the transcript is a single uninterrupted stream.

Whisper variants supported

The Overture model picker filters WhisperKit builds down to what your device can actually run. Current picks, verified on-device:

Whisper large-v3 turbo — Core ML — best-in-class accuracy on iPhone 15 Pro and newer, with turbo decoding for long-form passes.
Whisper large-v3 — Core ML — reference accuracy where latency matters less than fidelity.
distil-whisper — Core ML — distilled, faster, near-parity on clean audio; the default for dictation on mid-tier devices.
Whisper small — Core ML — balanced pick for older devices that still want real-time partials.
Whisper base — Core ML — smallest supported model; ships on every device for offline-only dictation.

What leaves my device when I use Overture?

Nothing, by default. Overture records into the iOS sandbox, runs WhisperKit against the buffer locally, and writes transcripts to local storage. The microphone permission is the standard iOS prompt; the network permission is irrelevant because Overture does not use the network.

There is no telemetry. There is no account. Audio is not uploaded, not streamed, not cached on a server — there is no server. See the privacy policy for the full data flow.

FAQ

Does Overture need an internet connection?

Only to download WhisperKit models the first time. Once a model is on your device, Overture records and transcribes fully offline — airplane mode is a supported configuration.

How accurate is on-device Whisper compared to cloud transcription?

Whisper large-v3 turbo, running locally via WhisperKit, matches the cloud Whisper endpoint word-for-word on clean audio and holds up well under noise. Distil-whisper trades a few percentage points of accuracy for roughly 2x speed.

Can Overture transcribe audio files I already have?

Yes — drop an m4a, wav, or mp3 into Overture and it runs the same WhisperKit pipeline against the file. Long files are segmented and transcribed in chunks, then stitched into a single transcript.

Does Overture support speaker diarization?

Opt-in and model-dependent. WhisperKit builds with speaker-turn support will surface diarized output in Overture; others will produce a single continuous transcript.

How does Overture connect to the other modules?

Through the shared Nimbus8 runtime. A dictation from Overture is just text — it flows into Gale's composer, Mist's query bar, or Ashe's task description through the same capability manifest the rest of the app uses.

The microphone, minus the cloud.

Streaming transcription

WhisperKit on-device

Dictate anywhere

What is Overture?

WhisperKit under the hood

Dictation mode

Long-form transcription

Whisper variants supported

What leaves my device when I use Overture?

FAQ

Does Overture need an internet connection?

How accurate is on-device Whisper compared to cloud transcription?

Can Overture transcribe audio files I already have?

Does Overture support speaker diarization?

How does Overture connect to the other modules?

The microphone, minus the cloud.

Streaming transcription

WhisperKit on-device

Dictate anywhere

What is Overture?

WhisperKit under the hood

Dictation mode

Long-form transcription

Whisper variants supported

What leaves my device when I use Overture?

FAQ

Does Overture need an internet connection?

How accurate is on-device Whisper compared to cloud transcription?

Can Overture transcribe audio files I already have?

Does Overture support speaker diarization?

How does Overture connect to the other modules?

Everything that comes with Nimbus8.