What is Zephyr?

Zephyr is Nimbus8's on-device translation module — a quiet, linguistically precise tool that renders text from one language into another without ever sending it to a cloud MT service. Paste a paragraph, drop in a PDF, or pipe a chat turn from another module, and Zephyr produces a streamed draft in the target language.

It's bidirectional by default: swap source and target with a single tap and the inverse model path is already loaded. Speaker-aware prompts keep formality, register, and tone consistent across turns, which matters far more for translation quality than raw model size.

Which languages does Zephyr support?

Zephyr's default model, NLLB-200-distilled-600M, covers just over two hundred languages — including the long tail of low-resource pairs that most cloud services either skip entirely or serve through degraded pivot paths. High-traffic pairs (English ↔ Spanish, French, German, Mandarin, Japanese, Arabic) can use lighter, faster specialists.

You don't pick the model manually for language coverage. Zephyr reads the requested pair, the installed models, and your device tier, then selects the smallest competent model that can actually run on your chip.

Which models does Zephyr run?

Every translation model ships as a normal Nimbus8 install — same sandboxed on-device pipeline as Gale. Three canonical picks cover the full spectrum:

NLLB-200-distilled-600M — the default. Meta's No Language Left Behind model, distilled for mobile. 200+ language pairs, solid quality across the long tail, fits comfortably on recent iPhones.
M2M-100 — smaller, faster, and tuned for speed on specific popular pairs. When you're translating repeatedly in the same direction, this is the low-latency option.
Gemma-2B — used as a draft-plus-post-edit engine. Gemma writes a candidate translation, then Zephyr runs a lightweight post-edit pass for register, idiom, and tone. Slower, but often the best reading flow for long prose.

Translating PDFs and long text

Zephyr treats documents as first-class inputs. PDFs are chunked by paragraph, translated in streaming order, and stitched back into a layout that mirrors the source — headings stay headings, lists stay lists, tables are preserved where the source made them explicit.

Long text is streamed token-by-token the same way Gale streams a chat reply. You can start reading the opening paragraphs before the model has reached the end of the document, and the target-side layout updates as the source is consumed.

Optional text-to-speech

Any translation can be played back through Apple's built-in system voices — no third-party TTS service, no audio uploaded anywhere. The voice matches the target language where a system voice exists; when it doesn't, Zephyr falls back to the closest phonetic approximation and flags it, so you know the pronunciation is best-effort.

Speech is strictly opt-in and runs through the same sandboxed AVSpeechSynthesizer iOS already uses for VoiceOver.

What leaves my device when I use Zephyr?

Nothing, by default. The source text, the translation, the document structure, the playback — all of it stays local. Zephyr never sends source or target to a cloud MT service. The network is required only when you explicitly download a new translation model.

There is no telemetry. There is no account. There is no "improve the model with your data" toggle, because there is no path off the device for that data to take. See the privacy policy for the full data flow.

FAQ

Is Zephyr shipping yet?

No — Zephyr is in the exploring phase. The translation pipeline is working internally, but we won't ship it until the quality comfortably exceeds what a comparable cloud-free option already offers. We'd rather be late and good than early and rough.

How does on-device translation compare to Google Translate or DeepL?

For high-traffic pairs on short text, cloud services still have an edge on the last few percent of quality. For long documents, rare pairs, and anything you don't want leaving your phone, Zephyr wins on the axis that matters — the text stays with you. NLLB-200 and Gemma post-edit close most of the remaining quality gap on popular pairs.

Can Zephyr translate speech in real time?

Not yet. Real-time speech translation depends on Overture (our audio module) being mature enough to provide low-latency transcription, which is still in progress. When both are ready, the two will compose cleanly: Overture transcribes, Zephyr translates, system voices speak the result — all on-device.

Does Zephyr handle right-to-left scripts?

Yes. Arabic, Hebrew, Urdu, Persian, and other RTL scripts render with correct text direction, punctuation, and bidirectional text wrapping. Layout in translated PDFs flips accordingly when the target language is RTL.

Translate anything — without sending anything.

200+ languages

Paragraph or document

Speak aloud

What is Zephyr?

Which languages does Zephyr support?

Which models does Zephyr run?

Translating PDFs and long text

Optional text-to-speech

What leaves my device when I use Zephyr?

FAQ

Is Zephyr shipping yet?

How does on-device translation compare to Google Translate or DeepL?

Can Zephyr translate speech in real time?

Does Zephyr handle right-to-left scripts?

Translate anything — without sending anything.

200+ languages

Paragraph or document

Speak aloud

What is Zephyr?

Which languages does Zephyr support?

Which models does Zephyr run?

Translating PDFs and long text

Optional text-to-speech

What leaves my device when I use Zephyr?

FAQ

Is Zephyr shipping yet?

How does on-device translation compare to Google Translate or DeepL?

Can Zephyr translate speech in real time?

Does Zephyr handle right-to-left scripts?

Everything that comes with Nimbus8.