Changelog — Nimbus8

v0.14 — Stratus beta, Gale vision April 2026

Stratus enters public beta with camera-to-ask OCR powered by Apple Vision and on-device VLMs. Gale gains multimodal input — attach photos and PDFs alongside text prompts, with vision routing handled automatically by the capability manifest.

Stratus (Vision): Live camera OCR, photo library import, clipboard image paste. Routes through Apple Vision framework first, falls back to on-device VLM for complex layouts.
Gale vision: Attach up to 6 images or PDFs per message. The capability manifest checks whether the active model supports vision and routes accordingly — no manual toggle.
Thinking blocks: Models that emit <think> reasoning (DeepSeek R1, Qwen3-Thinking, QwQ) now render as collapsible disclosure panels in the chat bubble. Thinking is stripped from memory and context to save budget.
Streaming think filter: New incremental parser handles <think> tags split across SSE chunks without visual glitches.
Model registry: Persistent index.json tracks every installed model's repo, filename, module, SHA256, and last-loaded timestamp. TOFU verification on first load; mismatch on subsequent loads is a hard error.
Bug fixes: Fixed workspace engine state loss across FFI calls. Fixed streaming chat returning zero token counts. Fixed telemetry buffer growing without bound.

v0.13 — Mist knowledge packs March 2026

Mist gains offline knowledge packs — installable content bundles (Wikipedia, Wiktionary, Wikivoyage) that feed the hybrid search index. Download once over WiFi, search forever without a connection.

Pack catalog: Bundled JSON manifest with download URLs pointing at Wikimedia's own servers. Nimbus8 never rehosts pack content.
Streaming ingest: Tar+gzip archives are parsed and embedded in a streaming pipeline — parser on a blocking thread, embedder on the async pool, bounded channel for backpressure. A full Simple Wikipedia (~200k articles) indexes in under 10 minutes on iPhone 15 Pro.
Source filtering: Hybrid queries accept include_source_prefix / exclude_source_prefix so the UI can run "only packs" and "only my notes" lanes against the same index.
Uninstall: One prefix scan removes every document indexed under a pack. Idempotent.

v0.12 — Chinook TTS, Ashe scheduler February 2026

On-device text-to-speech via Kokoro-82M lands as the Chinook module. Ashe gains a production scheduler with exponential backoff and iOS background task integration.

Chinook (TTS): 54 curated voices across 10 languages. Kokoro-82M runs through ONNX Runtime on the Neural Engine. Output is 24 kHz mono WAV — playable directly through AVAudioPlayer.
Ashe scheduler: Per-hand interval ticking with exponential backoff on failure (cap: 6 hours). Foreground loop via tokio, one-shot tick_all_due for BGAppRefreshTask.
Fuel metering: Token budgets, wall-time limits, and tool-call caps per turn. Background tasks get tighter budgets than foreground interactive turns.
Widget snapshots: Ashe writes atomic JSON snapshots to the App Group container after every wake. The widget extension reads them without opening the full app.

v0.11 — Mirage img2img, Echo word timestamps January 2026

Mirage editing: Use any generated image as a starting latent. Describe the edit in natural language ("make it sunset"), adjust strength, and the pipeline runs an img2img pass.
Echo word timestamps: Per-token timing data from whisper. Enables word-by-word playback highlighting and SRT/VTT export with word-level granularity.
Transcript export: Export dictation sessions as SRT, VTT, or plain text from the Echo module.
Gallery multi-select: Long-press to enter selection mode, drag across cells to extend. Share, save to Photos, or delete in bulk.

v0.10 — Cirrus GitHub PRs December 2025

GitHub Device Flow OAuth: Authenticate with GitHub from the app using the device code flow — no web view, no redirect URI.
PR creation: Cirrus can open pull requests against your repos. Diffs are generated by the Rust diff engine (Myers algorithm), reviewed in-app, and pushed with your explicit confirmation.
Syntax highlighting: Code blocks in chat and Cirrus use Highlightr when linked, with automatic language detection.

v0.9 — Haze memory, model registry November 2025

Haze (Memory): Cross-session vector recall backed by SQLite FTS5. The turn loop injects relevant memories into the system prompt so the agent has continuity across sessions.
Hot memory: Key-value working memory (Hermes MEMORY.md pattern) with automatic extraction every N turns. Safety-scanned before persistence.
Skill distillation: After a successful turn, the model optionally distills a reusable skill from the transcript. Skills are FTS-indexed and injected into future turns when relevant.

v0.8 — Zephyr translation October 2025

Zephyr: On-device multilingual translation. Pick a source and target language, paste or type text, get a translation from the active model. No network required.
Adaptive engine: Thermal monitoring adjusts inference parameters (batch size, thread count) when the device heats up, preventing thermal throttling from degrading the experience.

v0.7 — Overture dictation September 2025

Overture (Echo): On-device speech-to-text via whisper.cpp. Supports WAV file input, raw samples, and base64-encoded audio. Auto-detects language or accepts a hint.
Audio capture pipeline: AVAudioEngine tap resamples hardware audio to 16 kHz mono in real time. Smoothed audio level published for waveform UI.

v0.6 — Mist hybrid search August 2025

Mist: On-device semantic search with hybrid BM25 + dense vector retrieval fused via Reciprocal Rank Fusion. Backed by tantivy (BM25) and ONNX Runtime (BGE/MiniLM embeddings).
Web search reranking: Fetch snippets from DuckDuckGo/SearXNG, embed them locally, and rerank by cosine similarity against the query. No search history leaves the device.

v0.5 — Mirage on-device diffusion July 2025

Mirage: On-device image generation via Apple's ml-stable-diffusion. SD 1.5 and SDXL Turbo on the Neural Engine. Private gallery with metadata, share, and save to Photos.
Memory guard: Refuses to start generation when available memory is below 400 MB. Prevents OOM kills mid-generation.

v0.4 — Ashe agent runtime June 2025

Ashe: On-device agent runtime with a turn loop (plan → tool calls → apply → reflect), audit log with Merkle hash chain, and a tool registry with per-tool access policies.
Loop guard: Detects repeated identical tool calls and breaks the loop before fuel is wasted.
Safety scanner: Regex sweep for role-hijack prefixes, invisible Unicode, credential exfil patterns, and SSRF probes in any text injected into the system prompt.

v0.3 — Cirrus code module May 2025

Cirrus: Code-focused chat with inline diff cards, file creation, and project-aware context. Diffs use the Myers algorithm via the Rust diff engine.
Workspace engine: Local project + file state on disk. Create, edit, delete files; build a file tree for the model's context window.

v0.2 — HuggingFace browser April 2025

HF discovery: Search, browse, and download GGUF and MLX models from Hugging Face. Variant picker shows quantization, size, and backend compatibility.
Resumable downloads: Backed by hf-hub with HTTP Range resume and LFS redirect handling. Compatible with the Python huggingface_hub cache layout.
Gated model support: Paste your HF token in Settings to access gated repos (Llama, Gemma). Token stored in iOS Keychain, never logged.

v0.1 — Gale chat March 2025

Gale: The first module. Streaming chat with any open model via MLX on Apple Silicon. Token-by-token output, conversation history, session persistence.
Rust core: Foundation library with FFI bridge to Swift. Inference client, diff engine, download manager, model manager, telemetry buffer.
Vanilla Wood theme: The default (and only) palette. Warm paper tones, quiet surfaces, no dark mode yet.

What shipped, and when.