Docs — Nimbus8

Start here

Nimbus8 is a native iPhone and iPad app for running useful AI locally. The default path is simple: install the app, pick a recommended open model, download it to your device, and start working in one of the modules. No Nimbus account is required for local use.

Install Nimbus8. Gale, the chat module, is free. Nimbus8 Pro unlocks the rest of the module set through Apple In-App Purchase.
Choose a model. Use the in-app catalog for tested picks, or browse Hugging Face for open-source models that fit your device.
Download locally. Model weights, indexes, chats, files, and agent logs live in the app sandbox on your device.
Open a module. Use Gale for chat, Cirrus for code, Ashe for agents, Mist for search, Mirage for images, Overture for audio, Stratus for vision, and Zephyr for translation.

Our goal: make capable AI available to more people without forcing every task through a cloud subscription. Nimbus8 favors open-source models, local control, and clear opt-in when a connected provider is useful.

Privacy model

Nimbus8 is local-first. Your conversations, attachments, model files, memory, search indexes, generated media, and agent activity are stored on-device by default. There is no always-on Nimbus cloud model, no server-side AI inference, and no telemetry requirement for local use.

What stays local

Chats, prompts, files, images, embeddings, offline packs, scheduled agent runs, generated outputs, and local logs stay inside the app unless you explicitly share, export, or connect a provider.

What can leave

Only data you send through an enabled connector, a web search, a custom endpoint, or optional OpenRouter mode can leave the device. Those features are explicit choices.

Credentials for connectors and model providers are stored in the Apple Keychain where possible. Nimbus8 rejects cleartext non-local connector endpoints for sensitive traffic and asks before destructive agent actions.

Models

Nimbus8 is designed around open models first. The catalog recommends models by task and device class, then the app checks whether your phone or iPad has enough memory and storage before you download. The same installed model can often power more than one module, so you do not need duplicate copies for chat, coding, agents, and translation.

Supported model paths

Curated catalog: tested recommendations for chat, code, vision, audio, search, and image generation.
Hugging Face: browse and install compatible open-source models directly from the app.
Local or LAN endpoints: advanced users can point Nimbus8 at their own OpenAI-compatible local server.
OpenRouter: optional provider mode for people who deliberately want hosted model access.

Modules

Each Nimbus8 module is a focused workspace over the same local runtime. You can use one model across multiple modules when its capabilities fit the task.

Chat

Gale is the everyday assistant for open models. Use it for local chat, streaming replies, attachments, PDFs, images, and tool-capable model workflows.

Code

Cirrus helps with software work: repo-aware context, inline diffs, patch review, GitHub flows, and local code reasoning.

Agents

Ashe runs on-device agent turns and scheduled Hands. It can call approved tools, work from memory, keep an activity feed, and ask before risky actions.

Mist searches local notes, files, web results when enabled, and offline knowledge packs with cited answers.

Images

Mirage generates and edits images through local diffusion workflows, device-fit presets, and a private gallery.

Audio

Overture handles speech-to-text, dictation, transcription, text-to-speech, voice workflows, and audio utilities.

Vision

Stratus reads images, screenshots, documents, camera captures, OCR, and visual questions with compatible local vision models.

Translate

Zephyr translates and rewrites multilingual text using on-device language models.

Memory

Haze is the shared memory layer behind the scenes: durable facts, cross-module recall, and local vector search.

Agents & approvals

Ashe is the agent runtime. A normal Ashe turn runs locally: the model reads the prompt, optionally calls tools, receives tool results, and stops when it reaches an answer, a budget limit, or your cancellation.

Hands: scheduled workflows that can run on a tick, such as checking a queue, summarizing updates, or preparing a brief.
Activity feed: a readable record of what the agent attempted, which tools it used, and where it stopped.
Approvals: destructive or externally visible actions ask first unless you explicitly configure a looser policy.
Audit posture: connector errors are redacted, secrets are not meant to enter prompts, and tool output is bounded before it returns to the model.

Connectors

Connectors let Nimbus8 work with accounts and systems you choose. They are not required for local chat. When enabled, a connector uses the credential you provide for that provider only, and the app shows connector tools through Ashe or module surfaces that support them.

Examples

Developer and project tools such as GitHub, GitLab, Linear, Sentry, and repository workflows.
Productivity surfaces such as calendar, contacts, notes, WebDAV, CalDAV, CardDAV, Matrix, Slack, Discord, Telegram, Todoist, Trello, and S3-compatible storage.
Local file and project context for Cirrus, Mist, and Ashe when you explicitly pick files or folders.

Network connectors require HTTPS except for loopback development endpoints such as localhost. Secrets are redacted from connector error messages before they can appear in agent context.

OpenRouter is opt-in only

Nimbus8 does not require OpenRouter. It exists as a deliberate escape hatch for people who want access to hosted models for a particular job, a fallback while downloading local weights, or comparison against local results.

Open Settings and choose the provider mode for the module or inference surface.
Add your own OpenRouter key. Nimbus8 stores the key locally and uses it only when OpenRouter mode is selected.
Review prompts before sending. Anything sent to OpenRouter is processed by that provider, not by local on-device inference.
Switch back to local mode whenever you want the default private path.

Device fit

Nimbus8 checks chip, memory, storage, model size, context window, and runtime support before recommending a model. If a model is too large, the app should steer you toward a smaller quantized model or a module-specific alternative.

For best results, start with curated catalog picks. Keep enough free storage for model weights and generated outputs, charge during long downloads or image generation, and use smaller models when your device is hot or memory pressure is high.

Troubleshooting

A model will not download

Check free storage, network connectivity, and whether the Hugging Face repository requires accepting a license. Try a smaller curated model if your device is near its memory budget.

Responses are slow

Choose a smaller model, reduce context length, close heavy apps, or let the device cool down. Local inference speed depends on chip, RAM, thermal state, and model format.

A connector is not working

Recheck the account credential, endpoint URL, and permission scope. Public endpoints should use HTTPS. Local development endpoints can use loopback HTTP only.

I want the most private setup

Use local models from the catalog or Hugging Face, leave OpenRouter disabled, avoid web search and external connectors, and keep files inside the app sandbox.

Everything Nimbus8 can do, and how to use it.