The NGINX for AI runtimes

One gateway for every
AI runtime you run.

ModelDock runs on your machine and LAN, exposing a single OpenAI-compatible gateway. Route, audit and secure requests across Ollama, LM Studio, DashScope, Codex and Claude Code with an instance:model id — no upstream credentials ever leave your box.

Rust core · Tauri v2 desktopmacOS · Windows · Linux
quickstart.py
from openai import OpenAI
 
client = OpenAI(
base_url="http://127.0.0.1:3737/v1",
api_key="mdk_…",
)
 
resp = client.chat.completions.create(
model="ollama-local:qwen3:14b", # or an alias: default-code
messages=[{"role": "user", "content": "Hi"}],
)
Unified routing

Every runtime behind one id

Address a model directly as instance:model, or use a stable alias with an ordered fallback chain. The router splits on the first colon only, so upstream names that contain colons just work — and unknown ids fail loudly, never silently.

OOllamaLocal
LLM StudioLocal
DDashScopeCloud
CCodexCLI
CClaude CodeCLI
ollama-local:qwen3:14bOllama → qwen3:14b
dashscope-prod:qwen-plusDashScope → qwen-plus
claudecode-local:sonnetClaude Code → sonnet
default-codealias → primary + fallback chain
Why ModelDock

Built for developers who run AI locally

A thin, fast Rust gateway that keeps your tooling consistent whether a model lives on your GPU, in a local app, or behind a cloud API.

OpenAI-compatible API

Standard /v1/chat/completions and /v1/models endpoints, plus an Anthropic-shaped /v1/messages route. Point any existing SDK at 127.0.0.1:3737.

Provider instances

Run several instances of one runtime — ollama-local, ollama-gpu-node — each with its own base_url, allowlist, risk level and LAN exposure. Codex and Claude Code stay single-instance per machine.

Aliases & fallback

Define stable aliases like default-code with a primary plus an ordered fallback chain. Failed calls retry down the chain; policy and permission errors never fall back.

Keys, scopes & policy

Generate local mdk_ keys (only the argon2 hash + prefix are stored). Authorize by model / alias / instance / cloud / high-risk / LAN, with rate limits and daily token quotas per key.

Desktop control center

A Tauri v2 + React dashboard for Providers, Models & Aliases, API Keys, Audit and Logs — manage every runtime from one native window.

Audit & prompt logging

One audit event per request, physically separate from runtime logs, with an x-modeldock-request-id header. Prompt logging is leveled (off / metadata / redacted / full) and off by default.

How it works

A clean path from request to runtime

Every chat and model call follows the same predictable pipeline. Routing stays provider-agnostic — concrete runtimes are wired in at the composition layer only.

Client
Auth
Scope
Policy
LAN
Alias
Router + Fallback
Provider
01

Authenticate & scope

The mdk_ key is verified against its argon2 hash, then checked for the required scope before anything else runs.

02

Policy & LAN gate

Per-key policy authorizes the target by model / alias / instance / cloud / high-risk, and the LAN gate enforces exposure — fail-closed.

03

Resolve & route

Aliases resolve to a primary plus fallback chain; ModelRouter parses instance:model and selects the provider instance.

04

Forward & fall back

The provider calls the runtime and restores the gateway model id. On failure it retries the fallback chain — but never on policy errors.

Security & privacy

Privacy is the architecture, not a setting

These boundaries are core to the product. Your machine stays the trust boundary — by design.

Credentials never touch the gateway

Codex and Claude tokens, cookies, sessions and auth files are handled entirely by the upstream CLIs. ModelDock never reads, displays, copies or stores them.

Sandboxed by default

The Codex provider runs app-server in a read-only sandbox and declines command, file-change, permission and MCP elicitation requests. Claude Code runs with tools disabled and no session persistence.

Prompt logging is opt-in

Logging of prompt content is off by default. Provider secrets flow through a masked SecretStore — the UI only ever sees a preview, never the value.

Localhost unless you say otherwise

The gateway binds to 127.0.0.1. LAN exposure requires explicitly setting the host to 0.0.0.0 and enabling allow_lan — high-risk providers default to no LAN exposure at all.

FAQ

Frequently asked questions

Short, factual answers about what ModelDock is and how it works.

What is ModelDock?

ModelDock is a local and LAN AI Runtime Gateway — the NGINX for AI runtimes. It runs on your machine and exposes one OpenAI-compatible API on 127.0.0.1:3737 that routes, audits and secures requests across Ollama, LM Studio, DashScope, Codex and Claude Code.

Is ModelDock free and open source?

Yes. ModelDock is open source under the MIT license and free to use. You can download the desktop app or build it from source.

Which AI runtimes does it support?

Out of the box: Ollama, LM Studio, DashScope, Codex and Claude Code. You can run multiple instances of one runtime, and add new providers without changing your client code.

Does ModelDock store my API keys or upstream credentials?

No upstream credentials ever leave your machine. The gateway never reads, displays, copies or stores Codex or Claude tokens. Local gateway keys (mdk_) are stored only as an argon2 hash plus a display prefix, and provider secrets are masked in the UI.

How do I call ModelDock from the OpenAI SDK?

Point any OpenAI-compatible SDK at http://127.0.0.1:3737/v1 with an mdk_ key, and set the model to instance:model (for example ollama-local:qwen3:14b) or a model alias such as default-code.

What platforms does ModelDock run on?

ModelDock ships as a Tauri v2 desktop app for macOS, Windows and Linux, with a Rust core that boots the embedded gateway on startup.

Run every model from one place

Install the desktop app, start the embedded gateway, and point your tools at a single local endpoint. Open source and free.

Defaults to 127.0.0.1:3737 · Rust workspace + Tauri v2