Model Providers

Supported LLM providers and models for OpenClaw

Recommended Providers

Anthropic (Claude)

Recommended: Anthropic Pro/Max (100/200) + Opus 4.6

Why: Long-context strength and better prompt-injection resistance
Models: Claude Opus 4.6, Opus 4.5, Claude Sonnet, Claude Haiku
Authentication: OAuth (subscription) or API keys
Best For: High-quality responses, long conversations, security-conscious users

OpenAI (GPT)

Supports ChatGPT and Codex models:

Models: GPT-4, GPT-3.5, Codex (including gpt-5.3-codex)
Authentication: OAuth (subscription) or API keys
Best For: Code generation, general purpose tasks

Other Supported Providers

Gemini - Google's Gemini models, including Gemini 3.1 (google/gemini-3.1-pro-preview) from 2026.2.21
Volcano Engine (Doubao) / BytePlus - Doubao, Kimi, GLM, DeepSeek; coding variants via volcengine-plan; onboard with volcengine-api-key
Moonshot - Moonshot AI (Kimi) models
Minimax - Minimax models (popular for local-like performance)
Vercel AI Gateway - Unified gateway for multiple providers
OpenRouter - Access to multiple models through one API
Bedrock - AWS Bedrock models
GLM - GLM (Zhipu) models via Volcano Engine or Z.AI
Zai - Z.AI provider for GLM-5, GLM-4.7, Coding Plan
xAI (Grok) - xAI Grok models
Hugging Face Inference - Run models via the Hugging Face Inference API (first-class onboarding and API key support)
vLLM - Self-hosted OpenAI-compatible inference. Llama, Mistral, Qwen. High throughput.

Ollama (Local + Cloud)

Ollama is the fastest way to run OpenClaw on Mac and Linux. With Ollama 0.17+, a single command installs and configures everything:

One command

ollama launch openclaw --model kimi-k2.5:cloud

Ollama supports local models (full privacy, no API costs) and cloud models (e.g. kimi-k2.5, minimax-m2.5, glm-5) with full context. OpenClaw integrates via the native Ollama API for streaming and tool calling. Use baseUrl: "http://host:11434" (not /v1). See Ollama + OpenClaw tutorial and docs.openclaw.ai/providers/ollama.

Local Models (General)

Run models entirely on your machine for complete privacy:

Complete Privacy - No data leaves your machine
No API Costs - Free to run (after initial setup)
Offline Capable - Works without internet
Customizable - Fine-tune and modify models

Configure local models in your Gateway configuration. Requires sufficient hardware (GPU recommended for best performance). Ollama (above) is the recommended path for local setup. For high-throughput self-hosting, see vLLM.

Hardware Requirements (Local Models)

Rough guidelines for running local models. Actual needs depend on model size and inference server (Ollama, vLLM, etc.):

7B models — 8–16GB VRAM (or CPU with 16GB+ RAM, slower)
13B models — 16–24GB VRAM
34B–70B models — 48–80GB VRAM or multi-GPU

For lighter setups, use Ollama (optimized for consumer hardware), cloud-backed models via Ollama, or API providers.

Authentication Methods

OAuth (Subscription-Based)

For subscription services like Claude Pro/Max and ChatGPT Plus:

Authenticate using your subscription account
No API keys needed
Uses your subscription limits
More convenient for personal use

API Keys

For pay-per-use or API-based access:

Configure API keys in credentials
Pay per token/request
More control over costs
Better for production/enterprise use

Model Failover

Configure automatic failover for reliability:

Failover Config

{
  "agent": {
    "model": "anthropic/claude-opus-4-6",
    "fallback": [
      "anthropic/claude-sonnet",
      "openai/gpt-4"
    ]
  }
}

If the primary model fails or is unavailable, OpenClaw automatically tries fallback models in order.

Model Routing & Cost

Running an always-on agent with a single premium model for every task can get expensive. Many users adopt tiered model routing:

Main chat: Use a strong model (e.g. Claude Opus) for direct conversations and important decisions.
Heartbeats and cron: Use a cheaper, fast model (e.g. Gemini Flash) for scheduled “wake up and check” tasks—see Automation for cron and heartbeats.
Subagents / background work: Route background tasks to a capable but cost-effective model so the main session stays responsive.

Costs vary with usage (from tens to hundreds of dollars per month). For example setups and cost management, see Example Setups & Model Routing.

If you prefer automatic per-request routing—each query sent to the cheapest model that can handle it, without assigning tiers manually—third-party plugins can do that. For example, ClawRouter (BlockRunAI) runs routing locally, supports 41+ models through one wallet, and uses pay-per-request (USDC on Base). Set your model to blockrun/auto after installing the plugin. See the Skills list for discovery; we don’t endorse or guarantee third-party tools.

Configuration Examples

Minimal Configuration

Minimal Config

{
  "agent": {
    "model": "anthropic/claude-opus-4-6"
  }
}

With Failover