Hugging Face Inference with OpenClaw

Run open-source models via the Hugging Face Inference API

OpenClaw integrates with the Hugging Face Inference API, so you can run thousands of open-source models (Llama, Mistral, Phi, Qwen, and many more) without hosting them yourself. OpenClaw supports first-class onboarding and API key configuration for Hugging Face—set up during openclaw onboard or in the Setup Wizard. Good for users who want to try open-source models with pay-per-use, or who prefer models from the Hugging Face catalog.

Why Hugging Face Inference

  • Thousands of models — Llama, Mistral, Phi, Qwen, Gemma, and more
  • No self-hosting — Inference runs on Hugging Face infrastructure
  • Pay-per-use — No need to provision GPU or manage servers
  • First-class onboarding — API key setup during OpenClaw install
  • Open-source focus — Access models you can't get from closed APIs

Authentication

Create an API token at huggingface.co/settings/tokens. For Inference API, you may need a Pro subscription or pay-as-you-go for production use. Store the token in ~/.openclaw/credentials or via openclaw secrets. Onboarding can guide you through this.

Basic Configuration

Configure the Hugging Face provider and model ID. Model IDs use the Hugging Face format (e.g. meta-llama/Llama-3-8B-Instruct). Browse models at huggingface.co/models.

Hugging Face config
{
  "agent": {
    "model": "meta-llama/Llama-3-8B-Instruct",
    "provider": "huggingface"
  }
}

See Configuration guide and OpenClaw provider docs for full options.

When to Use Hugging Face vs Others

  • Hugging Face — Open-source models, no GPU to manage, experiment with many models
  • Ollama — Local models, full privacy, no API cost (but you need hardware)
  • Anthropic/OpenAI — Top-tier closed models, best quality for many tasks
  • OpenRouter — Multi-provider gateway including Hugging Face

Related