Hugging Face Inference with OpenClaw

Run open-source models via the Hugging Face Inference API

OpenClaw integrates with the Hugging Face Inference API, so you can run thousands of open-source models (Llama, Mistral, Phi, Qwen, and many more) without hosting them yourself. OpenClaw supports first-class onboarding and API key configuration for Hugging Faceโ€”set up during openclaw onboard or in the Setup Wizard. Good for users who want to try open-source models with pay-per-use, or who prefer models from the Hugging Face catalog.

Why Hugging Face Inference

  • Thousands of models โ€” Llama, Mistral, Phi, Qwen, Gemma, and more
  • No self-hosting โ€” Inference runs on Hugging Face infrastructure
  • Pay-per-use โ€” No need to provision GPU or manage servers
  • First-class onboarding โ€” API key setup during OpenClaw install
  • Open-source focus โ€” Access models you can't get from closed APIs

Authentication

Create an API token at huggingface.co/settings/tokens. For Inference API, you may need a Pro subscription or pay-as-you-go for production use. Store the token in ~/.openclaw/credentials or via openclaw secrets. Onboarding can guide you through this.

Basic Configuration

Configure the Hugging Face provider and model ID. Model IDs use the Hugging Face format (e.g. meta-llama/Llama-3-8B-Instruct). Browse models at huggingface.co/models.

Hugging Face config
{
  "agent": {
    "model": "meta-llama/Llama-3-8B-Instruct",
    "provider": "huggingface"
  }
}

See Configuration guide and OpenClaw provider docs for full options.

When to Use Hugging Face vs Others

  • Hugging Face โ€” Open-source models, no GPU to manage, experiment with many models
  • Ollama โ€” Local models, full privacy, no API cost (but you need hardware)
  • Anthropic/OpenAI โ€” Top-tier closed models, best quality for many tasks
  • OpenRouter โ€” Multi-provider gateway including Hugging Face

Related