SHIELD.md: Give Your Agent a Security Policy
A simple, updatable way to tell your agent how to react when something risky shows up
A simple, updatable way to tell your agent how to react when something risky shows up
Once your OpenClaw agent is on the internet or pulling in skills from the wild, you’re dealing with real risk: someone could get at your machine, a malicious skill could run, or prompt injection could steer the agent. SHIELD.md is a way to give your agent a written security policy—rules it loads from a file and applies when something matches a known threat. Block, ask for approval, or just log. It’s a community idea (SHIELD v0) that sits on top of what you’re already doing: sandboxing, DM pairing, tool policies. One more layer you can edit and version like any other doc.
Below we walk through what SHIELD is, where it fits in your agent’s file layout, how to use it, and what it can’t do. The full spec and template live in the community; we point you there at the end.
SHIELD makes the most sense when you’re already thinking about risk: you install skills from ClawHub or elsewhere, you’re opening DMs to more people, or you want a single place to list “don’t run this,” “ask me first,” or “log it.” It doesn’t replace sandboxing or skill review—it gives the agent explicit, updatable instructions so you don’t have to repeat yourself in SOUL or AGENTS. If you’re just running locally for yourself and you trust every skill, you might not need it yet. If you’re scaling up or sharing an agent, it’s worth a look.
OpenClaw agents use a handful of Markdown files in the workspace. Each has a job:
SHIELD.md is the security layer: “when you see this kind of threat, do this.” Block, ask for approval, or log. It doesn’t change the agent’s job—it adds a policy you can update as threats show up or fade. New malicious skill? Add an entry. False positive? Remove it or relax the action.
SHIELD is a Markdown file with a defined structure. In practice you’re maintaining:
The spec includes threat categories like prompt injection, dangerous tool use, malicious MCP, memory tampering, supply-chain (bad skills or deps), fraud, policy bypass, and general anomalies. You add and remove entries as you go—no need to rewrite the rest of the agent.
So SHIELD is guardrails in a file. You tell the agent in SOUL, AGENTS, and MEMORY to load SHIELD and follow it before doing the sensitive stuff. The file is plain Markdown—edit it, version it, share it.
SHIELD v0 is meant as early guardrails, not a locked-down security boundary:
Bottom line: pair SHIELD with sandboxing, allowlists, tool policies, and real skill review. For high-risk setups, keep using isolated machines and least privilege.
SHIELD.md at your agent root (same place as AGENTS.md, SOUL.md, and the rest).Because it’s just a file in your workspace, you can edit it, put it in version control, and reuse it across agents or teams.
Is SHIELD enough on its own? No. It’s a policy the model is asked to follow. You still need sandboxing, DM pairing, tool policies, and reviewing skills before you install them. SHIELD is a structured way to say “when you see X, do Y”—it doesn’t replace those other controls.
Do I need MoltThreat? No. MoltThreat is one way to get a threat list that fits the SHIELD format. You can also write your own entries (e.g. “block this skill name,” “require approval for this domain”) or mix community feed with your own rules.
Where’s the full template? The SHIELD v0 spec, Decision block syntax, and field definitions come from the community. We link to the source below; that’s where to get the exact template and any updates (e.g. SHIELD v1).
SHIELD v0 was proposed as an open standard so agents can have a consistent, readable security policy. The idea and template come from the community—see fr0gger_’s post on X for the full write-up. That post also introduces MoltThreat, a curated threat database for agents that can help keep your SHIELD.md in sync with known risks. For the full spec, Decision block format, and the recommendation_agent mini-language, grab the template from there or any linked repo. This page is a summary for OpenClaw users; the official spec may evolve (e.g. v1).
In short: SHIELD.md is a Markdown security policy you put in your agent’s workspace. You list threats and what to do (log, require approval, block). The agent is told to check it before risky actions and to output a Decision block. It’s guardrails, not a vault—use it with sandboxing and skill review.