We're releasing openclaw-shield, an open source security plugin that adds guardrails to OpenClaw agents. It prevents secret leaks, PII exposure, and destructive command execution.
The Risk
AI agents operating on behalf of users can access files, run shell commands, and produce text responses. Without guardrails, they can read .env files and output raw API keys, display Social Security numbers or credit card numbers, execute destructive commands like rm -rf, or exfiltrate credentials by embedding them in shell commands. That's what openclaw-shield prevents.
It uses five independent layers of defense-in-depth security, each independently toggleable:
- Prompt Guard - injects security policy into the agent context before each turn
- Output Scanner - redacts secrets and PII from tool output before transcript persistence
- Tool Blocker - blocks dangerous tool calls at the host level before execution
- Input Audit - logs inbound messages and flags any secrets users accidentally send
- Security Gate - requires the agent to call a gate tool before exec or file-read, returning ALLOWED or DENIED
It detects AWS keys, GitHub tokens, Stripe keys, JWTs, private keys, and more. It catches PII including emails, SSNs, credit card numbers, and phone numbers. It also blocks destructive commands like rm, format, mkfs, and dd, plus any custom patterns you define.
Installation is One Line:
openclaw plugins install @knostic/openclaw-shield
No build step, no external dependencies, no database. Defaults are secure out of the box.
Critical Known Limitations
OpenClaw gets updated constantly, and without community updates, openclaw-shield won't stay effective for more than mere days. We've already had to update it several times. PRs are welcome and encouraged.
Knostic: Discovery and Control for the Agent Layer
If you're looking for visibility and control over your coding agents, MCP servers, and IDE extensions, from Cursor and Claude Code to Copilot, check out what we're building at https://www.getkirin.com/