TL;DR

  • Risk level: High
  • Who is affected: Anyone running OpenClaw with default or misconfigured settings
  • Main issues: Prompt injection, supply chain (skills), gateway exposure, privileged access
  • Key insight: Local-first doesn’t mean safe—it’s a different risk profile, not a lower one
  • Mitigation: Isolation, sandboxing, approval workflows, and network segmentation

What OpenClaw Actually Does

OpenClaw is an autonomous AI agent that runs on hardware you control. Unlike chatbots that wait for prompts, it’s designed to be ambient—always running, periodically acting, integrated into your existing communication channels.

Core Architecture

ComponentFunctionRisk Surface
GatewayLocal web dashboard + CLINetwork exposure, auth bypass
ConnectorsChat app integrations (15+ platforms)Message interception, impersonation
SkillsDownloadable capability bundlesSupply chain, arbitrary code execution
HeartbeatScheduled task executionPersistent automation, unattended actions
LLM InterfaceAPI calls to Claude/OpenAI/local modelsAPI key exposure, prompt injection

The “Local-First” Tradeoff

OpenClaw’s pitch is compelling: your data stays on your hardware, you control the infrastructure, no vendor lock-in. But this shifts the security burden from the vendor to you.

SaaS model: Vendor secures the platform, you trust their security team
Local-first model: You secure everything, the vendor provides code

This isn’t inherently worse—it’s just different. Many organizations can secure local infrastructure better than they can audit SaaS security. But it requires understanding the risks.

The Five Core Risk Categories

1. Prompt Injection (Direct and Indirect)

Status: Acknowledged as unsolved by OpenClaw documentation

OpenClaw’s own security docs state: “system prompts are soft guidance only; hard enforcement comes from tool policy, approvals, sandboxing, allowlists.”

Direct injection: Attacker DMs your WhatsApp-connected agent with malicious instructions
Indirect injection: Agent reads a compromised webpage, email attachment, or document containing hidden instructions

Why it’s hard to fix: LLMs process all input as context. There’s no reliable way to distinguish “user’s legitimate request” from “attacker’s malicious instruction.” Current mitigations (system prompts, output filtering) are probabilistic, not deterministic.

OWASP LLM Top 10: Prompt injection remains #1.

2. Supply Chain (Skills and Plugins)

Skills are structured instruction bundles (SKILL.md files) that expand agent capabilities. Plugins run in-process with the gateway.

The risk: Installing a skill is functionally equivalent to running arbitrary code.

Skill installation flow:
1. User sends skill link to agent
2. Agent fetches SKILL.md from URL
3. Agent parses instructions and updates capabilities
4. New capabilities execute with agent's permissions

Attack vectors:

  • Compromised skill repository (ClawHub)
  • Typosquatting skill names
  • Malicious skills disguised as utilities
  • Skills that fetch additional remote instructions

OpenClaw’s warning: Their documentation explicitly states to treat skill installs like running arbitrary code.

3. Gateway Exposure

The gateway provides a web dashboard for controlling your agent. Historically, default configurations bound to all interfaces (0.0.0.0), making them discoverable from the internet.

Real-world incidents (January 2026):

  • Security researchers found widespread exposed gateways
  • Pillar Security documented attack traffic targeting these instances
  • Misconfigured reverse proxies bypassed authentication

What exposed gateways leak:

  • API tokens for connected services
  • Conversation logs
  • Agent configurations
  • File system access (if enabled)

4. Privileged Access

OpenClaw agents typically run with significant permissions:

  • Chat apps: Read/write access to connected platforms (WhatsApp, Telegram, Slack, etc.)
  • File system: Read/write local files (configurable but often broad)
  • Command execution: Shell access on the host system (if enabled)
  • Network: Outbound connections to APIs, services, and the internet

No built-in approval workflows: By default, the agent doesn’t ask before acting. If it decides to send a message, delete a file, or execute a command, it just does it.

5. Fetch-and-Follow (Platform Integration)

When connected to platforms like Moltbook, agents periodically download and execute remote instructions. See /risks/moltbook/fetch-and-follow-risk/ for detailed analysis.

Verified Security Incidents

DateIncidentImpactSource
Jan 27, 2026Fake VS Code extensionScreenConnect RAT installedAikido Security
Jan 30-31, 2026Exposed gatewaysCredential and log exposurePillar Security
Jan 31, 2026Moltbook database breach32,000+ agent credentials404 Media

These incidents happened within one week of viral growth. The attack surface was identified and exploited faster than most open-source projects see their first security review.

Risk Mitigation Strategies

Isolation

Run OpenClaw in environments that limit blast radius:

  • VPS: Off-network deployment (see /implement/openclaw/yolo-safely/)
  • VM: Dedicated virtual machine with no access to production
  • Container: Docker with read-only filesystems, restricted capabilities
  • Separate identity: Burner accounts for experimentation

Sandboxing

Contain what the agent can access:

  • Read-only filesystem except specific directories
  • No access to SSH keys, credentials, or sensitive files
  • Network egress restrictions (outbound 443 only)
  • No command execution permissions for untrusted skills

Approval Workflows

Implement human-in-the-loop for high-risk operations:

  • Confirmation before sending messages
  • Approval for file deletions or modifications
  • Review for command execution
  • Logging and alerting for all actions

Monitoring

Visibility into agent behavior:

  • Log all API calls and their parameters
  • Audit file access patterns
  • Monitor network egress
  • Alert on anomalous behavior

Who Should Avoid OpenClaw

  • Organizations with strict compliance requirements (healthcare, finance)
  • Anyone processing sensitive personal data without security expertise
  • Teams without capacity to monitor and maintain agent infrastructure
  • Users who need guaranteed availability or audit trails
  • Anyone uncomfortable with probabilistic security controls

Who Can Use It Safely

  • Security-conscious individuals with isolated infrastructure
  • Teams with dedicated security resources
  • Experimentation and prototyping (with proper containment)
  • Low-risk automation (non-sensitive data, recoverable operations)
  • Users who understand and accept the risk tradeoffs

Evidence Quality

ClaimConfidenceEvidence
Prompt injection unsolvedHighOpenClaw official documentation
Gateway exposure incidentsHighPillar Security reporting
Fake VS Code extensionHighAikido Security analysis
Skill system risksHighOpenClaw documentation warnings
Privileged access modelHighArchitecture documentation

What Would Change This Assessment

  • Built-in deterministic prompt injection defenses
  • Cryptographically signed skills with verification
  • Mandatory approval workflows for high-risk operations
  • Comprehensive audit logging by default
  • Independent security audit of codebase

AIHackers Verdict

OpenClaw represents a meaningful shift in AI agent architecture: local-first, user-controlled, extensible. These are genuine advantages over SaaS alternatives. But the security model requires expertise and ongoing vigilance.

Bottom line: OpenClaw is powerful because it’s local-first, but that power comes with responsibility. Treat it like any other privileged system access tool: isolate it, monitor it, and never give it more permissions than absolutely necessary.


What to Do Next

Already running OpenClaw? Secure your deployment — 10-minute hardening checklist.

Evaluating OpenClaw for your team? Review the verified security incidents before making a decision.

Not convinced the risks are real? Read about the January 31 Moltbook breach — 32,000 agents compromised in one configuration error.

The incidents of January 2026 weren’t flukes—they were predictable consequences of rapid growth meeting a broad attack surface. Going forward, assume that if you run OpenClaw, someone will try to exploit it. Design your deployment accordingly.

Sources

  • OpenClaw documentation (docs.openclaw.ai)
  • OpenClaw security docs (docs.openclaw.ai/gateway/security)
  • Aikido Security: Fake VS Code extension analysis
  • Pillar Security: Gateway exposure findings
  • OWASP Top 10 for LLM Applications