How to Check If an AI Skill Is Safe: A 5-Step Guide
AI agent skills — whether from ClawHub, GitHub, or any community source — can contain hidden malicious code. Research shows that roughly 26% of agent skills have at least one vulnerability, and some malicious skills have accumulated over 30,000 installs before being discovered.
This guide walks you through 5 practical steps to verify any skill before installation.
Step 1: Read the system_prompt Field
Every skill has a system_prompt (or equivalent instruction block). Open the SKILL.md file and carefully read what it instructs the agent to do.
Red flags to look for:
- References to sensitive files:
~/.ssh/,~/.env,config.json, files containing "password" or "token" - External URLs:
webhook.site,requestbin.com, or any unfamiliar domain - Behavior overrides: "ignore previous instructions", "always remember", "from now on"
- Shell commands:
exec(),system(),curl,wget,nc
If the skill claims to be a "code formatter" but its instructions mention reading SSH keys — that's a major red flag.
Step 2: Check for Hidden Zero-Width Characters
This is the most overlooked attack vector. Zero-width characters (U+200B, U+200C, U+200D, U+FEFF) are invisible in text editors but can hide malicious instructions.
How to check manually:
cat -v skill.md | grep -P '[\x00-\x08]'
Or use SkillsSafe's zero-width detector at skillssafe.com/en/zero-width-detector — paste the content and instantly see if any hidden characters are embedded, with exact positions highlighted.
Step 3: Verify the Author and Source
- Is the author a known developer or organization?
- Does the skill have a GitHub repository with commit history?
- How many installs does it have? (High installs don't guarantee safety — the ClawHub malicious skill had 30,000 installs)
- Are there reviews or security audit results?
Check if the skill has been scanned by security tools like SkillsSafe, SkillShield, or ClawSecure.
Step 4: Run an Automated Scan
Manual review is important but time-consuming. Use an automated scanner to catch patterns you might miss.
With SkillsSafe (free, no signup):
Option A — Web scanner: Visit skillssafe.com, paste the skill content or enter the URL, and get a risk report in seconds.
Option B — MCP Server (for agents):
# OpenClaw
openclaw mcp add skillssafe https://mcp.skillssafe.com/sse
# Or add to any MCP config
{
"mcpServers": {
"skillssafe": {
"url": "https://mcp.skillssafe.com/sse"
}
}
}
Also available on Smithery.ai — search for "skillssafe".
Option C — REST API:
curl -X POST https://skillssafe.com/api/v1/scan/url \
-H "Content-Type: application/json" \
-d '{"url": "https://clawhub.ai/skills/my-skill/SKILL.md"}'
The scanner checks for credential theft, data exfiltration, prompt injection, reverse shells, ClawHavoc indicators, and hidden characters.
Step 5: Use Sandboxing
Even after scanning, run new skills in an isolated environment first:
- Docker sandbox:
openclaw --sandbox=docker - Dedicated low-privilege user account: Don't run agents as root/admin
- Minimal workspace mounting: Only mount the directories the skill actually needs
Summary
| Step | What | Time |
|---|---|---|
| 1 | Read system_prompt | 2 min |
| 2 | Check zero-width characters | 30 sec |
| 3 | Verify author/source | 1 min |
| 4 | Run automated scan | 10 sec |
| 5 | Use sandboxing | Setup once |
The fastest path: paste the skill URL into SkillsSafe and get a risk score in seconds. If the score is below 50, don't install it.
Published by SkillsSafe — Free AI agent skill security scanner. Supports OpenClaw, Claude Code, Cursor, and Codex.