How to Check If an AI Skill Is Safe: A 5-Step Guide

AI agent skills — whether from ClawHub, GitHub, or any community source — can contain hidden malicious code. Research shows that roughly 26% of agent skills have at least one vulnerability, and some malicious skills have accumulated over 30,000 installs before being discovered.

This guide walks you through 5 practical steps to verify any skill before installation.

Step 1: Read the system_prompt Field

Every skill has a system_prompt (or equivalent instruction block). Open the SKILL.md file and carefully read what it instructs the agent to do.

Red flags to look for:

References to sensitive files: ~/.ssh/, ~/.env, config.json, files containing "password" or "token"
External URLs: webhook.site, requestbin.com, or any unfamiliar domain
Behavior overrides: "ignore previous instructions", "always remember", "from now on"
Shell commands: exec(), system(), curl, wget, nc

If the skill claims to be a "code formatter" but its instructions mention reading SSH keys — that's a major red flag.

Step 2: Check for Hidden Zero-Width Characters

This is the most overlooked attack vector. Zero-width characters (U+200B, U+200C, U+200D, U+FEFF) are invisible in text editors but can hide malicious instructions.

How to check manually:

bash

cat -v skill.md | grep -P '[\x00-\x08]'

Or use SkillsSafe's zero-width detector at skillssafe.com/en/zero-width-detector — paste the content and instantly see if any hidden characters are embedded, with exact positions highlighted.

Step 3: Verify the Author and Source

Is the author a known developer or organization?
Does the skill have a GitHub repository with commit history?
How many installs does it have? (High installs don't guarantee safety — the ClawHub malicious skill had 30,000 installs)
Are there reviews or security audit results?

Check if the skill has been scanned by security tools like SkillsSafe, SkillShield, or ClawSecure.

Step 4: Run an Automated Scan

Manual review is important but time-consuming. Use an automated scanner to catch patterns you might miss.

With SkillsSafe (free, no signup):

Option A — Web scanner: Visit skillssafe.com, paste the skill content or enter the URL, and get a risk report in seconds.

Option B — MCP Server (for agents):

bash

# OpenClaw
openclaw mcp add skillssafe https://mcp.skillssafe.com/sse

# Or add to any MCP config
{
  "mcpServers": {
    "skillssafe": {
      "url": "https://mcp.skillssafe.com/sse"
    }
  }
}

Also available on Smithery.ai — search for "skillssafe".

Option C — REST API:

bash

curl -X POST https://skillssafe.com/api/v1/scan/url \
  -H "Content-Type: application/json" \
  -d '{"url": "https://clawhub.ai/skills/my-skill/SKILL.md"}'

The scanner checks for credential theft, data exfiltration, prompt injection, reverse shells, ClawHavoc indicators, and hidden characters.

Step 5: Use Sandboxing

Even after scanning, run new skills in an isolated environment first:

Docker sandbox: openclaw --sandbox=docker
Dedicated low-privilege user account: Don't run agents as root/admin
Minimal workspace mounting: Only mount the directories the skill actually needs

Summary

Step	What	Time
1	Read system_prompt	2 min
2	Check zero-width characters	30 sec
3	Verify author/source	1 min
4	Run automated scan	10 sec
5	Use sandboxing	Setup once

The fastest path: paste the skill URL into SkillsSafe and get a risk score in seconds. If the score is below 50, don't install it.

Published by SkillsSafe — Free AI agent skill security scanner. Supports OpenClaw, Claude Code, Cursor, and Codex.

How to Check If an AI Skill Is Safe: A 5-Step Guide

Step 1: Read the system_prompt Field

Step 2: Check for Hidden Zero-Width Characters

Step 3: Verify the Author and Source

Step 4: Run an Automated Scan

Step 5: Use Sandboxing

Summary

Scan an AI Skill Now