Quality + safety for AI agent skills

Better skills. Proven safe.

Grade any SKILL.md, then rewrite it better — grounded in 4,000+ exemplars. Free scan, no signup.

Scan a skill free → Browse the registry →

Two Skillproof reports side by side: a community skill graded F (do not install) for a credential-exfiltration chain, and anthropics/skills pdf graded A (safe to install). — Same kind of skill — one ships a **credential-exfiltration chain**, the other is clean. The independent, re-verifiable grade tells them apart, before you install.

Scan any skill

Paste a SKILL.md or a GitHub URL. You'll get a security grade, a quality grade, and a verdict, instantly, in your browser.

Try:

Two grades. One report.

Every other tool checks one or the other. Skillproof is the independent third party that does both, and then fixes the skill for you.

Lead · the optimizer

✨ Quality & rewrite

We score structure, triggering, steps, examples, output contract and token budget, then the optimizer does a bounded-edit rewrite of the skill body (not just the description), grounded in retrieved exemplar skills, and reports an estimated before/after. The SkillOpt method — free.

Bundled · trust

🛡️ Safety scan

Invisible-Unicode / ASCII-smuggling, bidi overrides, prompt injection, secret exfiltration, dangerous code, over-broad tool grants, config/memory poisoning (CLAUDE.md, settings.json, .mcp.json), supply-chain risk. A letter grade and a verdict, with the smuggled bytes shown, never carried.

Browse the Registry

We've already graded 4,000+ public agent skills for quality and safety — independent, vendor-neutral. Search, sort, open any skill's report, and read the SKILL.md. No signup.

▶ 4,011 skills graded · browse →

Why Skillproof and not the others

The skill-security space is crowded and we don't pretend otherwise. We win on the three things no one else does together.

Corpus-grounded

Edits are driven by retrieval over 4,000+ curated skills, the reference standard for "what good looks like." Others optimize against your own evals or not at all.

Full body rewrite

First-party tools stop at tuning the description so it triggers. We rewrite the actual logic, and refuse any rewrite that lowers your safety grade.

Vendor-neutral

Not a marketplace, not a security-platform upsell. One independent grade across Claude, Cursor, Codex and Gemini skills, embeddable as a badge.

A report for every skill

Every scan mints a permanent, shareable report: the quality grade, an honest safety signal, the findings, the full SKILL.md, and an embeddable badge. Consult any of the 4,000+ already graded, or scan your own.

Example Skillproof report: quality grade, safety signal, findings and the SKILL.md

▶ see a live report →

FAQ

Isn't skill scanning already a thing?

Yes. Snyk, Socket, Cisco and others scan skills for safety, and they're good. None of them improve your skill. Skillproof leads with the optimizer and bundles a safety scan so you get one honest "safe and good" report.

How is the uplift measured?

The free grade is a deterministic static lint. The optimizer reports a real before/after quality delta. A task-validated uplift (full SkillOpt-style held-out evaluation) is available via an early-access pilot when you bring an eval set, and we label estimates honestly.

Do you store my skill?

A scan creates a shareable report you can delete. Optimizations run against your account. We never train on your private skills.

Which agents are supported?

Anything using the SKILL.md / .skill format: Claude Code & Claude Skills, Cursor, Codex, Gemini CLI. The CLI and GitHub Action run in your own CI.