skills/agent-hardening/SKILL.md
Use when reviewing agent security, prompt injection defenses, input sanitization, LLM security hardening, or SSRF protection. Keywords: prompt injection, agent hardening, input sanitization, LLM security, SSRF, defense in depth
npx skillsauth add avifenesh/cairn agent-hardeningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference-only skill documenting security hardening patterns for LLM-powered agent systems.
LLM-powered agents face a unique threat surface that traditional web applications do not:
| ID | Threat | Agent Relevance | |----|--------|-----------------| | LLM01 | Prompt Injection (direct + indirect) | Untrusted data reaches prompt via feed, email, webhooks, agent messages | | LLM02 | Insecure Output Handling | Tool results re-injected into LLM context without validation | | LLM05 | Insecure Plugin Design | Tool arguments not schema-validated; uncapped tool loop iterations | | LLM07 | SSRF | Web-fetch/URL tools used to probe internal network or cloud metadata |
| Layer | Strategy | What It Guards | |-------|----------|----------------| | 1 | Input sanitization at trust boundaries | Strip XML/HTML tags, escape special chars from external content to prevent prompt injection | | 2 | Privilege separation (dual-LLM quarantine) | Process untrusted content with stripped-down LLM (no tools, no memories), pass only capped factual summaries | | 3 | Output validation | Validate tool results against schema before re-injecting into LLM context | | 4 | Schema validation (Zod/JSON Schema) | Reject malformed tool arguments, enforce field types and size limits | | 5 | Rate limiting and loop guards | Cap tool calls per session, detect repetitive tool invocations, prevent runaway execution | | 6 | SSRF protection | Block private IPs, loopback, cloud metadata, DNS rebinding; redirect blocking | | 7 | Environment isolation | Subprocess gets only safe env vars, workspace path restricted, timeout enforced |
A quarantine LLM processes untrusted content with NO tools and NO memory context. Its system prompt constrains the model to produce only factual summaries and reject instruction-like content.
All external content is tag-stripped before entering prompt context:
<...> patterns -- used for feed titles, memory content, LLM input/outputThese are abstract categories -- no working payloads are included:
See reference.md for Cairn's defense inventory, gap analysis, and remediation recommendations.
data-ai
Detect agent-cairn PRs that have stalled (no activity >=90 min) and classify the failure mode to route to appropriate recovery agent.
tools
Post-install skill adaptation: read a newly installed SKILL.md, fix environment-specific references (paths, accounts, tool names), assign the skill to relevant agent types, and propose an AGENTS.md update. Triggered automatically after cairn.installSkill completes.
data-ai
Monthly self-improvement brief for Cairn. Queries error_patterns, action_exemplars, experiment_windows, and session_journal to synthesize what Cairn learned, where it failed, and 3 concrete proposals for Avi to approve. Run on the 1st of each month. Keywords: growth brief, monthly review, self-improvement, what did cairn learn, how is cairn doing, monthly report
testing
Decision support with memory-backed context. Retrieves past decisions, journal history, and relevant facts before answering questions that involve a choice or tradeoff. Keywords: should I, which is better, tradeoff, compare, decide, choose, option, alternative, pros and cons, recommend