.agents/skills/ai-security/SKILL.md
AI security and governance checklists: OWASP LLM Top 10, OWASP Agentic AI Top 10, MITRE ATLAS tactics, NIST AI RMF, least-privilege tool access, EU AI Act risk tiers. Load when reviewing agent architecture for injection risk, tool access policy, trust boundaries, supply chain, or compliance.
npx skillsauth add alexandrsurkov/forgentframework ai-securityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Load this skill when the task concerns AI-specific security review, tool access policy, trust boundary design, supply chain risk, prompt injection surface, or regulatory compliance.
.agent.md or mcp.json for least-privilege| ID | Risk | In this pipeline | |---|---|---| | LLM01 | Prompt Injection — malicious content in tool output or user input overrides system instructions | Any agent that reads external content (files, issues, URLs) is exposed. Critic must flag if executor reads unvalidated external input. | | LLM06 | Supply Chain — compromised model, dependency, or third-party agent | Verify model source stays GitHub Copilot managed; every new MCP server = new supply chain surface. | | LLM09 | Misinformation — agent confidently produces wrong output | Critic isolation and golden tests (A1.5) are the primary mitigations. |
Quick triage questions for any new agent:
These complement LLM Top 10 — LLM Top 10 covers model vulnerabilities; Agentic Top 10 covers agent system risks.
| Risk | Description | Mitigation in this pipeline |
|---|---|---|
| Excessive Agency | Agent has more tools/permissions than its task requires | Least-privilege tool list in .agent.md; see checklist below |
| Trust Boundary Violation | Agent trusts output from another agent without validation | Critic receives only final result, not raw executor output; orchestrator validates before routing |
| Human-in-the-Loop Bypass | Irreversible action taken without human confirmation | NEEDS_HUMAN escalation path; no destructive tools in agent tool lists |
| Insecure Inter-Agent Trust | Agents authenticate each other implicitly | VS Code Copilot agents don't share sessions; orchestrator is the only caller |
| Cascading Hallucination | Hallucination in one agent propagates through the pipeline | Critic reviews each result before the next subtask starts |
For every agent in .github/agents/*.agent.md:
createFileseditFilesrunTerminalwebSearchmcp.json — is it used by this agent? If not, not declared in its scope.Roles → expected tool sets:
| Agent | Expected tools |
|---|---|
| orchestrator | agent, readFile, createFiles, editFiles (sessions/traces only), fileSearch, textSearch |
| spec-editor | readFile, editFiles, createFiles, fileSearch, textSearch |
| docs-critic | readFile, fileSearch, textSearch (read-only) |
| process-critic | readFile, fileSearch, textSearch (read-only) |
| agent-architect (read-only in this repo) | readFile, fileSearch, textSearch, fetch, webSearch, githubRepo |
| Tactic | Description | Pipeline relevance |
|---|---|---|
| AML.T0051 — Prompt Injection | Adversarial input in data manipulates model output | Any agent that reads file content written by parties outside the pipeline |
| AML.T0040 — Model Extraction | Repeated queries to reconstruct model weights or behavior | Low risk for VS Code Copilot agents; relevant if agents are exposed via API |
| AML.T0043 — Craft Adversarial Data | Poisoning training or fine-tune data | Relevant if golden tests (.agents/evals/) could be manipulated |
| AML.T0048 — Backdoor ML Model | Trojan in model responds to trigger input | Risk at supply chain layer (LLM06 overlap); mitigated by GitHub Copilot managed infrastructure |
| Function | Key questions for this pipeline |
|---|---|
| GOVERN | Are roles defined? (AGENTS.md) Are ADRs recorded? Is AGENTS_CHANGELOG maintained? |
| MAP | Are risks documented per agent? Is the trust boundary diagram current? |
| MEASURE | Are golden tests passing? Is the critic rubric calibrated? |
| MANAGE | Is NEEDS_HUMAN escalation path functional? Are irreversible actions gated? |
| Standard | When relevant | Action | |---|---|---| | Constitutional AI | Designing critique rubrics | Use BLOCKER/WARNING/SUGGESTION severity (→ see agent-patterns SKILL) | | ISO/IEC 42001 | Enterprise/regulated deployment | AI Management System baseline; relevant for audit readiness | | EU AI Act | EU-market deployment | Classify pipeline risk tier: Limited (chatbot) or High (decisions about persons) | | Model Cards | Deploying or fine-tuning a model | Document capabilities, limitations, intended use | | Spec-Driven Development (SDD) | Adding a new agent feature | Phase 0: spec approved before Phase 1: implementation |
testing
TODO: one line describing the domain + explicit trigger phrases
development
Frontend skill pack: UI conventions, testing, build, and performance constraints
development
DevOps skill pack: CI/CD, environments, IaC, secrets policy, and observability
development
Backend skill pack: architecture, conventions, testing, and ops for the backend component