skills/llm-security/SKILL.md
# OWASP LLM Security Skill > Threat analysis for LLM-integrated systems against OWASP Top 10 for LLM Applications (v2.0, 2025). Covers prompt injection, output handling, agency boundaries, and supply chain risks specific to LLM-powered features. ## Trigger Invoke when: a task involves LLM calls (`llm_call_fn`), prompt construction, model output parsing, RAG/embedding pipelines, agent orchestration, or any feature where an LLM produces content that drives downstream behavior. ## Input Contrac
npx skillsauth add bigeasyfreeman/adlc skills/llm-securityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Threat analysis for LLM-integrated systems against OWASP Top 10 for LLM Applications (v2.0, 2025). Covers prompt injection, output handling, agency boundaries, and supply chain risks specific to LLM-powered features.
Invoke when: a task involves LLM calls (llm_call_fn), prompt construction, model output parsing, RAG/embedding pipelines, agent orchestration, or any feature where an LLM produces content that drives downstream behavior.
{
"task_spec": "TaskSpec from Build Brief",
"repo_map": "Cached codebase research output",
"llm_touchpoints": ["list of functions/modules that call LLMs or process LLM output"],
"agent_topology": "description of agent interactions if multi-agent"
}
Project-specific example: Issue tracker bodies are the primary indirect injection vector. Issue content flows into decomposition prompts, codegen contexts, and executor instructions. Intake classifiers, scope analyzers, and decomposition gates all parse user-controlled text through LLMs. Each must treat issue content as potentially adversarial.
Project-specific example: Codegen context assemblers that inline source code into prompts must ensure no secrets from .env or config files are included. Components that log LLM interactions must apply redaction before writing. Learning systems that store outcomes must scrub sensitive content from stored patterns.
Project-specific example: LLM client modules must pin model identifiers. Executor binaries must be verified. Any tool/skill plugins must be from trusted sources.
Project-specific example: A learning system that ingests outcomes from automated runs is vulnerable. If an attacker can influence issue content that the system processes, they can poison the learning loop. Outcome validation must check for anomalous patterns before updating policy. Any confidence updater must have bounds on how much a single outcome can shift behavior.
Project-specific example: Every module using JSON extraction from LLM responses must handle malformed output gracefully. Decomposition gates that produce slice plans from LLMs must validate them against the actual codebase before creating child issues. PR engines must sanitize LLM-generated descriptions. Executor code output must pass all quality gates before merge.
Project-specific example: This is the core risk for autonomous development systems. An agent that autonomously creates branches, writes code, opens PRs, and merges needs strong controls: confidence-threshold merge gates (e.g., auto-merge >= 0.85, human gate < 0.60), human-gate patterns for sensitive files, max failure retries, and multi-perspective checks (e.g., Eval Council). Any new capability must explicitly define its agency boundary.
Project-specific example: Codegen context assemblers build prompts that include repo structure, file contents, and integration wiring. These prompts should not include deployment credentials, infrastructure details, or private API endpoints. Prompts ARE the spec -- they should be auditable.
Project-specific example: If your project does not use RAG or vector stores, mark N/A. If AutoContext or similar features are added, this becomes relevant. Flag for future review.
Project-specific example: An LLM call pattern with deterministic fallback is the primary mitigation. Every intelligence module should fall back to static heuristics when the LLM is unavailable or returns low-confidence results. Multi-perspective validation (e.g., Eval Council) adds a second layer. Learning systems must not amplify hallucinated patterns.
Project-specific example: Configure timeouts for each LLM-calling component (e.g., decomposer.timeout_seconds: 300). Each executor invocation needs a timeout. CI fix loops need max iterations (e.g., max_iterations_per_pr: 5). Backend services need max retries (e.g., max_failure_retries_per_item: 3). These are the consumption boundaries -- any new LLM call path must define its own.
{
"task_id": "string",
"llm_threat_assessment": {
"LLM01_prompt_injection": { "risk": "HIGH|MEDIUM|LOW|N/A", "vectors": [], "mitigations": [] },
"LLM02_disclosure": { "...": "..." },
"LLM03_supply_chain": { "...": "..." },
"LLM04_poisoning": { "...": "..." },
"LLM05_output_handling": { "...": "..." },
"LLM06_excessive_agency": { "...": "..." },
"LLM07_prompt_leakage": { "...": "..." },
"LLM08_vector_weaknesses": { "...": "..." },
"LLM09_misinformation": { "...": "..." },
"LLM10_unbounded_consumption": { "...": "..." }
},
"overall_risk": "HIGH|MEDIUM|LOW",
"blocking_findings": [],
"advisory_findings": []
}
llm_call_fn call site must have a documented fallback pathcontract_version in all skill payloads and enforce compatibility before processing.docs/schemas/security-assessment.schema.json and fail on contract drift.docs/specs/token-budgets.md with pre-turn checks.budget_exhausted, missing_llm_touchpoints, contract_mismatch).development
Orchestration skill: chains the full ADLC Build Loop. PRD → Brief → Council → Scaffold → Codegen → LDD → TDD → Council → PR. Use when implementing a new feature end-to-end.
development
# Skill: Helm & ArgoCD Deployment > Validates Helm charts and generates ArgoCD Application manifests when the ADLC pipeline produces infrastructure or service code. Ensures every deployable artifact has correct chart structure, environment-specific values, and a GitOps-ready Application manifest before code review. --- ## Why This Exists Without deployment validation in the pipeline, common failures slip through to production: - **Helm charts fail `helm template`** because of missing values,
testing
Decide whether an intersecting verifier actually exercises the semantic change.
development
# Skill: UX Flow Builder > Generates user flow diagrams (Mermaid) from PRD personas and screen specifications. Surfaces dead ends, missing screens, and disconnected flows before design or engineering starts. Helps PMs think in screens, not features. --- ## Trigger - Automatically during PRD Phase 4 (Personas & Flows) to visualize the user journey - On-demand when the PM says "show me the flow" or "map the user journey" - During PRD evaluation to verify screen connectivity --- ## Input ```