skills/co-redteam-orchestrated-security-discovery/SKILL.md
Multi-agent security vulnerability discovery and exploitation using Co-RedTeam's orchestrated workflow. Decomposes security analysis into coordinated discovery and exploitation stages with execution-grounded iterative reasoning and layered memory. Use when: 'find vulnerabilities in this codebase', 'red team this application', 'security audit this project', 'exploit this vulnerability', 'penetration test this service', 'analyze this code for security flaws'.
npx skillsauth add ndpvt-web/arxiv-claude-skills co-redteam-orchestrated-security-discoveryInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to perform structured, multi-stage security vulnerability discovery and exploitation following the Co-RedTeam framework. Instead of single-pass code scanning, it decomposes security analysis into a discovery stage (identify and evidence vulnerabilities) and an exploitation stage (plan, execute, validate, and refine attack sequences iteratively). Each stage uses specialized reasoning roles — analysis, critique, planning, validation, execution, and evaluation — coordinated through strict input/output schemas. The key insight: treat exploitation as a structured search process guided by real execution feedback, not single-shot payload generation.
Authorization context required: This workflow applies only to authorized security testing, CTF challenges, security research, or defensive assessment. Refuse requests lacking clear authorization.
Co-RedTeam mirrors real-world red-teaming by splitting vulnerability analysis into two coordinated stages. Stage I (Discovery) performs code-aware analysis using taint tracing (source-to-sink), trust boundary mapping, configuration auditing, and business logic tracing. Each candidate vulnerability must pass a critique loop that demands rigorous evidence chains: the exact source of untrusted input, the dangerous sink where execution occurs, and why existing protections fail. Only vulnerabilities with line-number-level evidence survive the critique filter.
Stage II (Exploitation) treats each confirmed vulnerability as a structured search problem. A planner drafts a multi-step exploit plan, then enters a closed loop: propose an action, validate it for safety and correctness, execute it in an isolated environment, evaluate the output, and refine the plan based on what actually happened. The critical innovation is proactive plan revision — after each execution result, the planner does not just update the current step's status but reviews all future planned steps and revises any that are invalidated by new evidence. This prevents silent failure cascades where agents keep executing an obsolete plan.
The framework uses a three-layer memory system: (1) vulnerability pattern memory storing confirmed schemas with observable symptoms and confirming tests, (2) strategy memory capturing high-level exploitation workflows that generalize across targets, and (3) technical action memory recording concrete commands and scripts with both successes and failures. Ablation studies show execution feedback accounts for the largest performance impact (41.6%), followed by code browsing (11.6%) and memory (9.1%).
Map the attack surface. Enumerate the full file structure, identify the technology stack (language, framework, database, containerization), locate entry points (routes, API endpoints, CLI handlers, file uploads), and read configuration files (Dockerfile, docker-compose.yml, requirements.txt, package.json). Note debug modes, hardcoded secrets, and vulnerable dependency versions.
Apply structured analysis techniques. For each entry point, perform:
Compile evidence chains. For each candidate vulnerability, construct a structured evidence record:
Critique each finding independently. Before accepting any vulnerability, challenge it:
Draft a multi-step exploit plan. For each confirmed vulnerability, create an ordered list of concrete steps. Each step has a description, action type (BASH, PYTHON, or VERIFICATION), the specific command or code, and a status (PLANNED, DONE, BLOCKED). Consult memory for similar past exploits and adapt successful strategies.
Validate before executing. Before running any action, verify:
Execute in isolation and capture output. Run the validated action in the target environment (Docker container, sandbox, or designated test system). Capture stdout, stderr, exit codes, and any HTTP responses or state changes.
Evaluate execution results as reasoning signals. After each execution:
Proactively revise the remaining plan. Do not just mark the current step as DONE or BLOCKED — review every remaining PLANNED step. If execution feedback invalidates future steps, update them immediately. Insert corrective steps for failures. Remove steps that are no longer relevant. This prevents cascading errors from an obsolete plan.
Iterate until exploitation succeeds or the iteration budget is exhausted. Continue the execute-evaluate-revise loop (default cap: 20 iterations). Terminate when: (a) a working proof-of-concept is produced with validated output, (b) all reasonable exploitation paths are exhausted, or (c) the iteration cap is reached. Document the final exploit chain, PoC payload, and execution trace.
Example 1: SQL Injection Discovery and Exploitation in a Flask App
User: "Find and exploit vulnerabilities in this Flask web application (authorized pentest)"
Approach:
request.args.get('id') passed directly into db.execute(f"SELECT * FROM users WHERE id = {user_id}") at routes/users.py:47curl "http://target:5000/users?id=1 OR 1=1" — test for boolean-based SQLicurl "http://target:5000/users?id=1 UNION SELECT username,password FROM users--" — extract credentialsOutput:
## Vulnerability Report
### Finding 1: SQL Injection (CWE-89) — CRITICAL
**Source**: routes/users.py:32 — `user_id = request.args.get('id')`
**Sink**: routes/users.py:47 — `db.execute(f"SELECT * FROM users WHERE id = {user_id}")`
**Protection gaps**: No parameterized queries, no input validation, no WAF
**Proof of Concept**:
curl "http://target:5000/users?id=1 UNION SELECT username,password FROM users--"
**Impact**: Full database read access, credential extraction
**Remediation**: Use parameterized queries: `db.execute("SELECT * FROM users WHERE id = ?", (user_id,))`
Example 2: Multi-Step Exploitation of Command Injection in a CI/CD Pipeline
User: "This CI/CD service lets users submit build configs. Check if it's exploitable (CTF challenge)"
Approach:
build_command flows to subprocess.call(config['build_command'], shell=True) at services/builder.py:67 — no sanitizationbuild_command: "echo test" — verify normal executionbuild_command: "id && cat /etc/passwd" — test command chainingbuild_command: "cat /flag.txt" — retrieve CTF flagExample 3: Discovery-Only Security Audit (No Exploitation)
User: "Audit this Node.js Express API for security issues, don't exploit anything"
Approach:
Output:
## Security Audit Report — 4 Findings
1. **Stored XSS (CWE-79) — HIGH**: routes/comments.js:34 stores req.body.text
without sanitization, rendered via res.send() at routes/posts.js:89 without escaping.
2. **Missing Authentication (CWE-862) — HIGH**: routes/admin.js:12-45 defines
/admin/users and /admin/config endpoints with no auth middleware.
3. **Prototype Pollution (CWE-1321) — MEDIUM**: lib/merge.js:7 uses recursive
object merge without __proto__ filtering on user-supplied JSON.
4. **Outdated Dependency (CWE-1395) — MEDIUM**: package.json pins [email protected],
which is vulnerable to CVE-2020-8203 (prototype pollution).
Do:
Avoid:
Paper: Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents (He et al., 2026). Key sections: Section 3 for the full agent architecture and interaction protocol, Section 4 for the three-layer memory system design, and Tables 1-5 for benchmark results showing execution feedback as the single most impactful component (41.6% performance impact in ablation).
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".