knowledge/deep-research-generic/SKILL.md
File-backed deep research with recursive link-following, multi-tool fetching (Jina Reader, Tavily, Playwright), and step-by-step synthesis. Use when the user asks to research, investigate, analyze, or summarize a topic in depth; when a thorough answer requires gathering and cross-referencing multiple sources across linked pages; or when the output must be comprehensive, cited, and not limited by context window size.
npx skillsauth add aeondave/malskill deep-research-genericInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
File-backed, multi-pass research workflow. Each useful page is fetched, cleaned, and saved to an intermediate file. Linked pages are recursively followed. All intermediate files are synthesized step-by-step into a single comprehensive research document.
Core principle: Use the file system as extended memory. Never rely on context alone — save everything worth keeping, then synthesize from files.
Before any search:
.research/{topic-slug}/
├── _plan.md # Sub-questions, priorities, URL queue, gap tracker
├── pages/ # One .md file per fetched page
└── output.md # Final synthesized research document
_plan.md with sub-questions and an empty URL queue sectionAsk at most two clarifying questions. If the request is clear, proceed immediately.
For each sub-question, run parallel searches to discover URLs:
search_depth: basic, max_results: 5–10fetch_webpage on https://s.jina.ai/{search-query}From results:
_plan.md under the URL queueProcess each queued URL individually:
3a. Fetch using the tool hierarchy (stop at first success):
| Priority | Tool | When |
|---|---|---|
| 1 | Jina Reader | Default for all pages — cleanest markdown |
| 2 | fetch_webpage (direct) | APIs, raw JSON/text, simple pages |
| 3 | Tavily extract | Available + structured data needed |
| 4 | Playwright | JS-rendered pages, dynamic tables, SPAs |
Jina Reader: fetch_webpage on https://r.jina.ai/{full-url-with-scheme}
Converts any page to clean markdown. No auth, no MCP dependency. Handles complex layouts, strips ads/nav.
Escalation: Jina empty → direct fetch → Tavily extract → Playwright.
3b. Evaluate: Is the content relevant and citable? If not, mark URL as skipped in _plan.md and move on.
3c. Save to intermediate file pages/{NNN}_{slug}.md:
# {Page Title}
- **Source**: {URL}
- **Fetched**: {date}
- **Serves**: {sub-question name}
- **Relevance**: high/medium/low
## Content
{Cleaned content: facts, data, quotes, code, citations.
Remove navigation, ads, boilerplate. Summarize verbose sections
but preserve all critical detail and data points.}
## Outbound Links
- {URL1} — {why it might be useful}
- {URL2} — {why it might be useful}
3d. Extract links: Identify all outbound links that could deepen the research. Add relevant new URLs to the queue in _plan.md.
Repeat Step 3 for newly queued links. Stop when:
Update _plan.md queue: mark each URL as fetched, skipped, or queued.
After all fetch rounds:
_plan.md with gap analysisBuild output.md incrementally from intermediate files:
output.md with inline citations [N]
c. Move to the next dimensionKey: Each section reads only its relevant files. The research depth is limited only by the data found, not by context window size.
Structure of output.md:
## Executive Summary
[2–3 sentences. Key conclusions + overall confidence.]
## Key Findings
- **{Finding}**: {1 sentence} — Confidence: High/Medium/Low [N]
## Detailed Analysis
### {Dimension 1}
{Analysis with inline citations [1][2].}
### {Dimension 2}
...
## Consensus
[What sources agree on.]
## Conflicts and Uncertainty
[Where sources disagree or data is missing.]
## Sources
[1] Author/Org, "Title", URL — date — Tier N
[2] ...
## Gaps and Follow-up Questions
[What this research does NOT answer.]
Present output.md to the user. Intermediate files remain available for follow-up.
Read a page: fetch_webpage → https://r.jina.ai/{target-url-with-scheme}
Search the web: fetch_webpage → https://s.jina.ai/{search-query}
| Tool | Use |
|---|---|
| tavily_search | Keyword search — primary discovery |
| tavily_extract | Content extraction from known URLs |
| tavily_crawl | Multi-page site crawl (expensive — use last) |
| tavily_map | Enumerate URLs before crawling |
Query rules: max 400 chars, one topic per query, parallel sub-questions, include_domains instead of site:, filter by score > 0.5.
Use when Jina and Tavily return empty or incomplete content:
| Tier | Examples | Credibility | |---|---|---| | 1 | Peer-reviewed journals, official statistics | High | | 2 | Government/NGO reports, industry standards | High | | 3 | Reputable news outlets, expert commentary | Medium | | 4 | Blogs, forums, unverified claims | Low — verify independently |
| Level | Fetch rounds | Link depth | Pages | Output | |---|---|---|---|---| | Quick | 1 | None | 5–10 | 500–1000 words | | Standard | 2 | 1 level | 15–25 | 1500–3000 words | | Deep | 3+ | 2–3 levels | 30–50+ | 3000+ words |
Default to Standard. Use Deep when user says "in depth", "thorough", "comprehensive", or topic has many interconnected sources.
data-ai
Scoped routing: Linux operator; hosts, sessions, users, services, packages, logs, containers, SSH, network paths, privilege evidence.
development
Offensive methodology for ICS/OT/SCADA environments in authorized industrial penetration testing and red team operations. Use when assessing PLCs, RTUs, HMIs, engineering workstations, historians, or field devices running Modbus, DNP3, EtherNet/IP, S7comm/S7+, Profinet, IEC 60870-5-104, BACnet, or OPC-UA. Covers passive OT network enumeration, protocol-level device interrogation, PLC coil/register read-write attacks, HMI session exploitation, historian and engineering workstation compromise, and safe escalation rules for critical infrastructure scope. Does not cover: general IT network exploitation (network-technique), physical hardware interfaces UART/JTAG/SPI (hardware-technique), wireless sensor network attacks (wireless-technique), RF/SDR signal analysis (hardware-ctf or wireless-technique), or CTF-framed ICS lab tasks (ics-ctf).
tools
Offensive methodology for authorized game security assessments, game client security research, and game-adjacent penetration testing in real-world engagements. Use when assessing game clients for cheating vulnerabilities, testing anti-cheat effectiveness, auditing game server protocols for score manipulation or economic fraud, reverse engineering game DRM or license validation, analyzing game save file protection, or assessing game mod/plugin security. Covers: process memory scanning and manipulation (Cheat Engine methodology), game binary reversing for license and DRM bypass, game network protocol analysis and packet replay, anti-cheat mechanism analysis, save file format reversing and tampering, speed hack and value injection techniques. Does NOT cover: CTF game challenges (game-ctf), game engine source code auditing (web-exploit-technique or vuln-search-technique for the backend), or general binary exploitation (pwn-ctf or reversing-technique).
development
Auth assessment: hardware/embedded methodology; UART/JTAG/SWD/SPI/I2C, firmware extraction, boot/debug paths, embedded OS evidence.