Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

etanhey/research-prompt-quality

Name: research-prompt-quality
Author: etanhey

skills/golem-powers/_archive/research-prompt-quality/SKILL.md

npx skillsauth add etanhey/golems research-prompt-quality

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Research Prompt Quality — Pre-Flight Gate

Stop flat, redundant deep-research prompts before they ship. Research execution skills run after this gate passes.

When to run

Run before writing or pasting any deep-research prompt (Claude Desktop, Gemini Deep Research, or /research --deep). If the gate fails, do not emit a research prompt — output the stop reason and route to engineering or plan work instead.

The three gates (all must pass)

Gate 1 — CHECK-FIRST (non-redundancy)

Search existing work before proposing new research.

skills/golem-powers/research-prompt-quality/scripts/check-first.sh "<topic keywords>"

Sources scanned:

Brain Drive/03_RESEARCH/ (when mounted)
Every ~/Gits/*/docs.local/research/
Every ~/Gits/*/docs.local/plans/
Every ~/Gits/*/docs.local/decisions/

On exit 1: print ALREADY RESEARCHED → <paths> and STOP. This is engineering / plan execution, not research.

Canonical failure: proposing "RRF ranking" deep-research when ≥6 prior artifacts already exist (see evals/fixtures/neg-2-redundant-rrf.md).

Gate 2 — GROUND

Every research prompt MUST include:

Drive folder refs — relevant Brain Drive/03_RESEARCH/... paths (or documented folder IDs from boot grounding docs).
≥1 concrete current-usage example — real code, config, or file path from the repo (not generic "we use agents").
Prior-research stance — explicit BUILD-ON / VALIDATE / REFUTE for each prior artifact; never restart from zero.

Use references/ground-template.md as the required structure. Fold grounding bundles (e.g. MCL-RESEARCH-GROUNDING.md) into the prompt; do not treat the bundle as the prompt itself.

Gate 3 — Emit only if 1+2 pass

If CHECK-FIRST passes and GROUND is satisfied, emit the self-contained research prompt (reuse /claude-desktop-research format). Otherwise output only the stop message or grounding gap list.

Workflow

1. CHECK-FIRST   → scripts/check-first.sh "<keywords>"
2. If hits       → STOP ("ALREADY DONE → paths → engineering")
3. Gather ground → read repo paths, Drive folders, prior research files
4. Draft         → references/ground-template.md
5. Score         → scripts/score-research-prompt.py <draft.md>  (target ≥8/10)
6. Ship prompt   → hand to /claude-desktop-research or /research

Static quality bar

Run scripts/score-research-prompt.py on the draft. RED gate for eval fixtures: negative prompts ≤4/10, grounded prompts ≥8/10. See EVAL.md for rubric and fixture scores.

Integration

/research — run this gate before --deep or external research dispatch.
/claude-desktop-research — run this gate before writing the paste-ready prompt file.

Anti-patterns (from gen-10, 2026-05-29)

| Bad | Good | |-----|------| | Flat MCL prompt with no Drive refs or repo examples | Grounded prompt + MCL-RESEARCH-GROUNDING.md folded in | | "Deep-research RRF fusion" when 6+ prior docs exist | ALREADY RESEARCHED → <paths> → engineering | | Generic "multi-agent ecosystem" without send_input / cmux paths | Cite cmux/SKILL.md, orc/workflows/fact-propagation.md, etc. | | Restart prior art from scratch | BUILD-ON / VALIDATE / REFUTE per prior file |

Eval

See EVAL.md — four real fixtures, literal scorer stdout, with-skill vs without-skill delta target ≥+30%.

etanhey/research-prompt-quality

skills/golem-powers/_archive/research-prompt-quality/SKILL.md

Mandatory pre-flight gate before any deep-research prompt ships. Three gates: CHECK-FIRST (non-redundancy), GROUND (Drive refs + current-usage examples + prior-research stance), emit-only-if-pass. Use when writing deep research prompts, Claude Desktop research prompts, deciding should we research, or proposing research. Triggers: 'deep research', 'research prompt', 'should we research', 'propose research'. NOT for executing research — use /research, /claude-desktop-research, or /gemini-research.

3 stars

testing

Updated Jun 7, 2026

$ install --global

skillsauth

npx skillsauth add etanhey/golems research-prompt-quality

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 30, 2026, 2:58 AM37.1s10 files scanned

SKILL.md

name:: research-prompt-quality
description:: Mandatory pre-flight gate before any deep-research prompt ships. Three gates: CHECK-FIRST (non-redundancy), GROUND (Drive refs + current-usage examples + prior-research stance), emit-only-if-pass. Use when writing deep research prompts, Claude Desktop research prompts, deciding should we research, or proposing research. Triggers: 'deep research', 'research prompt', 'should we research', 'propose research'. NOT for executing research — use /research, /claude-desktop-research, or /gemini-research.

Research Prompt Quality — Pre-Flight Gate

Stop flat, redundant deep-research prompts before they ship. Research execution skills run after this gate passes.

When to run

The three gates (all must pass)

Gate 1 — CHECK-FIRST (non-redundancy)

Search existing work before proposing new research.

skills/golem-powers/research-prompt-quality/scripts/check-first.sh "<topic keywords>"

Sources scanned:

Brain Drive/03_RESEARCH/ (when mounted)
Every ~/Gits/*/docs.local/research/
Every ~/Gits/*/docs.local/plans/
Every ~/Gits/*/docs.local/decisions/

On exit 1: print ALREADY RESEARCHED → <paths> and STOP. This is engineering / plan execution, not research.

Canonical failure: proposing "RRF ranking" deep-research when ≥6 prior artifacts already exist (see evals/fixtures/neg-2-redundant-rrf.md).

Gate 2 — GROUND

Every research prompt MUST include:

Drive folder refs — relevant Brain Drive/03_RESEARCH/... paths (or documented folder IDs from boot grounding docs).
≥1 concrete current-usage example — real code, config, or file path from the repo (not generic "we use agents").
Prior-research stance — explicit BUILD-ON / VALIDATE / REFUTE for each prior artifact; never restart from zero.

Use references/ground-template.md as the required structure. Fold grounding bundles (e.g. MCL-RESEARCH-GROUNDING.md) into the prompt; do not treat the bundle as the prompt itself.

Gate 3 — Emit only if 1+2 pass

If CHECK-FIRST passes and GROUND is satisfied, emit the self-contained research prompt (reuse /claude-desktop-research format). Otherwise output only the stop message or grounding gap list.

Workflow

1. CHECK-FIRST   → scripts/check-first.sh "<keywords>"
2. If hits       → STOP ("ALREADY DONE → paths → engineering")
3. Gather ground → read repo paths, Drive folders, prior research files
4. Draft         → references/ground-template.md
5. Score         → scripts/score-research-prompt.py <draft.md>  (target ≥8/10)
6. Ship prompt   → hand to /claude-desktop-research or /research

Static quality bar

Run scripts/score-research-prompt.py on the draft. RED gate for eval fixtures: negative prompts ≤4/10, grounded prompts ≥8/10. See EVAL.md for rubric and fixture scores.

Integration

/research — run this gate before --deep or external research dispatch.
/claude-desktop-research — run this gate before writing the paste-ready prompt file.

Anti-patterns (from gen-10, 2026-05-29)

Eval

See EVAL.md — four real fixtures, literal scorer stdout, with-skill vs without-skill delta target ≥+30%.

Related Skills

etanhey/phoenix-human-view

tools

VerifiedTrustedCommunity

The human-eval UX contract for Phoenix views: turn-by-turn scrollable replay (not a scorecard), hide-but-copyable IDs, collapsed thinking, identity chips, tool filters, tiny frozen starter datasets, mark-wrong-in-thread, mobile-first. Use when: building or reviewing ANY Phoenix/eval view, annotation UI, session replay, or human-grading surface. Triggers: phoenix view, eval UI, annotation view, session replay, human eval UX, grading interface. NOT for: Phoenix data pipelines/ingest (capture scripts have their own specs).

3SKILL.mdUpdated Jun 7, 2026

etanhey/phoenix-human-view

etanhey/mac-systems

tools

VerifiedTrustedCommunity

macOS systems specialist — AppKit NSPanel architecture, launchd services, socket activation, MCP bridge resilience, syspolicyd, and high-frequency SwiftUI dashboards. Use when building menu-bar apps, LaunchAgents, debugging syspolicyd/Gatekeeper/TCC, resilient UDS/MCP bridges, or SwiftUI dashboards at 10Hz+.

3SKILL.mdUpdated Jun 7, 2026

etanhey/judge-fleet

development

VerifiedTrustedCommunity

Bulk LLM-judging protocol for fleet-dispatched verdict runs (KG cluster, eval harness). Use when: dispatching or running judge workers (J1/J2/RT), planning bulk-apply from verdict JSONL, or triaging evidence_degraded outputs. Triggers: judge fleet, bulk judge, R3 verdicts, kg-judge, RT gate, evidence_degraded. NOT for: single-item code review, Phoenix view UX (use phoenix-human-view), or non-judge eval pipelines.

3SKILL.mdUpdated Jun 7, 2026

etanhey/fleet-wrap

development

VerifiedTrustedCommunity

Quiet-down protocol for sprint close: when the fleet wraps, delete ALL polling crons and monitors, send ONE final dashboard + ONE message, then go SILENT. Use when: fleet wraps, all workers done, overnight queue exhausted, sprint close, Etan asleep/away with nothing approved left. Triggers: fleet wrap, wrap the fleet, stand down, going quiet, sprint close. NOT for: mid-sprint monitoring (keep your loops), spawning a successor (use /session-handoff first).

3SKILL.mdUpdated Jun 7, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/etanhey/golems.git

# Copy into Claude Code skills folder (global)
cp -r golems/skills/golem-powers/_archive/research-prompt-quality ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

etanhey/golems

3 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT