skills/prompt-spec/SKILL.md
Audit or rewrite a prompt into a six-section issue spec (Goal / Constraints / Non-goals / Verification / Architecture notes / Existing context) before any code gets generated. Use when the user pastes a vague request and asks for implementation, or explicitly says they want to frame an issue properly. Triggers on: prompt spec, audit this prompt, check my prompt, what's missing in this prompt, frame this issue, rewrite as a prompt spec, convert to issue spec, make this an issue, issue framing.
npx skillsauth add realroc/skills prompt-specInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Turn a vague request into an issue an AI agent can execute safely. Or audit an existing prompt and surface what's missing before code gets written.
This skill exists because the real failure mode in AI-native engineering isn't "AI writes wrong code" — it's "human writes incomplete prompt, AI fills the gaps with a plausible-but-disastrous default."
Companion skill: githire (the full six-step workflow). This skill only owns the issue-framing step.
Two modes, picked automatically based on what the user gives you.
Input: an existing prompt / issue / Slack message. Output: which of the 6 sections are missing or weak, and what concrete failure mode each gap maps to.
Input: a vague request + enough context to ask follow-up questions. Output: a complete six-section prompt spec the user can paste into an issue or feed to a coding agent.
If the user just pastes a prompt with no instruction, default to Audit and offer Rewrite as the next step.
| Section | What it answers | Missing → failure mode | |---|---|---| | Goal | What we're trying to achieve, in user-visible terms | AI optimizes for the wrong outcome | | Constraints | Call frequency, data scale, latency budget, schema we can't break | AI picks an implementation that doesn't survive prod load | | Non-goals | What this change must NOT do; concrete anti-patterns | AI happily uses the worst-case implementation that fits the goal | | Verification | Tests, smoke probes, prod metrics that should/shouldn't move | "All tests pass" becomes the only signal; incidents in prod | | Architecture notes | Data structures, index patterns, cache strategy, the shape we expect | AI invents a structure that conflicts with team conventions or prior decisions | | Existing context | Files likely involved, prior decisions, related issues/PRs | AI rebuilds something that already exists or re-introduces an old bug |
For each of the six sections, return one of three verdicts:
Then, for each △ or ✗, name one concrete failure mode that gap enables.
## Audit · <prompt one-line summary>
| Section | Verdict | Gap → failure mode |
|---|---|---|
| Goal | ✓ / △ / ✗ | (only if △ or ✗) |
| Constraints | ✓ / △ / ✗ | ... |
| Non-goals | ✓ / △ / ✗ | ... |
| Verification | ✓ / △ / ✗ | ... |
| Architecture notes | ✓ / △ / ✗ | ... |
| Existing context | ✓ / △ / ✗ | ... |
### Top risks if shipped as-is
1. <highest-impact gap, what could break, why>
2. <next>
3. <next>
### Suggested next step
- [ ] Fill in <section X> with: <concrete prompt back to the user>
- [ ] Fill in <section Y> with: <concrete prompt back to the user>
- [ ] Once filled, run Mode B (Rewrite) to produce the final spec.
| Weak phrasing | What's wrong | Stronger version |
|---|---|---|
| "should be fast" | no number, no percentile | "p99 ≤ 150ms at 50 RPS" |
| "use the field" | no shape, no constraint | "read from model_detail.made_in_china; don't introduce new keys" |
| "make sure it works" | unmeasurable | "smoke test asserts: cn site returns only made_in_china=1; intl site unchanged" |
| "follow best practices" | meaningless | "no SCAN/KEYS in request-path; cache TTL ≥ 60s; reads O(1)" |
Use this exact template. Fill every section; if a section truly doesn't apply, write N/A — <why>, never leave it blank.
# <One-line title: <verb> <object>>
## Goal
<What we're trying to achieve, in user-visible terms. One paragraph.>
## Constraints
- <Hard limit 1: e.g., `/api/X` is hit on every page load, currently ~200 RPS>
- <Hard limit 2: e.g., model_detail has ~80 entries today, may grow to ~500>
- <Hard limit 3: e.g., must not break the existing /api/Y contract>
## Non-goals
- <Concrete anti-pattern 1: e.g., no SCAN / KEYS / unbounded loops in request-path>
- <Concrete anti-pattern 2: e.g., do not change the frontend caching strategy in this PR>
- <Scope reduction 1: e.g., this PR does not handle <adjacent feature>>
## Verification
- <Test 1: e.g., pytest tests/X — assertion specifically about Y>
- <Smoke 1: e.g., `curl /api/site/config` on cn vs intl, expected diff>
- <Prod metric 1: e.g., `/api/X` p99 must not move > 10ms after rollout>
## Architecture notes
- <Shape 1: e.g., maintain a Redis SET `domestic_model_ids`, write on model up/down>
- <Shape 2: e.g., reads are SMEMBERS — O(1)>
- <Constraint inherited: e.g., per <link to prior PR / ADR>, we standardized on...>
## Existing context
- Likely files: <path1>, <path2>
- Prior related work: <issue/PR/ADR link>
- Known landmines: <e.g., "previous attempt at X is in PR #42, reverted because Y">
A real production incident (full walkthrough: https://realroc.github.io/git-hired/case-redis-scan.html) ran with this prompt:
"国产模型判定走的是前缀匹配。我想做成 model_detail 里的 made_in_china 字段。需要针对国内站和国际站分别做 search 检查。建议在 smoke test 那边增加 E2E 测试。"
Audit verdict:
| Section | Verdict | Gap → failure mode |
|---|---|---|
| Goal | ✓ | — |
| Constraints | ✗ | nothing said /api/site/config is hit per page-load → AI used per-request impl |
| Non-goals | ✗ | no "no SCAN/KEYS" → AI used r.scan(match='model_detail::*') |
| Verification | △ | "smoke test" too vague — didn't say "p99 must not move" → no prod gate |
| Architecture notes | ✗ | no mention of maintained set → AI invented a runtime scan |
| Existing context | △ | mentioned the field but not "model_detail has ~80 keys, all read-mostly" |
What actually happened: AI generated functionally-correct code in 5 minutes. Production was on fire in 13 hours. 25 hours of fix-chain commits until Redis SET replaced the SCAN.
Rewritten (Mode B output) would have been ~150 words longer and prevented the entire incident. Those 150 words = 25 hours of save.
githire skill's Execute step) comes after._<assumption>_ italic note.When asking the user to fill missing sections, use lettered options so the user can answer with 1B, 2A, 3D:
1. What's the call frequency of the entry point you're touching?
A. One-off / batch job (cold path)
B. User action (clicks, ~1 RPS per user)
C. Page-load / startup (high RPS, 10–100×)
D. Request-path on every API hit (highest scrutiny)
2. What's the data scale today, and 6 months out?
A. < 100 items, stable
B. 100–10K items, slow growth
C. 10K–1M items
D. > 1M items / unbounded
3. What's the verification bar?
A. Unit tests only
B. + smoke / integration
C. + production metric SLO
D. + canary / staged rollout
Activate this skill when the user asks to:
Also activate proactively when:
githire Execute step.development
Screen MongoDB conversation collections for script-driven abuse (prompt-injection templates, curl/empty user agents, probe-word floods, sessionless calls, multi-account IPs). Produces a two-tier triage report (confirmed abuse / suspicious) plus a multi-account IP list and a ban candidate CSV. Use when asked to find script callers, prompt-injection attempts, abnormal high-frequency users, accounts bypassing the web UI, or "who is using my AI as a cron job".
testing
GitHire's six-step AI-native engineering method: frame the issue, sandbox, AI execute, AI review, architect decision, ship. Use when planning or executing real work with AI agents — issue framing, prompt writing, PR review gating, architect handoff — or anytime humans-frame-AI-execute-architects-verify applies. Triggers on: use githire, githire methodology, issue-first onboarding, ai-native workflow, frame this issue, prompt spec, architect review, first PR for a candidate, hire through real PRs.
development
Geolocate a batch of IPv4 addresses and produce a Markdown distribution table — Chinese IPs broken down by province (incl. HK/MO/TW), foreign IPs by country, with counts and percentages. Optionally exports CSV. Uses the free ip-api.com batch endpoint (no key, no signup, HTTP only, 15 batches × 100 IPs per minute). Use when the user pastes a list of IPs and asks for "IP 分布", "IP 归属地分布", "省份分布", "where are these IPs from", "geolocate these IPs", or wants an IP-region breakdown table.
development
Automate Shumei-based user violation-rate audits from MongoDB user and conversation collections, producing a CSV sorted by per-user request violation rate. Use when asked to screen users for forbidden/risky content, compute user-level violation rates, audit newly registered/free/suspicious users, or rerun a similar report with custom user filters, conversation filters, and a Shumei input-event key.