skills/explore-feature/SKILL.md
Identify high-value next features using architecture artifacts, code signals, and active OpenSpec context
npx skillsauth add jankneumann/agentic-coding-tools explore-featureInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze the current codebase and workflow state to recommend what to build next.
$ARGUMENTS - Optional focus area (for example: "performance", "refactoring", "cost", "usability", "security")
Use OpenSpec-generated runtime assets first, then CLI fallback:
.claude/commands/opsx/*.md or .claude/skills/openspec-*/SKILL.md.codex/skills/openspec-*/SKILL.md.gemini/commands/opsx/*.toml or .gemini/skills/openspec-*/SKILL.mdopenspec CLI commandsUse docs/coordination-detection-template.md as the shared detection preamble.
CAN_* flag is trueRead-only exploration may run from the shared checkout when it only returns a
recommendation in chat. Artifact-producing exploration MUST use a managed
worktree in local CLI execution before any write, including refreshing
architecture artifacts, writing docs/feature-discovery/opportunities.json,
creating OpenSpec changes, updating docs, seeding coordinator issues, or
changing issue state.
For artifact-producing mode, run this before the first write:
CHANGE_ID="explore-<short-focus-slug>"
eval "$(python3 "<skill-base-dir>/../worktree/scripts/worktree.py" setup "$CHANGE_ID")"
cd "$WORKTREE_PATH"
skills/.venv/bin/python skills/shared/checkout_policy.py require-mutation
All artifact-producing steps below happen inside that worktree and are committed to the resolved branch. The shared checkout remains read-only.
At skill start, run the coordination detection preamble and set:
COORDINATOR_AVAILABLECOORDINATION_TRANSPORT (mcp|http|none)CAN_LOCK, CAN_QUEUE_WORK, CAN_HANDOFF, CAN_MEMORY, CAN_GUARDRAILSIf CAN_MEMORY=true, recall relevant history before analysis:
recall with tags like ["feature-discovery", "<focus-area>"]"<skill-base-dir>/../coordination-bridge/scripts/coordination_bridge.py" try_recall(...)On recall failure/unavailability, continue normally and log informationally.
If $ARGUMENTS is missing, single-word, or matches one of the known broad buckets (performance, refactoring, cost, usability, security, quality, speed, reliability, tech-debt), ask 2-4 short questions to localize the pain before running architectural analysis. A scored opportunity ranking is only useful if the scoring criteria match the user's actual itch.
RAW="$(echo "$ARGUMENTS" | xargs)"
WORD_COUNT=$(echo "$RAW" | wc -w | tr -d ' ')
BROAD_BUCKETS="performance refactoring cost usability security quality speed reliability tech-debt"
NEEDS_INTERVIEW=false
if [[ -z "$RAW" ]] || [[ "$WORD_COUNT" -le 1 ]]; then
NEEDS_INTERVIEW=true
elif echo " $BROAD_BUCKETS " | grep -qw "$RAW"; then
NEEDS_INTERVIEW=true
fi
If NEEDS_INTERVIEW=true, ask 2-4 questions in a single batch via AskUserQuestion. No confidence loop, no follow-ups -- this is a lightweight localization, not a full discovery interview (see plan-feature Step 3b.ii for that pattern).
Question template (pick the 2-4 most relevant, parameterized by the bucket):
| Bucket | Question 1 (localize "what") | Question 2 (localize "why now") |
|--------|------------------------------|----------------------------------|
| performance | "Whose performance? Build time / runtime latency / perceived UX / agent throughput / cost per run" | "What recent moment made you reach for this -- a slow build, a user complaint, a bill?" |
| refactoring | "Which axis of debt? Coupling / duplication / unclear naming / dead code / outdated patterns" | "Is there a feature you're avoiding because the surrounding code scares you?" |
| cost | "Which cost? LLM API spend / infra / engineering time / agent retries" | "What budget signal triggered this -- a bill, a forecast, a failed audit?" |
| usability | "Whose usability? End user / operator / developer / agent" | "Where do people get stuck -- onboarding, daily flow, edge cases, recovery from errors?" |
| security | "Which surface? Authn / authz / secrets handling / supply chain / data exfiltration" | "Is this driven by an audit, a recent incident, or proactive hardening?" |
| reliability | "Which failure mode? Crashes / data loss / silent corruption / cascading failures / flaky tests" | "What broke recently, or what are you afraid will break?" |
| tech-debt / quality | "Which symptom hurts most? Test gaps / fragile interfaces / spec drift / undocumented decisions" | "Which area of the codebase do you avoid editing, and why?" |
| speed | "Whose speed? Build / test / deploy / agent execution / human iteration" | "What slowness most often blocks you mid-task?" |
Optional Question 3-4 (open-ended, pick if ambiguity remains):
<bucket>?"Capture the answers as LOCALIZED_FOCUS -- a short string (e.g., "runtime latency in agent dispatch loop, triggered by 8s p95 in last week's traces") used in Step 3 scoring and Step 5 artifact persistence.
If NEEDS_INTERVIEW=false, skip this step entirely and set LOCALIZED_FOCUS="$ARGUMENTS".
Skip this step only when this invocation is strictly read-only and returns
results in chat. If any later step will write files or refresh generated
artifacts, enter the worktree and run checkout_policy.py as described in the
Local CLI Mutation Boundary section.
openspec list --specs
openspec list
Collect:
test -f docs/architecture-analysis/architecture.summary.json || make architecture
Use:
docs/architecture-analysis/architecture.summary.jsondocs/architecture-analysis/architecture.diagnostics.json (if present)docs/architecture-analysis/parallel_zones.jsonLook for:
Check for recent gen-eval reports. These provide empirical evidence of interface reliability and coverage gaps:
# Look for gen-eval reports in the project
# Use -newer filter only if opportunities.json exists; otherwise find any report
if [ -f docs/feature-discovery/opportunities.json ]; then
GENEVAL_REPORT=$(find . -name "gen-eval-report.json" -type f -newer docs/feature-discovery/opportunities.json 2>/dev/null | head -1)
else
GENEVAL_REPORT=$(find . -name "gen-eval-report.json" -type f 2>/dev/null | sort -t/ -k1 | head -1)
fi
If a report exists, extract:
fail or error verdicts — these represent concrete bugs or regressions that could become fix opportunitiesIncorporate these signals into opportunity ranking:
add- opportunity with quick-win bucket (writing scenarios is low effort)fix- opportunities targeting the inconsistent service layerIf no report exists, skip this step and note "No gen-eval data available" in the output.
Check for an archive-intelligence index that provides exemplars and patterns from completed changes:
ARCHIVE_INDEX="docs/factory-intelligence/archive-index.json"
EXEMPLAR_REGISTRY="docs/factory-intelligence/exemplars.json"
If the archive index exists, use it to:
Include archive signals in the opportunity output when relevant. If neither file exists, skip and note "No archive intelligence available."
Before producing the ranked list, run two short discipline steps. They widen the solution space so the ranking is not a vote among lookalikes.
Take any concrete pain point surfaced in Steps 1-2.6 (slow build, fragile interface, missing test coverage, etc.) and reframe it as "How might we ____?" — a question, not a solution.
Why: a pain like "the dispatch loop is slow" implicitly suggests "make the dispatch loop faster" — which narrows you to one branch of solutions. Reframing as "How might we deliver work to agents without polling?" widens the space to include pg_notify, websockets, push-based queues, batched dispatch, etc.
For each candidate pain you surfaced, write at least one HMW reframe before brainstorming candidates. Capture the HMW question alongside the candidate in opportunities.json under a new hmw_reframe field — this makes the reasoning auditable and lets the next agent see why the candidate was even considered.
When generating candidates, deliberately apply each of the lenses below at least once. Don't stop at the first lens that returns a hit — diversity beats local-maxima fixation.
In opportunities.json, tag each candidate with the lens(es) that surfaced it (lenses_applied: ["inversion", "10x"]). Candidates surfaced by 2+ lenses are stronger signals than candidates surfaced by one.
Every candidate report MUST include a top-level NOT DOING: section listing the most-tempting alternatives that were considered and rejected, each with a one-line rationale.
NOT DOING:
- Rewrite dispatch in Rust — too costly for the latency win available; revisit if profiling shows CPU-bound bottleneck.
- Add a second cache layer — overlaps with planned `cache-unification` change; would create competing surfaces.
- Defer to next quarter — error budget is already burning; deferring compounds the cost.
Why this matters: an explore report that lists only what we will do hides the alternatives the team rejected. Future agents (and the operator) cannot evaluate the recommendation without seeing the rejected branches. The NOT DOING: list is the antidote to confirmation bias.
This section is required: opportunities.json MUST have a not_doing array (parallel to the ranked list) with { "alternative": "...", "rationale": "..." } entries. The skill's invariant test asserts this section is present in the SKILL.md and the JSON output.
Generate a ranked shortlist (3-7 items), each with:
low/med/high)score = impact*0.4 + strategic_fit*0.25 + (4-effort)*0.2 + (4-risk)*0.15 + focus_match*0.1low=1, med=2, high=3; S=1, M=2, L=3focus_match (0-3): how directly the opportunity addresses LOCALIZED_FOCUS (set either by Step 0.5's interview answers or by the direct $ARGUMENTS string when the interview was skipped). 3 = directly addresses the named pain (e.g., focus is "runtime latency in agent dispatch" and opportunity reduces dispatch latency); 2 = addresses the broader bucket but not the specific pain; 1 = tangentially related; 0 = unrelated. Only set focus_match=0 for all items when LOCALIZED_FOCUS is empty/unset — a non-empty focus from $ARGUMENTS is just as valid a scoring anchor as one from the interview.localized_focus_alignment: one-line note on why the opportunity received its focus_match score (e.g., "Reduces dispatch loop p95 by replacing polling with pg_notify -- direct hit on stated pain")quick-win (high score, low effort/risk)big-bet (high potential impact with medium/high effort)add-, update-, refactor-, remove-)blocked-by dependencies (existing change-ids, missing infra, unresolved design decisions)/plan-feature now, or defer)Coordinator-dependent step (requires CAN_DISCOVER and CAN_LOCK). Skip if coordinator is unavailable.
check_locks() to get all active file and logical locksdiscover_agents() to enumerate in-flight features and their claimed resourcesFor each candidate from Step 3, if resource claims were enumerated in Step 3.5:
FULL -- No resource overlap; safe for full parallel executionPARTIAL -- Some overlap; can run in parallel with serialized access to shared resourcesSEQUENTIAL -- Heavy overlap; must wait for in-flight features to completeAdd these fields to the ranked output when available:
| Field | Description |
|-------|-------------|
| Parallel Feasibility | FULL / PARTIAL / SEQUENTIAL (or N/A if coordinator unavailable) |
| Resource Conflicts | List of overlapping locks (if any) |
| Independent Zones | Which parallel_zones.json groups are available |
For the top recommendation, include:
/plan-feature <description>/iterate-on-plan <change-id> if a related proposal existsWrite/update machine-readable discovery artifacts:
docs/feature-discovery/opportunities.json (current ranked opportunities)docs/feature-discovery/history.json (recent top recommendations with timestamps/status)Rules:
/prioritize-proposals can reference opportunities without text matchinggen_eval_signals field in opportunities.json with: { "report_path": "<path>", "failing_interfaces": [...], "coverage_pct": <float>, "categories_below_threshold": [...] }localized_focus field in opportunities.json capturing the interview output (raw input, bucket detected, questions asked, answers, derived focus string). This makes the ranking reproducible and gives /prioritize-proposals the same context the user provided| Rationalization | Why it's wrong | |---|---| | "I already know what we should build — skip the lenses" | Pre-loaded answers come from the most recently active mental model, not the highest-value option. The lenses are how you discover alternatives you wouldn't have generated otherwise. | | "The HMW reframe is just rewording — it doesn't change the answer" | "Make X faster" anchors you to optimization. "How might we eliminate X?" surfaces removal candidates that are usually 10× cheaper. The reframe is the work. | | "Listing NOT DOING items is filler — the report is about what we WILL do" | The rejected branches ARE the audit trail. Without them, future agents cannot tell whether the recommendation was a considered choice or a default. | | "Only 1 lens hit — that's enough, ship the candidate" | One-lens hits are local maxima. A genuinely strong candidate surfaces from 2+ lenses (e.g., simplification AND 10×). One-lens hits should be marked low-confidence, not headlined. | | "Focus interview is for vague input — my $ARGUMENTS is specific" | "Performance" looks specific but is a bucket. Run the interview anyway when the bucket name matches one of the broad buckets in Step 0.5; the cost is 30 seconds, the upside is a focus_match score that actually means something. |
lenses_applied: [] for all candidates (the lenses were skipped or fabricated).NOT DOING: section is missing or contains only "<none>" (no alternatives were considered, or the author is performing the form without the substance).quick-win bucket — strongly suggests inversion / 10× / big-bet lenses were not applied.localized_focus is empty even when $ARGUMENTS was non-empty (the interview was run unnecessarily, or the focus from $ARGUMENTS was discarded).opportunities.json has at least one hmw_reframe entry AND at least one entry in lenses_applied from the 8 named lenses.NOT DOING: section with ≥2 rejected alternatives, each with a one-line rationale (cite the file path and section).opportunities.json | jq '[.[] | .lenses_applied[]] | unique | length' returns ≥3).focus_match score and localized_focus_alignment note — not just the raw ranking.$ARGUMENTS matched a broad bucket, the focus interview ran and localized_focus is populated with the interview output (not the raw bucket name).development
Open the artifacts relevant to a review (OpenSpec proposal, branch changes, or explicit paths) in VS Code, in a curated read-order, in the right worktree.
tools
Render and seed coordinator-owned task status block in OpenSpec tasks.md
testing
User-invocable skill that omits the tail block
tools
Missing several required keys