skills/github-issue-dupe-check/SKILL.md
Use before opening a GitHub issue to search for likely duplicates with gh, inspect candidate issues, and classify them as duplicate, related, or distinct. Use when drafting GitHub issues, checking whether a bug report already exists, responding to duplicate-bot comments, or deciding whether to comment on an existing issue instead of filing a new one.
npx skillsauth add mullzhang/skills github-issue-dupe-checkInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill before opening a GitHub issue, or when a duplicate bot flags an issue after creation.
owner/name) and the draft issue title/body.gh issue view, including closed issues.duplicate: same user-visible bug or likely same root cause; one issue should be closed.related: overlapping area, but likely a separate fix or tracking item.distinct: not materially related.Use the bundled script from the repository root or from any workspace:
python skills/mullzhang/github-issue-dupe-check/scripts/collect_issue_candidates.py \
--repo openai/codex \
--draft path/to/issue-draft.md \
--output /tmp/issue-dupe-candidates.json
If there is no draft file:
python skills/mullzhang/github-issue-dupe-check/scripts/collect_issue_candidates.py \
--repo openai/codex \
--title "TUI $ autocomplete shows inaccessible App Directory entries" \
--body "Typing $ shows irrelevant [App] suggestions from codex_app_directory..." \
--output /tmp/issue-dupe-candidates.json
Then read the JSON and inspect the highest-signal candidates:
gh issue view 24145 --repo openai/codex --json number,title,state,url,body,comments
Always search at least these categories:
For large repositories, broad symptom queries often beat precise internal queries. In the Codex App Directory case, codex_app_directory missed the older issue, while dollar sign menu bloated found it.
duplicate when the same maintainer change would likely fix both reports.related when symptoms share a UI surface but involve different providers, commands, or lifecycle phases.Additional diagnostics on the older issue:
I believe #NEW is a duplicate of this issue, but it includes additional diagnostics that may help narrow this down.
In my repro, the unexpected entries are present in:
```text
PATH_OR_CACHE
```
with entries like:
```json
{
"key": "value"
}
```
This suggests SOURCE is being included in SURFACE.
```
Closing the newer issue:
Closing this as a duplicate of #OLD. I added the relevant diagnostics there.
The collector writes JSON with:
repoqueriescandidatesmatchedQueries per candidategh issue view when availableUse that output as evidence, not as final judgment. The agent must still read and classify likely matches.
development
Create a finite Markdown questionnaire file that contains grouped questions, recommended options, answer fields, and optional rationale fields, then read the completed file and continue from the user's answers. Use when Codex needs to ask multiple questions for requirements, specifications, acceptance criteria, product decisions, design choices, implementation tradeoffs, or any situation where conversational back-and-forth would fatigue the user or make the remaining question count unclear.
development
Detect Python dead-code candidates that are referenced only from tests by running Vulture twice and diffing results (production paths vs production+test paths). Use when auditing cleanup targets, reviewing unused-code reports, or validating whether symbols are reachable only through tests.
testing
Generate synthetic data with SDV (Synthetic Data Vault). Learn patterns from real data with machine learning and produce privacy-preserving synthetic data. Use cases: (1) single-table synthetic data generation, (2) multi-table (relational DB) synthetic data generation, (3) time-series synthetic data generation, (4) synthetic data quality evaluation, and (5) metadata and constraint setup
development
Parse PuLP and solver logs (CBC, HiGHS, Gurobi, CPLEX) to diagnose infeasible/unbounded/time-limit/execution failures, extract key metrics, and propose prioritized next debugging actions. Use when given optimization run logs, solver stdout/stderr, or LP/MPS export errors and you need root-cause clues that generalize across optimization problems. When LP/MPS/log artifacts exist, include all of them in the analysis.