npx skillsauth add laststance/skills qa-cliInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When running this skill in Codex, translate Claude Code-only primitives before acting: AskUserQuestion -> chat/request_user_input, TodoWrite -> update_plan, Task/TaskCreate/TeamCreate/SendMessage -> spawn_agent/send_input/wait_agent when available and allowed, and EnterPlanMode/ExitPlanMode -> a concise chat plan plus explicit approval.
Resolve Read/Write/Edit/Bash/WebSearch/WebFetch to Codex file/shell/web tools, and map ~/.claude/... paths to ~/.agents/... or ~/.codex/... unless the task explicitly targets Claude Code.
When running this skill in Cursor Agent, translate Claude Code-only primitives before acting: AskUserQuestion -> AskQuestion; TodoWrite -> Cursor TodoWrite or an equivalent checklist; Task/TaskCreate/TeamCreate/SendMessage/multi-agent flows -> Cursor Task (subagents), parallel Tasks, or run_in_background when allowed (TeamCreate/SendMessage may have no exact match); EnterPlanMode/ExitPlanMode -> Plan mode (SwitchMode / CreatePlan) plus explicit user approval.
Resolve Read/Write/Edit/StrReplace/Bash/web/search/MCP via Cursor Composer or Agent equivalents. MCP names written as mcp__server__tool typically map to call_mcp_tool with configured server identifiers. Map ~/.claude/... to ~/.cursor/skills/, .cursor/skills/, and .cursor/rules/ unless the task explicitly targets Claude Code.
You audit a CLI tool without touching its source, produce a single
Markdown report of observed issues, and stop. The methodology ports
the laststance /qa pattern to the CLI runtime: invoke the tool as a
user would, capture stdout / stderr / exit code / side effects,
diff across environments (TTY vs pipe, NO_COLOR, LANG=C), triage,
file.
/tmp/qa-cli-session/ and clean up in Phase 9.--help output, man page, and any user-visible docs, but do
not grep its source to decide what to test.printenv-pair, or a before/after file listing. Commands
alone are not evidence — capture their output.sync subcommand")qa-tui insteadqa-*
skill/review or a review-oriented agentAsk once, up front, via AskUserQuestion:
./my-cli,
node cli.js, pnpm run cli --, bun run cli.ts, a global like
gh or kubectl)--dry-run or a safe test
target. Never run destructive commands without explicit scope.Record these answers in the report header.
One file, at ./qa-reports/cli-<YYYY-MM-DD>-<tool>.md, matching
templates/qa-report-template-cli.md. Evidence (stdout / stderr
captures, session transcripts) lives alongside under
./qa-reports/artifacts/cli-<date>-<tool>/.
No other files. No fixes. No PRs.
Set up once in Phase 0:
/tmp/qa-cli-session/
├── baseline/
│ ├── help.txt # `<tool> --help` raw capture
│ ├── version.txt
│ └── env.txt # `printenv | sort` for reproducibility
├── cases/
│ ├── case-001/
│ │ ├── cmd # the exact command run
│ │ ├── stdout
│ │ ├── stderr
│ │ └── exit # numeric exit code
│ └── case-002/…
├── fixtures/ # any sample input files you create
└── notes.md # running log — becomes the report
The rule: one case = one directory. It makes diffing across environments mechanical.
mkdir -p /tmp/qa-cli-session/{baseline,cases,fixtures}
cd /tmp/qa-cli-session
# Record the exact invocation — parameterize TOOL once, reuse.
TOOL="<user-provided-invocation>"
echo "$TOOL" > baseline/invocation
# Basics
$TOOL --version > baseline/version.txt 2>&1 ; echo "exit=$?" >> baseline/version.txt
$TOOL --help > baseline/help.txt 2>&1 ; echo "exit=$?" >> baseline/help.txt
$TOOL -h > baseline/help-short.txt 2>&1 ; echo "exit=$?" >> baseline/help-short.txt
Note every mismatch already:
--help exit 0? (Should. Exit-code bug if it doesn't.)--help and -h produce the same output? (Should, or one
should be documented absent.)--version print only the version on stdout, or does it
emit a header / banner / ASCII art? (Banners on stdout break
$(tool --version) users.)printenv | sort > baseline/env.txt
uname -a > baseline/os.txt
echo "$SHELL" >> baseline/os.txt
locale > baseline/locale.txt 2>&1
These are the ground truth for every "works for me" debate in the report.
From baseline/help.txt, scan for:
TODO, FIXME, XXX, <fill in>, Loremexample.com, yourdomain--foo in summary but --FOO in
detailAny hit is at least a medium content issue.
man and completions (if the tool claims to provide them)man <tool> 2>/dev/null | head -50 > baseline/man.txt
<tool> completion bash 2>/dev/null | head -20 > baseline/completion-bash.txt
<tool> completion zsh 2>/dev/null | head -20 > baseline/completion-zsh.txt
A broken completion script is a high-severity functional bug (breaks the user's shell if sourced).
The goal: enumerate every subcommand, every top-level flag, and note which combinations the tool advertises as supported.
From baseline/help.txt, build:
## Surface map
### Top-level flags
- `--verbose` / `-v`
- `--config <path>` / `-c`
- `--format <json|plain>` / `-f`
- `--no-color`
- `--dry-run`
- `--help` / `-h`
- `--version`
### Subcommands
- `init` — …
- `sync` — …
- `status` — …
- `config get|set|unset` — nested subcommands
- `help <cmd>`
### Positional patterns
- `<tool> <file>` (top-level)
- `<tool> sync <source> <dest>` (sync subcommand)
For nested subcommands, recurse: <tool> config --help,
<tool> config get --help, etc. Cap the depth at what the user
asked for.
Parking lot issues spotted during mapping (record, don't deep- dive yet):
--help references a flag the top-level help
doesn't document-f listed as --force
somewhere, --file somewhere else)For each subcommand, run the "happiest possible" invocation and capture all three streams.
CASE=/tmp/qa-cli-session/cases/case-001
mkdir -p "$CASE"
CMD=($TOOL sync ./fixtures/a.txt ./fixtures/b.txt)
printf '%q ' "${CMD[@]}" > "$CASE/cmd"
"${CMD[@]}" > "$CASE/stdout" 2> "$CASE/stderr"
echo $? > "$CASE/exit"
Check:
0 on success? (Anything else on the golden path is a
high.)ls -la before
and after.Repeat for every subcommand within scope. One case per invocation.
The CLI's error surface tells you how well it respects its users.
$TOOL --nonexistent-flag # Expect: exit!=0, useful stderr
$TOOL -Z # Expect: same
$TOOL --config # Missing value for flag expecting one
$TOOL --format banana # Invalid enum value
$TOOL sync # Missing required positional
$TOOL sync a b c d # Too many positionals
$TOOL --verbose --quiet # Conflicting flags (if both exist)
For each, capture stdout/stderr/exit code. Red flags:
0 on bad input → critical>/dev/null 2>errors.log pattern)See '<tool> --help' hint → mediumCreate fixtures that exercise the boundary:
# Empty file
: > fixtures/empty.txt
# File that doesn't exist
$TOOL process fixtures/does-not-exist.txt
# File with no read permission (if testing as a non-root tool)
echo hi > fixtures/denied.txt && chmod 000 fixtures/denied.txt
$TOOL process fixtures/denied.txt ; chmod 644 fixtures/denied.txt
# Binary data where the tool expects text
head -c 4096 /dev/urandom > fixtures/binary.bin
$TOOL process fixtures/binary.bin
# Input that looks like a flag
echo 'hello' > fixtures/--foo
$TOOL process fixtures/--foo # should require `--` separator or quote
# Very large input (streaming test — see Phase 4)
Each case's shape matters: a FileNotFoundError traceback is a bug;
error: no such file: 'fixtures/does-not-exist.txt' on stderr with
exit !=0 is correct.
HOME unset: env -u HOME $TOOLPATH stripped: env PATH= $TOOL (does the tool shell out? Should
fail gracefully)LANG=C: env LANG=C $TOOL --help — any mojibake?This is the phase CLIs most often fail.
For the golden-path commands, confirm:
$TOOL command > /dev/null # you should still see status / progress on stderr
$TOOL command 2> /dev/null # you should still see the data on stdout
$TOOL command > out.txt 2> err.txt # the split is clean
Rules:
$(tool foo)) lives here.>out.txt.--verbose adds chatty output, it goes to stderr unless it IS
the data.If the tool advertises --format json / --json:
$TOOL list --format json | jq . # must parse
$TOOL list --format json > out.json ; wc -l out.json # size sanity
A tool that says it outputs JSON but produces invalid JSON is critical. Common bugs:
\rsprint() debug statement left in → malformed JSONInvoke three ways:
$TOOL list # attached TTY → color OK
$TOOL list | cat # piped → color must disappear (isatty check)
NO_COLOR=1 $TOOL list # NO_COLOR spec → color must disappear
$TOOL list --no-color # explicit flag → color must disappear
Grep each capture for ANSI codes:
grep -cP $'\x1b\\[' cases/case-NNN/stdout
NO_COLOR=1 ignored = high (spec violation,
https://no-color.org)--no-color present but non-functional = highIf --verbose / -v / -vv / --quiet exist:
$TOOL command 2> stderr-normal
$TOOL command --verbose 2> stderr-verbose
$TOOL command --quiet 2> stderr-quiet
wc -l stderr-* # expect: quiet < normal < verbose
--quiet that still emits warnings, or --verbose that adds
nothing, are medium bugs.
If the tool advertises stdin support (<tool> - or implicit):
echo 'hello' | $TOOL process - # piped
$TOOL process - < fixtures/a.txt # redirected
$TOOL process # no stdin and no args — should fail fast,
# NOT hang waiting for tty input in a
# non-interactive context (common bug)
$TOOL process </dev/null # closed stdin
Hangs when stdin is closed = critical (breaks CI, cron, any piped use).
$TOOL process ~/file.txt # tilde expansion works? (shell-level, but
# check the tool resolves correctly)
$TOOL process './relative/path.txt'
$TOOL process '/absolute/path.txt'
$TOOL process 'fixtures/*.txt' # does the tool expand globs or rely on
# the shell? Document which.
$TOOL process 'path with spaces.txt' # quoting
$TOOL export --out result.txt # --out flag creates / overwrites?
$TOOL export --out /no-permission.txt # clear error, not a traceback
$TOOL export --out /tmp/a/b/c/d.txt # does it mkdir -p or fail?
Document what it does; note surprising choices.
Grep --help for mentioned env vars (XYZ_CONFIG, XYZ_TOKEN).
For each:
env -u XYZ_CONFIG $TOOL ... # what happens without it?
XYZ_CONFIG=/non/existent $TOOL ... # bad path?
XYZ_CONFIG='' $TOOL ... # empty string?
If the tool reads a config (~/.config/<tool>/config.yaml,
./.toolrc, etc.):
The standard order is flag > env > config > default. Test it explicitly:
# Set config to value A, env to B, flag to C; confirm C wins.
printf 'setting: A\n' > fixtures/cfg.yaml
SETTING=B $TOOL --config fixtures/cfg.yaml --setting C ...
# Repeat with flag omitted (env B wins?), flag+env omitted (config A wins?)
Wrong precedence is high — it's invisible to users until they're deep in a bug hunt.
Only applicable if the tool runs for any non-trivial duration.
# Kick off a long operation, then SIGINT it
$TOOL sync --large-dataset &
PID=$!
sleep 2
kill -INT $PID
wait $PID
echo "exit=$?"
Check:
128 + SIGINT) by convention, or at least
non-zero?ls /tmp/qa-cli-session and a ls -la in the target dir before
and after.$TOOL sync --large-dataset &
kill -TERM $!
wait $!
Same checks. Tools that ignore SIGTERM and keep running are critical in orchestrated environments (Kubernetes, systemd).
If the tool uses a lockfile (.tool.lock):
Open /tmp/qa-cli-session/notes.md. For each observation:
references/issue-taxonomy-cli.mdThe "Top 3 things to fix" section at the top of the report is the
only thing a busy maintainer will read. Pick the three with the
highest user impact — usually (a) anything that exits 0 on a real
error, (b) anything that breaks piping, (c) anything that ignores
NO_COLOR / --no-color.
Use templates/qa-report-template-cli.md. Fill every field. If a
section is empty, write "None observed" — don't delete it.
Store evidence excerpts inline (3–10 lines each with a fenced code
block). If a full capture is >20 lines, put it in
./qa-reports/artifacts/<session>/ and link it.
Use the category weights in the template. Subtract:
Clamp to [0, 100]. Express as {score}/100. The number is a rough
signal; the issue list is what matters.
# Remove scratch fixtures if they contain sensitive output
rm -rf /tmp/qa-cli-session
# If the tool left side effects (config files, lockfiles, temp dirs),
# note each one in the report's "Residue" section and either clean it
# up (if clearly a test artifact) or flag it for the user.
Confirm you did not create any files outside /tmp/qa-cli-session
and ./qa-reports/ during the run.
references/issue-taxonomy-cli.md — categories and severity
guidance for every kind of CLI bugreferences/cli-driver-reference.md — shell patterns for
capturing streams, measuring duration, exercising TTY detectionreferences/cli-conventions.md — POSIX / GNU / isatty / NO_COLOR /
exit-code conventions to compare observed behavior againsttemplates/qa-report-template-cli.md — the output formattesting
Cited research briefs
development
Daily coding habit prompts JP
development
React core deep-dive JP
data-ai
Copy last agent reply