plugins/build/skills/check-bash-script/SKILL.md
Audits a Bash 4.0+ script against 34 deterministic rules emitted as JSON envelopes (shebang, `set -euo pipefail`, header comment, main + sourceable guard, readonly constants, mktemp+trap pairing, shellcheck rule set SC2086 / SC2046 / SC2068 / SC2154 / SC2155 / SC2006 / SC2010 / SC2012 / SC2045 / SC2013 / SC2162 / SC2038 / SC2164 / SC2002 / SC2294, shfmt format compliance, eval / GNU-flag / `/tmp/` literal flagging, secret patterns, line count, naming conventions, command preflight) plus six judgment dimensions (output discipline, input validation, performance intent, function design, commenting intent, cross-entity collision). Use when the user wants to "audit a bash script", "lint a bash script", or "run shellcheck on this". Not for POSIX `sh` portability — refused at scope. Not for Python scripts — route to `/build:check-python-script`.
npx skillsauth add bcbeidel/wos check-bash-scriptInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Audit a Bash 4.0+ script for structural soundness, safety, idiom discipline, and adherence to the project's Bash conventions. The rubric — what makes a script load-bearing, the anatomy template, the patterns that work — lives in bash-script-best-practices.md. This skill is the audit workflow; the principles doc is what it audits against.
The audit runs in three tiers. Tier-1 is deterministic — nine
detection scripts run per target and emit JSON envelopes, leaning hard
on shellcheck and shfmt for the heavy lifting. Each envelope is
self-sufficient: {rule_id, overall_status, findings[]} where every
finding includes a non-empty recommended_changes recipe. Tier-2
is judgment — read each references/check-*.md file (six surviving
LLM-judged dimensions) and evaluate the artifact against it directly;
produce one finding per dimension at most, default-closed when
borderline. Tier-3 is cross-entity collision detection — when the
scope holds multiple scripts in the same directory, check for
duplicated logic the maintainer could consolidate.
Read-only by default. The opt-in repair loop applies fixes only after
per-finding confirmation. Each script's recommended_changes field is
the canonical repair guidance — no enrichment needed.
Also fires when the user phrases the request as:
Read $ARGUMENTS:
.sh file (or extensionless executable) —
audit that file..sh at the
top level and every extensionless file with a bash shebang. Do not
recurse into subdirectories — Bash scripts are top-level by
convention, and recursion pulls in helpers / libraries / vendored
shell that the audit does not model.Confirm the scope aloud before proceeding (one line: "Auditing <path> (N scripts found)").
Run nine detection scripts in sequence against each target. Each emits
a JSON envelope (single-rule scripts) or a JSON array of envelopes
(multi-rule scripts) to stdout. Each exits 0 on clean / warn /
inapplicable, 1 on any overall_status: fail, 64 on argument
error, 69 on missing required dependency. Do not stop on any
script's exit 1 — all nine contribute findings to the merged report.
SCRIPTS="${SKILL_DIR}/scripts" # resolved by Claude at invocation
TARGETS="$ARGUMENTS"
python3 "$SCRIPTS/check_secrets.py" $TARGETS # 1 rule: secret (FAIL — excludes from Tier-2)
python3 "$SCRIPTS/check_structure.py" $TARGETS # 7 rules: shebang/strict-mode (FAIL); header/main/main-guard/readonly/mktemp-trap (WARN)
python3 "$SCRIPTS/check_idioms.py" $TARGETS # 3 rules: bracket-test, printf-over-echo, var-braces (WARN)
python3 "$SCRIPTS/check_safety.py" $TARGETS # 3 rules: eval, tmp-literal (FAIL); gnu-flags (WARN)
python3 "$SCRIPTS/check_shellcheck.py" $TARGETS # 15 rules wrapping shellcheck SC codes; inapplicable if shellcheck absent
python3 "$SCRIPTS/check_naming.py" $TARGETS # 1 rule: naming (WARN — case, shadowing, weak names)
python3 "$SCRIPTS/check_preflight.py" $TARGETS # 1 rule: preflight (WARN — external commands missing command -v)
bash "$SCRIPTS/check_shfmt.sh" $TARGETS # 1 rule: format (WARN); inapplicable if shfmt absent
bash "$SCRIPTS/check_size.sh" $TARGETS # 2 rules: size, line-length (WARN)
The scripts live next to SKILL.md under scripts/ and are executable.
Claude resolves ${SKILL_DIR} from the skill's own directory at
invocation time — hooks use $CLAUDE_PLUGIN_ROOT, but skills do not.
JSON envelope shape — every script emits this (single envelope or
array of envelopes; documented in assets/output-example.json):
{
"rule_id": "shebang",
"overall_status": "pass | warn | fail | inapplicable",
"findings": [
{
"status": "warn | fail",
"location": {"line": 1, "context": "<excerpt>"},
"reasoning": "<≤2 sentences>",
"recommended_changes": "<canonical repair recipe>"
}
]
}
recommended_changes is required on every finding. Scripts embed
their own canonical repair recipes — no LLM enrichment needed.
Single-artifact-per-rule discipline. Every rule lives as exactly
one artifact: a script (when ≥70% mechanically detectable) or a
markdown file under references/check-*.md (when judgment-driven), but
never both. Tier-1 scripts cover 34 rules (the deterministic set);
references/check-*.md covers the 6 judgment dimensions.
Script-to-rules map (rule_ids each script emits):
| Script | rule_ids emitted |
|---|---|
| check_secrets.py | secret |
| check_structure.py | shebang, strict-mode, header-comment, main-fn, main-guard, readonly-config, mktemp-trap-pairing |
| check_idioms.py | bracket-test, printf-over-echo, var-braces |
| check_safety.py | eval, gnu-flags, tmp-literal |
| check_shellcheck.py | unquoted-variable-expansion, unquoted-command-substitution, unquoted-args-expansion, eval-of-array, ls-grep-parsing, ls-instead-of-find, iterating-ls-output, referenced-but-not-assigned, unscoped-function-variable, backtick-command-substitution, for-line-in-cat, read-without-r, find-xargs-without-print0, cd-without-exit-handling, useless-cat |
| check_naming.py | naming (~70% coverage; gaps documented in script docstring) |
| check_preflight.py | preflight (~75% coverage; gaps documented in script docstring) |
| check_shfmt.sh | format |
| check_size.sh | size, line-length |
FAIL findings that exclude the file from Tier-2 (evaluation is not useful until these are resolved):
secret finding from check_secrets.pyshebang FAIL from check_structure.py (#!/bin/sh or other
non-bash — the file is not in scope)eval or tmp-literal FAIL from check_safety.pycheck_shellcheck.py for unquoted-variable-expansion,
unquoted-command-substitution, unquoted-args-expansion,
eval-of-array, or the ls-* parse family — correctness bugs that
bias every judgment dimension toward false negativesFAIL findings that do NOT exclude from Tier-2: strict-mode FAIL
leaves a parseable bash script that judgment can still evaluate
productively.
WARN / inapplicable findings never exclude. They surface in the report alongside Tier-2 findings.
For each file that passed the Tier-2-exclusion filter, read the six
surviving references/check-*.md files and judge the artifact
against each. Six dimensions, run together as a single coherent
rubric pass — no trigger gating, no per-rule subagent dispatch. A
dimension that does not apply (e.g., D2 Input Validation on a script
with no destructive operations) returns inapplicable silently
(empty findings, no surfacing in the report).
The six dimensions:
| File | Dimension | What it judges |
|---|---|---|
| check-output-discipline.md | D1 Output Discipline | Data to stdout / chatter to stderr; die helper present and used; meaningful non-zero exit codes on every failure path |
| check-input-validation.md | D2 Input Validation & Destructive-Op Safety | Inputs validated before destructive work; ${var:?} for required, ${var:-} for optional; --dry-run for deletes/overwrites; no secrets in argv; -- before untrusted args |
| check-performance-intent.md | D4 Performance Intent | Bash builtins over forking in hot loops; parameter expansion over basename/dirname/sed for simple string ops |
| check-function-design.md | D5 Function Design | main() reads as orchestrator (sequence of named operations); function bodies short and single-purpose |
| check-commenting-intent.md | D7 Commenting Intent | Header comment names purpose / usage / dependencies; inline comments explain why, not what; TODOs carry an owner or ticket |
| check-cross-entity-collision.md | T3 Cross-Entity Collision | Fires only when the audit scope holds multiple files: shared die/usage/preflight helpers candidate for extraction |
(D3 Subprocess & Tool Hygiene and D6 Naming were dissolved into Tier-1
scripts — check_preflight.py and check_naming.py respectively. The
mechanical portion of those rules is now scripted; the judgment portion
folds into D5 Function Design and the canonical convention doc
bash-script-best-practices.md.)
Evaluator policy: see check-skill-pattern.md §Evaluator policy. Read all six files first, then evaluate each in turn against the same artifact.
Skill-specific FAIL escalations. Beyond the canonical WARN floor, escalate to FAIL only for safety concerns Tier-1 missed (e.g., a hand-rolled SQL-shaped string in shell).
When the scope holds multiple scripts (directory walk, step 1), check for structural duplication the maintainer could consolidate:
die /
usage / preflight helper functions (candidate for a shared
_helpers.sh sourced by each script)main argument-parsing blocks across
scripts (the same case "$1" in -h|--help) usage; exit 0 ;; esac
pattern repeated)command -v preflight blocks for the same dependency setReport collisions as INFO findings — they are maintainer guidance, not failures. Single-script scope skips this tier.
Parse each script's stdout as JSON (single envelope or array). Merge
all findings (Tier-1 from JSON envelopes; Tier-2 from your judgment
pass) into a unified table sorted by severity (fail > warn >
inapplicable), then by file path. Deduplicate exact-match findings
at merge time — shellcheck occasionally emits the same SC code at
multiple lines.
<SEVERITY> <path> — <rule_id>: <one-line reasoning>
Recommendation: <recommended_changes excerpt>
For Tier-1 findings, recommended_changes comes from the script's
embedded recipe constant (canonical, no enrichment). For Tier-2
findings, you author it inline grounded in the rule's body.
Summary line at top and bottom: N fail, N warn across N scripts. If
any file was excluded from Tier-2, name it and the exclusion-trigger
finding.
After presenting findings, ask:
"Apply fixes? Enter
y(all),n(skip), or comma-separated finding numbers."
For each selected finding:
recommended_changes field — that is the
canonical recipe for the fix.Per-change confirmation is non-negotiable. Bulk application removes the user's ability to review individual edits.
inapplicable
silently. Conditional dimensions produce inconsistent rubrics
across runs and make findings non-comparable.pass envelope by reading the artifact again for that rule;
trust the script. The judgment pass is for the six dimensions
that have NO script.shellcheck or
shfmt is absent, the overall_status: inapplicable envelope is
the user's signal that coverage is reduced. Surfacing it is the
contract; hiding it silently under-audits.recommended_changes — each script
embeds the canonical repair recipe sourced from the deleted
rule-*.md body content. Don't paraphrase or expand; copy through.inapplicable
silently.{rule_id, overall_status, findings}) or a JSON array
of envelopes (multi-rule scripts). The recommended_changes field
on each finding is canonical — copy it through to the report.shellcheck or shfmt is absent, the wrapper emits an
envelope with overall_status: inapplicable and exits 0. Other
scripts continue.$ARGUMENTS — the scope the user named
is the only scope.sh scripts — out of scope; the principles doc
this skill enforces is bash-only. The shebang FAIL from
check_structure.py is the structural refusal.git checkout -- <path> or the
editor's undo.Chainable to: /build:build-bash-script (rebuild from scratch
after flagged repairs if the repair loop surfaces structural issues
bigger than point fixes).
tools
Use when the user wants to "audit a help skill", "review my plugin index", or "verify my help-skill is up to date". Audits a plugins/<plugin>/skills/help/SKILL.md against the help-skill rubric — coverage, freshness, frontmatter fidelity, plus five judgment dimensions and a trigger-collision check.
tools
Use when the user wants to "scaffold a help skill", "add a /<plugin>:help command", or "build a plugin index skill", or wants to give a plugin an orientation surface that lists its skills and common workflows. Produces a SKILL.md at plugins/<plugin>/skills/help/SKILL.md.
tools
Audits pair-level integrity of a primitive-pair (the artifact `/build:build-skill-pair` produces) by walking the four required artifact slots — principles doc, `build-<primitive>/SKILL.md`, `check-<primitive>/SKILL.md`, and the `primitive-routing.md` registration — and reports cross-artifact issues a per-SKILL.md checker cannot see: missing principles doc, divergent principles paths between halves, absent routing registration, missing build→check handoff. Per-half structural compliance with the unified pattern (`check-skill-pattern.md`) is delegated to `plugins/build/_shared/scripts/check_skill_pattern.py`. Use when the user wants to "audit a skill pair", "review a primitive pair", or "validate the skill pair for X". Not for auditing a single SKILL.md — route to `/build:check-skill`. Not for re-distilling a stale principles doc — route to `/build:build-skill-pair`.
testing
Audit a root-level resolver — verify AGENTS.md pointer, managed-region integrity, filing-table coverage against disk, context-table actionability, and trigger-eval pass rate. Use when the user wants to "audit a resolver", "validate routing table", or "find dark capabilities".