marketplace/bundles/plan-marshall/skills/tools-input-validation/SKILL.md
Shared input validation module for plan-marshall scripts — validates plan IDs, file paths, enums, and skill notation
npx skillsauth add cuioss/plan-marshall tools-input-validationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Role: Shared Python module providing input validation functions for plan-marshall scripts. Prevents path traversal, invalid plan IDs, and malformed inputs from reaching filesystem operations.
Execution mode: Library module; import validators as documented in usage examples.
Prohibited actions:
validate_plan_id()validate_relative_path()Constraints:
^[a-z][a-z0-9-]*$add_<id>_arg(parser)) that wire the validator into argparse type= so malformed input is rejected at the script boundaryImport input_validation module in Python scripts that:
plan_id arguments for path constructionLocation: scripts/input_validation.py
Raising Validators (return validated value or raise ValueError)
1. validate_plan_id(plan_id: str) -> str
^[a-z][a-z0-9-]*$plan_id stringValueError if invalid2. validate_relative_path(file_path: str) -> str
.. traversalfile_path stringValueError if empty, absolute, or contains traversal3. validate_enum(value: str, allowed: list, label: str) -> str
value, list of allowed values, label for error messageValueError if not in allowed list4. validate_skill_notation(skill: str) -> str
bundle:skill formatskill notation stringValueError if not in bundle:skill format5. validate_script_notation(notation: str) -> str
bundle:skill:script format (3-part executor notation)notation stringValueError if not in bundle:skill:script formatBool Companions (drop-in replacements for existing patterns)
6. is_valid_plan_id(plan_id: str) -> bool
plan_id stringTrue if valid, False otherwise7. is_valid_relative_path(file_path: str) -> bool
file_path stringTrue if valid, False otherwiseThese helpers detect lesson-ID-shaped tokens embedded in arbitrary prose
(typically task titles and descriptions) and verify them against the live
manage-lessons inventory. They are the single source of truth for
"is this a real lesson ID?" checks across the bundle, used by
manage-tasks (at-write-time validation) and plan-doctor (post-hoc
plan diagnostics).
All three helpers reuse the canonical LESSON_ID_RE constant — no new
regex literal is introduced. The embedded scanner uses an unanchored
derivative (_LESSON_ID_EMBEDDED_RE) of the same pattern with non-digit
boundary lookarounds so adjacent digits don't bleed into a match.
The canonical regex shape is asserted
against actual repo data at runtime — not only in the test suite. On
first invocation per process, scan_lesson_id_tokens and
verify_lesson_ids_exist call verify_lesson_id_regex_against_inventory,
which:
manage-lessons list and parses the TOON inventory.LESSON_ID_RE,
caches success for the rest of the process.stderr and treats the anchor as a no-op so
greenfield use isn't a hard error.LESSON_ID_RE, raises
LessonRegexAnchoringError with the regex pattern and a sample of the
unmatched IDs. The cache is NOT set, so every subsequent scanner call
keeps failing until the regex (or the IDs) is corrected. This is the
failure mode that exists precisely because regex-vs-inventory drift can
otherwise produce a silent "no IDs match anything" false-clean signal.8. scan_lesson_id_tokens(text: str) -> list[str]
text.text (e.g., a task title + description).LessonRegexAnchoringError, LessonInventoryUnavailable
via the first-use anchor check.9. verify_lesson_ids_exist(tokens: Iterable[str]) -> dict[str, bool]
{token: present} for each token by lookup against
the live manage-lessons inventory. Duplicates de-duplicated.True if found in the
live inventory, False otherwise.LessonInventoryUnavailable when the subprocess fails —
NEVER silently returns "all present". Also LessonRegexAnchoringError
via the first-use anchor check.10. verify_lesson_id_regex_against_inventory() -> None
None on success or empty-inventory no-op.LessonRegexAnchoringError when live IDs exist but none
match LESSON_ID_RE; LessonInventoryUnavailable on subprocess
failure.LessonInventoryUnavailable(RuntimeError) — raised whenever
manage-lessons list cannot be invoked, exits non-zero, or returns
output the TOON parser rejects. Callers MUST surface this; silently
degrading to "all present" defeats the entire purpose of the scanner.
LessonRegexAnchoringError(RuntimeError) — raised when the live
inventory contains IDs but none of them match LESSON_ID_RE. Carries
regex (the pattern that failed to anchor) and sample_ids (a slice of
the unmatched IDs) for diagnostics.
from input_validation import (
LessonInventoryUnavailable,
LessonRegexAnchoringError,
scan_lesson_id_tokens,
verify_lesson_ids_exist,
)
text = "Per lesson 2026-04-29-10-001, anchor regex against live inventory."
try:
tokens = scan_lesson_id_tokens(text)
presence = verify_lesson_ids_exist(tokens)
missing = [tok for tok, ok in presence.items() if not ok]
if missing:
raise ValueError(f"Unresolved lesson IDs: {missing}")
except LessonRegexAnchoringError as exc:
# Hard fail — the regex shape has drifted from the inventory.
raise
except LessonInventoryUnavailable as exc:
# Hard fail — inventory unreachable; do not silently pretend
# everything is present.
raise
status: error
error: validation_failed
message: Plan ID contains invalid characters: bad!!id
| Identifier | Regex | Builder |
|------------|-------|---------|
| plan_id | ^[a-z][a-z0-9-]*$ | add_plan_id_arg |
| lesson_id | ^[0-9]{4}-[0-9]{2}-[0-9]{2}-[0-9]{2}-[0-9]+$ | add_lesson_id_arg |
| session_id | ^[A-Za-z0-9_-]{1,128}$ | add_session_id_arg |
| task_number | ^[0-9]+$ | add_task_number_arg |
| task_id | ^TASK-[0-9]+$ | add_task_id_arg |
| component | ^[a-z0-9-]+(:[a-z0-9-]+)*$ | add_component_arg |
| hash_id | ^[a-f0-9]{4,}$ | add_hash_id_arg |
| phase_id | ^[1-6]-(init\|refine\|outline\|plan\|execute\|finalize)$ | add_phase_arg |
| field_name | ^[a-z][a-z0-9_]*$ | add_field_arg |
| module_name | ^[a-z][a-z0-9_-]*$ | add_module_arg |
| package_name | ^[a-z][a-z0-9_]*(\.[a-z][a-z0-9_]*)*$ | add_package_arg |
| domain_name | ^[a-z][a-z0-9-]*$ | add_domain_arg |
| resource_name | ^[a-zA-Z0-9_-]+$ | add_name_arg |
The constants are exported from input_validation.py as PLAN_ID_RE, LESSON_ID_RE, etc., so consumers can reuse the canonical regex without re-deriving it. The add_<id>_arg(parser) builders wire type=validate_<id> into argparse so malformed input is rejected at the CLI boundary; pair them with parse_args_with_toon_errors() to centralise the status: error / error: invalid_<field> output path.
A handful of canonical regexes intentionally reject inputs that pre-canonical scripts accepted — these are current, deliberate behaviours documented in standards/identifier-validation-audit.md:
COMPONENT_RE rejects uppercase / path-shaped Sonar component keys (e.g. org:src/Main.java) at the CLI boundary.--task-number validates against TASK_NUMBER_RE then coerces to int, so downstream consumers still receive an int.--lesson-id is repeatable (action='append'); each element is validated independently and any malformed value fails the whole invocation.When adding a new identifier-shaped flag, use the matching add_<id>_arg(parser) builder and add a row to the registry table above. Pair the builders with parse_args_with_toon_errors() for the canonical status: error / error: invalid_<field> output path.
from input_validation import is_valid_plan_id, validate_enum
# Plan ID validation (alphanumeric + hyphens, 1-64 chars)
if not is_valid_plan_id(args.plan_id):
output_error('invalid_plan_id', f'Invalid plan_id format: {args.plan_id}')
sys.exit(1)
# Enum validation
validate_enum(args.certainty, ['CERTAIN_INCLUDE', 'CERTAIN_EXCLUDE', 'UNCERTAIN'], 'certainty')
Location: scripts/schema_validation.py
Lightweight schema validation for plan-marshall JSON storage files. Returns a list of error strings (empty list = valid). No external dependencies.
1. validate_status(data: Any) -> list[str]
status.json — requires plan_id (str), current_phase (str), phases (list of dicts with name and status)2. validate_references(data: Any) -> list[str]
references.json — requires plan_id (str)3. validate_task(data: Any) -> list[str]
TASK-*.json — requires task_id (str), title (str), status (str), steps (list of dicts with id and title)4. validate_assessment(data: Any) -> list[str]
hash_id (str), file_path (str), certainty (str, one of CERTAIN_INCLUDE/CERTAIN_EXCLUDE/UNCERTAIN), confidence (int or float)5. validate_finding(data: Any) -> list[str]
hash_id (str), type (str), severity (str), message (str)from schema_validation import validate_status, validate_task
errors = validate_status(data)
if errors:
output_error('schema_violation', '; '.join(errors))
sys.exit(1)
# Drop-in replacement for existing validate_plan_id pattern
from input_validation import is_valid_plan_id
if not is_valid_plan_id(args.plan_id):
print(f'Error: Invalid plan_id format: {args.plan_id}', file=sys.stderr)
sys.exit(1)
# New code: raising style with chaining
from input_validation import validate_plan_id, validate_relative_path
plan_id = validate_plan_id(args.plan_id) # raises ValueError
file_path = validate_relative_path(args.file) # raises ValueError
# Enum validation
from input_validation import validate_enum
validate_enum(args.status, ['pending', 'done', 'blocked'], 'status')
tools
Plan-marshall-domain implementor of the ext-self-review-{domain} extension point. Surfaces deterministic candidates (regexes, user-facing strings, markdown sections, symmetric-pair functions, flag-guard pairs, contract sources, schema-bearing files) for pre-submission structural self-review.
development
Domain-invariant recipe for deliberate wide-scope simplification campaigns across a scope x thoroughness cell, with a T4+ relation-graph pre-deliverable
testing
A test skill for README generation
testing
A test skill with existing references