.claude/skills/audit-arch/SKILL.md
Audit codebase for adherence to architectural standards, practices, and rules. Use when user says "audit arch", "audit architecture", "check architecture", or "architectural review". Spawns parallel subagents to examine multiple architectural aspects and generates a structured report.
npx skillsauth add talont-org/autoskillit audit-archInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Audit the codebase for adherence to architectural standards and rules.
NEVER:
ALWAYS:
{{AUTOSKILLIT_TEMP}}/audit-arch/arch_audit_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)Rule: All state reads must come from the authoritative source (database, API, configuration management). File outputs and caches are write-only. Systems never read files back as the primary source of state.
Audit Strategy — 3-Question SSOT Test:
Before flagging any file read as a P1 violation, ask all three questions. All three must be YES to report a violation:
Non-SSOT Patterns — Do NOT flag these:
Data Flow Tracing (for confirmed SSOT candidates):
Cross-Reference: Findings discovered while auditing other principles that answer YES to all three SSOT gate questions should be escalated and reported here as P1 violations.
Rule: Clear separation between domains with consistent structure.
Common patterns:
Audit Strategy:
Rule: Dependencies flow one direction. Higher layers depend on lower layers, never reverse.
Typical layering:
presentation/ -> depends on business logic, data access
business logic -> depends on data access, infrastructure
data access -> depends on infrastructure only
infrastructure -> depends on nothing project-specific
Also check internal layering: Within a domain, core modules should not import from higher-level modules (handlers, controllers, UI).
Audit Strategy:
Rule: Separate domains/modules must be independent. Feature A cannot import from Feature B directly.
Audit Strategy:
Rule: When using architectural patterns (MVC, repository pattern, state machines, etc.), implementations must follow consistent patterns across the codebase.
Audit Strategy:
Important: If a component bypasses the established pattern to use file-first state loading, that's a P1 violation - report it under P1, not here.
Rule: Shared functionality exists in exactly one location.
Audit Strategy:
Migration Awareness: During migration, shims are acceptable only if they re-export from new location. Full duplicate implementations are violations.
Rule: All data access through designated abstraction layer (repositories, DAOs, services), never direct client usage in business logic.
Audit Strategy:
Rule: No file should exceed 750 lines. Large files should be decomposed.
Audit Strategy:
Rule: When constructing models/objects from dicts/external data, use factory methods or full validation. Never manually select fields in constructor calls.
Rationale: Manual field selection silently drops unlisted fields. Optional fields are especially vulnerable since missing them causes no validation error.
Audit Strategy:
Model(field1=dict["x"], field2=dict.get("y")) patternsSeverity: HIGH - silent data loss breaks downstream consumers
Rule: Classes extending external framework base classes must implement ALL interface methods explicitly. Avoid mixin patterns where method resolution order affects behavior.
Audit Strategy:
NotImplementedError stubs)Severity: CRITICAL - Interface mismatches only surface at runtime in specific code paths
Rule: Direct dependencies should track current major versions. Minor/patch drift is acceptable; lagging a major version is not.
Audit Strategy:
Severity: MEDIUM - stale major versions accumulate migration debt and miss security fixes
Rule: All service dependencies injected into the central DI container (ToolContext) must be expressed as @runtime_checkable Protocol types defined in core/types.py. Each Protocol must have exactly one concrete Default* adapter in its domain layer. The Composition Root (server/_factory.py) is the only location in production code that may instantiate all concrete services simultaneously and wire them into ToolContext. All ToolContext field annotations must use Protocol types, never concrete classes.
Rationale: Protocol-typed fields allow test substitution without import pollution. A single wiring point makes the full dependency graph explicit and auditable. Naming convention (Default*) signals the standard production implementation and distinguishes it from test doubles. Concrete class leakage into field annotations couples the container to implementations rather than contracts.
Audit Strategy:
core/types.py for all @runtime_checkable Protocol definitionsDefault* concrete adapter exists in the codebaseDefault* naming convention (e.g., DefaultTestRunner, not RealTestRunner or TestRunnerImpl)ToolContext field annotations use only Protocol types (not concrete classes)ToolContext is only instantiated in server/_factory.py and test filesDefault* adapter instantiated outside the Composition Root in production codeComposition-Boundary Tiers — NOT Violations:
The Composition Root rule applies only to full DI wiring sessions. The following three tiers are legitimate composition boundaries and MUST NOT be flagged as P12 violations:
Default* adapters as dataclass field defaults. This pattern is intentional: the sub-package is a cohesive unit and the sibling wiring is not "escaping" the Composition Root. Example: pipeline/context.py instantiating DefaultBackgroundSupervisor, DefaultMcpResponseLog as field defaults.cli/ with their own lifecycle that do not receive a ToolContext at all. These modules compose their own lightweight dependencies because they operate outside the server pipeline. Examples: cli/_workspace.py, cli/_cook.py.SomeProtocol | None = None with a deferred-import fallback inside if param is None:. This is a dependency-injection convenience pattern, not a Composition Root bypass. Example: recording.py.Only Default* instantiation that bypasses the Composition Root AND falls outside all three tiers is a P12 violation.
Severity: HIGH — concrete class leakage in field annotations defeats substitutability; Protocol naming gaps make the DI contract non-discoverable
Rule: Architectural constraints that can be expressed as properties of source code structure must be enforced through automated checks (AST rules, static analysis, or property-based tests). Each enforced rule must carry a unique identifier, a rationale, and explicit named exemptions. Any architectural commitment discovered during an audit that has no automated enforcement represents coverage debt.
Rationale: Runtime tests only catch violations on executed code paths. Structural checks catch violations across every source file at check time, with zero execution overhead. Coverage gaps — constraints known but unenforced — accumulate silently.
Audit Strategy:
Severity: MEDIUM for gaps in existing enforcement; HIGH for any newly agreed architectural rule with zero automated enforcement
Rule: All imports of symbols from another sub-package must go through that sub-package's __init__.py gateway — never through a submodule path directly (e.g., from autoskillit.recipe.validator import X is forbidden outside recipe/; use from autoskillit.recipe import X). Sub-package __init__.py files are the public API contract: their __all__ is the surface. Submodule paths are internal implementation details free to be restructured. Facade functions in __init__.py that require deferred function-body imports (to work around circular initialization) must instead be extracted to a dedicated _api.py submodule and re-exported from __init__.py.
Rationale: Submodule-path imports create tight coupling to internal structure, making refactoring unsafe. The gateway pattern isolates external consumers from internal reorganization. When __init__.py contains substantive logic that requires deferred imports to avoid circular initialization, it signals that the function belongs in a real submodule — __init__.py should be a pure re-export facade. The _api.py convention provides a home for cross-cutting orchestration that needs imports from multiple internal submodules.
Audit Strategy:
from autoskillit.X.submodule import patterns in files outside package X__init__.py files declare a complete __all__ matching their re-exports= aliases or re-imports) inside __init__.py files__init__.py, check if it uses deferred (function-body) imports — if yes, flag for extraction to _api.py__init__.py function body contains 3+ deferred imports (strong signal of misplacement)test_no_cross_package_submodule_imports (or equivalent) AST test exists and covers all sub-packagesSeverity: MEDIUM for submodule-path leakage; MEDIUM for deferred-import-heavy __init__.py functions (circular import debt indicator)
These apply across all principles when evaluating architectural decisions:
Implicit correction masks upstream failures — Reject invalid input rather than fixing it. Examples: silent type conversion, default values for required fields, translation layers that never reject, retry loops that swallow errors.
Functions that accept all inputs without rejection are fallbacks, not validators — If a "validator" or "normalizer" never raises an error, it's hiding problems.
Standard patterns that are NOT cross-cutting violations:
ValidationError, custom errors: list[str] returns). These are the standard pattern — a validator that collects all errors before returning is not "hiding" problems.__init__.py files that re-export symbols from private submodules via __all__. These are gateway API contracts, not backward-compatibility shims. Only flag re-exports if the old location still contains a full duplicate implementation.System-derived values belong in code, not external input — Values determined by workflow state (status, IDs, counts) should be set by the system that owns them, not expected from external sources.
No backward compatibility — Flag any code containing these keywords as violations: legacy, deprecated, backward, compat, migration shim, old format, previous version, for compatibility. Dead code should be deleted, not preserved with comments explaining why it exists.
Before reporting any finding, complete the mandatory verification for its category. A finding MUST NOT be reported unless the required verification has been performed.
| Finding category | Required verification before reporting |
|---|---|
| Missing export (symbol absent from public API) | Use the Read tool to open the relevant __init__.py. Verify both __all__ contents and direct re-export statements. Discard the finding if the symbol is present. |
| Missing decorator (e.g., runtime_checkable) | Use the Read tool to open the file containing the class definition. Inspect the 3–5 lines directly preceding the class keyword. Discard the finding if the decorator is present. |
| Enforcement gap (no test for a rule or constant) | Use the Grep tool to search tests/ for the exact symbol or constant name. Discard the finding if a matching test file is found. |
| Code duplication | Use the Read tool to retrieve the full body of each function. Compare the full signature (parameters, return type) and logic step-by-step. Same-named functions at different abstraction levels are NOT duplicates. Discard if logically distinct. |
| Misplaced file or incorrect import path | Use the Bash tool to run git log --oneline -- {file_path} (substituting the actual path). Inspect commit messages for intentional placement decisions. Discard the finding if a commit explains the placement. |
git log --format="%H %s%n%b" -20 in the project root and scan both commit subjects and bodies for evidence that any finding was recently resolved. Only mark a finding STALE if the commit message references the specific file or code pattern involved (not just vague fix/refactor language). Mark confirmed stale findings as STALE (with the resolving commit hash). Do not remove stale findings — include them in the report with a STALE tag so the user can verify.a. HIGH/CRITICAL re-read: For every HIGH or CRITICAL finding, use the Read tool to re-open the exact file and line range cited. Confirm the code exhibits the claimed behavior. Downgrade or remove the finding if the evidence does not hold.
b. Concrete-class check (resource-leak and data-loss findings): For any
finding about a missing aclose, unclosed resource, or silent data loss, read
the concrete class body — not only the Protocol or interface declaration. If the
concrete implementation provides the method or handles the data correctly,
remove or revise the finding to reflect the accurate scope.
c. Enforcement-search confirmation (enforcement-gap findings): For any finding
claiming no test enforces a rule or constant, confirm that a Grep search of
tests/ for the exact symbol was performed during this audit. If it was not,
run it now; discard the finding if a matching test exists.
d. Internal validation note: After completing (a)–(c), record a one-sentence
note for each reviewed finding: either CONFIRMED – <original claim stands> or
REVISED – <corrected claim or reason for removal>. These notes are for
internal quality control and do not appear in the final report.
{{AUTOSKILLIT_TEMP}}/audit-arch/arch_audit_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)Launch ONE dedicated subagent AFTER all per-principle findings are consolidated. This subagent's only job is to propose a single new principle.
Constraints — ALL are mandatory:
Criteria — ALL must be true (for the dedicated suggestion subagent):
If criteria met: Add "Suggested Principle" section to report with:
If criteria NOT met: Omit section entirely. Do not suggest principles just to have a suggestion.
Do NOT flag:
These exceptions apply across all principles. Before reporting a finding, verify it does not match any entry below.
GE-1 — Deferred function-body imports with # noqa: PLC0415 are not P3/P4 layer violations. Only module-level unsuppressed cross-layer imports are flagged.
Source: audit-arch round 1 contest batch.
GE-2 — TYPE_CHECKING-only imports create zero runtime coupling and must not be flagged as cross-layer or cross-domain violations.
Source: audit-arch round 1 contest batch.
GE-7 — Stdlib-only forced duplication with an output-equivalence test is accepted; not a P6 violation when the module has a documented zero-external-imports constraint. Source: audit-arch round 2 contest batch.
GE-9 — Decorator-based registration imports in __init__.py (e.g., importing rules_*.py to trigger @semantic_rule registration) are intentional and not impure side effects.
Source: audit-arch round 2 contest batch.
GE-13 — "No dedicated test file" ≠ "no test coverage." Before claiming a module lacks coverage, grep tests/ for from <package>.<module> import. Absence of test_<module>.py is not evidence.
Source: audit-arch round 1 and cohesion round 1 contest batches.
GE-16 — Same-shape functions with different return types serving different layers or precision purposes are not duplicates. Source: audit-arch round 2 contest batch.
GE-17 — Approximately 6 lines of shared boilerplate between functions with fundamentally different feature sets is not duplication. Threshold: fewer than 8 lines of overlap between functions whose core logic is distinct does not constitute a P6 violation. Source: audit-arch round 2 contest batch.
These exceptions apply to known autoskillit codebase patterns that are legitimately non-standard.
PS-1 — smoke_utils.py at package root: stable callable-path API; autoskillit.smoke_utils.* is baked into recipe YAMLs and must remain at the top-level path.
PS-2 — hook_registry.py at package root: the hooks/ sub-package has a stdlib-only constraint; hook_registry cannot be placed inside hooks/. See commit ee83681d.
PS-4 — _llm_triage.py at package root: circular import constraint with its sole caller server/helpers.py prevents placing it inside server/.
PS-7 — SkillResolver naming: no SkillResolver Protocol exists; the Default* naming convention applies only to Protocol implementors. SkillResolver is the only implementation and needs no Default prefix.
PS-8 — L3 server/ and cli/ are excluded from REQ-IMP-001 intentionally; IL-005 and IL-006 provide compensating controls at the gateway boundary.
CRITICAL (requires at least one of):
HIGH:
MEDIUM:
LOW:
development
Generate YAML recipes for .autoskillit/recipes/. Use when user says "make script skill", "generate script", "script a workflow", "write a script", "create a script", "new recipe", "write a pipeline", or when loaded by other skills for script formatting.
data-ai
Create Uncertainty Representation visualization planning spec showing error bar definitions, distribution-aware alternatives, and multi-seed variance protocols. Statistical lens answering "How is uncertainty honestly represented?"
data-ai
Create Temporal Dynamics visualization planning spec showing axis scaling (linear vs log), smoothing disclosure, epoch/step alignment, run aggregation (mean + variance bands), early-stopping markers, and wall-clock vs step-count x-axis. Temporal lens answering "Are training dynamics shown clearly and honestly?"
data-ai
Create Narrative Story Arc visualization planning spec showing visual consistency across the report (same color = same model everywhere), logical figure progression, redundant figure detection, and narrative dependency between figures. Narrative lens answering "Do the figures tell a coherent story across the report?"