ops-improve-codebase-architecture/SKILL.md
Use when the user wants to improve software architecture, find refactoring opportunities, consolidate tightly-coupled modules, reduce shallow abstractions, or make a codebase more testable and AI-navigable. Language-agnostic. Produces GitHub issues labeled per the `ops-triage` scheme, ready for an AFK agent.
npx skillsauth add paulund/ai ops-improve-codebase-architectureInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Surface architectural friction, propose deepening opportunities — refactors that turn shallow modules into deep ones — and ship each accepted opportunity as a fully-specified GitHub issue ready to be worked on. The aim is testability, locality of change, and AI-navigability.
Use these terms exactly in every suggestion. Consistent language is the point — don't drift into "component," "service," "API," or "boundary." Full definitions in references/language.md.
Key principles (see references/language.md for the full list):
| File | When to load | |---|---| | references/language.md | Always — vocabulary discipline is core to the skill. Read before presenting candidates. | | references/deepening.md | Step 3 (grilling loop), once a candidate is chosen. Classifies dependencies and prescribes the testing strategy. |
If the project has documentation that captures domain language or prior architectural decisions, read it first. Common locations:
CONTEXT.md (or CONTEXT-MAP.md plus per-context CONTEXT.md files in a multi-context repo) — domain vocabularydocs/adr/ — Architecture Decision Records, recording decisions the skill should not re-litigateIf none of these exist, proceed silently. Don't flag their absence or suggest creating them upfront.
Then use the Agent tool with subagent_type=Explore to walk the codebase. Don't follow rigid heuristics — explore organically and note where you experience friction:
Apply the deletion test to anything you suspect is shallow: would deleting it concentrate complexity, or just move it? A "yes, concentrates" is the signal you want.
Present a numbered list of deepening opportunities so the user can see what was found. Then immediately create a GitHub issue for every candidate. Do not wait for the user to pick — the goal is to get every opportunity into the backlog.
For each candidate, the issue body must include:
Use the project's domain vocabulary (from CONTEXT.md if it exists; otherwise the names used in the code) for the what, and references/language.md vocabulary for the architecture. If the domain calls something "Order," talk about "the Order intake module" — not "the FooBarHandler," and not "the Order service."
ADR conflicts: if a candidate contradicts an existing ADR, only surface it when the friction is real enough to warrant revisiting the ADR. Mark it clearly (e.g. "contradicts ADR-0007 — but worth reopening because…"). Don't list every theoretical refactor an ADR forbids.
Labels — chore + planned. These are briefs for future grilling, not ready-for-agent yet.
If the user later picks a candidate from the backlog, load references/deepening.md and walk the design tree with them:
deepening.md (in-process, local-substitutable, remote-but-owned, true-external). The category determines what sits behind the seam and how tests cross it.After grilling, update the issue: apply the execution label afk (or hitl if human judgment is still needed), and append the grilling output to the body.
If the user wants to explore alternative shapes for the deepened module's interface, hand off to the sibling plan-design-interface skill, which generates radically different interface candidates in parallel and compares them.
Use gh issue create for every candidate found in Step 2. Infer the repo from git remote. Do not confirm with the user — create them immediately.
Labels (from the ops-triage scheme):
| Label | When |
|---|---|
| chore (category) | Always — refactors are maintenance, not bugs or features. |
| planned (state) | Default for issues created in Step 2. The candidate has been identified but not yet grilled. |
| afk (execution) | After the optional grilling loop (Step 3) has settled the design. An AFK agent can ship this autonomously. |
| hitl (execution) | Use when the work needs human judgment the brief can't pin down — risky migration steps, coordination with other teams, or design calls deferred during grilling. Note the reason in the brief. |
Apply exactly: one category (chore), one state (planned), and one execution (afk or hitl) after grilling.
Do not use needs-triage or needs-info — the candidate is already scoped enough to go into the backlog.
Issue body — open with the AI disclaimer, then a complete brief. If the issue is created from Step 2 (before grilling), include the candidate summary and mark sections like "Dependency category", "Seam & adapters", and "Testing strategy" as TBD — to be filled during grilling loop.
> *This was generated by AI during triage.*
## Context
<one paragraph: the friction this refactor relieves, in domain + architecture vocabulary>
## Files
- path/to/module/a
- path/to/module/b
## Current shape (shallow)
<what the modules look like today and why they're shallow — apply the deletion test>
## Target shape (deep)
<what the deepened module looks like: interface surface, what stays inside, what the seam looks like>
## Dependency category
<one of the four from references/deepening.md, with a one-line justification>
## Seam & adapters
<which adapters are needed (production + test). If only one adapter is justified, no port — say so explicitly.>
## Testing strategy
<which existing tests get deleted, which new tests at the new interface replace them. Follow "replace, don't layer".>
## Out of scope
<anything that surfaced during grilling but is deferred — keeps the agent focused>
## Acceptance criteria
- [ ] Deepened module exists at <path> with the interface above
- [ ] Old shallow modules deleted (or reduced to a thin compatibility shim, with a follow-up issue to remove it)
- [ ] New tests at the deepened interface pass; old shallow-module tests deleted
- [ ] No callers import the old internal modules
Issue title — Deepen <module-name>: <one-line outcome> (e.g. Deepen Order intake: collapse FooBarHandler + OrderValidator behind a single port).
Multiple candidates — if the user accepted more than one candidate during step 3 and grilled each, create one issue per candidate. Don't bundle.
ADR conflicts — if the candidate contradicted an existing ADR and the user agreed to reopen it, link the ADR in the Context section and add a sentence on what changed since the decision was recorded.
After creating each issue, print the URL so the user can open it.
MUST DO
references/language.md exactly (module, interface, implementation, depth, seam, adapter, leverage, locality). Treat drift as a bug.CONTEXT.md and any docs/adr/ if they exist before exploring.references/deepening.md before proposing a seam.chore) and exactly one state (planned), per the ops-triage scheme. After grilling (Step 3), also apply exactly one execution label (afk or hitl).> *This was generated by AI during triage.*MUST NOT DO
CONTEXT.md or ADRs upfront when they don't exist. Proceed silently.needs-triage or needs-info to issues created by this skill. If the candidate isn't fully specified, you haven't finished step 3.development
Use when the user wants to run the project's lint + types + build sequence as a gate before pushing, opening a PR, or merging. Invoked by chained dev skills between phases. Trigger phrases - "/quality-gate", "run the quality gate", "check it builds".
tools
Use when the user wants to verify a PR's feature works at runtime by booting the dev server, exercising the affected UI via Chrome DevTools MCP, and posting a screenshot summary back to the PR. Idempotent — skips if `verified` or `verify-failed` is already on the PR. Trigger phrases - "/pr-verify", "verify this PR", "runtime check the pr".
testing
Use when the user wants a security-focused review pass on a PR with findings actioned as commits on the same branch. Trigger phrases - "/pr-security-review", "security review and fix".
testing
Use when the user wants to open a pull request for an already-pushed branch that implements a specific issue. Idempotent — returns the existing PR if one is already open for the branch. Trigger phrases - "/pr-open", "open the pr", "create pr for this branch".