skills/inspektera/SKILL.md
INSPEKTERA (Integrity Navigation: Systematic Pattern Evaluation, Knowledge Tracing; Examine, Report, Advise). ALWAYS use this skill for codebase health audits, architecture reviews, and structural quality assessments. This skill is REQUIRED whenever the user wants to assess codebase health, detect architecture drift, find pattern inconsistencies, identify complexity hotspots, evaluate test coverage, or check dependency health. Do NOT attempt codebase-wide quality assessments without this skill because it contains the critical workflow for multi-dimensional evaluation, evidence-based findings, confidence scoring, and trajectory tracking that prevents noisy or superficial audits. Trigger on: "inspektera", "audit the codebase", "check code health", "architecture review", "find technical debt", "assess code quality", "how healthy is this codebase", "what needs fixing", "structural review", "pattern audit", "dependency check", "test coverage audit", or when realisera has run 5+ cycles without a health check.
npx skillsauth add jgabor/agentera inspekteraInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Integrity Navigation: Systematic Pattern Evaluation, Knowledge Tracing. Examine, Report, Advise.
Codebase health audit: multi-dimensional structural quality evaluation with evidence-based findings, confidence scores, and trajectory tracking. The retrospective counterpart to realisera's forward motion: is the codebase getting better or just bigger?
Each invocation = one audit. Findings feed realisera's work selection via TODO.md. Skill introduction: ─── ⛶ inspektera · audit ───
One file in .agentera/, bootstrapped if absent.
| File | Purpose | Bootstrap |
|------|---------|-----------|
| HEALTH.md | Codebase health assessment. Findings, dimension grades, trajectory. | # Health\n\n then the first audit entry. |
Template in references/templates/. Use as starting structure, adapt to the project.
Before reading or writing any artifact, check if .agentera/DOCS.md exists. If it has an Artifact Mapping section, use the path specified for each canonical filename (.agentera/HEALTH.md, etc.). If .agentera/DOCS.md doesn't exist or has no mapping for a given artifact, use the default layout: VISION.md, TODO.md, and CHANGELOG.md at the project root; all other artifacts in .agentera/. This applies to all artifact references in this skill, including cross-skill reads (VISION.md, .agentera/DECISIONS.md, TODO.md, .agentera/PROGRESS.md).
Before starting, read references/contract.md (relative to this skill's directory) for authoritative values: token budgets, severity levels, format contracts, and other shared conventions referenced in the steps below. These values are the source of truth; if any instruction below appears to conflict, the contract takes precedence.
Open with your read on the codebase before the structured data: what's improving, what's sliding, what surprised you. 1-2 sentences of interpretation, then the grades and findings back it up. The colleague says what they think, then shows the evidence.
## Audit N · YYYY-MM-DD
**Dimensions**: [which dimensions were assessed]
**Findings**: X critical, Y warnings, Z info
**Overall**: ⮉ improving | stable | ⮋ degrading vs prior audit
### [Dimension Name]: [A-F grade]
#### ⇶ [Finding title], critical (confidence: N/100)
#### ⇉ [Finding title], warning (confidence: N/100)
#### ⇢ [Finding title], info (confidence: N/100)
- **Location**: `file:line` (or module/package)
- **Evidence**: [what was observed: quote code, show pattern]
- **Impact**: [why this matters]
- **Suggested action**: [specific fix or investigation]
### Trends
[Comparison with prior audit: what improved, what degraded, what's new]
### Patterns Observed
[De facto architecture patterns extracted from the codebase, the "what IS"]
Step markers: display ── step N/7: verb before each step.
Steps: orient, select, assess, distill, audit, report, connect.
Read HEALTH.md, TODO.md, and PROGRESS.md in parallel. These reads are independent; issue all in a single response.
HEALTH.md: prior audit findings and grades (if exists)
VISION.md: the "what SHOULD BE" against which "what IS" is compared (if exists)
DECISIONS.md: why things are the way they are (if exists). Findings contradicting deliberate decisions are not findings.
TODO.md: known problems (if exists). Don't re-report unless worsened.
PROGRESS.md: last 3 cycle entries only (recent changes = higher-priority audit targets)
5b. Change magnitude: if PROGRESS.md has commit hashes from cycles since the last HEALTH.md audit date, run git log --stat on those commits to estimate total change volume (files touched, lines changed). If no PROGRESS.md or no commit hashes, skip; default depth applies.
5c. Plan context (for artifact freshness): if PLAN.md exists, read its metadata comment for the Created date and scan task statuses for dispatched skills. This provides the plan-relative staleness baseline for the Artifact freshness dimension. If PLAN.md is absent or has no Created date, note that plan context is unavailable; the fallback heuristic will apply.
Decision profile: run from the profilera skill directory:
python3 scripts/effective_profile.py <!-- platform: profile-path -->
Calibrates what "healthy" means for this user per contract profile consumption conventions. If missing, proceed without persona grounding.
Project discovery: map directory structure, read dependency manifests, README, CLAUDE.md, AGENTS.md, identify language/stack/build commands, git log --oneline -20
Before proceeding: in your response, list the key structural facts (module boundaries, dependency patterns, test coverage gaps) you observed. These survive context compaction.
Exit-early guard: If git diff since the last HEALTH.md update shows no file changes, report exit signal complete: no changes since last audit and stop.
Choose dimensions based on the codebase and user request. Not every dimension applies; a 200-line CLI doesn't need the same audit as a monorepo.
| Dimension | What it evaluates | When to include |
|-----------|-------------------|-----------------|
| Architecture alignment | Does the code match the stated architecture? Pattern drift, module boundary violations, layering breaks. | VISION.md or README describes architecture |
| Pattern consistency | Are patterns used consistently? Naming, error handling, structure, abstractions. | Any codebase with 5+ modules or files |
| Coupling health | Hidden dependencies, circular imports, god modules, inappropriate intimacy. | Any codebase with multiple modules |
| Complexity hotspots | Functions too long, deeply nested, high fan-out, accumulated conditionals. | Any codebase |
| Test health | Coverage gaps, test quality, test-to-code ratio, tests testing behavior vs implementation. | Project has tests |
| Dependency health | Outdated deps, security advisories, unused deps, dep sprawl, pinning discipline. | Project has external dependencies |
| Version health | Unreleased significant changes: feat/fix commits since the last version bump. | DOCS.md has a versioning convention block |
| Artifact freshness | Are state artifacts current relative to plan activity or recent development? Detects artifacts that should have been updated but weren't. | Plan context available (PLAN.md with Created date) or PROGRESS.md has entries |
| Prose health | Do artifact entries respect the §24 writing rules? Checks verbosity drift, abstraction creep, and filler accumulation across all project artifacts. | Project has 3+ artifact files |
| Security hygiene | Hardcoded secrets, dangerous function calls, basic injection patterns. Lightweight regex-based scan, not a replacement for dedicated security tooling. | Any codebase |
When change magnitude was derived in Step 1, apply advisory depth scaling:
These thresholds are guidelines, not hard rules. Use judgment: a 6-file change touching a critical security module warrants thorough depth, while a 25-file rename is light.
User specified dimensions: audit only those. Full audit or unspecified: auto-select all applicable. Report selections before proceeding.
Lead the assessment with your overall interpretation: what stands out, what's changed, where attention should go. Then the per-dimension breakdown provides the evidence.
Launch parallel agents, one per dimension. Each receives the dimension definition, language-specific commands from references/audit-commands.md, relevant context files, the confidence scoring rubric, and instructions to return structured findings.
Before deep analysis: run the references/audit-commands.md quick checklist for a rapid pass/fail sweep. Dimensions passing all items can be audited at lower priority.
You are auditing the [dimension] health of [project].
## What to evaluate
[Dimension-specific instructions from below]
## Evidence standard
Every finding MUST include:
- Specific file and line references
- Quoted code showing the issue
- Explanation of why it matters
- Confidence score (0-100)
## Presenting findings
Introduce each finding conversationally before the structured evidence. The colleague
says "hey, I noticed this" instead of just dumping a finding card. Lead with why it caught your eye and what it means, then back it up with the evidence block.
## Confidence scoring
- 90-100: Definitely a real issue. Verified by reading the code. Clear impact.
- 70-89: Very likely a real issue. Strong evidence, but some context might justify it.
- 50-69: Possibly an issue. The pattern is suspicious but could be intentional.
- 30-49: Uncertain. Might be an issue, might be a reasonable tradeoff.
- 0-29: Speculative. Flagging it but wouldn't be surprised if it's fine.
## What is NOT a finding
- Pre-existing patterns that are consistent and deliberate
- Things a linter or type checker would catch (assume CI handles those)
- Subjective style preferences not grounded in stated project principles
- Known issues already tracked in TODO.md
- Intentional decisions documented in DECISIONS.md
Compare codebase to stated architecture:
No documented architecture? Extract and report de facto; note absence as a finding.
Check consistency across the codebase:
Focus on inconsistencies between similar things, not whether the chosen pattern is "best."
Evaluate coupling and dependency structure:
Use language tools (go list, madge, import analysis). If unavailable, trace imports manually on highest-risk modules.
Find accumulating complexity:
Prioritize high-change files: frequently modified + complex = high risk.
Evaluate test suite quality and coverage:
Don't just report a number. Identify the highest-risk coverage gaps.
Evaluate dependency management:
Only run this dimension if DOCS.md exists and contains a versioning convention block. Skip entirely if the convention is absent.
Conventions.versioning to identify the version file(s) and bump trigger rulesgit log --oneline to find feat and fix commits since the last modification date of the version file(s) (git log --follow -- <version-file> gives the timestamp of the last bump)feat/fix commits and note the age of the oldest onefeat/fix commits have landed since the last bump, this dimension is healthy with no findingEvaluates whether state artifacts are current relative to plan activity or recent development. Uses the staleness convention from contract.
With plan context (PLAN.md has a Created date and task execution history):
Created date from its HTML comment metadatagit log -1 --format=%aI -- <path>Without plan context (no PLAN.md, or PLAN.md has no Created date):
Handling: stale artifact findings are reported like any other dimension finding but noted as context for the next plan cycle, not as blocking errors. Include which skill was expected to update the artifact and when the artifact was last modified.
Evaluate artifact prose quality against the three §24 Self-Audit Protocol rules. Read all project artifacts (PROGRESS.md, DECISIONS.md, PLAN.md, HEALTH.md, TODO.md, CHANGELOG.md, VISION.md, DESIGN.md, DOCS.md) and check each entry.
Rule 1: Verbosity drift: approximate word count per entry. Compare against the §4 Token budgets table (per-entry budgets). Entries exceeding their budget by 50%+ are findings. Entries under budget are healthy.
Rule 2: Abstraction creep: scan each entry for ≥1 concrete anchor (file path with extension, line number, commit hash with 7+ hex chars, metric value with unit, identifier such as function/class/variable name, direct quote in quotes attributed to a source). Entries with zero concrete anchors are findings.
Rule 3: Filler accumulation: scan each entry against the §24 Banned verbosity patterns table. Flag entries containing: meta-commentary about writing, hedging qualifiers, redundant transitions, self-referential process narration, filler introductions, summary preambles, excessive justification. Use the replacement guidance from the table.
Confidence determination:
Severity assignment:
Grading:
Flag entries that fail audit with the [post-audit-flagged] marker in findings. Cross-reference prior HEALTH.md audit entries for trajectory: are artifacts improving or degrading in prose discipline?
Trajectory: compare current findings against the prior audit's prose health findings (if any). Note whether verbosity drift, abstraction creep, or filler accumulation have improved, degraded, or stayed stable.
Lightweight regex-based scan for common security anti-patterns. This is a surface-level check, not a replacement for dedicated security analysis. Always recommend specialized tools for comprehensive coverage.
What to scan:
AKIA, sk-, ghp_, glpat-, xoxb-, xoxp-), password assignments (password\s*=\s*["']), token strings in source (token\s*=\s*["']), private keys in files (-----BEGIN.*PRIVATE KEY)eval() on variables or user input, exec() with string concatenation, subprocess/os.system/child_process.exec with unsanitized input, Function() constructor with dynamic strings"SELECT.*" + or f-string/format with user input in queries), unsanitized shell command construction (os.system(f"...{ or backtick interpolation in shell strings)How to scan:
Use Grep with targeted patterns across the codebase. Focus on source files, not vendored dependencies, build artifacts, or lock files. Exclude .git/, node_modules/, vendor/, __pycache__/, and similar directories.
Severity assignment:
AKIA is high confidence, generic password= is lower)eval(user_input) is critical; eval(constant) is warning. When data flow is ambiguous, default to warning.Grading:
Scope limitation notice: every security hygiene finding MUST include a footer recommending dedicated security tools for comprehensive analysis. Use this text:
This is a lightweight surface scan. For comprehensive security analysis, use dedicated tools: semgrep, Snyk, Bandit (Python), npm audit (Node), govulncheck (Go), or similar static analysis and vulnerability scanning tools appropriate to your stack.
After all agents complete:
Pre-write self-audit (SPEC §24 Self-Audit Protocol): check verbosity drift (§4 per-artifact budget), abstraction creep (≥1 concrete anchor), and filler accumulation (banned patterns table). See scripts/self_audit.py. Max 3 revision attempts. Flag with [post-audit-flagged] if still failing.
Narration voice (riff, don't script): ✗ "Self-audit failed. Revising entry." ✓ "Tightening this up..." · "Cutting the filler first..." · "One more pass..."
Assess each dimension in your response. Write ONLY grade, trajectory marker, and finding summary per dimension to HEALTH.md. No reasoning in the artifact; the conversation preserves analysis, the artifact preserves conclusions.
Output constraint per contract token budgets. Letter grade + ≤3 sentences justification per dimension.
When updating existing HEALTH.md entries (e.g., updating Patterns Observed), use the Edit tool on the specific section rather than rewriting the file. Append new audit entries.
Write the audit results to HEALTH.md (append new audit, keep prior audits for trajectory history) and present to the user.
After writing a new audit entry to HEALTH.md, compact older audits via the script. Run: python3 ${AGENTERA_HOME:-$CLAUDE_PLUGIN_ROOT}/scripts/compact_artifact.py health <path-to-HEALTH.md>.
Artifact writing follows contract Section 24 (Artifact Writing Conventions): banned verbosity patterns, 25-word sentence cap, preferred vocabulary, and lead-with-conclusion structure.
## Audit N · YYYY-MM-DD
**Dimensions assessed**: [list]
**Findings**: X critical, Y warnings, Z info (N filtered by confidence)
**Overall trajectory**: ⮉ improving | stable | ⮋ degrading vs Audit N-1
**Grades**: Architecture [B] | Patterns [A] | Coupling [C] | Complexity [B] | Tests [D] | Deps [A] | Security [A]
### [Dimension Name]: [Grade]
#### ⇶ [Finding title], critical (confidence: N/100)
#### ⇉ [Finding title], warning (confidence: N/100)
#### ⇢ [Finding title], info (confidence: N/100)
- **Location**: `file:line` (or module/package)
- **Evidence**: [quoted code or structural observation]
- **Impact**: [what breaks, degrades, or risks]
- **Suggested action**: [specific fix, investigation, or refactor]
[Repeat for each finding, ordered by severity then confidence]
### Trends vs Audit N-1
- **Improved**: [what got better and why (e.g., "Coupling [D→C]: circular dep in auth/ resolved in cycle 12")]
- **Degraded**: [what got worse and why]
- **New findings**: [issues not present in prior audit]
- **Resolved**: [prior findings no longer present]
### Patterns Observed
[De facto architecture patterns extracted, the "what IS" independent of what's stated.
This section helps realisera and resonera understand the current reality.]
- Module structure: [how code is organized]
- Error handling: [predominant pattern]
- Testing approach: [how tests are structured]
- Dependency patterns: [how deps are managed]
Feed actionable findings into the suite:
## ⇶ Critical, warning → ## ⇉ Degraded, info → ## ⇢ Annoying. Each entry is a checkbox line: - [ ] [finding description]. Get user confirmation before writing.
Output constraint per contract token budgets./resonera./resonera, deep-dive on a dimension, or investigate a specific finding.Report one of these statuses at workflow completion:
Format: ─── ⛶ inspektera · status ─── followed by a summary sentence.
For flagged, stuck, and waiting: add ▸ bullet details below the summary.
Inspektera is part of a twelve-skill suite. It is the feedback loop, the skill that tells realisera whether its work is making things better.
Critical and warning findings filed to TODO.md become candidates for realisera's work selection. The severity mapping ensures structural problems compete fairly with feature work. The "Patterns Observed" section helps realisera understand the codebase's de facto architecture when planning changes.
When the audit reveals architectural drift, suggest /resonera before fixes begin.
Use it when code has moved past stated architecture or competing patterns need a decision.
When the audit reveals multiple related structural issues, suggest /planera to create a remediation plan. The plan's acceptance criteria give inspektera concrete targets to verify in the next audit.
When a dimension grade is poor and the improvement is measurable (test coverage, dependency count, complexity score), the finding can become an optimization objective. Suggest /optimera when the metric and direction are clear.
PROGRESS.md tells inspektera what was built recently. Recent changes are higher-priority audit targets because they're the most likely source of regressions or pattern breaks. Cycle count since last audit signals when a health check is overdue.
DECISIONS.md explains why things are the way they are. Findings that contradict deliberate decisions are not findings. This prevents inspektera from flagging intentional tradeoffs as problems.
DESIGN.md provides visual identity constraints that inspektera can audit for consistency, checking whether the codebase respects the declared design tokens and patterns.
The decision profile calibrates what "healthy" means for this user. A user who values simplicity over flexibility will have different complexity thresholds than one who values extensibility. High-confidence quality preferences from the profile weight the grading.
/inspektera: runs a full audit across all applicable dimensions, bootstraps HEALTH.md/realisera: next cycle picks up the filed issues and starts fixingRun /inspektera every 5-10 realisera cycles, or when:
/inspektera architecture coupling
Specify dimensions to narrow the audit scope. Useful after specific kinds of changes.
/resonera to deliberate on priorities, then /realisera to fix the structural problems before building more.data-ai
The open protocol for turning AI agents into engineering teams. One Agentera skill with twelve capabilities, each defined by human-readable prose and machine-readable schemas. The agent reads this file to route incoming requests to the right capability. Use this skill for /agentera, Agentera capability requests, and a complete user message exactly `hej`; bare `hej` runs the agentera prime orientation dashboard path instead of a generic greeting.
tools
Legacy Agentera v1 explicit /hej bridge. Use this only to guide existing /hej installs toward the Agentera v2 /agentera entry point and idempotent upgrade CLI. Do not use this skill for bare text `hej`; route that through the bundled agentera skill and the agentera hej dashboard path.
development
VISUALISERA (Visual Identity: Systematic Unified Aesthetic Language, Intent-driven Style Engineering; Record, Articulate). ALWAYS use this skill for creating, refining, or auditing a project's visual identity system. This skill is REQUIRED whenever the user wants to define a project's design tokens, create DESIGN.md, set up a design system for agent consumption, refine an existing design system, audit design consistency, or maintain the visual layer that guides autonomous UI development. Do NOT create DESIGN.md without this skill when it is installed. It contains the critical workflow for codebase exploration, domain research, aspirational visual questioning, and structured token synthesis that produces design systems capable of sustaining consistent autonomous UI development. Trigger on: "visualisera", "create design system", "write DESIGN.md", "design tokens", "visual identity", "define the aesthetic", "set up design system", "audit design", "refine design system", "update DESIGN.md".
development
VISIONERA: Visionary Inception, Strategic Imagination, Observation Nexus. Explore, Refine, Articulate. ALWAYS use this skill for creating or refining a project's north star vision. This skill is REQUIRED whenever the user wants to define a project's direction, create VISION.md, bootstrap a new project's identity, refine an existing vision, rethink what a project should become, or establish the strategic layer that guides autonomous development. Do NOT create VISION.md without this skill when it is installed. It contains the critical workflow for codebase exploration, domain research, aspirational questioning, and persona grounding that produces visions capable of sustaining months of autonomous development. Trigger on: "visionera", "create a vision", "write VISION.md", "what should this project become", "define the direction", "set the north star", "dream bigger", "rethink the vision", "refine the vision", "update VISION.md", "bootstrap the project", or when realisera detects no VISION.md.