skills/test-classification/SKILL.md
Prompt template for test classification stage in Test Audit pipeline
npx skillsauth add ashaykubal/essential-agents-skills test-classificationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Prompt template for surface-level test classification and triage. Designed for a Haiku sub-agent to quickly categorize test files and flag those needing deep analysis.
This is an internal skill loaded by the orchestrator during Test Audit pipeline.
| Context | Action |
|---------|--------|
| /test-audit invoked | Orchestrator loads this skill for Stage 1 |
| Test Audit pipeline triggered by hook | Orchestrator loads this skill for Stage 1 |
| Need to classify test files | Load directly as prompt template for Haiku |
DO NOT use for:
mock-detection skill)test-audit skill)This skill provides the first stage prompt template:
test-audit (P0.8) orchestrates:
Stage 1: test-classification (Haiku) → classification YAML
Stage 2: mock-detection (Sonnet) → violations YAML
Stage 3: synthesis (Sonnet) → audit report
The orchestrator loads this skill and constructs a 4-part prompt for a general-purpose Haiku sub-agent.
Classify all test files in {target} by type and flag files needing deep analysis for mock appropriateness.
Target directory: {target}
Test file patterns: *.test.*, *.spec.*, test_*, *.integration.*, *.e2e.*
Classification rules: See "Classification Logic" section below
Deep analysis triggers: See "needs_deep_analysis Triggers" section below
Line counting rules: See "Verification Line Counting" section below
AST verification_lines (MANDATORY when available):
If the orchestrator provides ast_verification_lines per file, use that value directly as verification_lines in the output. Do NOT override with your own count. The AST value is deterministic and precise; heuristic counting at scale is error-prone.
Write classification to: logs/test-classification-{YYYYMMDD-HHMMSS}.yaml
Write diagnostics to: logs/diagnostics/test-classification-{YYYYMMDD-HHMMSS}.yaml
Use the schema specified in "Output Schema" section below.
| Pattern | Category |
|---------|----------|
| *.integration.* | integration |
| *.e2e.* | e2e |
| *.test.*, *.spec.*, test_* | unit (default) |
After filename classification, scan content to validate:
| Content Signal | Interpretation |
|----------------|----------------|
| Imports test framework (jest, vitest, mocha, pytest) | Confirms test file |
| Imports system modules (child_process, fs, http) | Note for risk assessment |
| Contains jest.mock(), vi.mock(), patch() | Mock indicator |
| Contains describe(, it(, test( | Standard test structure |
| Risk | Condition | Recommendation |
|------|-----------|----------------|
| test_management | Single file contains multiple test types (unit + integration) | Split into separate files |
Flag a file for deep analysis when ANY of these conditions are met:
| Trigger | Reason |
|---------|--------|
| *.integration.* file | Integration tests need chain verification - may have T3+ violations without explicit mocks |
| *.e2e.* file | E2E tests should have minimal mocking - verify end-to-end flow |
Rationale: The absence of jest.mock() in an integration test doesn't mean it's clean. T3+ violations (broken integration chains) use inline mock data instead of upstream function outputs. These are only detectable through deep analysis.
| Trigger | Reason |
|---------|--------|
| Unit test with any jest.mock() / vi.mock() on core modules | Potential T1 (mocking SUT) or over-mocking |
| Unit test with >3 top-level mocks | Unusual mock density suggests over-mocking |
| Unit test mocking core modules (spawn, fs, fetch, http) | Known risky patterns requiring contextual analysis |
Count "verification lines" per file for test effectiveness calculation. This count is used by P0.7 to calculate how many effective test lines remain after violations are identified.
//, /* */, /** */, #)import, require, from)describe(, it(, test(beforeEach(, afterEach(beforeAll(, afterAll(setUp(, tearDown(expect(, assert, should)metadata:
skill: test-classification
timestamp: "{ISO-8601}"
target: "{directory}"
model: haiku
files:
- path: tests/proxy.test.ts
category: unit
total_lines: 150
verification_lines: 95 # Use ast_verification_lines if provided; only count manually if unavailable
mock_indicators:
- "jest.spyOn(child_process, 'spawn')"
needs_deep_analysis: true
deep_analysis_reason: "Unit test mocks core module (spawn)"
- path: tests/api.integration.ts
category: integration
total_lines: 80
verification_lines: 55
mock_indicators:
- "jest.mock('node-fetch')"
needs_deep_analysis: true
deep_analysis_reason: "Integration test contains mocks"
- path: tests/utils.test.ts
category: unit
total_lines: 60
verification_lines: 40
mock_indicators: []
needs_deep_analysis: false
risks:
test_management:
- path: tests/everything.test.ts
reason: "Single file contains unit, integration, and e2e tests"
recommendation: "Split into separate files by test type"
summary:
total_files: 25
by_category:
unit: 15
integration: 8
e2e: 2
total_verification_lines: 1250
needs_deep_analysis: 5
test_management_risks: 1
Write diagnostic output to logs/diagnostics/test-classification-{YYYYMMDD-HHMMSS}.yaml:
diagnostic:
skill: test-classification
timestamp: "{ISO-8601}"
model: haiku
execution:
tool_calls: 15
files_scanned: 25
classification_time_estimate: "surface scan"
decisions:
- file: tests/proxy.test.ts
decision: needs_deep_analysis
reason: "Found jest.spyOn on child_process.spawn"
confidence: high
- file: tests/utils.test.ts
decision: clean
reason: "No mock indicators found"
confidence: high
errors: []
The orchestrator (P0.8) constructs the full prompt by:
{target} with user-provided path or inferred targetTask(subagent_type="general-purpose", model="haiku", prompt=...)logs/test-classification-{YYYYMMDD-HHMMSS}.yamlP0.7 (mock-detection) receives:
needs_deep_analysis: trueverification_lines count per file (for effectiveness calculation)mock_indicators as starting points for deep analysisWhen processing large test suites (>20 files), the orchestrator must batch classification to avoid context limits.
IF file_count > 20:
Split files into batches of 20-25
FOR each batch:
Spawn Haiku sub-agent with batch file list
Collect classification YAML for batch
Merge all batch results into single classification output
ELSE:
Process all files in single sub-agent call
When merging batch results:
files arraysrisks entriessummary totals across all batchesFor optimal performance, spawn batch sub-agents in parallel:
Task(subagent_type="general-purpose", model="haiku", prompt=batch1_prompt, run_in_background=true)
Task(subagent_type="general-purpose", model="haiku", prompt=batch2_prompt, run_in_background=true)
...
Read all outputs after completion, then merge.
mock-detection (P0.7) - Deep analysis of flagged filestest-audit (P0.8) - Orchestration and synthesispipeline-templates (P0.3) - Test Audit pipeline definitiontesting
--- name: test-audit description: Audit test suites for T1-T4 violations using AST analysis, mock detection, and multi-stage synthesis. Invoke when user asks to audit tests, check test quality, find mock violations, review test effectiveness, or inspect test suites for over-mocking. Triggers automatic rewrites when quality gates fail. user-invocable: true argument-hint: [path] [--threshold=N] skills: - test-classification - mock-detection - assertion-patterns - component-pattern
development
Template for structured sub-agent invocation using 4-part prompting (GOAL/CONSTRAINTS/CONTEXT/OUTPUT) and F# pipeline notation. Use when orchestrating sub-agents or designing multi-agent workflows.
development
Template for structured sub-agent output including YAML log format, task completion reports (WHY/WHAT/TRADE-OFFS/RISKS), and summary constraints. Use when defining how sub-agents should report results.
development
Configures Language Server Protocol integration for Claude Code projects. Use when setting up LSP servers, verifying post-restart initialization, or troubleshooting broken LSP configurations.