plugins/codebase-audit-suite/skills/ln-635-test-isolation-auditor/SKILL.md
Audits whether test results can be trusted: flakiness, isolation, real external dependencies, time/random/order dependency, and shared state. Use when auditing test trustworthiness.
npx skillsauth add levnikolaevich/claude-code-skills ln-635-test-isolation-auditorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Paths: File paths (
references/,../ln-*) are relative to this skill directory.
Type: L3 Worker
Specialized worker auditing whether automated test results are deterministic, isolated, and trustworthy.
REWRITE_FOR_DETERMINISM or DELETE_IF_LOW_VALUEMANDATORY READ: Load references/audit_worker_core_contract.md.
Receives contextStore with: tech_stack, testFilesMetadata, codebase_root, output_dir.
Detection policy: use two-layer detection (candidate scan, then context verification); load references/two_layer_detection.md only when the verification method is ambiguous.
REWRITE_FOR_DETERMINISM by default; use DELETE_IF_LOW_VALUE only when the test is both untrustworthy and low-value according to obvious local evidencereferences/templates/audit_worker_report_template.md, write to {output_dir}/ln-635--global.md in single Write callGood: Mocked (jest.mock, sinon, nock) Bad: Real HTTP calls to external APIs
Detection:
axios.get, fetch(, http.request without mocksSeverity: HIGH
Recommendation: Ensure external API calls are controlled (mock, stub, or test server). Tool choice depends on project stack. Exception: Integration tests are EXPECTED to use real dependencies -- do NOT flag
Effort: M
Good: In-memory DB (sqlite :memory:) or mocked Bad: Real database (PostgreSQL, MySQL)
Detection:
beforeAll(async () => { await db.connect() }) without :memory:Severity: MEDIUM
Recommendation: Ensure DB state is controlled and isolated between test runs. Exception: Integration tests with in-memory DB via config -> skip
Effort: M-L
Good: Mocked (mock-fs, vol) Bad: Real file reads/writes
Detection:
fs.readFile, fs.writeFile without mocksSeverity: MEDIUM
Recommendation: Ensure file system operations are isolated (mock, temp directory, or cleanup). Tool choice depends on project stack
Effort: S-M
Good: Mocked (jest.useFakeTimers, sinon.useFakeTimers)
Bad: new Date(), Date.now() without mocks
Detection:
new Date() in test files without useFakeTimersSeverity: MEDIUM
Recommendation: Ensure time-dependent logic uses controlled clock (fake timers, injected clock, or time provider). Tool choice depends on project stack
Effort: S
Good: Seeded random (Math.seedrandom, fixed seed)
Bad: Math.random() without seed
Detection:
Math.random() without seed setupSeverity: LOW
Recommendation: Use seeded random for deterministic tests
Effort: S
Good: Mocked (supertest for Express, no real ports)
Bad: Real network requests (localhost:3000, binding to port)
Detection:
app.listen(3000) in testsSeverity: MEDIUM
Recommendation: Use supertest (no real port)
Effort: M
What: Tests that pass/fail randomly
Detection:
setTimeout, setInterval without proper awaitsSeverity: HIGH
Recommendation: Fix race conditions, use proper async/await
Effort: M-L
What: Assertions on current time (expect(timestamp).toBeCloseTo(Date.now()))
Detection:
Date.now(), new Date() in assertionsSeverity: MEDIUM
Recommendation: Mock time
Effort: S
What: Tests that fail when run in different order
Detection:
Severity: MEDIUM
Recommendation: Isolate tests, reset state in beforeEach
Effort: M
What: Global variables modified across tests
Detection:
let globalVar at module levelSeverity: MEDIUM
Recommendation: Use beforeEach to reset state
Effort: S-M
What: Test with >100 lines, testing too many scenarios
Detection:
Severity: MEDIUM
Recommendation: Split into focused tests (one scenario per test)
Effort: S-M
What: Test taking >5 seconds to run
Detection:
Severity: MEDIUM
Recommendation: Control external deps with test doubles or in-memory services selected from the project stack; parallelize only after isolation is verified
Effort: M
What: Test labeled "Unit" but not mocking dependencies
Detection:
Severity: LOW
Recommendation: Either mock dependencies OR rename to Integration test
Effort: S
What: Tests with default config values only. Use the non-default config rule from references/risk_based_testing_guide.md; load references/risk_based_testing_methodology.md only when examples are needed.
Detection:
:8080, :3000, 30000, limit: 20, offset: 0|| DEFAULT patterns in source code with matching test valuesSeverity: HIGH
Effort: S
MANDATORY READ: Load references/audit_scoring.md.
Severity mapping:
MANDATORY READ: Load references/templates/audit_worker_report_template.md.
Write JSON summary per references/audit_summary_contract.md. In managed mode the caller passes both runId and summaryArtifactPath; in standalone mode the worker generates its own run-scoped artifact path per shared contract.
Write report to {output_dir}/ln-635--global.md with category: "Test Trustworthiness" and checks: api_isolation, db_isolation, fs_isolation, time_isolation, random_isolation, network_isolation, flaky_tests, order_dependency, shared_state, default_value_blindness.
Return summary per references/audit_summary_contract.md.
When summaryArtifactPath is absent, write the standalone runtime summary under .hex-skills/runtime-artifacts/runs/{run_id}/evaluation-worker/{worker}--{identifier}.json and optionally echo the same summary in structured output.
Report written: .hex-skills/runtime-artifacts/runs/{run_id}/audit-report/ln-635--global.md
Score: X.X/10 | Issues: N (C:N H:N M:N L:N)
Note: Findings are flattened into single array. Use principle field prefix (Isolation / Determinism / Dependency Control) to identify issue category. Each finding includes action: "REWRITE_FOR_DETERMINISM" or action: "DELETE_IF_LOW_VALUE".
Apply the already-loaded references/audit_worker_core_contract.md.
principle prefix to distinguishREWRITE_FOR_DETERMINISM unless evidence shows the test is also low-value enough to use DELETE_IF_LOW_VALUE.Monitor (2.1.98+): For repeated test runs expected >30s each, use Monitor. Fallback: Bash(run_in_background=true).
Apply the already-loaded references/audit_worker_core_contract.md.
{output_dir}/ln-635--global.md (atomic single Write call)Version: 3.0.0 Last Updated: 2025-12-23
testing
Audits architecture config boundaries: typed settings, scattered env reads, config leakage, and layer ownership. Use for config architecture.
tools
Finds architecture-level modernization opportunities: obsolete custom mechanisms, overbuilt extension points, and simplifiable architecture. Use when auditing architecture evolution.
development
Builds dependency topology, detects cycles, validates import rules, and calculates coupling metrics. Use when auditing architecture topology.
testing
Checks layer, resource ownership, and orchestration boundaries. Use when auditing architecture boundary enforcement.