002-workspaces/test-harness-lab/skills/nixtla-release-validation/SKILL.md
Multi-phase release validation workflow for nixtla. Analyzes git changes, predicts test impact, assesses risk, runs pytest verification, provides go/no-go recommendation. Trigger: "validate release", "run release validation", "check release readiness"
npx skillsauth add intent-solutions-io/plugins-nixtla nixtla-release-validationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Automated pre-release validation using multi-phase test harness pattern with empirical verification.
Validate nixtla releases (e.g., v1.7.0 → v1.8.0) before shipping by analyzing changes, predicting impact, running tests, and providing evidence-based go/no-go recommendation.
This workflow implements the 5-phase validated workflow pattern:
Phase 4 is the critical phase - it runs actual scripts to verify Phase 2 predictions.
jq for JSON processing002-workspaces/test-harness-lab/skills/nixtla-release-validation/reports/cd 002-workspaces/test-harness-lab/skills/nixtla-release-validation
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
SESSION_DIR="reports/${TIMESTAMP}"
mkdir -p "${SESSION_DIR}"
Task: Spawn Phase 1 agent
Input JSON:
{
"session_dir": "<SESSION_DIR>",
"from_version": "v1.7.0",
"to_version": "v1.8.0",
"repo_path": "/home/jeremy/000-projects/nixtla"
}
Expected Output: <SESSION_DIR>/phase1-change-analysis.json
{
"metadata": {
"phase": 1,
"timestamp": "2025-12-22T17:00:00Z"
},
"changes": {
"changed_files": ["src/forecast.py", "tests/test_forecast.py"],
"changed_apis": ["forecast()", "fit()"],
"breaking_changes": ["forecast() now requires 'freq' parameter"],
"new_features": ["Added anomaly detection"]
}
}
Task: Spawn Phase 2 agent
Input JSON:
{
"session_dir": "<SESSION_DIR>",
"phase1_output": "<SESSION_DIR>/phase1-change-analysis.json"
}
Expected Output: <SESSION_DIR>/phase2-test-predictions.json
{
"metadata": {
"phase": 2,
"timestamp": "2025-12-22T17:05:00Z"
},
"test_predictions": [
{
"change": "Modified forecast() signature",
"affected_tests": ["test_forecast_basic", "test_forecast_with_exog"],
"reason": "Function signature changed, existing calls will fail"
}
]
}
Task: Spawn Phase 3 agent
Input JSON:
{
"session_dir": "<SESSION_DIR>",
"phase1_output": "<SESSION_DIR>/phase1-change-analysis.json",
"phase2_output": "<SESSION_DIR>/phase2-test-predictions.json"
}
Expected Output: <SESSION_DIR>/phase3-risk-assessment.json
{
"metadata": {
"phase": 3,
"timestamp": "2025-12-22T17:10:00Z"
},
"risk_categories": {
"high_risk": ["forecast() signature change - breaking"],
"medium_risk": ["New anomaly detection - needs testing"],
"low_risk": ["Documentation updates"]
},
"go_no_go": "pending"
}
Task: Run verification script, compare predictions vs reality
Script: scripts/analyze_test_results.sh
bash scripts/analyze_test_results.sh \
"/home/jeremy/000-projects/nixtla" \
"${SESSION_DIR}"
Expected Output: <SESSION_DIR>/phase4-verification-report.json
{
"metadata": {
"phase": 4,
"script": "analyze_test_results.sh",
"timestamp": "2025-12-22T17:15:00Z"
},
"results": {
"tests_run": 145,
"tests_passed": 142,
"tests_failed": 3,
"coverage_pct": 87.5,
"failed_tests": ["test_forecast_basic", "test_forecast_with_exog"]
},
"prediction_comparison": {
"predictions_confirmed": ["test_forecast_basic - FAILED as predicted"],
"predictions_revised": [],
"unexpected_failures": ["test_hierarchical - not predicted"]
}
}
Task: Spawn Phase 5 agent
Input JSON:
{
"session_dir": "<SESSION_DIR>",
"phase3_output": "<SESSION_DIR>/phase3-risk-assessment.json",
"phase4_output": "<SESSION_DIR>/phase4-verification-report.json"
}
Expected Output: <SESSION_DIR>/phase5-final-recommendation.json
{
"metadata": {
"phase": 5,
"timestamp": "2025-12-22T17:20:00Z"
},
"recommendation": "no-go",
"blockers": [
"3 test failures must be fixed before release",
"forecast() breaking change needs migration guide"
],
"release_notes": "...",
"migration_steps": [...]
}
Create summary markdown report:
cat > "${SESSION_DIR}/RELEASE-VALIDATION-SUMMARY.md" <<EOF
# Release Validation Summary
**Release**: v1.7.0 → v1.8.0
**Date**: $(date)
**Recommendation**: NO-GO
## Test Results
- Tests Run: 145
- Passed: 142
- Failed: 3
## Blockers
1. forecast() breaking change needs migration guide
2. 3 test failures must be addressed
## Next Steps
1. Fix test_forecast_basic
2. Fix test_forecast_with_exog
3. Fix test_hierarchical
4. Write migration guide for forecast() changes
5. Re-run validation
EOF
Structured Outputs:
phase1-change-analysis.json - Git changes analyzedphase2-test-predictions.json - Impact predictionsphase3-risk-assessment.json - Risk categoriesphase4-verification-report.json - Actual test resultsphase5-final-recommendation.json - Go/no-go decisionRELEASE-VALIDATION-SUMMARY.md - Human-readable summaryEvidence Trail: All outputs in timestamped session directory.
If Phase 1-3 fail: Check JSON syntax, file paths, git tags exist.
If Phase 4 fails:
scripts/analyze_test_results.sh exists and is executableIf Phase 5 fails: Check Phase 3-4 JSON outputs exist and are valid.
Validation Failures: Each phase must write valid JSON to expected path before next phase runs.
cd 002-workspaces/test-harness-lab/skills/nixtla-release-validation
SESSION_DIR="reports/$(date +%Y%m%d_%H%M%S)"
mkdir -p "${SESSION_DIR}"
# Phase 1-5: Spawn agents sequentially
# Phase 4: Run verification script
bash scripts/analyze_test_results.sh /home/jeremy/000-projects/nixtla "${SESSION_DIR}"
# Check final recommendation
cat "${SESSION_DIR}/phase5-final-recommendation.json" | jq '.recommendation'
SESSION_DIR="reports/test_v1.6_to_v1.7"
mkdir -p "${SESSION_DIR}"
# Run phases with historical release
# Compare predictions vs actual (known outcome)
002-workspaces/test-harness-lab/reference-implementation/scripts/analyze_test_results.shagents/phase_*.mdtesting
This skill enables Claude to manage isolated test environments using Docker Compose, Testcontainers, and environment variables. It is used to create consistent, reproducible testing environments for software projects. Claude should use this skill when the user needs to set up a test environment with specific configurations, manage Docker Compose files for test infrastructure, set up programmatic container management with Testcontainers, manage environment variables for tests, or ensure cleanup after tests. Trigger terms include "test environment", "docker compose", "testcontainers", "environment variables", "isolated environment", "env-setup", and "test setup".
tools
This skill uses the test-doubles-generator plugin to automatically create mocks, stubs, spies, and fakes for unit testing. It analyzes dependencies in the code and generates appropriate test doubles based on the chosen testing framework, such as Jest, Sinon, or others. Use this skill when you need to generate test doubles, mocks, stubs, spies, or fakes to isolate units of code during testing. Trigger this skill by requesting test double generation or using the `/gen-doubles` or `/gd` command.
tools
This skill enables Claude to generate realistic test data for software development. It uses the test-data-generator plugin to create users, products, orders, and custom schemas for comprehensive testing. Use this skill when you need to populate databases, simulate user behavior, or create fixtures for automated tests. Trigger phrases include "generate test data", "create fake users", "populate database", "generate product data", "create test orders", or "generate data based on schema". This skill is especially useful for populating testing environments or creating sample data for demonstrations.
development
This skill analyzes code coverage metrics to identify untested code and generate comprehensive coverage reports. It is triggered when the user requests analysis of code coverage, identification of coverage gaps, or generation of coverage reports. The skill is best used to improve code quality by ensuring adequate test coverage and identifying areas for improvement. Use trigger terms like "analyze coverage", "code coverage report", "untested code", or the shortcut "cov".