.claude/skills/performance-testing/SKILL.md
Profiles application performance under load using k6, Artillery, or JMeter to measure latency, throughput, and error rates. Use when planning load tests, stress tests, soak tests, benchmarking APIs, or identifying performance bottlenecks.
npx skillsauth add proffesor-for-testing/agentic-qe performance-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
<default_to_action> When testing performance or planning load tests:
Quick Test Type Selection:
Critical Success Factors:
| Type | Purpose | When | |------|---------|------| | Load | Expected traffic | Every release | | Stress | Beyond capacity | Quarterly | | Spike | Sudden surge | Before events | | Endurance | Memory leaks | After code changes | | Scalability | Scaling validation | Infrastructure changes |
| Metric | Target | Why | |--------|--------|-----| | p95 response | < 200ms | User experience | | Throughput | 10k req/min | Capacity | | Error rate | < 0.1% | Reliability | | CPU | < 70% | Headroom | | Memory | < 80% | Stability |
qe-performance-tester: Load test orchestrationqe-quality-analyzer: Results analysisqe-production-intelligence: Production comparisonBad: "The system should be fast" Good: "p95 response time < 200ms under 1,000 concurrent users"
export const options = {
thresholds: {
http_req_duration: ['p(95)<200'], // 95% < 200ms
http_req_failed: ['rate<0.01'], // < 1% failures
},
};
Bad: Every user hits homepage repeatedly Good: Model actual user behavior
// Realistic distribution
// 40% browse, 30% search, 20% details, 10% checkout
export default function () {
const action = Math.random();
if (action < 0.4) browse();
else if (action < 0.7) search();
else if (action < 0.9) viewProduct();
else checkout();
sleep(randomInt(1, 5)); // Think time
}
Symptoms: Slow queries under load, connection pool exhaustion Fixes: Add indexes, optimize N+1 queries, increase pool size, read replicas
// BAD: 100 orders = 101 queries
const orders = await Order.findAll();
for (const order of orders) {
const customer = await Customer.findById(order.customerId);
}
// GOOD: 1 query
const orders = await Order.findAll({ include: [Customer] });
Problem: Blocking operations in request path (sending email during checkout) Fix: Use message queues, process async, return immediately
Detection: Endurance testing, memory profiling Common causes: Event listeners not cleaned, caches without eviction
Solutions: Aggressive timeouts, circuit breakers, caching, graceful degradation
// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '1m', target: 50 }, // Ramp up
{ duration: '3m', target: 50 }, // Steady
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<200'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
sleep(1);
}
# GitHub Actions
- name: Run k6 test
uses: grafana/[email protected]
with:
filename: performance-test.js
Load: 1,000 users | p95: 180ms | Throughput: 5,000 req/s
Error rate: 0.05% | CPU: 65% | Memory: 70%
Load: 1,000 users | p95: 3,500ms ❌ | Throughput: 500 req/s ❌
Error rate: 5% ❌ | CPU: 95% ❌ | Memory: 90% ❌
| ❌ Anti-Pattern | ✅ Better | |----------------|-----------| | Testing too late | Test early and often | | Unrealistic scenarios | Model real user behavior | | 0 to 1000 users instantly | Ramp up gradually | | No monitoring during tests | Monitor everything | | No baseline | Establish and track trends | | One-time testing | Continuous performance testing |
// Comprehensive load test
await Task("Load Test", {
target: 'https://api.example.com',
scenarios: {
checkout: { vus: 100, duration: '5m' },
search: { vus: 200, duration: '5m' },
browse: { vus: 500, duration: '5m' }
},
thresholds: {
'http_req_duration': ['p(95)<200'],
'http_req_failed': ['rate<0.01']
}
}, "qe-performance-tester");
// Bottleneck analysis
await Task("Analyze Bottlenecks", {
testResults: perfTest,
metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");
// CI integration
await Task("CI Performance Gate", {
mode: 'smoke',
duration: '1m',
vus: 10,
failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");
aqe/performance/
├── results/* - Test execution results
├── baselines/* - Performance baselines
├── bottlenecks/* - Identified bottlenecks
└── trends/* - Historical trends
const perfFleet = await FleetManager.coordinate({
strategy: 'performance-testing',
agents: [
'qe-performance-tester',
'qe-quality-analyzer',
'qe-production-intelligence',
'qe-deployment-readiness'
],
topology: 'sequential'
});
Performance is a feature: Test it like functionality Test continuously: Not just before launch Monitor production: Synthetic + real user monitoring Fix what matters: Focus on user-impacting bottlenecks Trend over time: Catch degradation early
With Agents: Agents automate load testing, analyze bottlenecks, and compare with production. Use agents to maintain performance at scale.
After each performance test run, append results to run-history.json in this skill directory:
node -e "
const fs = require('fs');
const h = JSON.parse(fs.readFileSync('.claude/skills/performance-testing/run-history.json'));
h.runs.push({date: new Date().toISOString().split('T')[0], scenario: 'load', p95_ms: P95, throughput_rps: RPS, error_rate_pct: ERR});
fs.writeFileSync('.claude/skills/performance-testing/run-history.json', JSON.stringify(h, null, 2));
"
Read run-history.json before each run — compare with baselines. Alert if p95 increases >20% from baseline.
development
Apply XP practices including pair programming, ensemble programming, continuous integration, and sustainable pace. Use when implementing agile development practices, improving team collaboration, or adopting technical excellence practices.
development
Warehouse Management System testing patterns for inventory operations, pick/pack/ship workflows, wave management, EDI X12/EDIFACT compliance, RF/barcode scanning, and WMS-ERP integration. Use when testing WMS platforms (Blue Yonder, Manhattan, SAP EWM).
testing
Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.
development
Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.