Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

proffesor-for-testing/performance-testing

Name: performance-testing
Author: proffesor-for-testing

.claude/skills/performance-testing/SKILL.md

npx skillsauth add proffesor-for-testing/agentic-qe performance-testing

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Performance Testing

<default_to_action> When testing performance or planning load tests:

DEFINE SLOs: p95 response time, throughput, error rate targets
IDENTIFY critical paths: revenue flows, high-traffic pages, key APIs
CREATE realistic scenarios: user journeys, think time, varied data
EXECUTE with monitoring: CPU, memory, DB queries, network
ANALYZE bottlenecks and fix before production

Quick Test Type Selection:

Expected load validation → Load testing
Find breaking point → Stress testing
Sudden traffic spike → Spike testing
Memory leaks, resource exhaustion → Endurance/soak testing
Horizontal/vertical scaling → Scalability testing

Critical Success Factors:

Performance is a feature, not an afterthought
Test early and often, not just before release
Focus on user-impacting bottlenecks </default_to_action>

Quick Reference Card

When to Use

Before major releases
After infrastructure changes
Before scaling events (Black Friday)
When setting SLAs/SLOs

Test Types

| Type | Purpose | When | |------|---------|------| | Load | Expected traffic | Every release | | Stress | Beyond capacity | Quarterly | | Spike | Sudden surge | Before events | | Endurance | Memory leaks | After code changes | | Scalability | Scaling validation | Infrastructure changes |

Key Metrics

| Metric | Target | Why | |--------|--------|-----| | p95 response | < 200ms | User experience | | Throughput | 10k req/min | Capacity | | Error rate | < 0.1% | Reliability | | CPU | < 70% | Headroom | | Memory | < 80% | Stability |

Tools

k6: Modern, JS-based, CI/CD friendly
JMeter: Enterprise, feature-rich
Artillery: Simple YAML configs
Gatling: Scala, great reporting

Agent Coordination

qe-performance-tester: Load test orchestration
qe-quality-analyzer: Results analysis
qe-production-intelligence: Production comparison

Defining SLOs

Bad: "The system should be fast" Good: "p95 response time < 200ms under 1,000 concurrent users"

export const options = {
  thresholds: {
    http_req_duration: ['p(95)<200'],  // 95% < 200ms
    http_req_failed: ['rate<0.01'],     // < 1% failures
  },
};

Realistic Scenarios

Bad: Every user hits homepage repeatedly Good: Model actual user behavior

// Realistic distribution
// 40% browse, 30% search, 20% details, 10% checkout
export default function () {
  const action = Math.random();
  if (action < 0.4) browse();
  else if (action < 0.7) search();
  else if (action < 0.9) viewProduct();
  else checkout();

  sleep(randomInt(1, 5)); // Think time
}

Common Bottlenecks

Database

Symptoms: Slow queries under load, connection pool exhaustion Fixes: Add indexes, optimize N+1 queries, increase pool size, read replicas

N+1 Queries

// BAD: 100 orders = 101 queries
const orders = await Order.findAll();
for (const order of orders) {
  const customer = await Customer.findById(order.customerId);
}

// GOOD: 1 query
const orders = await Order.findAll({ include: [Customer] });

Synchronous Processing

Problem: Blocking operations in request path (sending email during checkout) Fix: Use message queues, process async, return immediately

Memory Leaks

Detection: Endurance testing, memory profiling Common causes: Event listeners not cleaned, caches without eviction

External Dependencies

Solutions: Aggressive timeouts, circuit breakers, caching, graceful degradation

k6 CI/CD Example

// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '1m', target: 50 },   // Ramp up
    { duration: '3m', target: 50 },   // Steady
    { duration: '1m', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<200'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
  });
  sleep(1);
}

# GitHub Actions
- name: Run k6 test
  uses: grafana/[email protected]
  with:
    filename: performance-test.js

Analyzing Results

Good Results

Load: 1,000 users | p95: 180ms | Throughput: 5,000 req/s
Error rate: 0.05% | CPU: 65% | Memory: 70%

Problems

Load: 1,000 users | p95: 3,500ms ❌ | Throughput: 500 req/s ❌
Error rate: 5% ❌ | CPU: 95% ❌ | Memory: 90% ❌

Root Cause Analysis

Correlate metrics: When response time spikes, what changes?
Check logs: Errors, warnings, slow queries
Profile code: Where is time spent?
Monitor resources: CPU, memory, disk
Trace requests: End-to-end flow

Anti-Patterns

| ❌ Anti-Pattern | ✅ Better | |----------------|-----------| | Testing too late | Test early and often | | Unrealistic scenarios | Model real user behavior | | 0 to 1000 users instantly | Ramp up gradually | | No monitoring during tests | Monitor everything | | No baseline | Establish and track trends | | One-time testing | Continuous performance testing |

Agent-Assisted Performance Testing

// Comprehensive load test
await Task("Load Test", {
  target: 'https://api.example.com',
  scenarios: {
    checkout: { vus: 100, duration: '5m' },
    search: { vus: 200, duration: '5m' },
    browse: { vus: 500, duration: '5m' }
  },
  thresholds: {
    'http_req_duration': ['p(95)<200'],
    'http_req_failed': ['rate<0.01']
  }
}, "qe-performance-tester");

// Bottleneck analysis
await Task("Analyze Bottlenecks", {
  testResults: perfTest,
  metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");

// CI integration
await Task("CI Performance Gate", {
  mode: 'smoke',
  duration: '1m',
  vus: 10,
  failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");

Agent Coordination Hints

Memory Namespace

aqe/performance/
├── results/*       - Test execution results
├── baselines/*     - Performance baselines
├── bottlenecks/*   - Identified bottlenecks
└── trends/*        - Historical trends

Fleet Coordination

const perfFleet = await FleetManager.coordinate({
  strategy: 'performance-testing',
  agents: [
    'qe-performance-tester',
    'qe-quality-analyzer',
    'qe-production-intelligence',
    'qe-deployment-readiness'
  ],
  topology: 'sequential'
});

Pre-Production Checklist

[ ] Load test passed (expected traffic)
[ ] Stress test passed (2-3x expected)
[ ] Spike test passed (sudden surge)
[ ] Endurance test passed (24+ hours)
[ ] Database indexes in place
[ ] Caching configured
[ ] Monitoring and alerting set up
[ ] Performance baseline established

Related Skills

agentic-quality-engineering - Agent coordination
api-testing-patterns - API performance
chaos-engineering-resilience - Resilience testing

Remember

Performance is a feature: Test it like functionality Test continuously: Not just before launch Monitor production: Synthetic + real user monitoring Fix what matters: Focus on user-impacting bottlenecks Trend over time: Catch degradation early

With Agents: Agents automate load testing, analyze bottlenecks, and compare with production. Use agents to maintain performance at scale.

Run History

After each performance test run, append results to run-history.json in this skill directory:

node -e "
const fs = require('fs');
const h = JSON.parse(fs.readFileSync('.claude/skills/performance-testing/run-history.json'));
h.runs.push({date: new Date().toISOString().split('T')[0], scenario: 'load', p95_ms: P95, throughput_rps: RPS, error_rate_pct: ERR});
fs.writeFileSync('.claude/skills/performance-testing/run-history.json', JSON.stringify(h, null, 2));
"

Read run-history.json before each run — compare with baselines. Alert if p95 increases >20% from baseline.

Gotchas

k6 scripts generated by agent often hardcode base URLs — use environment variables for portability
Load tests in containers hit resource limits before app limits — ensure container has 2x the resources of target
Agent forgets to include think time between requests — without it, load is unrealistically bursty
P95 vs P99 matters — agent defaults to averages which hide tail latency problems
Baseline comparison requires consistent environment — CI runner variance can cause 20%+ noise

proffesor-for-testing/performance-testing

.claude/skills/performance-testing/SKILL.md

Profiles application performance under load using k6, Artillery, or JMeter to measure latency, throughput, and error rates. Use when planning load tests, stress tests, soak tests, benchmarking APIs, or identifying performance bottlenecks.

304 stars

development

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add proffesor-for-testing/agentic-qe performance-testing

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 11, 2026, 8:20 PM74.7s7 files scanned

SKILL.md

name:: performance-testing
description:: Profiles application performance under load using k6, Artillery, or JMeter to measure latency, throughput, and error rates. Use when planning load tests, stress tests, soak tests, benchmarking APIs, or identifying performance bottlenecks.
category:: specialized-testing
priority:: high
tokenEstimate:: 1100
agents:: [qe-performance-tester, qe-quality-analyzer, qe-production-intelligence]
implementation_status:: optimized
optimization_version:: 1.0
last_optimized:: 2025-12-02
dependencies:: []
quick_reference_card:: true
tags:: [performance, load-testing, stress-testing, scalability, k6, bottlenecks]
trust_tier:: 3
schema_path:: schemas/output.json
validator_path:: scripts/validate-config.json
eval_path:: evals/performance-testing.yaml

Performance Testing

<default_to_action> When testing performance or planning load tests:

DEFINE SLOs: p95 response time, throughput, error rate targets
IDENTIFY critical paths: revenue flows, high-traffic pages, key APIs
CREATE realistic scenarios: user journeys, think time, varied data
EXECUTE with monitoring: CPU, memory, DB queries, network
ANALYZE bottlenecks and fix before production

Quick Test Type Selection:

Expected load validation → Load testing
Find breaking point → Stress testing
Sudden traffic spike → Spike testing
Memory leaks, resource exhaustion → Endurance/soak testing
Horizontal/vertical scaling → Scalability testing

Critical Success Factors:

Performance is a feature, not an afterthought
Test early and often, not just before release
Focus on user-impacting bottlenecks </default_to_action>

Quick Reference Card

When to Use

Before major releases
After infrastructure changes
Before scaling events (Black Friday)
When setting SLAs/SLOs

Test Types

Key Metrics

Tools

k6: Modern, JS-based, CI/CD friendly
JMeter: Enterprise, feature-rich
Artillery: Simple YAML configs
Gatling: Scala, great reporting

Agent Coordination

qe-performance-tester: Load test orchestration
qe-quality-analyzer: Results analysis
qe-production-intelligence: Production comparison

Defining SLOs

Bad: "The system should be fast" Good: "p95 response time < 200ms under 1,000 concurrent users"

export const options = {
  thresholds: {
    http_req_duration: ['p(95)<200'],  // 95% < 200ms
    http_req_failed: ['rate<0.01'],     // < 1% failures
  },
};

Realistic Scenarios

Bad: Every user hits homepage repeatedly Good: Model actual user behavior

// Realistic distribution
// 40% browse, 30% search, 20% details, 10% checkout
export default function () {
  const action = Math.random();
  if (action < 0.4) browse();
  else if (action < 0.7) search();
  else if (action < 0.9) viewProduct();
  else checkout();

  sleep(randomInt(1, 5)); // Think time
}

Common Bottlenecks

Database

Symptoms: Slow queries under load, connection pool exhaustion Fixes: Add indexes, optimize N+1 queries, increase pool size, read replicas

N+1 Queries

// BAD: 100 orders = 101 queries
const orders = await Order.findAll();
for (const order of orders) {
  const customer = await Customer.findById(order.customerId);
}

// GOOD: 1 query
const orders = await Order.findAll({ include: [Customer] });

Synchronous Processing

Problem: Blocking operations in request path (sending email during checkout) Fix: Use message queues, process async, return immediately

Memory Leaks

Detection: Endurance testing, memory profiling Common causes: Event listeners not cleaned, caches without eviction

External Dependencies

Solutions: Aggressive timeouts, circuit breakers, caching, graceful degradation

k6 CI/CD Example

// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '1m', target: 50 },   // Ramp up
    { duration: '3m', target: 50 },   // Steady
    { duration: '1m', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<200'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
  });
  sleep(1);
}

# GitHub Actions
- name: Run k6 test
  uses: grafana/[email protected]
  with:
    filename: performance-test.js

Analyzing Results

Good Results

Load: 1,000 users | p95: 180ms | Throughput: 5,000 req/s
Error rate: 0.05% | CPU: 65% | Memory: 70%

Problems

Load: 1,000 users | p95: 3,500ms ❌ | Throughput: 500 req/s ❌
Error rate: 5% ❌ | CPU: 95% ❌ | Memory: 90% ❌

Root Cause Analysis

Correlate metrics: When response time spikes, what changes?
Check logs: Errors, warnings, slow queries
Profile code: Where is time spent?
Monitor resources: CPU, memory, disk
Trace requests: End-to-end flow

Anti-Patterns

Agent-Assisted Performance Testing

// Comprehensive load test
await Task("Load Test", {
  target: 'https://api.example.com',
  scenarios: {
    checkout: { vus: 100, duration: '5m' },
    search: { vus: 200, duration: '5m' },
    browse: { vus: 500, duration: '5m' }
  },
  thresholds: {
    'http_req_duration': ['p(95)<200'],
    'http_req_failed': ['rate<0.01']
  }
}, "qe-performance-tester");

// Bottleneck analysis
await Task("Analyze Bottlenecks", {
  testResults: perfTest,
  metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");

// CI integration
await Task("CI Performance Gate", {
  mode: 'smoke',
  duration: '1m',
  vus: 10,
  failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");

Agent Coordination Hints

Memory Namespace

aqe/performance/
├── results/*       - Test execution results
├── baselines/*     - Performance baselines
├── bottlenecks/*   - Identified bottlenecks
└── trends/*        - Historical trends

Fleet Coordination

const perfFleet = await FleetManager.coordinate({
  strategy: 'performance-testing',
  agents: [
    'qe-performance-tester',
    'qe-quality-analyzer',
    'qe-production-intelligence',
    'qe-deployment-readiness'
  ],
  topology: 'sequential'
});

Pre-Production Checklist

[ ] Load test passed (expected traffic)
[ ] Stress test passed (2-3x expected)
[ ] Spike test passed (sudden surge)
[ ] Endurance test passed (24+ hours)
[ ] Database indexes in place
[ ] Caching configured
[ ] Monitoring and alerting set up
[ ] Performance baseline established

Related Skills

agentic-quality-engineering - Agent coordination
api-testing-patterns - API performance
chaos-engineering-resilience - Resilience testing

Remember

With Agents: Agents automate load testing, analyze bottlenecks, and compare with production. Use agents to maintain performance at scale.

Run History

After each performance test run, append results to run-history.json in this skill directory:

node -e "
const fs = require('fs');
const h = JSON.parse(fs.readFileSync('.claude/skills/performance-testing/run-history.json'));
h.runs.push({date: new Date().toISOString().split('T')[0], scenario: 'load', p95_ms: P95, throughput_rps: RPS, error_rate_pct: ERR});
fs.writeFileSync('.claude/skills/performance-testing/run-history.json', JSON.stringify(h, null, 2));
"

Read run-history.json before each run — compare with baselines. Alert if p95 increases >20% from baseline.

Gotchas

k6 scripts generated by agent often hardcode base URLs — use environment variables for portability
Load tests in containers hit resource limits before app limits — ensure container has 2x the resources of target
Agent forgets to include think time between requests — without it, load is unrealistically bursty
P95 vs P99 matters — agent defaults to averages which hide tail latency problems
Baseline comparison requires consistent environment — CI runner variance can cause 20%+ noise

Related Skills

proffesor-for-testing/qe-xp-practices

development

VerifiedTrustedCommunity

Apply XP practices including pair programming, ensemble programming, continuous integration, and sustainable pace. Use when implementing agile development practices, improving team collaboration, or adopting technical excellence practices.

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-xp-practices

proffesor-for-testing/qe-wms-testing-patterns

development

VerifiedTrustedCommunity

Warehouse Management System testing patterns for inventory operations, pick/pack/ship workflows, wave management, EDI X12/EDIFACT compliance, RF/barcode scanning, and WMS-ERP integration. Use when testing WMS platforms (Blue Yonder, Manhattan, SAP EWM).

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-wms-testing-patterns

proffesor-for-testing/qe-visual-testing-advanced

testing

VerifiedTrustedCommunity

Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-visual-testing-advanced

proffesor-for-testing/qe-verification-quality

development

VerifiedTrustedCommunity

Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-verification-quality

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/proffesor-for-testing/agentic-qe.git

# Copy into Claude Code skills folder (global)
cp -r agentic-qe/.claude/skills/performance-testing ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

proffesor-for-testing/agentic-qe

304 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT