.claude/skills/test-observability/SKILL.md
Integrate Playwright tests with OpenTelemetry, Grafana, Prometheus, Loki, and Tempo. Use when debugging test failures across distributed systems, measuring test performance, creating test dashboards, or correlating tests with backend traces.
npx skillsauth add adaptationio/skrillz test-observabilityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Connect your Playwright tests to your observability stack (Grafana, Prometheus, Loki, Tempo) for:
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Playwright │────▶│ OTEL Collector │────▶│ Tempo (Traces) │
│ Tests │ │ │────▶│ Loki (Logs) │
│ │ │ │────▶│ Prometheus │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Grafana │
│ Test Dashboard │
└─────────────────┘
npm install playwright-opentelemetry-reporter @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-http
// playwright.config.ts
import { defineConfig } from '@playwright/test';
export default defineConfig({
reporter: [
['html'],
['playwright-opentelemetry-reporter', {
serviceName: 'playwright-tests',
endpoint: process.env.OTEL_ENDPOINT || 'http://localhost:4318/v1/traces',
}],
],
});
# With local OTEL collector
OTEL_ENDPOINT=http://localhost:4318/v1/traces npx playwright test
# With Railway OTEL collector
OTEL_ENDPOINT=http://otel-collector.railway.internal:4318/v1/traces npx playwright test
service.name = "playwright-tests"Tracetest enables trace-based testing - assert on any span in the distributed trace.
npm install @tracetest/playwright
// playwright.config.ts
import { defineConfig } from '@playwright/test';
export default defineConfig({
reporter: [
['html'],
['@tracetest/playwright', {
serverUrl: process.env.TRACETEST_URL || 'http://localhost:11633',
apiKey: process.env.TRACETEST_API_KEY,
}],
],
});
// tests/checkout.spec.ts
import { test, expect } from '@playwright/test';
import { Tracetest } from '@tracetest/playwright';
test('checkout creates order', async ({ page }) => {
const tracetest = new Tracetest();
// Start trace capture
await tracetest.capture();
// Perform user action
await page.goto('/checkout');
await page.click('#place-order');
await expect(page.locator('.order-confirmation')).toBeVisible();
// Assert on backend trace!
await tracetest.assertOnTrace({
assertions: [
// API response time
{
selector: 'span[name="POST /api/orders"]',
assertion: 'attr:http.status_code = 201',
},
// Database query time
{
selector: 'span[name="INSERT orders"]',
assertion: 'attr:db.duration < 100ms',
},
// Payment service
{
selector: 'span[name="payment.process"]',
assertion: 'attr:payment.status = "success"',
},
],
});
});
dashboards/test-results-dashboard.json| Panel | Query | Purpose |
|-------|-------|---------|
| Pass Rate | sum(playwright_test_passed) / sum(playwright_test_total) | Overall health |
| Test Duration (p95) | histogram_quantile(0.95, playwright_test_duration_bucket) | Performance |
| Failed Tests | playwright_test_failed{status="failed"} | Quick triage |
| Flaky Tests | playwright_test_retries > 1 | Identify flakiness |
| Slowest Tests | topk(10, playwright_test_duration) | Optimization targets |
// Export custom metrics from tests
import { test } from '@playwright/test';
import { metrics } from '@opentelemetry/api';
const testMeter = metrics.getMeter('playwright-tests');
const testDuration = testMeter.createHistogram('test.duration');
const testCounter = testMeter.createCounter('test.count');
test('custom metrics', async ({ page }) => {
const start = Date.now();
await page.goto('/');
await page.click('.action');
// Record custom metric
testDuration.record(Date.now() - start, {
test: 'homepage-action',
browser: 'chromium',
});
testCounter.add(1, { status: 'passed' });
});
Connect to your existing Railway observability stack.
# Set in Railway service or .env
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector.railway.internal:4318
OTEL_SERVICE_NAME=playwright-tests
// playwright.config.ts
export default defineConfig({
reporter: [
['playwright-opentelemetry-reporter', {
serviceName: process.env.OTEL_SERVICE_NAME || 'playwright-tests',
endpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT,
headers: {
// Add auth if needed
'Authorization': `Bearer ${process.env.OTEL_AUTH_TOKEN}`,
},
}],
],
});
# Test OTEL collector is receiving data
curl -X POST http://otel-collector.railway.internal:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{"resourceSpans":[]}'
# Should return 200 OK
Send test logs to Loki alongside traces.
// tests/fixtures/logger.ts
import winston from 'winston';
import { OTLPLogExporter } from '@opentelemetry/exporter-logs-otlp-http';
export const logger = winston.createLogger({
transports: [
new winston.transports.Console(),
// Send to OTEL collector → Loki
new OTLPLogTransport({
endpoint: process.env.OTEL_ENDPOINT,
}),
],
});
// In tests
import { logger } from './fixtures/logger';
test('with logging', async ({ page }) => {
logger.info('Starting test', { test: 'checkout' });
await page.goto('/checkout');
logger.debug('Page loaded', { url: page.url() });
await page.click('#submit');
logger.info('Order submitted');
});
1. Grafana → Explore → Loki
2. Query: {service="playwright-tests"}
3. Filter: |= "error" or |json | level="error"
Set up alerts for test failures.
# alerts/test-alerts.yml
groups:
- name: playwright-tests
rules:
- alert: TestFailureRate
expr: |
(sum(rate(playwright_test_failed[5m])) / sum(rate(playwright_test_total[5m]))) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High test failure rate (>10%)"
- alert: TestDurationSpike
expr: |
histogram_quantile(0.95, playwright_test_duration_bucket) > 30
for: 5m
labels:
severity: warning
annotations:
summary: "Test duration p95 > 30 seconds"
avg() of A is above 0.1❌ tests/checkout.spec.ts:15 - Order creation failed
Expected: Order confirmation visible
Actual: Error page shown
Grafana → Explore → Tempo
Query: service.name="playwright-tests" AND name="checkout.spec.ts"
Browser click "Place Order" (50ms)
└─ POST /api/orders (200ms)
└─ Validate cart (10ms)
└─ Process payment ❌ (timeout after 30s)
└─ Payment gateway unreachable
└─ Create order (skipped)
Found: Payment gateway timeout caused order failure.
Without traces: Would have debugged browser-side for hours.
import { context, trace } from '@opentelemetry/api';
test('with trace context', async ({ page }) => {
const tracer = trace.getTracer('playwright');
await tracer.startActiveSpan('checkout-flow', async (span) => {
span.setAttribute('test.name', 'checkout');
span.setAttribute('test.browser', 'chromium');
await page.goto('/checkout');
await page.click('#submit');
span.setStatus({ code: 1 }); // OK
span.end();
});
});
// Ensure trace ID passes from browser to backend
test('trace propagation', async ({ page }) => {
// Get current trace ID
const traceId = trace.getActiveSpan()?.spanContext().traceId;
// Add to request headers
await page.route('**/*', route => {
route.continue({
headers: {
...route.request().headers(),
'traceparent': `00-${traceId}-${spanId}-01`,
},
});
});
});
test('with tags', async ({ page }) => {
const span = trace.getActiveSpan();
span?.setAttribute('test.suite', 'e2e');
span?.setAttribute('test.priority', 'critical');
span?.setAttribute('test.feature', 'checkout');
// Now filter in Grafana: test.feature="checkout"
});
references/tracetest-integration.md - Full Tracetest setupreferences/otel-reporter-setup.md - OTEL reporter configurationreferences/grafana-dashboards.md - Dashboard creation guidedashboards/test-results-dashboard.json - Ready-to-import dashboardTest observability transforms debugging from guesswork to precision - see the full picture from click to database.
development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.