skills/infrastructure-documenter/SKILL.md
Expert guide for documenting infrastructure including architecture diagrams, runbooks, system documentation, and operational procedures. Use when creating technical documentation for systems and deployments.
npx skillsauth add jmsktm/claude-settings infrastructure-documenterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill helps you create clear, maintainable infrastructure documentation. Covers architecture diagrams, runbooks, system documentation, operational procedures, and documentation-as-code practices.
| Type | Audience | Purpose | |------|----------|---------| | Architecture | Engineers | Understand system design | | Runbooks | Ops/SRE | Handle incidents | | API Docs | Developers | Integrate with system | | Onboarding | New hires | Get up to speed | | Decision Records | Future you | Understand why |
# System Architecture
## Overview
[Project Name] is a [type] application that [purpose].
## High-Level Architecture
┌─────────────────────────────────────────────────────────────┐ │ Users │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Vercel Edge │ │ ┌─────────────────┐ ┌─────────────────┐ │ │ │ Next.js App │ │ Edge Functions │ │ │ └─────────────────┘ └─────────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ┌───────────────┼───────────────┐ ▼ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Supabase │ │ Redis │ │ Stripe │ │ - PostgreSQL │ │ - Session │ │ - Payments │ │ - Auth │ │ - Cache │ │ - Webhooks │ │ - Realtime │ │ │ │ │ │ - Storage │ │ │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘
## Components
### Frontend (Next.js App)
- **Location**: Vercel Edge Network
- **Framework**: Next.js 14 (App Router)
- **Styling**: Tailwind CSS + shadcn/ui
- **State**: Zustand + React Query
### Backend Services
| Service | Provider | Purpose |
|---------|----------|---------|
| Database | Supabase | PostgreSQL with RLS |
| Auth | Supabase Auth | User authentication |
| Storage | Supabase Storage | File uploads |
| Cache | Upstash Redis | Session & API cache |
| Payments | Stripe | Subscriptions |
| Email | Resend | Transactional emails |
### Data Flow
1. User request → Vercel Edge
2. SSR/API Route processes request
3. Database queries via Supabase client
4. Response cached at edge (when applicable)
5. Response returned to user
## Security
### Authentication Flow
1. User signs in via Supabase Auth
2. JWT token issued and stored in cookie
3. Server validates token on each request
4. RLS policies enforce data access
### Data Protection
- All data encrypted at rest (AES-256)
- TLS 1.3 for data in transit
- Secrets stored in Vercel environment
- PII fields encrypted in database
## Request Flow
```mermaid
sequenceDiagram
participant U as User
participant V as Vercel
participant N as Next.js
participant S as Supabase
participant R as Redis
U->>V: HTTPS Request
V->>N: Route to App
alt Cached Response
N->>R: Check Cache
R-->>N: Cache Hit
N-->>U: Return Cached
else Cache Miss
N->>S: Query Database
S-->>N: Data
N->>R: Store in Cache
N-->>U: Return Response
end
erDiagram
users ||--o{ projects : owns
users {
uuid id PK
text email
text name
timestamp created_at
}
projects ||--o{ tasks : contains
projects {
uuid id PK
uuid user_id FK
text name
text status
}
tasks {
uuid id PK
uuid project_id FK
text title
boolean completed
}
## Runbooks
### Runbook Template
```markdown
# Runbook: [Service Name] - [Issue Type]
## Overview
Brief description of the issue and when this runbook applies.
## Severity
- **P1 (Critical)**: Complete outage
- **P2 (High)**: Degraded service
- **P3 (Medium)**: Minor impact
- **P4 (Low)**: No user impact
## Detection
How this issue is typically detected:
- [ ] Alert from [monitoring system]
- [ ] User report
- [ ] Automated check failure
## Impact Assessment
- **Users affected**: All / Segment / None
- **Data at risk**: Yes / No
- **Revenue impact**: High / Medium / Low / None
## Prerequisites
- [ ] Access to [system/dashboard]
- [ ] Credentials for [service]
- [ ] Contact info for [team/person]
## Resolution Steps
### Step 1: Verify the Issue
```bash
# Check service status
curl -I https://api.example.com/health
# Check logs
vercel logs --follow
Common causes:
# Check connection count
SELECT count(*) FROM pg_stat_activity;
# Kill idle connections
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle' AND query_start < now() - interval '1 hour';
# Rollback to previous deployment
vercel rollback
# Check service health
curl https://api.example.com/health
# Monitor error rates for 15 minutes
If unable to resolve within 30 minutes:
### Database Runbooks
```markdown
# Runbook: Database Performance Issues
## Symptoms
- Slow API responses (>1s)
- Timeout errors in logs
- High database CPU in dashboard
## Quick Checks
### 1. Check Active Connections
```sql
SELECT
state,
count(*),
max(now() - query_start) as max_duration
FROM pg_stat_activity
GROUP BY state;
SELECT
pid,
now() - query_start AS duration,
query
FROM pg_stat_activity
WHERE state = 'active'
AND now() - query_start > interval '30 seconds'
ORDER BY duration DESC;
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
LIMIT 10;
SELECT
relname,
seq_scan,
idx_scan,
seq_scan - idx_scan AS difference
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
ORDER BY difference DESC;
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE pid = [PID_FROM_ABOVE];
CREATE INDEX CONCURRENTLY idx_table_column
ON table_name (column_name);
## Decision Records (ADRs)
### ADR Template
```markdown
# ADR-001: Choose Supabase for Database
## Status
Accepted
## Context
We need a database solution for [Project Name] that supports:
- PostgreSQL compatibility
- Real-time subscriptions
- Built-in authentication
- Easy local development
- Generous free tier
## Decision
We will use Supabase as our primary database and auth provider.
## Alternatives Considered
### PlanetScale
**Pros:**
- Excellent scaling
- Branching for schema changes
- MySQL compatible
**Cons:**
- No built-in auth
- No real-time subscriptions
- Additional services needed
### Firebase
**Pros:**
- Real-time built-in
- Mature platform
- Good mobile SDKs
**Cons:**
- NoSQL (not ideal for our use case)
- Vendor lock-in concerns
- Complex security rules
## Consequences
### Positive
- Single provider for DB + Auth + Storage
- Great developer experience
- Row Level Security for data protection
- Local development with supabase CLI
### Negative
- PostgreSQL-specific features tie us to provider
- Supabase still maturing (some rough edges)
- Limited to their managed offering
### Risks
- Supabase scaling limitations at high traffic
- Migration cost if we need to move
## References
- [Supabase Documentation](https://supabase.com/docs)
- [Comparison: Supabase vs Firebase](https://...)
# API Reference
## Base URL
Production: https://api.example.com/v1 Staging: https://staging-api.example.com/v1
## Authentication
All API requests require authentication via Bearer token.
```bash
curl -H "Authorization: Bearer YOUR_TOKEN" \
https://api.example.com/v1/users
GET /users/me
Response:
{
"id": "usr_123",
"email": "[email protected]",
"name": "John Doe",
"created_at": "2024-01-01T00:00:00Z"
}
PATCH /users/me
Request Body: | Field | Type | Required | Description | |-------|------|----------|-------------| | name | string | No | Display name | | avatar_url | string | No | Profile image URL |
Example:
curl -X PATCH \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"name": "Jane Doe"}' \
https://api.example.com/v1/users/me
| Status | Code | Description | |--------|------|-------------| | 400 | BAD_REQUEST | Invalid request body | | 401 | UNAUTHORIZED | Missing or invalid token | | 403 | FORBIDDEN | Insufficient permissions | | 404 | NOT_FOUND | Resource not found | | 429 | RATE_LIMITED | Too many requests | | 500 | INTERNAL_ERROR | Server error |
Error Response Format:
{
"error": {
"code": "NOT_FOUND",
"message": "User not found"
}
}
## Environment Documentation
### Environment Matrix
```markdown
# Environments
## Overview
| Environment | URL | Purpose | Deploy |
|-------------|-----|---------|--------|
| Production | https://myapp.com | Live users | Manual (main) |
| Staging | https://staging.myapp.com | Pre-release testing | Auto (main) |
| Preview | https://pr-*.vercel.app | PR review | Auto (PR) |
| Development | http://localhost:3000 | Local dev | Manual |
## Configuration
### Production
```env
NODE_ENV=production
DATABASE_URL=[Supabase Production]
NEXT_PUBLIC_APP_URL=https://myapp.com
NODE_ENV=production
DATABASE_URL=[Supabase Staging Branch]
NEXT_PUBLIC_APP_URL=https://staging.myapp.com
NODE_ENV=development
DATABASE_URL=[Local Supabase]
NEXT_PUBLIC_APP_URL=http://localhost:3000
| Secret | Rotation | Last Rotated | |--------|----------|--------------| | Database password | 90 days | 2024-01-15 | | API keys | 90 days | 2024-01-15 | | JWT secret | Never | Initial setup |
## Documentation-as-Code
### Documentation Structure
docs/ ├── README.md # Documentation index ├── architecture/ │ ├── overview.md # System architecture │ ├── data-flow.md # Data flow diagrams │ └── decisions/ # ADRs │ ├── 001-database.md │ └── 002-hosting.md ├── runbooks/ │ ├── README.md # Runbook index │ ├── database.md # Database issues │ ├── deployment.md # Deployment issues │ └── outage.md # Service outage ├── api/ │ └── reference.md # API documentation └── onboarding/ ├── setup.md # Local setup └── contributing.md # How to contribute
### Auto-Generated Documentation
```yaml
# .github/workflows/docs.yml
name: Generate Docs
on:
push:
branches: [main]
paths:
- 'src/**'
- 'docs/**'
jobs:
generate-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Generate API docs from OpenAPI
run: |
npx @redocly/cli build-docs openapi.yaml \
--output docs/api/index.html
- name: Generate TypeDoc
run: npx typedoc --out docs/api/typescript
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs
Invoke this skill when:
data-ai
Optimize YouTube videos for SEO, thumbnails, descriptions, and audience retention
testing
Design and facilitate effective workshops with agendas, activities, and outcomes
data-ai
Design and optimize AI-powered workflows for complex tasks
data-ai
Design and implement automated workflows to eliminate repetitive tasks and streamline processes