specwright/templates/skills/dev-team/architect/data-modeling/SKILL.md
# Data Modeling Skill > Skill: Data Modeling > Role: Architect > Created: 2026-01-09 > Version: 1.0.0 ## Purpose Designs database schemas, data structures, and relationships that are normalized, performant, scalable, and maintainable. Ensures data integrity, optimizes queries, and plans for future growth. ## When to Activate This Skill **Trigger Conditions:** - New feature requiring database changes - Database schema design - Data migration planning - Query performance optimization - Data i
npx skillsauth add michsindlinger/specwright specwright/templates/skills/dev-team/architect/data-modelingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Skill: Data Modeling Role: Architect Created: 2026-01-09 Version: 1.0.0
Designs database schemas, data structures, and relationships that are normalized, performant, scalable, and maintainable. Ensures data integrity, optimizes queries, and plans for future growth.
Trigger Conditions:
Context Signals:
[TECH_STACK_SPECIFIC]
[TECH_STACK_SPECIFIC]
[TECH_STACK_SPECIFIC]
[TECH_STACK_SPECIFIC]
[TECH_STACK_SPECIFIC]
[MCP_TOOLS]
<!-- Populated during skill creation based on: 1. User's installed MCP servers 2. User's selection for this skill Recommended for this skill (examples): - database - Query and inspect database - sql - Execute SQL commands - [TECH_STACK_SPECIFIC] - Framework ORM/database tools Note: Skills work without MCP servers, but functionality may be limited -->Scenario: Design schema for blog posts and comments
Design:
[TECH_STACK_SPECIFIC]
-- Users table
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
username VARCHAR(100) NOT NULL UNIQUE,
password_hash VARCHAR(255) NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
-- Posts table
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
published_at TIMESTAMP,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
INDEX idx_posts_user_id (user_id),
INDEX idx_posts_published_at (published_at)
);
-- Comments table
CREATE TABLE comments (
id BIGSERIAL PRIMARY KEY,
post_id BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
content TEXT NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
INDEX idx_comments_post_id (post_id),
INDEX idx_comments_user_id (user_id)
);
DESIGN DECISIONS:
- BIGSERIAL for future scalability
- CASCADE delete to maintain referential integrity
- Indexes on foreign keys for join performance
- published_at nullable (drafts are unpublished)
- Timestamps for audit trail
Scenario: Posts can have multiple tags, tags can apply to multiple posts
Design:
[TECH_STACK_SPECIFIC]
-- Tags table
CREATE TABLE tags (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL UNIQUE,
slug VARCHAR(100) NOT NULL UNIQUE,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
INDEX idx_tags_slug (slug)
);
-- Join table
CREATE TABLE post_tags (
id BIGSERIAL PRIMARY KEY,
post_id BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
tag_id BIGINT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
UNIQUE (post_id, tag_id),
INDEX idx_post_tags_post_id (post_id),
INDEX idx_post_tags_tag_id (tag_id)
);
DESIGN DECISIONS:
- Separate join table (post_tags) for many-to-many
- Composite unique constraint prevents duplicates
- Indexes on both foreign keys for bidirectional queries
- Cascade delete cleans up orphaned relationships
- slug for URL-friendly tag pages
Scenario: Comments can be on posts or on other comments (nested)
Design:
[TECH_STACK_SPECIFIC]
-- Polymorphic comments table
CREATE TABLE comments (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
commentable_type VARCHAR(100) NOT NULL,
commentable_id BIGINT NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
INDEX idx_comments_user_id (user_id),
INDEX idx_comments_commentable (commentable_type, commentable_id)
);
-- Example data:
-- Comment on post:
-- { commentable_type: 'Post', commentable_id: 123 }
-- Comment on comment (reply):
-- { commentable_type: 'Comment', commentable_id: 456 }
DESIGN DECISIONS:
- Polymorphic pattern for flexible relationships
- Composite index on type + id for lookups
- No foreign key constraint (polymorphic limitation)
- Application-level validation required
- Consider view/materialized view for performance
Scenario: Frequently display post comment count (expensive to calculate)
Design:
[TECH_STACK_SPECIFIC]
-- Add denormalized column to posts
ALTER TABLE posts ADD COLUMN comments_count INTEGER NOT NULL DEFAULT 0;
-- Create index for sorting
CREATE INDEX idx_posts_comments_count ON posts(comments_count);
-- Trigger to maintain count (PostgreSQL example)
CREATE OR REPLACE FUNCTION update_post_comments_count()
RETURNS TRIGGER AS $$
BEGIN
IF (TG_OP = 'INSERT') THEN
UPDATE posts SET comments_count = comments_count + 1
WHERE id = NEW.post_id;
RETURN NEW;
ELSIF (TG_OP = 'DELETE') THEN
UPDATE posts SET comments_count = comments_count - 1
WHERE id = OLD.post_id;
RETURN OLD;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER comments_count_trigger
AFTER INSERT OR DELETE ON comments
FOR EACH ROW EXECUTE FUNCTION update_post_comments_count();
-- Alternative: Application-level counter cache
[Framework-specific counter cache implementation]
DESIGN DECISIONS:
- Denormalize for read performance (very frequent query)
- Trigger maintains consistency automatically
- Trade-off: Slightly slower writes, much faster reads
- Document the denormalization clearly
- Alternative: Periodic batch update for less critical counts
Scenario: Add email verification to existing users table
Migration:
[TECH_STACK_SPECIFIC]
# Migration: Add email verification
class AddEmailVerificationToUsers < ActiveRecord::Migration[7.0]
def up
# Add new columns
add_column :users, :email_verified, :boolean, default: false, null: false
add_column :users, :email_verification_token, :string
add_column :users, :email_verification_sent_at, :timestamp
# Add index for token lookup
add_index :users, :email_verification_token, unique: true
# Data migration: Mark existing users as verified (grandfathered)
User.update_all(email_verified: true)
end
def down
remove_column :users, :email_verified
remove_column :users, :email_verification_token
remove_column :users, :email_verification_sent_at
end
end
MIGRATION BEST PRACTICES:
1. Always provide rollback (down method)
2. Handle existing data appropriately
3. Add indexes in same migration
4. Set sensible defaults
5. Consider zero-downtime deployment needs
6. Test migration on production-like dataset
7. Document any manual steps required
Scenario: Optimize query performance for common searches
Analysis:
[TECH_STACK_SPECIFIC]
-- Common query: Find posts by user, ordered by published date
SELECT * FROM posts
WHERE user_id = 123
AND published_at IS NOT NULL
ORDER BY published_at DESC
LIMIT 25;
-- Optimal index: Composite index covering the query
CREATE INDEX idx_posts_user_published
ON posts(user_id, published_at DESC)
WHERE published_at IS NOT NULL;
-- This is a partial index (WHERE clause) for efficiency
-- Common query: Search posts by title
SELECT * FROM posts
WHERE title ILIKE '%search term%';
-- Optimal index: Full-text search
CREATE INDEX idx_posts_title_fulltext
ON posts USING GIN(to_tsvector('english', title));
-- Updated query using index:
SELECT * FROM posts
WHERE to_tsvector('english', title) @@ to_tsquery('english', 'search & term');
INDEXING PRINCIPLES:
1. Index columns used in WHERE clauses
2. Index foreign keys
3. Index columns used in ORDER BY
4. Composite indexes for multi-column queries
5. Consider partial indexes for filtered queries
6. Use appropriate index type (B-tree, GIN, GiST)
7. Monitor index usage and remove unused indexes
8. Balance read performance vs write overhead
1. UNDERSTAND: Data requirements and relationships
2. DESIGN: Entity-relationship model
3. NORMALIZE: Apply normalization principles (3NF)
4. OPTIMIZE: Strategic denormalization if needed
5. CONSTRAIN: Add integrity constraints
6. INDEX: Plan indexing strategy
7. MIGRATE: Create migration plan
8. VALIDATE: Test with realistic data volumes
9. DOCUMENT: Schema decisions and rationale
tools
Session Handoff: Erstellt eine vollständige Zusammenfassung der aktuellen Session für einen sauberen Kontextwechsel. NUR bei explizitem Aufruf (/session-handoff). NICHT automatisch auslösen. Geeignet wenn der User die Session resetten will, den Kontext aufräumen will, oder bei ~120k Tokens angelangt ist.
development
Pre-Mortem Risk Analysis: Strukturierte Prospective-Hindsight-Übung um launch-blocking Risiken vor Commitment aufzudecken. Team stellt sich vor, das Produkt sei 14 Tage nach Launch gefloppt, und arbeitet rückwärts. Klassifiziert Risiken in Tigers (echt), Paper Tigers (hypothetisch), Elephants (unausgesprochen). Nutze diesen Skill vor Build-Commitment, bei zu hoher Stakeholder-Confidence, vor Major-Releases, oder wenn das Team vage Sorgen nicht artikulieren kann. Trigger: /pre-mortem, 'pre-mortem', 'risk analysis', 'was könnte schiefgehen', 'risiken vor launch'.
testing
Six-Sigma Atomicity Validator for create-spec stories
tools
UX pattern definition guidance for navigation, user flows, interactions, and accessibility