skills/graphrag-system-design/SKILL.md
Designs complete GraphRAG systems integrating graph databases, vector stores, orchestration frameworks, and LLM reasoning. Guides through pattern selection, technology stack decisions, integration pipeline design, and domain-specific customizations. Use when designing GraphRAG systems, choosing technology stacks for graph-augmented retrieval, combining Neo4j with LLM, using LangChain/LlamaIndex knowledge graphs, applying community detection for RAG, building hybrid symbol-vector pipelines, or deploying production or domain-specific GraphRAG.
npx skillsauth add lyndonkl/claude graphrag-system-designInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Copy this checklist and work through each step:
GraphRAG System Design Progress:
- [ ] Step 1: Analyze domain requirements
- [ ] Step 2: Select GraphRAG pattern
- [ ] Step 3: Choose technology stack
- [ ] Step 4: Design integration pipeline
- [ ] Step 5: Apply domain customizations
- [ ] Step 6: Define deployment strategy
- [ ] Step 7: Produce specification
Step 1: Analyze domain requirements
Characterize the retrieval problem: query complexity (single-hop vs multi-hop), data volume and update frequency, compliance constraints, latency requirements, and explainability needs. Determine whether graph structure adds value over flat retrieval -- multi-hop reasoning, entity disambiguation, and relationship-aware context assembly are strong signals for GraphRAG. Define the user personas and query patterns the system must serve.
Step 2: Select GraphRAG pattern
Choose the core retrieval architecture using the GraphRAG Pattern Selection guide. Match your query patterns to the appropriate pattern: Hybrid Symbol-Vector for mixed structured/unstructured queries, Subgraph-on-Demand for focused context assembly, or Community-Based Global Summarization for broad thematic queries. For detailed pattern descriptions, see Methodology Reference.
Step 3: Choose technology stack
Select components for each architectural layer: graph database, vector database, orchestration framework, and LLM provider. Use the Technology Stacks Reference for component-by-component comparison. Key decisions: single-system vs multi-system hybrid, managed vs self-hosted, framework-based vs custom pipeline. Consider team expertise, budget constraints, and existing infrastructure.
Step 4: Design integration pipeline
Define the end-to-end data flow from ingestion through generation. The core pipeline stages: Ingest (raw data) -> Extract (entities and relations) -> Build KG (populate graph) -> Index (vector embeddings + graph indices) -> Retrieve (hybrid graph+vector search) -> Generate (LLM with graph-grounded context) -> Cite (provenance from graph paths). Design the query routing logic that determines when to use graph traversal, vector search, or both. See Methodology Reference for pipeline design considerations.
Step 5: Apply domain customizations
Adapt the generic architecture to domain-specific requirements: ontology selection (UMLS for healthcare, FIBO for finance), compliance patterns (HIPAA access control, regulatory audit trails), and domain retrieval patterns (temporal graphs for finance, layered patient graphs for clinical). See Domain Patterns Reference for domain-specific guidance.
Step 6: Define deployment strategy
Specify the deployment architecture: graph database sizing and clustering, vector index configuration, caching strategy, batch vs real-time ingestion, monitoring and observability, and scaling plan. Define performance SLAs for query latency, throughput, and freshness. Plan for graph maintenance: incremental updates, schema evolution, and data quality monitoring.
Step 7: Produce specification
Compile the complete system design specification using the Output Template. Validate against the quality rubric at System Design Rubric. Ensure all components are connected end-to-end with clear data flows, error handling, and fallback strategies.
| Pattern | Query Type | Mechanism | Best For | Trade-offs | |---------|-----------|-----------|----------|------------| | Hybrid Symbol-Vector | Mixed structured + semantic | Pre-filter by graph type/constraint then rank by embedding similarity; or broad vector search then graph-guided expansion | Systems needing both precise structural queries and fuzzy semantic search; enterprise QA with entity disambiguation | Higher complexity; requires synchronized graph + vector indices; latency depends on filter-then-rank vs expand strategy | | Subgraph-on-Demand | Focused multi-hop | Build temporary query-specific subgraphs rather than querying one monolithic graph; extract relevant neighborhood, embed, retrieve | Real-time applications needing focused context; systems with frequent updates; cost-sensitive deployments | Cold-start latency for subgraph construction; requires efficient subgraph extraction; context may miss distant but relevant nodes | | Community-Based Global Summarization | Broad thematic / global | Detect communities/clusters in graph, embed summaries of each community, retrieve relevant community then drill into entity details; Microsoft GraphRAG pattern | Broad "what is X about?" queries; corpus-level summarization; thematic exploration across large knowledge bases | Requires periodic community detection (batch); summaries may lose detail; community boundaries can split related concepts |
A complete GraphRAG system integrates four core component layers:
+------------------+ +------------------+ +----------------------+ +-----------+
| Graph Database | | Vector Database | | Orchestration | | LLM |
| (Structure) |<--->| (Semantics) |<--->| Framework |<--->| (Reason) |
| | | | | | | |
| Neo4j / Tiger | | Pinecone / | | LangChain / | | GPT-4 / |
| Graph / Neptune | | Weaviate / | | LlamaIndex / | | Claude / |
| / GraphDB | | Qdrant / pgvec | | LangGraph / Custom | | Llama |
+------------------+ +------------------+ +----------------------+ +-----------+
| | |
v v v
Graph Traversal Embedding Search Pipeline Logic
Multi-hop Paths Semantic Ranking Query Routing
Schema Filtering Similarity Scores Context Assembly
Provenance Chains Hybrid Re-ranking Citation Generation
Key integration decisions:
GRAPHRAG SYSTEM DESIGN SPECIFICATION
======================================
Project: [Project name]
Domain: [Target domain]
Date: [Date]
Author: [Author]
1. DOMAIN REQUIREMENTS
Query Patterns: [Single-hop / Multi-hop / Thematic / Mixed]
Data Volume: [Document count, entity count estimates]
Update Frequency: [Real-time / Daily / Weekly / Batch]
Latency Requirements: [p50, p95, p99 targets]
Compliance: [HIPAA / GDPR / SOX / None]
Explainability: [Required / Nice-to-have / Not needed]
2. GRAPHRAG PATTERN
Primary Pattern: [Hybrid Symbol-Vector / Subgraph-on-Demand / Community-Based]
Secondary Pattern: [If hybrid approach, specify]
Query Router: [How queries are dispatched to retrieval paths]
Rationale: [Why this pattern fits the requirements]
3. TECHNOLOGY STACK
Graph Database: [Product, version, deployment mode]
Justification: [Why this choice]
Vector Database: [Product, version, deployment mode]
Justification: [Why this choice]
Orchestration: [Framework or custom pipeline]
Justification: [Why this choice]
LLM Provider: [Model, API or self-hosted]
Justification: [Why this choice]
Supporting Infrastructure: [Cache, queue, monitoring tools]
4. INTEGRATION PIPELINE
Ingestion:
- Source types: [Documents, APIs, databases]
- Processing: [Chunking strategy, metadata extraction]
Extraction:
- Entity extraction: [Method, model, confidence threshold]
- Relation extraction: [Method, schema enforcement]
Knowledge Graph Build:
- Schema: [Node types, edge types, properties]
- Population: [Batch / streaming, deduplication strategy]
Indexing:
- Graph indices: [Index types, query optimization]
- Vector indices: [Embedding model, dimension, index type]
Retrieval:
- Graph retrieval: [Traversal strategy, depth limits]
- Vector retrieval: [Top-k, similarity threshold]
- Hybrid fusion: [How graph and vector results combine]
Generation:
- Context assembly: [How retrieved data becomes LLM context]
- Prompt template: [Structure for graph-grounded generation]
Citation:
- Provenance: [How sources are tracked and surfaced]
5. DOMAIN CUSTOMIZATIONS
Ontology: [Domain ontology or taxonomy used]
Compliance Controls: [Access control, audit, encryption]
Domain-Specific Patterns: [Temporal graphs, layered architecture, etc.]
6. DEPLOYMENT STRATEGY
Infrastructure: [Cloud provider, regions, HA configuration]
Scaling Plan: [Graph DB scaling, vector DB scaling, LLM scaling]
Monitoring: [Metrics, alerts, dashboards]
Maintenance: [Graph update strategy, schema evolution plan]
7. PERFORMANCE TARGETS
Query Latency: [p50, p95, p99]
Throughput: [Queries per second]
Freshness: [Time from data change to queryable]
Accuracy: [Retrieval precision/recall targets]
8. RISK AND MITIGATION
- [Risk 1]: [Mitigation strategy]
- [Risk 2]: [Mitigation strategy]
- [Risk 3]: [Mitigation strategy]
NEXT STEPS:
- Build proof-of-concept with sample data
- Benchmark retrieval quality against baseline RAG
- Load test with production-scale data
- Iterate schema and retrieval strategy based on evaluation
testing
--- name: advisory-edit description: A strict advisory-only editing discipline for a writer who dictates ("speaks out") essays and wants help WITHOUT having their voice changed. The editor directs structure, flags grammar, and suggests strategic language — but never modifies the writer's text unless the writer explicitly says "apply" / "make that change" / "rewrite this." Produces a line-referenced, suggestion-only critique where every item is marked the writer's call. Four passes: structural, l
testing
Provides the house style for analyst-grade strategist writing — third-person register with sparing first-person, no em dashes, no "not X, not Y, not Z" negation cascades, numbered footnote citations rather than inline source parentheticals, specific opinion-signaling phrases, and topic-forward paragraph structure modeled on voice patterns observed in Damodaran's Musings on Markets and Thompson's Stratechery. Use when consolidating working notes into a finished long-form strategist or analyst report that must read as written by a senior human analyst rather than an AI assistant.
testing
Renders a markdown report to a PDF using pandoc with xelatex (11pt serif body, 1-inch margins, numbered footnotes, formal heading hierarchy). Requires a one-time install of pandoc and a LaTeX engine on the user's machine — basictex on macOS or texlive-xetex on Linux. Does not attempt automatic install. Fails loudly with the exact install commands if pandoc or xelatex is missing on the user's PATH. Use when producing a finished strategist or analyst report PDF from a polished markdown source.
testing
Produces step-by-step computational walkthroughs of vector and matrix operations as a sequence of numbered "frames", showing the explicit state at each step. The text-equivalent of a 3Blue1Brown animation — each frame shows what changed and why, so the learner can re-trace the operation by hand. Use when the learner needs to *see* a computation unfold (eigenvalue computation, attention with 3 tokens, gradient descent step, SVD on a 2×2, layer norm on a 3-vector, softmax of a small input), when an explanation has been given but the learner needs to ground it in a worked example, or when introducing an operation that's intimidating in symbol form but trivial in pencil-and-paper form.