Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lyndonkl/knowledge-graph-construction

Name: knowledge-graph-construction
Author: lyndonkl

skills/knowledge-graph-construction/SKILL.md

npx skillsauth add lyndonkl/claude knowledge-graph-construction

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Workflow
Architecture Selection Guide
Schema Patterns
Output Template

Knowledge Graph Construction

Workflow

Copy this checklist and work through each step:

KG Construction Progress:
- [ ] Step 1: Identify data sources and domain scope
- [ ] Step 2: Select graph data model
- [ ] Step 3: Design schema and ontology
- [ ] Step 4: Configure extraction pipeline
- [ ] Step 5: Define layered architecture
- [ ] Step 6: Validate and quality-check the graph

Step 1: Identify data sources and domain scope

Catalog the input data: document types (papers, clinical notes, web pages, logs), volume, update frequency, and language. Define the domain boundary -- what entity types and relation types matter for the target use case. Determine whether the KG will serve RAG retrieval, reasoning/inference, analytics, or a combination. This scoping step prevents over-extraction and keeps the schema focused.

Step 2: Select graph data model

Choose the underlying data model using the Architecture Selection Guide. Key trade-offs: LPG for flexibility and rapid prototyping, RDF/OWL for standards-based interoperability and inference, Hypergraphs for complex N-ary relations, Temporal Graphs for time-evolving knowledge. Consider query language, tooling maturity, and vector integration needs. For detailed model comparisons, see Data Models Reference.

Step 3: Design schema and ontology

Define node types (entity classes), edge types (relation classes), and property schemas. Apply patterns from Schema Patterns: entity-relation for simple domains, event reification for N-ary relations, layered tiers for multi-source integration. Decide on controlled vocabularies, cardinality constraints, and whether to adopt or extend an existing ontology (e.g., Schema.org, UMLS, SNOMED). For methodology details, see Methodology Reference.

Step 4: Configure extraction pipeline

Build the pipeline that populates the graph. Core components: LLM-assisted entity extraction with multi-round verification, relation extraction via prompt-based or dependency-parsing methods, entity normalization (synonym merging, ontology linking), and schema enforcement through post-processing validation. Use few-shot examples in prompts to improve extraction consistency. Include a second-pass LLM verification to catch missed entities. For full pipeline design, see Methodology Reference.

Step 5: Define layered architecture

Structure the KG into tiers for maintainability and trust. A common pattern: Layer 1 (instance data) holds user-specific or case-specific entities and relations; Layer 2 (domain knowledge) holds curated facts from literature or domain experts; Layer 3 (canonical ontology) holds the formal schema and upper ontology. Add provenance and evidence layering so every fact traces back to its source document, extraction method, and confidence score. Temporal subgraphs capture time-indexed state for domains where knowledge evolves.

Step 6: Validate and quality-check the graph

Run validation at multiple levels: schema conformance (do all nodes and edges match declared types?), coverage (are expected entity types populated?), consistency (no contradictory edges), and completeness (sample-based human review). Use a second LLM as a validator to fact-check extracted triples against source documents. Compute graph statistics (node degree distribution, connected components, orphan nodes) to identify extraction gaps. Quality criteria are defined in Quality Rubric.

Architecture Selection Guide

By Use Case

| Model | Flexibility | Standardization | Reasoning | Vector Integration | Query Language | Best For | |-------|-------------|-----------------|-----------|-------------------|----------------|----------| | LPG | High | Low | Limited | Native (Neo4j) | Cypher, Gremlin | Rapid development, RAG pipelines | | RDF/OWL | Medium | High | Full (OWL-DL) | Via extensions | SPARQL | Interoperability, ontology-heavy domains | | Hypergraph | High | Low | Limited | Custom | Custom APIs | N-ary relations, multi-entity events | | Temporal | Medium | Low | Time-based | Via extensions | Temporal Cypher | Evolving knowledge, episodic memory |

By Domain

| Domain | Recommended Model | Rationale | |--------|-------------------|-----------| | Biomedical / Clinical | RDF/OWL | UMLS/SNOMED ontologies, reasoning needed | | Enterprise / RAG | LPG | Fast iteration, vector search integration | | Event-centric (news, logs) | Hypergraph or Temporal | Multi-participant events, time evolution | | Legal / Compliance | RDF/OWL | Formal reasoning, provenance chains | | Scientific Literature | LPG + Layered | Flexible extraction, layered trust |

Schema Patterns

Entity-Relation Pattern

The simplest pattern. Nodes represent entities, edges represent binary relations. Properties on nodes hold attributes; properties on edges hold relation metadata (confidence, source, timestamp).

(:Person {name, role}) -[:WORKS_AT {since}]-> (:Organization {name, type})
(:Drug {name, class})  -[:TREATS {efficacy}]-> (:Disease {name, icd_code})

Best for: domains with primarily binary relationships and moderate complexity.

Event Reification Pattern

Model N-ary relations and complex events as first-class nodes. An event node connects to all participants via typed role edges. This avoids information loss from forcing N-ary relations into binary edges.

(:ClinicalTrial {id, phase, start_date})
  -[:HAS_DRUG]->     (:Drug {name})
  -[:HAS_CONDITION]-> (:Disease {name})
  -[:HAS_OUTCOME]->   (:Outcome {measure, value})
  -[:CONDUCTED_BY]->  (:Organization {name})

Best for: events with multiple participants, clinical data, news events, financial transactions.

Layered Tier Pattern

Separate the graph into trust-differentiated layers that can be queried independently or together.

Layer 3 (Canonical Ontology): Formal class hierarchy, relation definitions, constraints
Layer 2 (Domain Knowledge):   Curated facts from literature, expert-validated
Layer 1 (Instance Data):      Extracted from user documents, case-specific, lower confidence

Cross-layer edges link instances to domain concepts and domain concepts to ontology classes. Provenance metadata on every edge records: source document, extraction method, confidence score, and timestamp.

Best for: multi-source integration, RAG with trust scoring, enterprise knowledge management.

Output Template

KNOWLEDGE GRAPH CONSTRUCTION SPECIFICATION
============================================

Domain: [Target domain and scope]
Use Case: [RAG / Reasoning / Analytics / Hybrid]
Data Sources: [List of input data types and volumes]

Data Model: [LPG / RDF / Hypergraph / Temporal]
Query Language: [Cypher / SPARQL / Gremlin / Custom]
Storage Backend: [Neo4j / Amazon Neptune / Virtuoso / etc.]

Schema Definition:
  Node Types:
  1. [EntityType] - [description]
     Properties: [list with types]
  2. [EntityType] - [description]
     Properties: [list with types]
  3. [Continue for each node type...]

  Edge Types:
  1. [RelationType] (source -> target) - [description]
     Properties: [list with types]
  2. [Continue for each edge type...]

  Constraints:
  - [Cardinality, uniqueness, required properties]

Extraction Pipeline:
  1. Entity Extraction
     - Method: [LLM-assisted / NER / Hybrid]
     - Prompt template: [summary or reference]
     - Verification: [Multi-round / Second-LLM / Manual sample]
  2. Relation Extraction
     - Method: [Prompt-based / Dependency parsing / Hybrid]
     - Few-shot examples: [count and source]
  3. Normalization
     - Deduplication: [method]
     - Ontology linking: [target ontology]
     - Synonym resolution: [approach]

Layered Architecture:
  Layer 1 (Instance): [description of instance-level data]
  Layer 2 (Domain):   [description of curated domain knowledge]
  Layer 3 (Ontology): [description of formal schema]
  Provenance: [How source/confidence/timestamp are tracked]

Validation Plan:
  - Schema conformance: [automated checks]
  - Coverage: [expected entity/relation counts]
  - Consistency: [contradiction detection method]
  - Human review: [sampling strategy]

Estimated Scale: [node count, edge count, properties per node]
Key Dependencies: [libraries, APIs, ontologies]

NEXT STEPS:
- Implement extraction pipeline on sample data
- Populate graph and run validation suite
- Iterate schema based on extraction results
- Integrate with downstream application (RAG, reasoning, etc.)

lyndonkl/knowledge-graph-construction

skills/knowledge-graph-construction/SKILL.md

Designs and builds knowledge graphs from unstructured or semi-structured data sources. Guides through data model selection (LPG, RDF, hypergraph, temporal), schema design, entity/relation extraction pipelines, and layered architecture construction. Use when designing knowledge graphs, choosing between LPG vs RDF, planning entity extraction, designing graph schemas, aligning ontologies, building a KG for RAG, or when user mentions knowledge graph construction.

81 stars

development

Updated Apr 20, 2026

$ install --global

skillsauth

npx skillsauth add lyndonkl/claude knowledge-graph-construction

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 20, 2026, 6:27 AM5.6s4 files scanned

SKILL.md

name:: knowledge-graph-construction
description:: Designs and builds knowledge graphs from unstructured or semi-structured data sources. Guides through data model selection (LPG, RDF, hypergraph, temporal), schema design, entity/relation extraction pipelines, and layered architecture construction. Use when designing knowledge graphs, choosing between LPG vs RDF, planning entity extraction, designing graph schemas, aligning ontologies, building a KG for RAG, or when user mentions knowledge graph construction.

Workflow
Architecture Selection Guide
Schema Patterns
Output Template

Knowledge Graph Construction

Workflow

Copy this checklist and work through each step:

KG Construction Progress:
- [ ] Step 1: Identify data sources and domain scope
- [ ] Step 2: Select graph data model
- [ ] Step 3: Design schema and ontology
- [ ] Step 4: Configure extraction pipeline
- [ ] Step 5: Define layered architecture
- [ ] Step 6: Validate and quality-check the graph

Step 1: Identify data sources and domain scope

Step 2: Select graph data model

Step 3: Design schema and ontology

Step 4: Configure extraction pipeline

Step 5: Define layered architecture

Step 6: Validate and quality-check the graph

Architecture Selection Guide

By Use Case

By Domain

Schema Patterns

Entity-Relation Pattern

The simplest pattern. Nodes represent entities, edges represent binary relations. Properties on nodes hold attributes; properties on edges hold relation metadata (confidence, source, timestamp).

(:Person {name, role}) -[:WORKS_AT {since}]-> (:Organization {name, type})
(:Drug {name, class})  -[:TREATS {efficacy}]-> (:Disease {name, icd_code})

Best for: domains with primarily binary relationships and moderate complexity.

Event Reification Pattern

(:ClinicalTrial {id, phase, start_date})
  -[:HAS_DRUG]->     (:Drug {name})
  -[:HAS_CONDITION]-> (:Disease {name})
  -[:HAS_OUTCOME]->   (:Outcome {measure, value})
  -[:CONDUCTED_BY]->  (:Organization {name})

Best for: events with multiple participants, clinical data, news events, financial transactions.

Layered Tier Pattern

Separate the graph into trust-differentiated layers that can be queried independently or together.

Layer 3 (Canonical Ontology): Formal class hierarchy, relation definitions, constraints
Layer 2 (Domain Knowledge):   Curated facts from literature, expert-validated
Layer 1 (Instance Data):      Extracted from user documents, case-specific, lower confidence

Best for: multi-source integration, RAG with trust scoring, enterprise knowledge management.

Output Template

KNOWLEDGE GRAPH CONSTRUCTION SPECIFICATION
============================================

Domain: [Target domain and scope]
Use Case: [RAG / Reasoning / Analytics / Hybrid]
Data Sources: [List of input data types and volumes]

Data Model: [LPG / RDF / Hypergraph / Temporal]
Query Language: [Cypher / SPARQL / Gremlin / Custom]
Storage Backend: [Neo4j / Amazon Neptune / Virtuoso / etc.]

Schema Definition:
  Node Types:
  1. [EntityType] - [description]
     Properties: [list with types]
  2. [EntityType] - [description]
     Properties: [list with types]
  3. [Continue for each node type...]

  Edge Types:
  1. [RelationType] (source -> target) - [description]
     Properties: [list with types]
  2. [Continue for each edge type...]

  Constraints:
  - [Cardinality, uniqueness, required properties]

Extraction Pipeline:
  1. Entity Extraction
     - Method: [LLM-assisted / NER / Hybrid]
     - Prompt template: [summary or reference]
     - Verification: [Multi-round / Second-LLM / Manual sample]
  2. Relation Extraction
     - Method: [Prompt-based / Dependency parsing / Hybrid]
     - Few-shot examples: [count and source]
  3. Normalization
     - Deduplication: [method]
     - Ontology linking: [target ontology]
     - Synonym resolution: [approach]

Layered Architecture:
  Layer 1 (Instance): [description of instance-level data]
  Layer 2 (Domain):   [description of curated domain knowledge]
  Layer 3 (Ontology): [description of formal schema]
  Provenance: [How source/confidence/timestamp are tracked]

Validation Plan:
  - Schema conformance: [automated checks]
  - Coverage: [expected entity/relation counts]
  - Consistency: [contradiction detection method]
  - Human review: [sampling strategy]

Estimated Scale: [node count, edge count, properties per node]
Key Dependencies: [libraries, APIs, ontologies]

NEXT STEPS:
- Implement extraction pipeline on sample data
- Populate graph and run validation suite
- Iterate schema based on extraction results
- Integrate with downstream application (RAG, reasoning, etc.)

Related Skills

lyndonkl/conf-theme-clustering

testing

VerifiedTrustedCommunity

Cluster a conference's event records into a small set of coarse themes with finer sub-clusters, an explicit outlier bucket, and soft (multi-membership) affinities — using the hybrid embed-then-label pipeline (embed abstracts, reduce, density-cluster, then LLM-label the clusters) when embedding libraries are available, and an LLM-reasoned hierarchical fallback when they are not. Embeddings do the grouping; the LLM only names the groups. Conference-agnostic. Use when turning structured event records into a navigable theme map for preference elicitation and scheduling, when you need 6-8 reasonable themes rather than 20 muddy ones, or when overlapping talks must belong to more than one theme. Trigger keywords - theme clustering, cluster talks, embed then label, soft membership, outlier talks, conference themes, topic map.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

development

VerifiedTrustedCommunity

Build a personal conference schedule as a constraint-optimization problem — hard constraints (no time overlap, room-to-room travel time, capacity/registration, the attendee's own must-attends and blackouts) plus a user-owned weighted objective trading interest against breadth, pacing (maximize contiguous free time), and serendipity. Surfaces unbreakable conflicts (two high-value overlapping talks the model cannot rank) as decisions for the human rather than silently picking, and reports what each choice traded away. Conference-agnostic. Use to turn a preference profile plus a theme map into a day-by-day plan, to resolve overlapping sessions, or to balance a packed vs paced schedule. Trigger keywords - schedule optimization, conference schedule, constraint optimization, overlapping talks, contiguous free time, conflict surfacing, packed vs paced.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

development

VerifiedTrustedCommunity

Parse a heterogeneous conference program (markdown, HTML, PDF-derived text, or JSON) into normalized event records with per-field confidence scores and independent classification axes (topic, depth, format, prerequisites, recorded, capacity). Detects the program's format before extracting, treats every inferred field as uncertain (present vs inferred vs missing), and flags thin or missing abstracts so downstream enrichment can target them. Conference-agnostic. Use when ingesting a conference or event schedule into a structured store, normalizing a talk/session list, or extracting per-session metadata with calibrated confidence. Trigger keywords - program ingestion, parse schedule, session extraction, event records, conference program, talk metadata, per-field confidence.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation

development

VerifiedTrustedCommunity

Build a personalized preference profile from a small number of well-chosen, cluster-grounded questions instead of a long survey. Represents the person's interests as an uncertainty region over the theme map, picks the single highest-information-gain choice-based question (contrasting real talks from different clusters), balances exploiting known interests against exploring uncertain ones, deliberately injects outlier probes to fight selection bias, and stops as soon as the schedule would be stable. Also elicits the user-owned objective weights and hard constraints. Interactive — runs where it can actually ask the person. Conference-agnostic. Use to turn a theme map into a preference profile, to decide what to ask a conference attendee, or to elicit scheduling priorities. Trigger keywords - preference elicitation, ask few questions, information gain, choice-based questions, selection bias probe, objective weights, attendee preferences.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-preference-elicitation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lyndonkl/claude.git

# Copy into Claude Code skills folder (global)
cp -r claude/skills/knowledge-graph-construction ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lyndonkl/claude

81 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

lyndonkl/knowledge-graph-construction

$ install --global

Security Scan Results

SKILL.md

Table of Contents

Knowledge Graph Construction

Workflow

Architecture Selection Guide

By Use Case

By Domain

Schema Patterns

Entity-Relation Pattern

Event Reification Pattern

Layered Tier Pattern

Output Template

Related Skills

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation

lyndonkl/knowledge-graph-construction

$ install --global

Security Scan Results

SKILL.md

Table of Contents

Knowledge Graph Construction

Workflow

Architecture Selection Guide

By Use Case

By Domain

Schema Patterns

Entity-Relation Pattern

Event Reification Pattern

Layered Tier Pattern

Output Template

Related Skills

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation