Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

agtm1199/build-kg

Name: build-kg
Author: agtm1199

.claude/skills/build-kg/SKILL.md

npx skillsauth add agtm1199/build-kg build-kg

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Build a knowledge graph for: $ARGUMENTS

Activate the virtual environment before every Python command: . venv/bin/activate && <command>.

Phase 0: Init

Sanitize the topic into a graph-safe name (lowercase, underscores, no special chars). Example: "kubernetes networking" becomes kubernetes_networking.
Create the working directory:
```
mkdir -p kg_builds/$GRAPH_NAME
```
Update AGE_GRAPH_NAME=$GRAPH_NAME in .env.
Initialize the graph schema:
```
python -m build_kg.setup_graph
```

Phase 0.5: Ontology Generation

Design a domain ontology for $ARGUMENTS. This is the most important phase.

Identify 3-6 node types for the core entities. For each:
- label: PascalCase (e.g. Component, Algorithm)
- description: what this node represents
- properties: key-value pairs with types (string, integer, float, boolean, json)
Identify 3-8 edge types. For each:
- label: UPPER_SNAKE_CASE (e.g. USES, DEPENDS_ON)
- source and target: node labels
- description: what the relationship means
Choose a root_node: the primary node type that maps 1:1 to source fragments.
Write the json_schema: the exact JSON format the LLM should output.
Save as kg_builds/$GRAPH_NAME/ontology.yaml:

description: "<Topic> knowledge graph ontology"
nodes:
  - label: "NodeType1"
    description: "..."
    properties:
      name: "string"
      category: "string"
  - label: "NodeType2"
    description: "..."
    properties:
      name: "string"
edges:
  - label: "RELATIONSHIP_NAME"
    source: "NodeType1"
    target: "NodeType2"
    description: "..."
root_node: "NodeType1"
json_schema: |
  {
    "entities": [
      {"_label": "NodeType1|NodeType2", "name": "...", "category": "..."}
    ],
    "relationships": [
      {"_label": "RELATIONSHIP_NAME", "_from_index": 0, "_to_index": 1}
    ]
  }

Reinitialize the graph with the ontology:

python -m build_kg.setup_graph --ontology kg_builds/$GRAPH_NAME/ontology.yaml

Phase 1: Discover Sources

Find 5-15 authoritative sources about $ARGUMENTS using web search.

Search with multiple queries:
- "$ARGUMENTS" official documentation
- "$ARGUMENTS" comprehensive guide
- "$ARGUMENTS" reference manual
- "$ARGUMENTS" tutorial overview
- "$ARGUMENTS" specification
Evaluate each result: Is it authoritative? Does it have substantial text? Is it crawlable?
Organize into priority tiers:
- P1: Official docs, specs, reference manuals (depth 2-3, up to 50 pages)
- P2: Tutorials, guides, educational content (depth 1-2, up to 20 pages)
Create kg_builds/$GRAPH_NAME/manifest.json:

{
  "topic": "$ARGUMENTS",
  "graph_name": "$GRAPH_NAME",
  "sources": [
    {
      "source_name": "descriptive_short_name",
      "url": "https://...",
      "title": "Page Title",
      "authority": "Organization Name",
      "jurisdiction": "",
      "doc_type": "documentation",
      "priority": "P1",
      "depth": 2,
      "max_pages": 50,
      "delay": 1500
    }
  ],
  "defaults": {
    "jurisdiction": "",
    "authority": "",
    "doc_type": "documentation"
  }
}

Gap analysis: check all ontology node types have source coverage. Search for more if needed.

Phase 2: Crawl

For each source in the manifest:

build-kg-crawl --url "$URL" --output kg_builds/$GRAPH_NAME/crawled/$SOURCE_NAME --depth $DEPTH --pages $MAX_PAGES --delay $DELAY --format markdown

If a crawl fails, note it and continue. Do not retry more than once.

Phase 3: Chunk

build-kg-chunk kg_builds/$GRAPH_NAME/crawled kg_builds/$GRAPH_NAME/chunks --strategy by_title --max-chars 1000

Phase 4: Load

build-kg-load kg_builds/$GRAPH_NAME/chunks --manifest kg_builds/$GRAPH_NAME/manifest.json

Phase 5: Parse

Small datasets (< 500 fragments) — sync:

build-kg-parse --ontology kg_builds/$GRAPH_NAME/ontology.yaml

Large datasets (500+ fragments) — batch (50% cheaper):

build-kg-parse-batch prepare --ontology kg_builds/$GRAPH_NAME/ontology.yaml --output kg_builds/$GRAPH_NAME/batch_requests.jsonl
build-kg-parse-batch submit kg_builds/$GRAPH_NAME/batch_requests.jsonl
build-kg-parse-batch status $BATCH_ID --watch
build-kg-parse-batch process $BATCH_ID --ontology kg_builds/$GRAPH_NAME/ontology.yaml

Phase 6: Report

Count nodes by type:

SELECT * FROM cypher('$GRAPH_NAME', $$ MATCH (n) RETURN label(n) AS type, count(*) AS total $$) AS (type agtype, total agtype);

Count edges by type:

SELECT * FROM cypher('$GRAPH_NAME', $$ MATCH ()-[r]->() RETURN type(r) AS rel, count(*) AS total $$) AS (rel agtype, total agtype);

Show example subgraphs:

SELECT * FROM cypher('$GRAPH_NAME', $$ MATCH (a)-[r]->(b) RETURN a, type(r), b LIMIT 10 $$) AS (a agtype, rel agtype, b agtype);

Present: topic, graph name, ontology summary, sources crawled, fragments loaded, node/edge counts by type, example Cypher queries, and cost estimate.

agtm1199/build-kg

.claude/skills/build-kg/SKILL.md

Build a knowledge graph from any topic. Generates an ontology, discovers sources, crawls, chunks, loads, and parses into Apache AGE (PostgreSQL).

1 stars

development

Updated Mar 30, 2026

$ install --global

skillsauth

npx skillsauth add agtm1199/build-kg build-kg

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 1, 2026, 7:04 AM57.1s1 file scanned

SKILL.md

name:: build-kg
description:: Build a knowledge graph from any topic. Generates an ontology, discovers sources, crawls, chunks, loads, and parses into Apache AGE (PostgreSQL).
argument-hint:: [topic]
disable-model-invocation:: true
allowed-tools:: Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch

Build a knowledge graph for: $ARGUMENTS

Activate the virtual environment before every Python command: . venv/bin/activate && <command>.

Phase 0: Init

Sanitize the topic into a graph-safe name (lowercase, underscores, no special chars). Example: "kubernetes networking" becomes kubernetes_networking.
Create the working directory:
```
mkdir -p kg_builds/$GRAPH_NAME
```
Update AGE_GRAPH_NAME=$GRAPH_NAME in .env.
Initialize the graph schema:
```
python -m build_kg.setup_graph
```

Phase 0.5: Ontology Generation

Design a domain ontology for $ARGUMENTS. This is the most important phase.

Identify 3-6 node types for the core entities. For each:
- label: PascalCase (e.g. Component, Algorithm)
- description: what this node represents
- properties: key-value pairs with types (string, integer, float, boolean, json)
Identify 3-8 edge types. For each:
- label: UPPER_SNAKE_CASE (e.g. USES, DEPENDS_ON)
- source and target: node labels
- description: what the relationship means
Choose a root_node: the primary node type that maps 1:1 to source fragments.
Write the json_schema: the exact JSON format the LLM should output.
Save as kg_builds/$GRAPH_NAME/ontology.yaml:

description: "<Topic> knowledge graph ontology"
nodes:
  - label: "NodeType1"
    description: "..."
    properties:
      name: "string"
      category: "string"
  - label: "NodeType2"
    description: "..."
    properties:
      name: "string"
edges:
  - label: "RELATIONSHIP_NAME"
    source: "NodeType1"
    target: "NodeType2"
    description: "..."
root_node: "NodeType1"
json_schema: |
  {
    "entities": [
      {"_label": "NodeType1|NodeType2", "name": "...", "category": "..."}
    ],
    "relationships": [
      {"_label": "RELATIONSHIP_NAME", "_from_index": 0, "_to_index": 1}
    ]
  }

Reinitialize the graph with the ontology:

python -m build_kg.setup_graph --ontology kg_builds/$GRAPH_NAME/ontology.yaml

Phase 1: Discover Sources

Find 5-15 authoritative sources about $ARGUMENTS using web search.

Search with multiple queries:
- "$ARGUMENTS" official documentation
- "$ARGUMENTS" comprehensive guide
- "$ARGUMENTS" reference manual
- "$ARGUMENTS" tutorial overview
- "$ARGUMENTS" specification
Evaluate each result: Is it authoritative? Does it have substantial text? Is it crawlable?
Organize into priority tiers:
- P1: Official docs, specs, reference manuals (depth 2-3, up to 50 pages)
- P2: Tutorials, guides, educational content (depth 1-2, up to 20 pages)
Create kg_builds/$GRAPH_NAME/manifest.json:

{
  "topic": "$ARGUMENTS",
  "graph_name": "$GRAPH_NAME",
  "sources": [
    {
      "source_name": "descriptive_short_name",
      "url": "https://...",
      "title": "Page Title",
      "authority": "Organization Name",
      "jurisdiction": "",
      "doc_type": "documentation",
      "priority": "P1",
      "depth": 2,
      "max_pages": 50,
      "delay": 1500
    }
  ],
  "defaults": {
    "jurisdiction": "",
    "authority": "",
    "doc_type": "documentation"
  }
}

Gap analysis: check all ontology node types have source coverage. Search for more if needed.

Phase 2: Crawl

For each source in the manifest:

build-kg-crawl --url "$URL" --output kg_builds/$GRAPH_NAME/crawled/$SOURCE_NAME --depth $DEPTH --pages $MAX_PAGES --delay $DELAY --format markdown

If a crawl fails, note it and continue. Do not retry more than once.

Phase 3: Chunk

build-kg-chunk kg_builds/$GRAPH_NAME/crawled kg_builds/$GRAPH_NAME/chunks --strategy by_title --max-chars 1000

Phase 4: Load

build-kg-load kg_builds/$GRAPH_NAME/chunks --manifest kg_builds/$GRAPH_NAME/manifest.json

Phase 5: Parse

Small datasets (< 500 fragments) — sync:

build-kg-parse --ontology kg_builds/$GRAPH_NAME/ontology.yaml

Large datasets (500+ fragments) — batch (50% cheaper):

build-kg-parse-batch prepare --ontology kg_builds/$GRAPH_NAME/ontology.yaml --output kg_builds/$GRAPH_NAME/batch_requests.jsonl
build-kg-parse-batch submit kg_builds/$GRAPH_NAME/batch_requests.jsonl
build-kg-parse-batch status $BATCH_ID --watch
build-kg-parse-batch process $BATCH_ID --ontology kg_builds/$GRAPH_NAME/ontology.yaml

Phase 6: Report

Count nodes by type:

SELECT * FROM cypher('$GRAPH_NAME', $$ MATCH (n) RETURN label(n) AS type, count(*) AS total $$) AS (type agtype, total agtype);

Count edges by type:

SELECT * FROM cypher('$GRAPH_NAME', $$ MATCH ()-[r]->() RETURN type(r) AS rel, count(*) AS total $$) AS (rel agtype, total agtype);

Show example subgraphs:

SELECT * FROM cypher('$GRAPH_NAME', $$ MATCH (a)-[r]->(b) RETURN a, type(r), b LIMIT 10 $$) AS (a agtype, rel agtype, b agtype);

Present: topic, graph name, ontology summary, sources crawled, fragments loaded, node/edge counts by type, example Cypher queries, and cost estimate.

Related Skills

openclaw/openclaw-secret-scanning-maintainer

development

VerifiedTrustedCommunity

Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.

357,764SKILL.mdUpdated Apr 15, 2026

openclaw/openclaw-secret-scanning-maintainer

openclaw/openclaw-release-maintainer

development

VerifiedTrustedCommunity

Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/openclaw-release-maintainer

openclaw/openclaw-qa-testing

development

VerifiedTrustedCommunity

Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/openclaw-qa-testing

openclaw/openclaw-parallels-smoke

development

VerifiedTrustedCommunity

End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/openclaw-parallels-smoke

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/agtm1199/build-kg.git

# Copy into Claude Code skills folder (global)
cp -r build-kg/.claude/skills/build-kg ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

agtm1199/build-kg

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT