Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

pwrdrvr/ghcrawl

Name: ghcrawl
Author: pwrdrvr

skills/ghcrawl/SKILL.md

npx skillsauth add pwrdrvr/ghcrawl ghcrawl

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

ghcrawl

Use ghcrawl as the machine-facing interface for local GitHub duplicate-cluster analysis.

Never read the ghcrawl SQLite database directly with sqlite3 or any other database client. If the supported CLI cannot return the needed information, report that CLI problem to the user instead of bypassing the interface.

Do not scrape the TUI. Prefer JSON CLI output.

The skill has two modes:

Default mode: assume API credentials are absent, unavailable, or irrelevant and stay read-only on existing local data.
API-enabled mode: only after ghcrawl doctor --json proves GitHub and OpenAI auth are configured and healthy.

In default mode, do not treat missing credentials as a problem unless the user explicitly asked for an API-backed operation or a supported read-only CLI command failed and doctor shows local setup is broken.

Even in API-enabled mode, never run sync, embed, cluster, or refresh unless the user explicitly asks for that work. Those commands can take a long time, consume paid API usage, and trigger rate limiting if used too often.

Current pipeline defaults to keep in mind:

persistent semantic search and clustering use a vectorlite sidecar index
the default summary model is gpt-5-mini
the default embedding basis is title_original, so refresh does not summarize unless the user explicitly switches to title_summary
changing summary model or embedding basis with ghcrawl configure makes the next refresh rebuild vectors and clusters
opting into title_summary can materially improve clustering quality, but it adds OpenAI cost; on openclaw/openclaw it improved non-solo cluster membership by about 50%

Also never run close-thread or close-cluster unless the user explicitly asks you to mark a local thread or cluster closed.

When to use this skill

The user wants related issue/PR clusters for one repo.
The user wants to refresh local ghcrawl data before analysis.
The user wants cluster summaries, cluster detail dumps, or nearest neighbors from a local ghcrawl database.

Command preference

Prefer the installed ghcrawl bin.

If ghcrawl is not on PATH, use:

npx ghcrawl cli ...

Do not start by running ghcrawl --help or <subcommand> --help. The documented command surface in this skill and references/protocol.md is the default source of truth. If you are actively maintaining ghcrawl itself or syntax is genuinely in doubt, command help is available through ghcrawl help <command> and ghcrawl <command> --help.

Core workflow

1. Default read-only flow

Do not run doctor on skill startup by default.

Start with local read-only commands:

Without explicit user direction to refresh data, prefer these local-only commands:

ghcrawl threads owner/repo --numbers 12345 --json
ghcrawl clusters owner/repo --min-size 10 --limit 20 --sort recent --json
ghcrawl cluster-detail owner/repo --id 123 --member-limit 20 --body-chars 280 --json
ghcrawl threads owner/repo --numbers 42,43,44 --json
ghcrawl author owner/repo --login lqquan --json
ghcrawl search owner/repo --query "download stalls" --mode hybrid --json
ghcrawl neighbors owner/repo --number 42 --limit 10 --json
ghcrawl configure --json

These operate on the existing local SQLite dataset.

Treat that stored dataset as the default source of truth for read-only analysis. Do not probe credentials, inspect env vars, or explain missing auth unless an API-backed task was requested or the supported CLI path is failing.

By default:

threads and author hide locally closed issues/PRs
clusters and cluster-detail hide locally closed clusters

If the user explicitly wants to inspect those records, add --include-closed.

Use threads --numbers 12345 when you need to find the cluster for one specific issue/PR number. The returned thread record includes clusterId. If it is non-null, follow with cluster-detail --id <clusterId>.

Use configure --json when you need to confirm the currently selected summary model or embedding basis before suggesting an expensive refresh.

Use threads --numbers ... when you need a batch of specific issue/PR records. Do not pay the CLI startup cost 10 times for 10 separate single-thread lookups.

Use author --login ... when you need one author's open threads and their strongest stored same-author similarity matches in one call.

If the user explicitly asks to mark a local issue/PR or cluster closed, use:

ghcrawl close-thread owner/repo --number 42 --json
ghcrawl close-cluster owner/repo --id 123 --json

If close-thread closes the last open item in a cluster, ghcrawl will automatically mark that cluster closed too.

2. Check local health only when needed

Run:

ghcrawl doctor --json

If the bin is unavailable, fall back to:

pnpm --filter ghcrawl cli doctor --json

Only do this when:

the user explicitly wants an API-backed operation such as refresh, sync, embed, or cluster
or a read-only request failed and you need to know whether the local install/config/auth state is broken

Interpret the result like this:

If GitHub/OpenAI auth is missing or unhealthy, stay in read-only mode.
If GitHub/OpenAI auth is healthy, API-backed operations are available, but still require explicit user direction.

If doctor is unhealthy but the user asked only for read-only inspection, say that API-backed refresh is unavailable and continue with read-only CLI commands when possible.

3. If the CLI is unavailable or misbehaving

Use one supported fallback path before giving up:

pnpm --filter ghcrawl cli ...

If a documented ghcrawl command still fails, hangs, or returns unusable output through the supported CLI path, stop and report that to the user. Do not inspect tables, schema, or rows with sqlite3, pragma, or ad hoc SQL.

4. Refresh local data only when explicitly requested

Only if the user explicitly asks to refresh or rebuild data, and doctor says auth is healthy, use:

ghcrawl refresh owner/repo

This runs, in fixed order:

GitHub sync/reconcile
summarize-if-needed
embed refresh
cluster rebuild

You may skip steps only when the user explicitly wants that or the freshness state makes it unnecessary:

ghcrawl refresh owner/repo --no-sync
ghcrawl refresh owner/repo --no-cluster

Do not decide on your own to run cluster just because it is local-only. It is still long-running and should be treated as an explicit user-directed operation.

5. List clusters

Use:

ghcrawl clusters owner/repo --min-size 10 --limit 20 --sort recent --json

This returns:

repo stats
freshness state
cluster summaries

6. Inspect one cluster

Use:

ghcrawl cluster-detail owner/repo --id 123 --member-limit 20 --body-chars 280 --json

This returns:

the selected cluster summary
each member thread
a body snippet
stored summary fields when present

7. Optional deeper inspection

Use search or neighbors as needed:

ghcrawl search owner/repo --query "download stalls" --mode hybrid --json
ghcrawl neighbors owner/repo --number 42 --limit 10 --json

Output rules

Report the repo name and whether you refreshed data in this run.
When listing clusters, include:
- cluster id
- representative number and kind
- display title
- total size
- PR count
- issue count
- latest updated time
When naming a cluster in prose, use this shape:
- Cluster <clusterId> (#<representativeNumber> representative <issue|pr>)
- example: Cluster 23945 (#42035 representative issue)
When drilling into a cluster, include clickable GitHub links for each issue/PR if you mention them.
Prefer concise summaries over dumping raw JSON.
If freshness is stale, say that explicitly:
- embeddings outdated
- clusters outdated
If you stayed read-only because doctor was not healthy or the user did not explicitly request a refresh, say that explicitly.

References

For the exact JSON-oriented command surface and examples, read:

references/protocol.md

pwrdrvr/ghcrawl

skills/ghcrawl/SKILL.md

Use the local ghcrawl CLI to inspect duplicate clusters and issue/PR summaries from the existing ghcrawl dataset, and refresh one repo only when the user explicitly asks. Use when a user wants to triage related issues or PRs, inspect semantic clusters, or run ghcrawl's staged refresh pipeline.

16 stars

tools

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add pwrdrvr/ghcrawl ghcrawl

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 11, 2026, 8:44 PM26.4s3 files scanned

SKILL.md

name:: ghcrawl
description:: Use the local ghcrawl CLI to inspect duplicate clusters and issue/PR summaries from the existing ghcrawl dataset, and refresh one repo only when the user explicitly asks. Use when a user wants to triage related issues or PRs, inspect semantic clusters, or run ghcrawl's staged refresh pipeline.
allowed-tools:: Bash(ghcrawl:*), Bash(pnpm:*), Read(*)

ghcrawl

Use ghcrawl as the machine-facing interface for local GitHub duplicate-cluster analysis.

Do not scrape the TUI. Prefer JSON CLI output.

The skill has two modes:

Default mode: assume API credentials are absent, unavailable, or irrelevant and stay read-only on existing local data.
API-enabled mode: only after ghcrawl doctor --json proves GitHub and OpenAI auth are configured and healthy.

Current pipeline defaults to keep in mind:

persistent semantic search and clustering use a vectorlite sidecar index
the default summary model is gpt-5-mini
the default embedding basis is title_original, so refresh does not summarize unless the user explicitly switches to title_summary
changing summary model or embedding basis with ghcrawl configure makes the next refresh rebuild vectors and clusters
opting into title_summary can materially improve clustering quality, but it adds OpenAI cost; on openclaw/openclaw it improved non-solo cluster membership by about 50%

Also never run close-thread or close-cluster unless the user explicitly asks you to mark a local thread or cluster closed.

When to use this skill

The user wants related issue/PR clusters for one repo.
The user wants to refresh local ghcrawl data before analysis.
The user wants cluster summaries, cluster detail dumps, or nearest neighbors from a local ghcrawl database.

Command preference

Prefer the installed ghcrawl bin.

If ghcrawl is not on PATH, use:

npx ghcrawl cli ...

Core workflow

1. Default read-only flow

Do not run doctor on skill startup by default.

Start with local read-only commands:

Without explicit user direction to refresh data, prefer these local-only commands:

ghcrawl threads owner/repo --numbers 12345 --json
ghcrawl clusters owner/repo --min-size 10 --limit 20 --sort recent --json
ghcrawl cluster-detail owner/repo --id 123 --member-limit 20 --body-chars 280 --json
ghcrawl threads owner/repo --numbers 42,43,44 --json
ghcrawl author owner/repo --login lqquan --json
ghcrawl search owner/repo --query "download stalls" --mode hybrid --json
ghcrawl neighbors owner/repo --number 42 --limit 10 --json
ghcrawl configure --json

These operate on the existing local SQLite dataset.

By default:

threads and author hide locally closed issues/PRs
clusters and cluster-detail hide locally closed clusters

If the user explicitly wants to inspect those records, add --include-closed.

Use configure --json when you need to confirm the currently selected summary model or embedding basis before suggesting an expensive refresh.

Use threads --numbers ... when you need a batch of specific issue/PR records. Do not pay the CLI startup cost 10 times for 10 separate single-thread lookups.

Use author --login ... when you need one author's open threads and their strongest stored same-author similarity matches in one call.

If the user explicitly asks to mark a local issue/PR or cluster closed, use:

ghcrawl close-thread owner/repo --number 42 --json
ghcrawl close-cluster owner/repo --id 123 --json

If close-thread closes the last open item in a cluster, ghcrawl will automatically mark that cluster closed too.

2. Check local health only when needed

Run:

ghcrawl doctor --json

If the bin is unavailable, fall back to:

pnpm --filter ghcrawl cli doctor --json

Only do this when:

the user explicitly wants an API-backed operation such as refresh, sync, embed, or cluster
or a read-only request failed and you need to know whether the local install/config/auth state is broken

Interpret the result like this:

If GitHub/OpenAI auth is missing or unhealthy, stay in read-only mode.
If GitHub/OpenAI auth is healthy, API-backed operations are available, but still require explicit user direction.

If doctor is unhealthy but the user asked only for read-only inspection, say that API-backed refresh is unavailable and continue with read-only CLI commands when possible.

3. If the CLI is unavailable or misbehaving

Use one supported fallback path before giving up:

pnpm --filter ghcrawl cli ...

4. Refresh local data only when explicitly requested

Only if the user explicitly asks to refresh or rebuild data, and doctor says auth is healthy, use:

ghcrawl refresh owner/repo

This runs, in fixed order:

GitHub sync/reconcile
summarize-if-needed
embed refresh
cluster rebuild

You may skip steps only when the user explicitly wants that or the freshness state makes it unnecessary:

ghcrawl refresh owner/repo --no-sync
ghcrawl refresh owner/repo --no-cluster

Do not decide on your own to run cluster just because it is local-only. It is still long-running and should be treated as an explicit user-directed operation.

5. List clusters

Use:

ghcrawl clusters owner/repo --min-size 10 --limit 20 --sort recent --json

This returns:

repo stats
freshness state
cluster summaries

6. Inspect one cluster

Use:

ghcrawl cluster-detail owner/repo --id 123 --member-limit 20 --body-chars 280 --json

This returns:

the selected cluster summary
each member thread
a body snippet
stored summary fields when present

7. Optional deeper inspection

Use search or neighbors as needed:

ghcrawl search owner/repo --query "download stalls" --mode hybrid --json
ghcrawl neighbors owner/repo --number 42 --limit 10 --json

Output rules

Report the repo name and whether you refreshed data in this run.
When listing clusters, include:
- cluster id
- representative number and kind
- display title
- total size
- PR count
- issue count
- latest updated time
When naming a cluster in prose, use this shape:
- Cluster <clusterId> (#<representativeNumber> representative <issue|pr>)
- example: Cluster 23945 (#42035 representative issue)
When drilling into a cluster, include clickable GitHub links for each issue/PR if you mention them.
Prefer concise summaries over dumping raw JSON.
If freshness is stale, say that explicitly:
- embeddings outdated
- clusters outdated
If you stayed read-only because doctor was not healthy or the user did not explicitly request a refresh, say that explicitly.

References

For the exact JSON-oriented command surface and examples, read:

references/protocol.md

Related Skills

pwrdrvr/release

documentation

VerifiedTrustedCommunity

Plan and publish a GitHub Release in a tag-driven repository. Use when a user asks to cut, prepare, or publish a software release, propose the next vX.Y.Z tag, support prerelease tags like vX.Y.Z-beta.N, draft better release notes from PRs and direct commits since the last release, update CHANGELOG.md, create the tag pinned to an exact commit, and watch the publish workflow.

16SKILL.mdUpdated Apr 11, 2026

pwrdrvr/project-manager

testing

VerifiedTrustedCommunity

Manage GitHub issues and the GitHub Project board for the current repository, while keeping the local tracker in sync. Use when the user wants to capture freeform requirements as issues, flesh out issue descriptions from repo or upstream research, triage Priority/Size/Workflow/Status, add issues or PRs to the repo's configured project board, or reconcile GitHub state with `.local/work-items.yaml`.

16SKILL.mdUpdated Apr 11, 2026

pwrdrvr/project-manager

openclaw/taskflow

tools

VerifiedTrustedCommunity

Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layers like Lobster, ACPX, plugins, or plain code. Keep conditional logic in the caller; use TaskFlow for flow identity, child-task linkage, waiting state, revision-checked mutations, and user-facing emergence.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/extensions/lobster

tools

VerifiedTrustedCommunity

# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/extensions/lobster

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/pwrdrvr/ghcrawl.git

# Copy into Claude Code skills folder (global)
cp -r ghcrawl/skills/ghcrawl ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

pwrdrvr/ghcrawl

16 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT