skills/mine.eval-repo/SKILL.md
Use when the user says: "evaluate this repo", "should I use this library", or wants to assess a third-party dependency. Checks test coverage, code quality, maintenance health, bus factor, and maturity.
npx skillsauth add NodeJSmith/Claudefiles mine.eval-repoInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Thorough assessment of a third-party GitHub repository to decide whether it's worth adopting as a dependency, tool, or reference. Answers the question: "Should I use this?"
$ARGUMENTS — a GitHub URL or owner/repo identifier. Examples:
/eval-repo https://github.com/pchuri/confluence-cli/eval-repo pallets/flaskRun get-skill-tmpdir <repo-name>-eval to create a unique directory (use the repository name only — no owner prefix or /; e.g., flask for pallets/flask), then clone the repo into it. Launch parallel subagents to collect data. All subagents work from the cloned repo.
Before launching subagents, fetch repo metadata with gh:
gh repo view <owner/repo> --json name,description,createdAt,updatedAt,pushedAt,stargazerCount,forkCount,issues,pullRequests,licenseInfo,primaryLanguage,languages,defaultBranchRef,isArchived,isFork
subagent_type: Bash)Run git commands to assess project health over time:
git tag — how many releases, how often, is semantic versioning used?git shortlog -sne --all — count of contributors, concentration of commits (bus factor)subagent_type: Explore)subagent_type: Explore)subagent_type: Explore)Combine subagent findings into a structured assessment. Don't just dump data — interpret it.
Evaluate each dimension on a simple scale: Strong / Adequate / Weak / Missing
| Dimension | Assessment | Evidence | |-----------|-----------|----------| | Maintenance | Active / Sporadic / Dormant / Abandoned | commit frequency, last activity, open issues response time | | Test Coverage | Comprehensive / Partial / Minimal / None | test file count, coverage config, CI enforcement | | Code Quality | Clean / Acceptable / Rough / Poor | file sizes, structure, linting, type safety | | Documentation | Thorough / Adequate / Sparse / None | README depth, API docs, examples | | Community | Growing / Stable / Stagnant / Solo | contributors, stars, forks, issue/PR activity | | Bus Factor | Healthy (3+) / Risky (2) / Critical (1) | commit distribution across contributors | | Security | Proactive / Basic / Reactive / None | audit in CI, dependency scanning, input validation |
Explicitly call out any of these if present:
Also call out positives:
Present findings conversationally, then use AskUserQuestion to determine next steps.
## Repository Evaluation: <owner/repo>
### Overview
<1-2 sentences: what it is, what it does>
### Project Vitals
| Metric | Value |
|--------|-------|
| Created | <date> (<age>) |
| Last activity | <date> |
| Version | <latest tag/release> |
| Stars / Forks | <count> |
| Contributors | <count> (bus factor: <N>) |
| License | <license> |
| Language | <primary language> |
### Health Assessment
<The health signals matrix from Phase 2>
### Red Flags
<Bulleted list, or "None identified">
### Green Flags
<Bulleted list>
### Code Quality Notes
<2-4 specific observations about the code — not generic, grounded in what you read>
### Test Coverage
<Specific findings: what's tested, what isn't, how thorough>
### Verdict
<Clear recommendation: one of>
- **Adopt with confidence** — well-maintained, well-tested, low risk
- **Adopt with caution** — usable but has notable gaps; document what you're accepting
- **Use for reference only** — interesting ideas but not production-ready as a dependency
- **Avoid** — significant quality, maintenance, or security concerns
- **Build your own** — the problem is simple enough that wrapping the underlying API/library directly is safer than this abstraction
<2-3 sentences explaining the recommendation>
AskUserQuestion:
question: "Based on this evaluation, what would you like to do?"
header: "Next step"
multiSelect: false
options:
- label: "Adopt it"
description: "I'll add it as a dependency — thanks for the review"
- label: "Dig deeper"
description: "I want to look at specific areas more closely before deciding"
- label: "Look for alternatives"
description: "Search for other libraries/tools that solve the same problem"
- label: "Skip it"
description: "Not worth it — I'll find another approach"
If the user wants to dig deeper, ask what specifically concerns them and focus investigation there.
If the user wants to look for alternatives, use WebSearch to find competing projects, then offer to evaluate the top 2-3 candidates with the same process (abbreviated — skip the full subagent sweep, focus on the health signals matrix for comparison).
After the evaluation is complete, remove the cloned repo directory:
rm -rf "<dir>"
Note: rm -rf is intentionally not pre-approved — the user will see a permission prompt for this cleanup step.
/mine.challenge for that/mine.research if you need to evaluate how this would integrate into your specific projectdevelopment
Use when the user says: 'create an issue', 'file an issue', 'open an issue', 'write an issue', 'new issue for this'. Codebase-aware issue creation — investigates the code to produce well-structured issues with acceptance criteria, affected areas, and enough detail for automated triage.
development
Use when the user says: 'triage issues', 'classify issues by complexity', 'assess issue complexity', 'find quick wins', 'which issues are small', 'batch issue assessment'. Batch codebase-aware issue triage — parallel Haiku subagents assess actual complexity and effort by reading the code, not just titles.
development
Use when the user says: "review my changes", "run the reviewers", "code and integration review", "readability review", "maintainability review", "sniff test this", "WTF check", "code smells", "is this code any good", "fresh eyes on this branch", "review this directory", "check this module". Dispatches three parallel reviewers — code, integration, and a readability pass — and consolidates findings into one prioritized report.
development
Use when the user says: "clean code check", "style review", "LLM smell check", "code hygiene", "nitpick this", "style check", "find style sins", "nitpicker review", "anal retentive review", "exhaustive style review", "no-filter style report". Dispatches three parallel stylistic checkers — llm-checker (training-bias patterns), lazy-checker (deferred debt), and nitpicker (style hygiene) — and consolidates findings into a report organized by checker with a Summary section for orchestration consumption.