skills/artificial-intelligence-open-source/SKILL.md
Analyze open-source projects for sustainability risks and apply AI-driven interventions for bug triaging, community health assessment, vulnerability detection, contributor onboarding, and maintenance automation. Trigger phrases: "analyze OSS health", "assess project sustainability", "triage issues automatically", "detect community smells", "onboard new contributors", "audit OSS security posture"
npx skillsauth add ndpvt-web/arxiv-claude-skills artificial-intelligence-open-sourceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to audit open-source software projects for sustainability risks and recommend concrete AI-augmented interventions. Drawing from a systematic literature review of AI applications in OSS (Karim, Lu & Goggins, 2026), it operationalizes six intervention domains: automated bug triaging, community health analytics, vulnerability detection, contributor onboarding pipelines, maintenance automation, and environmental impact assessment. The core principle is treating AI as a cooperative augmentation layer for human infrastructure, not a replacement.
The paper synthesizes research across six sustainability challenge domains, each mapped to specific AI techniques:
Bug fixing and maintenance: LLM-based patch generation combined with neural machine translation for code repair. The key insight is chaining detection-to-patch-to-review as an automated pipeline rather than treating each step in isolation. Defect prediction models identify high-risk modules before bugs manifest, reducing reactive maintenance burden.
Community health analytics: Machine learning models analyze trace data from repositories, issue trackers, and communication platforms to compute health metrics. Tools like YOSHI map project activity to community behavioral patterns, while csDetector identifies "community smells" -- organizational dysfunction indicators like information silos, lone wolves, or radio silence. Sentiment analysis on discussion threads detects toxicity or disengagement early.
Vulnerability detection and security: AI-driven static and dynamic analysis, NLP-based code scanning, and predictive risk modeling work together. The critical differentiator from standard SAST/DAST is the predictive layer -- identifying which code regions are likely to develop vulnerabilities based on change patterns, contributor experience, and historical defect density. Explainable AI (XAI) is emphasized so developers understand why code is flagged.
Gather project metadata: Clone the repository (or read its structure) and collect key signals: commit frequency over time, contributor count and distribution, open/closed issue ratio, PR merge times, dependency freshness, and CI/CD status.
Compute community health indicators: Analyze contributor activity patterns to identify bus factor (how many contributors account for 80%+ of commits), attrition rate (contributors who stopped in the last 6 months), and response latency on issues/PRs. Flag projects where bus factor < 3 or attrition exceeds 50%.
Detect community smells: Scan for organizational anti-patterns:
Triage issue backlog: Classify open issues by type (bug, feature, documentation, security), estimate severity from description and labels, identify stale issues (no activity > 90 days), and recommend prioritization. Tag issues suitable for newcomers based on complexity signals (file count, code churn, test coverage of affected areas).
Audit security posture: Scan for known vulnerabilities in dependencies (CVE databases), identify code hotspots with high churn and low test coverage (vulnerability-prone zones), check for security policy presence (SECURITY.md, responsible disclosure process), and assess patch velocity (time from CVE disclosure to fix merge).
Evaluate onboarding readiness: Check for CONTRIBUTING.md quality, good-first-issue label usage, mentorship program indicators, documentation coverage, and development environment setup automation. Score the project's newcomer friendliness.
Generate sustainability report: Produce a structured assessment with risk scores per domain, specific findings with evidence, and prioritized recommendations. Each recommendation maps to a concrete action (script, CI workflow, bot configuration, or process change).
Recommend AI-augmented interventions: For each identified risk, suggest specific tooling: issue labeling bots, automated dependency update workflows, sentiment analysis integrations, contributor matching systems, or security scanning pipelines. Provide implementation sketches, not just tool names.
Assess environmental footprint: If the project uses AI tooling (Copilot, CI-integrated ML models, LLM-based bots), estimate computational overhead and suggest efficiency measures like parameter-efficient fine-tuning, caching, or batched inference.
Example 1: Full project sustainability audit
User: "Analyze the health of this open-source project and tell me if it's at risk"
Approach:
Output:
## OSS Sustainability Report: example-project
### Risk Level: MODERATE (Score: 58/100)
### Community Health
- **Bus Factor: 2** (critical -- alice and bob account for 87% of commits)
- **Contributor Attrition: 40%** (6 of 15 contributors inactive >6 months)
- **Issue Response Time: 12 days median** (acceptable)
- **PR Merge Time: 4 days median** (healthy)
### Community Smells Detected
- [HIGH] **Lone Wolf**: alice authored 62% of commits in last 12 months
- [MEDIUM] **Organizational Silo**: frontend/ and backend/ directories have
zero contributor overlap
### Security Posture
- 3 outdated dependencies with known CVEs (2 moderate, 1 high)
- No SECURITY.md or disclosure policy found
- Patch velocity: N/A (no prior security fixes tracked)
### Onboarding Readiness: LOW
- CONTRIBUTING.md exists but lacks setup instructions
- 0 issues labeled "good-first-issue" in last 6 months
- No automated dev environment (devcontainer, Nix, etc.)
### Recommended Interventions
1. **Immediate**: Update 3 vulnerable dependencies and add SECURITY.md
2. **Short-term**: Label 5-10 issues as good-first-issue; add devcontainer
3. **Medium-term**: Set up Dependabot/Renovate for automated dependency PRs
4. **Structural**: Recruit 1-2 maintainers for frontend/ to reduce bus factor
Example 2: Automated issue triage
User: "Help me triage these 200+ open issues and figure out what to prioritize"
Approach:
Output:
## Issue Triage Report (214 open issues)
### By Priority
- **Critical (5)**: #189 (security), #201 (data loss), #145 (crash), #198 (auth bypass), #203 (regression)
- **High (23)**: Performance regressions, broken integrations, accessibility blockers
- **Medium (87)**: Feature requests with community demand, non-critical bugs
- **Low (99)**: Cosmetic issues, edge cases, nice-to-haves
### Stale Issues (no activity >90 days): 67
- Recommend closing 31 (resolved or no longer relevant)
- Recommend pinging 22 (potentially valid, need reproduction)
- Recommend labeling 14 as "help-wanted"
### Good First Issues (newcomer-friendly): 18 candidates
- Issues touching <3 files, well-documented behavior, existing test coverage
- Examples: #156 (typo in error message), #171 (add input validation),
#183 (update deprecated API call)
### Duplicates Detected: 11 clusters
- #102 / #134 / #178 (same rendering bug, different reports)
- #88 / #165 (same feature request with different framing)
Example 3: Security-focused sustainability audit
User: "Audit this project's security posture and set up automated vulnerability detection"
Approach:
Output:
## Security Sustainability Audit
### Dependency Vulnerabilities
| Package | Current | Fixed | Severity | CVE |
|---------------|---------|---------|----------|----------------|
| lodash | 4.17.19 | 4.17.21 | High | CVE-2021-23337 |
| axios | 0.21.0 | 0.21.1 | Moderate | CVE-2021-3749 |
| node-fetch | 2.6.0 | 2.6.7 | Moderate | CVE-2022-0235 |
### Code Risk Hotspots (high churn + low coverage)
- src/auth/session.js -- 47 commits, 12% test coverage
- src/api/handlers.js -- 38 commits, 23% test coverage
- lib/crypto/utils.js -- 22 commits, 0% test coverage
### Missing Security Infrastructure
- [ ] No SECURITY.md
- [ ] No automated dependency scanning in CI
- [ ] No branch protection on main
- [ ] No signed commits required
### Recommended CI Workflow (GitHub Actions)
```yaml
# .github/workflows/security-scan.yml
name: Security Scan
on: [push, pull_request, schedule]
# ... (generates complete workflow file)
## Best Practices
- **Do**: Analyze quantitative signals (commit frequency, issue velocity, contributor distribution) before making qualitative judgments about project health.
- **Do**: Present findings with evidence -- cite specific issue numbers, commit hashes, or contributor counts rather than vague assessments.
- **Do**: Frame AI interventions as augmenting maintainer capacity, not replacing human judgment. Automated triage suggests priority; humans decide.
- **Do**: Check for community smells as leading indicators -- organizational dysfunction predicts technical debt and abandonment before code metrics do.
- **Avoid**: Reducing sustainability to a single score. Always show the multi-dimensional breakdown (community, security, maintenance, onboarding).
- **Avoid**: Recommending heavy AI tooling for small projects. A 2-maintainer project needs a CONTRIBUTING.md and good-first-issue labels, not an ML-powered triage bot.
## Error Handling
- **Insufficient git history**: If the repository has fewer than 50 commits or less than 6 months of history, note that trend analysis is unreliable. Fall back to snapshot metrics (current state) rather than trajectory analysis.
- **Private or restricted issue trackers**: If issues are not publicly accessible, work with available signals (commits, PR activity, documentation) and explicitly note the gap in the report.
- **Monorepo or multi-project repositories**: Scope analysis to the relevant subdirectory. Contributor metrics for a monorepo are misleading if not filtered by path.
- **No CI/CD configuration detected**: Flag this as a sustainability risk in itself. Recommend minimum viable CI (linting, tests, dependency scanning) as a priority intervention.
- **Conflicting signals**: When metrics diverge (e.g., high commit activity but rising issue backlog), surface both signals and let the user interpret. Do not smooth over contradictions.
## Limitations
- **Community sentiment analysis is approximate**: Without access to real-time communication channels (Slack, Discord, mailing lists), assessment relies on public GitHub/GitLab data, which captures only a fraction of community dynamics.
- **Predictive accuracy degrades for small projects**: Health metrics and attrition models are most reliable for projects with 10+ regular contributors. For smaller projects, individual contributor decisions dominate and are unpredictable.
- **Security scanning is not exhaustive**: This approach identifies known CVEs and code risk hotspots but does not replace dedicated penetration testing or formal security audits.
- **Bias in historical data**: Contributor metrics may reflect systemic biases (timezone, language, institutional affiliation) rather than actual contribution quality. Flag this caveat when analyzing contributor distribution.
- **Snapshot vs. trajectory**: A single-point audit captures current state but may miss trends. Recommend periodic re-assessment for projects where sustainability is a concern.
## Reference
**Paper**: Karim, S. M. R. U., Lu, W., & Goggins, S. (2026). *Artificial Intelligence in Open Source Software Engineering: A Foundation for Sustainability*. [arXiv:2602.07071v1](https://arxiv.org/abs/2602.07071v1)
**What to look for**: Table 1 maps sustainability challenges to AI techniques with specific research references. Table 2 provides a comparative analysis of AI technique strengths and limitations per domain. The community health analytics section (csDetector, YOSHI) and the ethical concerns framework are the most directly actionable for building assessment tooling.
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".