Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

oborchers/source-evaluation

Name: source-evaluation
Author: oborchers

deep-research/skills/source-evaluation/SKILL.md

npx skillsauth add oborchers/fractional-cto source-evaluation

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Source Evaluation

Source quality is the primary bottleneck in research agent pipelines. Research on deep research agent trajectories found that over 57% of source errors occur in early retrieval stages, where initial fabrication acts as the primary catalyst for cascading downstream errors (arXiv 2601.22984). A single bad source in the first retrieval round contaminates the entire research trajectory.

Source Credibility Tiers

Every source encountered during research falls into one of six tiers. Always prefer higher-tier sources and cite the tier when reporting findings.

| Tier | Source Type | Examples | Trust Level | |------|-----------|----------|-------------| | T1 — Primary | Peer-reviewed journals, official specs, primary datasets | Nature, Science, IEEE, IETF RFCs, W3C specs | Highest | | T2 — Institutional | Government agencies, established research institutions | NIH, WHO, NIST, ACM Digital Library | High | | T3 — Expert | Named expert blogs, conference proceedings, major tech engineering blogs | Anthropic blog, Google Research, NeurIPS/ICML papers | Moderate-High | | T4 — Quality Editorial | Major publications with editorial review | MIT Technology Review, Ars Technica, The Verge | Moderate | | T5 — Community | Well-moderated forums, high-reputation answers | Stack Overflow (high-score), GitHub discussions | Low-Moderate | | T6 — Unverified | Content farms, SEO-optimized articles, anonymous posts, AI-generated content | Medium listicles, affiliate blogs, uncredited tutorials | Do not cite |

Rule: Never cite T6 sources. Prefer T1-T3 for factual claims. Use T4-T5 for context and community consensus only.

The CRAAP Framework — Automated Signals

Adapted from the CRAAP framework (CSU Chico), five dimensions for evaluating sources:

| Dimension | What to Check | Red Flags | |-----------|--------------|-----------| | Currency | Publication date, last-modified headers | No date visible, information predates major changes in the field | | Relevance | Does it address the specific research question? | Tangential coverage, keyword-stuffed but shallow | | Authority | Who published it? Credentials? | Anonymous author, no institutional affiliation, no citations | | Accuracy | Are claims sourced? Can they be verified? | No inline citations, contradicts known facts, round numbers without source | | Purpose | Is it informing, selling, or persuading? | High ad density, affiliate links, promotional language |

Note: CRAAP evaluates surface features. Use it as an initial filter, not the sole credibility signal (Stanford research found reliance on CRAAP alone makes researchers susceptible to misinformation).

Multi-Provider Search Strategy

Different search providers excel in different domains. Route queries to the appropriate provider:

| Provider | Best For | Limitations | |----------|---------|-------------| | WebSearch (general) | Broad topics, recent events, technical documentation | May surface SEO-optimized content | | arXiv / Semantic Scholar | Academic ML/AI research, preprints | Not peer-reviewed, may be superseded | | PubMed | Medical, biomedical, clinical research | Limited to biomedical domain | | Official documentation | API specs, library usage, framework guides | May lag behind actual behavior | | GitHub | Code examples, implementation patterns, issue discussions | Quality varies widely |

Strategy: Start with domain-appropriate providers. Use general web search to fill gaps. Cross-reference findings across multiple providers when possible.

SEO Spam Detection

Red flags that indicate low-quality, SEO-optimized content:

Listicle format with no depth ("Top 10 ways to...")
Keyword stuffing — the search term appears unnaturally often
No author attribution or author has no verifiable expertise
High ad-to-content ratio
Recycled/syndicated content appearing verbatim across multiple domains
AI-generated markers — generic phrasing, lack of specific examples, overly smooth prose
Affiliate links embedded throughout

When a source triggers 2+ red flags, discard it and search for a higher-quality alternative.

Artifact Evaluation

Research often involves evaluating non-content artifacts — packages, tools, technologies, standards, organizations. These require different signals than content sources. Every artifact has three signal dimensions:

| Dimension | What It Measures | Key Question | |-----------|-----------------|--------------| | Health | Is it alive and maintained? | When was the last meaningful activity? | | Adoption | Does anyone actually use it? | What are the real usage numbers? | | Authority | Who's behind it and are they credible? | Is this backed by a credible entity? |

Artifact Types and Key Signals

| Artifact Type | Health Signals | Adoption Signals | Authority Signals | |---------------|---------------|-----------------|-------------------| | Software packages | Last commit, release frequency, open issue response time | Downloads (npm weekly, PyPI monthly), dependents count | Maintainer reputation, organizational backing, license | | GitHub repos | Commit frequency, PR merge time, stale issue ratio | Stars, forks, contributor count | Bus factor (>1 critical), corporate sponsor, notable users | | APIs/Services | Uptime history, changelog frequency, deprecation notices | Customer logos, integration count, community size | Company funding, revenue stability, enterprise adoption | | Standards/Specs | Last revision date, errata activity | Implementation count, conformance test suites | Standards body status (draft/proposed/standard), industry backing | | Technologies | Release cadence, roadmap activity, CVE response time | Stack Overflow survey ranking, job postings, TIOBE/RedMonk index | Backing organization, governance model, ecosystem size | | Architectural patterns | Recent case studies, active community discussion | Industry adoption breadth, conference talk frequency | Documented at-scale deployments, known failure case studies | | People/Authors | Recent publication activity | Citation count, h-index, follower count | Institutional affiliation, industry role, peer recognition | | Companies/Orgs | Recent funding, hiring activity, product releases | Revenue, customer count, market share | Investor quality, leadership track record, industry awards | | Communities | Messages per week, new member rate | Member count, active member ratio | Moderation quality, notable members, signal-to-noise ratio | | Datasets/Benchmarks | Last update, known issues addressed | Citation count, leaderboard participation | Creator credentials, methodology transparency, peer review | | Claims/Statistics | Date of study, methodology recency | Citation count, replication status | Funding source, sample size, peer review, original source |

Programmatic Stat Verification

When evaluating artifacts, use APIs for exact stats instead of search snippets:

| Ecosystem | API Command | Returns | |-----------|-------------|---------| | GitHub | gh api repos/{owner}/{name} | Stars, forks, license, language, last update, open issues | | GitHub releases | gh api repos/{owner}/{name}/releases/latest | Latest version tag, release date | | npm | curl api.npmjs.org/downloads/point/last-week/{pkg} | Exact weekly downloads | | PyPI | curl pypistats.org/api/packages/{pkg}/recent | Recent download counts | | crates.io | curl crates.io/api/v1/crates/{crate} | Downloads, version, recent downloads | | RubyGems | curl rubygems.org/api/v1/gems/{gem}.json | Downloads, latest version | | Maven | Search site:mvnrepository.com {artifact} | Usage stats page |

These APIs return ground truth. Search snippets for these stats are unreliable.

Red Flags by Artifact Type

Software packages and repos:

Last commit >6 months ago with open issues unanswered
Fewer than 50 stars with no organizational backing
Single maintainer (bus factor = 1) for critical dependency
No tests, no CI, no changelog
License incompatible with intended use

Technologies and standards:

No major release in 12+ months
Declining Stack Overflow activity trend
Abandoned by original backing organization
No conformance test suite (for standards)

Claims and statistics:

No original source cited (circular citation)
Study funded by party with commercial interest in the outcome
Sample size <100 for quantitative claims
No methodology description

General rule: When an artifact triggers 2+ red flags, flag it explicitly in the research output. Do not recommend it without noting the risks.

For detailed per-artifact-type evaluation guides and how to check each signal programmatically, consult references/artifact-signals.md.

Retrieval Best Practices

Front-load quality — The first retrieval round disproportionately determines research quality due to the saturation bottleneck (agents fixate on early results). Start with high-authority sources.
Search iteratively — Refine search queries based on initial results. First search identifies terminology; second search uses domain-specific terms.
Diversify sources — Do not rely on a single provider or a single source for any claim. Cross-reference across independent sources.
Verify fetched content — After WebFetch, scan the content for authority signals (author credentials, citations, publication venue) before incorporating.
Track provenance — Record which source produced which claim. This metadata is essential for citation and for tracing errors.

Retrieval Anti-Patterns

| Anti-Pattern | Problem | Fix | |-------------|---------|-----| | Single-provider dependency | All searches go through one provider | Route by domain; use multiple providers | | First-result trust | Accepting the top search result without evaluation | Evaluate credibility tier before incorporating | | Equal credibility | Treating a blog post the same as a journal paper | Apply tier system; weight higher-tier sources | | Ignoring retrieval failures | Silent fallback when search returns nothing useful | Log the gap; try alternative queries or providers | | Breadth without depth | Fetching 20 URLs but reading none carefully | Fetch fewer sources; read each thoroughly |

Reference Files

For detailed provider comparison, domain-specific source guides, and artifact evaluation:

references/provider-comparison.md — Detailed comparison of search providers with API specifics, rate limits, and optimal use cases
references/artifact-signals.md — Per-artifact-type evaluation guides with health/adoption/authority thresholds, how to check each signal, and the quick evaluation checklist

oborchers/source-evaluation

deep-research/skills/source-evaluation/SKILL.md

This skill should be used when evaluating source credibility, deciding which search results to trust, choosing between search providers, detecting SEO spam or content farms, selecting domain-specific sources (academic, medical, legal, technical), evaluating software packages or libraries, comparing tools or technologies, assessing GitHub repo health, checking adoption metrics, or when research quality depends on retrieval quality. Covers the source credibility taxonomy (T1-T6 tiers), CRAAP framework adaptation, multi-provider search strategy, artifact evaluation framework (health/adoption/authority signals for packages, repos, APIs, standards, technologies), and source quality anti-patterns.

10 stars

tools

Updated May 13, 2026

$ install --global

skillsauth

npx skillsauth add oborchers/fractional-cto source-evaluation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 13, 2026, 6:43 AM163.3s3 files scanned

SKILL.md

name:: source-evaluation
description:: This skill should be used when evaluating source credibility, deciding which search results to trust, choosing between search providers, detecting SEO spam or content farms, selecting domain-specific sources (academic, medical, legal, technical), evaluating software packages or libraries, comparing tools or technologies, assessing GitHub repo health, checking adoption metrics, or when research quality depends on retrieval quality. Covers the source credibility taxonomy (T1-T6 tiers), CRAAP framework adaptation, multi-provider search strategy, artifact evaluation framework (health/adoption/authority signals for packages, repos, APIs, standards, technologies), and source quality anti-patterns.
version:: 1.2.0

Source Evaluation

Source Credibility Tiers

Every source encountered during research falls into one of six tiers. Always prefer higher-tier sources and cite the tier when reporting findings.

Rule: Never cite T6 sources. Prefer T1-T3 for factual claims. Use T4-T5 for context and community consensus only.

The CRAAP Framework — Automated Signals

Adapted from the CRAAP framework (CSU Chico), five dimensions for evaluating sources:

Note: CRAAP evaluates surface features. Use it as an initial filter, not the sole credibility signal (Stanford research found reliance on CRAAP alone makes researchers susceptible to misinformation).

Multi-Provider Search Strategy

Different search providers excel in different domains. Route queries to the appropriate provider:

Strategy: Start with domain-appropriate providers. Use general web search to fill gaps. Cross-reference findings across multiple providers when possible.

SEO Spam Detection

Red flags that indicate low-quality, SEO-optimized content:

Listicle format with no depth ("Top 10 ways to...")
Keyword stuffing — the search term appears unnaturally often
No author attribution or author has no verifiable expertise
High ad-to-content ratio
Recycled/syndicated content appearing verbatim across multiple domains
AI-generated markers — generic phrasing, lack of specific examples, overly smooth prose
Affiliate links embedded throughout

When a source triggers 2+ red flags, discard it and search for a higher-quality alternative.

Artifact Evaluation

Artifact Types and Key Signals

Programmatic Stat Verification

When evaluating artifacts, use APIs for exact stats instead of search snippets:

These APIs return ground truth. Search snippets for these stats are unreliable.

Red Flags by Artifact Type

Software packages and repos:

Last commit >6 months ago with open issues unanswered
Fewer than 50 stars with no organizational backing
Single maintainer (bus factor = 1) for critical dependency
No tests, no CI, no changelog
License incompatible with intended use

Technologies and standards:

No major release in 12+ months
Declining Stack Overflow activity trend
Abandoned by original backing organization
No conformance test suite (for standards)

Claims and statistics:

No original source cited (circular citation)
Study funded by party with commercial interest in the outcome
Sample size <100 for quantitative claims
No methodology description

General rule: When an artifact triggers 2+ red flags, flag it explicitly in the research output. Do not recommend it without noting the risks.

For detailed per-artifact-type evaluation guides and how to check each signal programmatically, consult references/artifact-signals.md.

Retrieval Best Practices

Front-load quality — The first retrieval round disproportionately determines research quality due to the saturation bottleneck (agents fixate on early results). Start with high-authority sources.
Search iteratively — Refine search queries based on initial results. First search identifies terminology; second search uses domain-specific terms.
Diversify sources — Do not rely on a single provider or a single source for any claim. Cross-reference across independent sources.
Verify fetched content — After WebFetch, scan the content for authority signals (author credentials, citations, publication venue) before incorporating.
Track provenance — Record which source produced which claim. This metadata is essential for citation and for tracing errors.

Retrieval Anti-Patterns

Reference Files

For detailed provider comparison, domain-specific source guides, and artifact evaluation:

references/provider-comparison.md — Detailed comparison of search providers with API specifics, rate limits, and optimal use cases
references/artifact-signals.md — Per-artifact-type evaluation guides with health/adoption/authority thresholds, how to check each signal, and the quick evaluation checklist

Related Skills

oborchers/using-planning-tools

tools

VerifiedTrustedCommunity

This skill should be used when the user invokes any /plan-* command from the planning-tools plugin (/plan-context, /plan-master, /plan-open-questions, /plan-verify, /plan-tick, /plan-progress, /plan-delete), asks how Claude Code's plan files work, asks where plans are stored, asks to author or audit a multi-phase master planning document, asks how to walk through a plan's Open Questions interactively, asks how to write progress entries, or mentions ~/.claude/plans/ or .claude/planning-tools.local.md. Provides the index of planning-tools commands, the master-plan workflow lifecycle, the v0.3.0+ list-shape mandate (phases and questions as headings + bulleted scope items, never tables), the v0.3.2+ plain-bullet shape (no `- [ ]` checkboxes — heading emoji is the sole tick signal), the progress-entry methodology, and the mechanics of Claude Code's plan-mode file storage.

14SKILL.mdUpdated May 13, 2026

oborchers/using-planning-tools

oborchers/whitespace-density

testing

VerifiedTrustedCommunity

This skill should be used when the user is adjusting spacing, padding, margins, content density, section gaps, vertical rhythm, or separation between elements. Also applies when reviewing whether a design feels cramped or too sparse, choosing between borders and whitespace for separation, or defining a spacing system. Covers the 4px/8px spacing system, macro vs micro whitespace, content density spectrum, separation techniques (whitespace > background shifts > borders), and vertical rhythm.

12SKILL.mdUpdated May 22, 2026

oborchers/whitespace-density

oborchers/visual-interest-expression

development

VerifiedTrustedCommunity

This skill should be used when the user is defining brand personality in design, choosing between illustration and photography, adding motion or animation, creating visual motifs, ensuring layout variety, customizing CSS framework defaults, or calibrating the level of creative expression for a given context. Covers Lavie & Tractinsky's expressive aesthetics, the expression spectrum (restrained to bold), brand personality translation, illustration systems, photography direction, and template independence.

12SKILL.mdUpdated May 22, 2026

oborchers/visual-interest-expression

oborchers/visual-hierarchy

development

VerifiedTrustedCommunity

This skill should be used when the user is establishing visual importance, designing headings, creating focal points, designing CTAs or buttons, arranging label-data relationships, implementing scanning patterns (F-pattern, Z-pattern), or ensuring one dominant element per screen. Covers the three levers of hierarchy (size, weight, color), three-tier information architecture, the 'emphasize by de-emphasizing' principle, CTA design, and label-data relationships.

12SKILL.mdUpdated May 22, 2026

oborchers/visual-hierarchy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/oborchers/fractional-cto.git

# Copy into Claude Code skills folder (global)
cp -r fractional-cto/deep-research/skills/source-evaluation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

oborchers/fractional-cto

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT