Quality Audit Skill

Systematic framework for evaluating skill quality across four dimensions: Clarity, Completeness, Accuracy, and Usefulness.

When to Use This Skill

Reviewing a new skill before adding to the registry
Auditing existing skills for quality improvements
Creating quality rubrics for skill validation
Standardizing skill quality across the library
Preparing skills for production use

Core Principles

The Four Quality Dimensions

| Dimension | Weight | Focus | |-----------|--------|-------| | Clarity | 25% | Structure, readability, progressive disclosure | | Completeness | 25% | Coverage, examples, edge cases, anti-patterns | | Accuracy | 30% | Correctness, best practices, security | | Usefulness | 20% | Real-world applicability, production-readiness |

Scoring Scale (1-5)

| Score | Label | Meaning | |-------|-------|---------| | 1 | Unacceptable | Fundamentally broken, dangerous, or unusable | | 2 | Needs Work | Major issues requiring significant revision | | 3 | Acceptable | Meets minimum standards, functional | | 4 | Good | High quality, minor improvements possible | | 5 | Excellent | Exemplary, production-ready, best-in-class |

Passing Criteria

Minimum: 3.0 weighted average (acceptable)
Target: 4.0 weighted average (good)
Exceptional: 4.5+ weighted average (excellent)
Blocking: Accuracy must be ≥3.0 (no dangerous advice)

Audit Workflow

Phase 1: Structure Check

checklist:
  structure:
    - [ ] Has valid YAML frontmatter
    - [ ] Contains required metadata (name, description)
    - [ ] Follows progressive disclosure (Tier 1 → 2 → 3)
    - [ ] Sections are logically ordered
    - [ ] Token estimate is reasonable (<5000 for core)

Phase 2: Content Evaluation

checklist:
  content:
    - [ ] "When to Use" section is clear
    - [ ] Core principles are well-defined
    - [ ] Code examples are complete and runnable
    - [ ] Anti-patterns are documented
    - [ ] Troubleshooting guidance exists

Phase 3: Dimension Scoring

For each dimension, evaluate against specific criteria:

Clarity Criteria:

Well-organized sections with logical flow
Concise explanations without jargon overload
Code examples are readable and well-commented
Progressive disclosure from simple to complex

Completeness Criteria:

Covers core concepts thoroughly
Includes edge cases and error handling
Provides both do's and don'ts
Has working examples for main use cases

Accuracy Criteria:

Code examples compile/run without errors
Follows current best practices (not deprecated)
Security considerations are correct
Performance claims are verifiable

Usefulness Criteria:

Examples solve real-world problems
Can be applied immediately
Scales to production use cases
Includes troubleshooting guidance

Phase 4: Report Generation

## Audit Report: {skill_name}

**Date**: {date}
**Auditor**: {auditor}
**Status**: {PASS|FAIL|NEEDS_REVIEW}

### Scores

| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Clarity | {x}/5 | 25% | {x*0.25} |
| Completeness | {x}/5 | 25% | {x*0.25} |
| Accuracy | {x}/5 | 30% | {x*0.30} |
| Usefulness | {x}/5 | 20% | {x*0.20} |
| **Total** | | | **{sum}/5** |

### Issues Found

- [CRITICAL] {issue description}
- [MAJOR] {issue description}
- [MINOR] {issue description}

### Recommendations

1. {actionable recommendation}
2. {actionable recommendation}

Implementation Patterns

Pattern 1: Quick Audit (5-minute review)

Use for rapid assessment of skill quality:

# Run automated structure checks
cortex skills audit <skill-name> --quick

# Output: Pass/Fail with basic metrics

Quick Audit Checks:

YAML frontmatter valid?
Required sections present?
Code blocks have language tags?
No TODO/FIXME markers?
Token count reasonable?

Pattern 2: Full Audit (15-30 minute review)

Comprehensive evaluation with human review:

# Generate full audit report
cortex skills audit <skill-name> --full

# Interactive mode for scoring
cortex skills audit <skill-name> --interactive

Full Audit Process:

Run automated checks
Read through content manually
Test code examples
Score each dimension
Document issues and recommendations
Generate report

Pattern 3: Comparative Audit

Compare skill against reference implementation:

# Compare against template-skill-enhanced
cortex skills audit <skill-name> --compare template-skill-enhanced

Pattern 4: Batch Audit

Audit multiple skills for registry health:

# Audit all skills in a category
cortex skills audit --category security

# Audit skills below threshold
cortex skills audit --below-score 3.5

CLI Commands

# Basic audit
cortex skills audit <skill-name>

# Options
  --quick           Quick structural check only
  --full            Full audit with all dimensions
  --interactive     Interactive scoring mode
  --output FILE     Write report to file
  --format FORMAT   Output format (markdown|json|yaml)
  --compare SKILL   Compare against reference skill
  --fix             Auto-fix simple issues (formatting)

Creating Custom Rubrics

Skills can define custom rubrics in validation/rubric.yaml:

# validation/rubric.yaml
version: "1.0.0"
skill_name: my-skill

dimensions:
  clarity:
    weight: 25
    criteria:
      - "API examples use realistic data"
      - "Error handling is shown for each operation"
  completeness:
    weight: 25
    criteria:
      - "Covers all HTTP methods"
      - "Includes pagination patterns"
  accuracy:
    weight: 30
    criteria:
      - "Follows REST conventions"
      - "Security headers documented"
  usefulness:
    weight: 20
    criteria:
      - "Examples work with common frameworks"

passing_criteria:
  minimum_score: 3.5  # Higher bar for this skill
  required_dimensions:
    - accuracy
    - completeness

Best Practices

Do

Be specific - "Line 45: SQL query vulnerable to injection" not "has security issues"
Be actionable - Include how to fix each issue
Be fair - Use the same standards consistently
Document evidence - Quote specific content for each score
Prioritize - Critical issues first, suggestions last

Don't

Score based on personal style preferences
Mark deprecated patterns without suggesting alternatives
Fail skills for missing optional sections
Ignore security issues regardless of other scores
Rush through audits for complex skills

Anti-Patterns

The Rubber Stamp

Problem: Approving skills without thorough review Why it's bad: Low-quality skills erode trust in the library Fix: Use the full audit checklist, test code examples

The Perfectionist Block

Problem: Failing skills for minor issues Why it's bad: Prevents useful skills from being available Fix: Distinguish between blocking issues and suggestions

Score Inflation

Problem: Giving high scores without justification Why it's bad: Makes scores meaningless Fix: Document specific evidence for each score

Integration with CI/CD

# .github/workflows/skill-quality.yml
name: Skill Quality Gate

on:
  pull_request:
    paths:
      - 'skills/**'

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install cortex
        run: pip install cortex
      - name: Audit changed skills
        run: |
          for skill in $(git diff --name-only HEAD~1 | grep 'skills/' | cut -d'/' -f2 | uniq); do
            cortex skills audit "$skill" --quick --fail-under 3.0
          done

Troubleshooting

"Audit fails but skill looks fine"

Check YAML frontmatter syntax
Verify all required sections exist
Ensure code blocks have language tags
Check for hidden characters (copy/paste issues)

"Scores seem inconsistent"

Review the scoring guide for each dimension
Calibrate by auditing template-skill-enhanced first
Use --interactive mode for clearer criteria

External Resources

Skill Template Reference
Rubric Schema
Skill Creator Guide

Changelog

1.0.0 (2026-01-05)

Initial release
Four-dimension scoring framework
CLI integration
CI/CD workflow example

Quality Audit Skill

Systematic framework for evaluating skill quality across four dimensions: Clarity, Completeness, Accuracy, and Usefulness.

When to Use This Skill

Reviewing a new skill before adding to the registry
Auditing existing skills for quality improvements
Creating quality rubrics for skill validation
Standardizing skill quality across the library
Preparing skills for production use

Core Principles

The Four Quality Dimensions

Scoring Scale (1-5)

Passing Criteria

Minimum: 3.0 weighted average (acceptable)
Target: 4.0 weighted average (good)
Exceptional: 4.5+ weighted average (excellent)
Blocking: Accuracy must be ≥3.0 (no dangerous advice)

Audit Workflow

Phase 1: Structure Check

checklist:
  structure:
    - [ ] Has valid YAML frontmatter
    - [ ] Contains required metadata (name, description)
    - [ ] Follows progressive disclosure (Tier 1 → 2 → 3)
    - [ ] Sections are logically ordered
    - [ ] Token estimate is reasonable (<5000 for core)

Phase 2: Content Evaluation

checklist:
  content:
    - [ ] "When to Use" section is clear
    - [ ] Core principles are well-defined
    - [ ] Code examples are complete and runnable
    - [ ] Anti-patterns are documented
    - [ ] Troubleshooting guidance exists

Phase 3: Dimension Scoring

For each dimension, evaluate against specific criteria:

Clarity Criteria:

Well-organized sections with logical flow
Concise explanations without jargon overload
Code examples are readable and well-commented
Progressive disclosure from simple to complex

Completeness Criteria:

Covers core concepts thoroughly
Includes edge cases and error handling
Provides both do's and don'ts
Has working examples for main use cases

Accuracy Criteria:

Code examples compile/run without errors
Follows current best practices (not deprecated)
Security considerations are correct
Performance claims are verifiable

Usefulness Criteria:

Examples solve real-world problems
Can be applied immediately
Scales to production use cases
Includes troubleshooting guidance

Phase 4: Report Generation

## Audit Report: {skill_name}

**Date**: {date}
**Auditor**: {auditor}
**Status**: {PASS|FAIL|NEEDS_REVIEW}

### Scores

| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Clarity | {x}/5 | 25% | {x*0.25} |
| Completeness | {x}/5 | 25% | {x*0.25} |
| Accuracy | {x}/5 | 30% | {x*0.30} |
| Usefulness | {x}/5 | 20% | {x*0.20} |
| **Total** | | | **{sum}/5** |

### Issues Found

- [CRITICAL] {issue description}
- [MAJOR] {issue description}
- [MINOR] {issue description}

### Recommendations

1. {actionable recommendation}
2. {actionable recommendation}

Implementation Patterns

Pattern 1: Quick Audit (5-minute review)

Use for rapid assessment of skill quality:

# Run automated structure checks
cortex skills audit <skill-name> --quick

# Output: Pass/Fail with basic metrics

Quick Audit Checks:

YAML frontmatter valid?
Required sections present?
Code blocks have language tags?
No TODO/FIXME markers?
Token count reasonable?

Pattern 2: Full Audit (15-30 minute review)

Comprehensive evaluation with human review:

# Generate full audit report
cortex skills audit <skill-name> --full

# Interactive mode for scoring
cortex skills audit <skill-name> --interactive

Full Audit Process:

Run automated checks
Read through content manually
Test code examples
Score each dimension
Document issues and recommendations
Generate report

Pattern 3: Comparative Audit

Compare skill against reference implementation:

# Compare against template-skill-enhanced
cortex skills audit <skill-name> --compare template-skill-enhanced

Pattern 4: Batch Audit

Audit multiple skills for registry health:

# Audit all skills in a category
cortex skills audit --category security

# Audit skills below threshold
cortex skills audit --below-score 3.5

CLI Commands

# Basic audit
cortex skills audit <skill-name>

# Options
  --quick           Quick structural check only
  --full            Full audit with all dimensions
  --interactive     Interactive scoring mode
  --output FILE     Write report to file
  --format FORMAT   Output format (markdown|json|yaml)
  --compare SKILL   Compare against reference skill
  --fix             Auto-fix simple issues (formatting)

Creating Custom Rubrics

Skills can define custom rubrics in validation/rubric.yaml:

# validation/rubric.yaml
version: "1.0.0"
skill_name: my-skill

dimensions:
  clarity:
    weight: 25
    criteria:
      - "API examples use realistic data"
      - "Error handling is shown for each operation"
  completeness:
    weight: 25
    criteria:
      - "Covers all HTTP methods"
      - "Includes pagination patterns"
  accuracy:
    weight: 30
    criteria:
      - "Follows REST conventions"
      - "Security headers documented"
  usefulness:
    weight: 20
    criteria:
      - "Examples work with common frameworks"

passing_criteria:
  minimum_score: 3.5  # Higher bar for this skill
  required_dimensions:
    - accuracy
    - completeness

Best Practices

Do

Be specific - "Line 45: SQL query vulnerable to injection" not "has security issues"
Be actionable - Include how to fix each issue
Be fair - Use the same standards consistently
Document evidence - Quote specific content for each score
Prioritize - Critical issues first, suggestions last

Don't

Score based on personal style preferences
Mark deprecated patterns without suggesting alternatives
Fail skills for missing optional sections
Ignore security issues regardless of other scores
Rush through audits for complex skills

Anti-Patterns

The Rubber Stamp

Problem: Approving skills without thorough review Why it's bad: Low-quality skills erode trust in the library Fix: Use the full audit checklist, test code examples

The Perfectionist Block

Problem: Failing skills for minor issues Why it's bad: Prevents useful skills from being available Fix: Distinguish between blocking issues and suggestions

Score Inflation

Problem: Giving high scores without justification Why it's bad: Makes scores meaningless Fix: Document specific evidence for each score

Integration with CI/CD

# .github/workflows/skill-quality.yml
name: Skill Quality Gate

on:
  pull_request:
    paths:
      - 'skills/**'

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install cortex
        run: pip install cortex
      - name: Audit changed skills
        run: |
          for skill in $(git diff --name-only HEAD~1 | grep 'skills/' | cut -d'/' -f2 | uniq); do
            cortex skills audit "$skill" --quick --fail-under 3.0
          done

Troubleshooting

"Audit fails but skill looks fine"

Check YAML frontmatter syntax
Verify all required sections exist
Ensure code blocks have language tags
Check for hidden characters (copy/paste issues)

"Scores seem inconsistent"

Review the scoring guide for each dimension
Calibrate by auditing template-skill-enhanced first
Use --interactive mode for clearer criteria

External Resources

Skill Template Reference
Rubric Schema
Skill Creator Guide

Changelog

1.0.0 (2026-01-05)

Initial release
Four-dimension scoring framework
CLI integration
CI/CD workflow example

Adoption

nickcrew/quality-audit

$ install --global

Security Scan Results

SKILL.md

Quality Audit Skill

When to Use This Skill

Core Principles

The Four Quality Dimensions

Scoring Scale (1-5)

Passing Criteria

Audit Workflow

Phase 1: Structure Check

Phase 2: Content Evaluation

Phase 3: Dimension Scoring

Phase 4: Report Generation

Implementation Patterns

Pattern 1: Quick Audit (5-minute review)

Pattern 2: Full Audit (15-30 minute review)

Pattern 3: Comparative Audit

Pattern 4: Batch Audit

CLI Commands

Creating Custom Rubrics

Best Practices

Do

Don't

Anti-Patterns

The Rubber Stamp

The Perfectionist Block

Score Inflation

Integration with CI/CD

Troubleshooting

"Audit fails but skill looks fine"

"Scores seem inconsistent"

External Resources

Changelog

1.0.0 (2026-01-05)

Related Skills

nickcrew/writing-skills

nickcrew/workflow-security-audit

nickcrew/workflow-performance

nickcrew/workflow-feature

nickcrew/quality-audit

$ install --global

Security Scan Results

SKILL.md

Quality Audit Skill

When to Use This Skill

Core Principles

The Four Quality Dimensions

Scoring Scale (1-5)

Passing Criteria

Audit Workflow

Phase 1: Structure Check

Phase 2: Content Evaluation

Phase 3: Dimension Scoring

Phase 4: Report Generation

Implementation Patterns

Pattern 1: Quick Audit (5-minute review)

Pattern 2: Full Audit (15-30 minute review)

Pattern 3: Comparative Audit

Pattern 4: Batch Audit

CLI Commands

Creating Custom Rubrics

Best Practices

Do

Don't

Anti-Patterns

The Rubber Stamp

The Perfectionist Block

Score Inflation

Integration with CI/CD

Troubleshooting

"Audit fails but skill looks fine"

"Scores seem inconsistent"

External Resources

Changelog

1.0.0 (2026-01-05)

Related Skills

nickcrew/writing-skills