.claude/skills/review-multi/SKILL.md
Comprehensive multi-dimensional skill reviews across structure, content, quality, usability, and integration. Task-based operations with automated validation, manual assessment, scoring rubrics, and improvement recommendations. Use when reviewing skills, ensuring quality, validating production readiness, identifying improvements, or conducting quality assurance.
npx skillsauth add adaptationio/skrillz review-multiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
review-multi provides a systematic framework for conducting comprehensive, multi-dimensional reviews of Claude Code skills. It evaluates skills across 5 independent dimensions, combining automated validation with manual assessment to deliver objective quality scores and actionable improvement recommendations.
Purpose: Systematic skill quality assurance through multi-dimensional assessment
The 5 Review Dimensions:
Automation Levels:
Scoring System:
Value Proposition:
Key Benefits:
Use review-multi when:
Pre-Production Validation - Review new skills before deploying to production to catch issues early and ensure quality standards
Quality Assurance - Conduct systematic QA on skills to validate they meet ecosystem standards and user needs
Identifying Improvements - Discover specific, actionable improvements for existing skills through multi-dimensional assessment
Continuous Improvement - Regular reviews throughout development lifecycle, not just at end, to maintain quality
Production Readiness Assessment - Determine if skill is ready for production use with objective scoring and grade mapping
Skill Ecosystem Standards - Ensure consistency and quality across multiple skills using standardized review framework
Post-Update Validation - Review skills after major updates to ensure changes don't introduce issues or degrade quality
Learning and Improvement - Use review findings to learn patterns, improve future skills, and refine development practices
Team Calibration - Standardize quality assessment across multiple reviewers with objective rubrics
Don't Use When:
Required:
.claude/skills/[skill-name]/ format)Optional:
development-workflow/references/common-patterns.md)Skills (no required dependencies, complementary):
The review-multi scoring system provides objective, consistent quality assessment across all skill dimensions.
Each dimension is scored independently using a 1-5 integer scale:
5 - Excellent (Exceeds Standards)
4 - Good (Meets Standards)
3 - Acceptable (Minor Improvements Needed)
2 - Needs Work (Notable Issues)
1 - Poor (Significant Problems)
The overall score is a weighted average of the 5 dimension scores:
Overall = (Structure × 0.20) + (Content × 0.25) + (Quality × 0.25) +
(Usability × 0.15) + (Integration × 0.15)
Weight Rationale:
Example Calculations:
Overall scores map to letter grades:
Based on overall score:
Decision Framework:
Purpose: Validate file organization, naming conventions, YAML frontmatter compliance, and progressive disclosure
When to Use This Operation:
Automation Level: 95% automated via scripts/validate-structure.py
Process:
Run Structure Validation Script
python3 scripts/validate-structure.py /path/to/skill [--json] [--verbose]
Script checks YAML, file structure, naming, progressive disclosure
Review YAML Frontmatter
Verify File Structure
Check Naming Conventions
Validate Progressive Disclosure
Validation Checklist:
name field in kebab-case format (e.g., skill-name)description includes 5+ trigger keywords (naturally embedded)Scoring Criteria:
Outputs:
Time Estimate: 5-10 minutes (mostly automated)
Example:
$ python3 scripts/validate-structure.py .claude/skills/todo-management
Structure Validation Report
===========================
Skill: todo-management
Date: 2025-11-06
✅ YAML Frontmatter: PASS
- Name format: valid (kebab-case)
- Trigger keywords: 8 found (target: 5+)
✅ File Structure: PASS
- SKILL.md: exists
- README.md: exists
- references/: 3 files found
- scripts/: 1 file found
✅ Naming Conventions: PASS
- All files follow conventions
⚠️ Progressive Disclosure: WARNING
- SKILL.md: 569 lines (good)
- state-management-guide.md: 501 lines (good)
- BUT: No Quick Reference section detected
Overall Structure Score: 4/5 (Good)
Issues: 1 warning (missing Quick Reference)
Recommendation: Add Quick Reference section to SKILL.md
Purpose: Assess section completeness, content clarity, example quality, and documentation comprehensiveness
When to Use This Operation:
Automation Level: 40% automated (section detection, example counting), 60% manual assessment
Process:
Check Section Completeness (automated + manual)
Assess Content Clarity (manual)
Evaluate Example Quality (automated count + manual quality)
Review Documentation Completeness (manual)
Check Explanation Depth (manual)
Validation Checklist:
Scoring Criteria:
Outputs:
Time Estimate: 15-30 minutes (requires manual review)
Example:
Content Review: prompt-builder
==============================
Section Completeness: 9/10 ✅
✅ Overview: Present, clear explanation of purpose
✅ When to Use: 7 scenarios listed
✅ Main Content: 5-step workflow, well-organized
✅ Best Practices: 6 practices documented
✅ Quick Reference: Present
⚠️ Common Mistakes: Not present (optional but valuable)
Example Quality: 8/10 ✅
- Count: 12 examples (exceeds target of 5+)
- Concrete: Yes, all examples executable
- Helpful: Yes, demonstrate key concepts
- Minor: Could use 1-2 edge case examples
Content Clarity: 9/10 ✅
- Well-organized logical flow
- Clear explanations without verbosity
- Technical level appropriate
- Minor: Step 3 could be clearer (add diagram)
Documentation Completeness: 8/10 ✅
- All workflow steps documented
- Validation criteria clear
- Minor gaps: Error handling not covered
Content Score: 4/5 (Good)
Primary Recommendation: Add Common Mistakes section
Secondary: Add error handling guidance to Step 3
Purpose: Evaluate pattern compliance, best practices adherence, anti-pattern detection, and code/script quality
When to Use This Operation:
Automation Level: 50% automated (pattern detection, anti-pattern checking), 50% manual assessment
Process:
Detect Architecture Pattern (automated + manual)
Validate Documentation Patterns (automated + manual)
Check Best Practices (manual)
Detect Anti-Patterns (automated + manual)
Assess Code Quality (manual, if scripts present)
Validation Checklist:
Scoring Criteria:
Outputs:
Time Estimate: 20-40 minutes (mixed automated + manual)
Example:
Quality Review: workflow-skill-creator
======================================
Pattern Compliance: ✅
- Pattern Detected: Workflow-based
- Implementation: Correct (5 sequential steps with dependencies)
- Consistency: High (all steps follow same structure)
Documentation Patterns: ✅
- 5 Core Sections: All present
- Structure: Consistent across all 5 steps
- Formatting: Proper heading levels
Best Practices Adherence: 8/10 ✅
✅ Validation checklists: Present and specific
✅ Examples throughout: 6 examples included
✅ Quick Reference: Present
⚠️ Error handling: Limited (only happy path in examples)
Anti-Pattern Detection: 1 detected ⚠️
✅ No keyword stuffing (15 natural keywords)
✅ No monolithic file (1,465 lines but has references/)
✅ Consistent structure
✅ Specific validation criteria
✅ Examples complete (no placeholders)
⚠️ Error cases: Only happy path documented
✅ Dependencies: Clearly documented
✅ Not over-engineered
Code Quality: N/A (no scripts)
Quality Score: 4/5 (Good)
Primary Issue: Limited error handling documentation
Recommendation: Add error case examples and recovery guidance
Purpose: Evaluate ease of use, learnability, real-world effectiveness, and user satisfaction through scenario testing
When to Use This Operation:
Automation Level: 10% automated (basic checks), 90% manual testing
Process:
Test in Real-World Scenario
Assess Navigation/Findability
Evaluate Clarity
Measure Effectiveness
Assess Learning Curve
Validation Checklist:
Scoring Criteria:
Outputs:
Time Estimate: 30-60 minutes (requires actual testing)
Example:
Usability Review: skill-researcher
==================================
Real-World Scenario Test: ✅
- Scenario: Research GitHub API integration patterns
- Result: SUCCESS - Found 5 relevant sources, synthesized findings
- Experience: Smooth, operations clearly explained
- Time: 45 minutes (expected 60 min range)
Navigation/Findability: 9/10 ✅
- Information easy to find
- 5 operations clearly separated
- Quick Reference table very helpful
- Minor: Could use table of contents for long doc
Instruction Clarity: 9/10 ✅
- Steps clear and actionable
- Process well-explained
- Examples demonstrate concepts
- Minor: Web search query formulation could be clearer
Effectiveness: 10/10 ✅
- Achieved purpose: Found patterns and synthesized
- Delivered value: Comprehensive research in 45 min
- Would use again: Yes, very helpful
Learning Curve: 8/10 ✅
- Time to understand: 10 minutes
- Time to use effectively: 15 minutes
- Reasonable for complexity
- First-time user: Some concepts need explanation (credibility scoring)
Error Handling: N/A (no errors encountered)
User Satisfaction: 9/10 ✅
- Would use again: Yes
- Would recommend: Yes
- Overall experience: Very positive
Usability Score: 5/5 (Excellent)
Minor Improvement: Add brief explanation of credibility scoring concept
Purpose: Assess dependency documentation, data flow clarity, component integration, and composition patterns
When to Use This Operation:
Automation Level: 30% automated (dependency checking, cross-reference validation), 70% manual assessment
Process:
Review Dependency Documentation (manual)
dependencies field used (if applicable)?Assess Data Flow Clarity (manual, for workflow skills)
Evaluate Component Integration (manual)
Verify Cross-References (automated + manual)
Check Composition Patterns (manual, for workflow skills)
Validation Checklist:
dependencies field correct (if used)Scoring Criteria:
Outputs:
Time Estimate: 15-25 minutes (mostly manual)
Example:
Integration Review: development-workflow
========================================
Dependency Documentation: 10/10 ✅
- Required Skills: None (workflow is standalone)
- Component Skills: 5 clearly documented (skill-researcher, planning-architect, task-development, prompt-builder, todo-management)
- Optional Skills: 3 complementary skills mentioned (review-multi, skill-updater, testing-validator)
- YAML Field: Not used (not required, skills referenced in content)
Data Flow Clarity: 10/10 ✅ (Workflow Skill)
- Data flow diagram present (skill → output → next skill)
- Inputs/outputs for each step documented
- Users understand how artifacts flow
- Example:
skill-researcher → research-synthesis.md → planning-architect ↓ skill-architecture-plan.md → task-development
Component Integration: 10/10 ✅
- Integration method documented for each step (Guided Execution)
- Integration examples provided
- Clear explanation of how skills work together
- Process for using each component skill detailed
Cross-Reference Validation: ✅
- Internal links valid (references/ files exist and reachable)
- External skill references correct (all 5 component skills exist)
- Complementary skills mentioned appropriately
Composition Pattern: 10/10 ✅ (Workflow Skill)
- Pattern: Sequential Pipeline (with one optional step)
- Correctly implemented (Step 1 → 2 → [3 optional] → 4 → 5)
- Orchestration details provided
- Clear flow diagram
Integration Score: 5/5 (Excellent)
Notes: Exemplary integration documentation for workflow skill
Purpose: Complete multi-dimensional assessment across all 5 dimensions with aggregate scoring
When to Use:
Process:
Run All 5 Operations Sequentially
Aggregate Scores
Assess Production Readiness
Compile Improvement Recommendations
Generate Comprehensive Report
Output:
Time Estimate: 1.5-2.5 hours total
Example Output:
Comprehensive Review Report: skill-researcher
=============================================
OVERALL SCORE: 4.6/5.0 - GRADE A
STATUS: ✅ PRODUCTION READY
Dimension Scores:
- Structure: 5/5 (Excellent) - Perfect file organization
- Content: 5/5 (Excellent) - Comprehensive, clear documentation
- Quality: 4/5 (Good) - High quality, minor error handling gaps
- Usability: 5/5 (Excellent) - Easy to use, highly effective
- Integration: 4/5 (Good) - Well-documented dependencies
Production Readiness: READY - High quality, deploy with confidence
Recommendations (Priority Order):
1. [Medium] Add error handling examples for web search failures
2. [Low] Consider adding table of contents for long SKILL.md
Strengths:
- Excellent structure and organization
- Comprehensive coverage of 5 research operations
- Strong usability with clear instructions
- Good examples throughout
Overall: Exemplary skill, production-ready quality
Purpose: Quick automated validation for rapid quality feedback during development
When to Use:
Process:
Run Automated Structure Validation
python3 scripts/validate-structure.py /path/to/skill
Check Critical Issues
Generate Pass/Fail Report
Provide Quick Fixes (if available)
Output:
Time Estimate: 5-10 minutes
Example Output:
$ python3 scripts/validate-structure.py .claude/skills/my-skill
Fast Check Report
=================
Skill: my-skill
❌ FAIL - Critical Issues Found
Critical Issues:
1. YAML frontmatter: Invalid syntax (line 3: unexpected character)
2. Naming convention: File "MyGuide.md" should be "my-guide.md"
Quick Fixes:
1. Fix YAML: Remove trailing comma on line 3
2. Rename file: mv references/MyGuide.md references/my-guide.md
Run full validation after fixes: python3 scripts/validate-structure.py .claude/skills/my-skill
Purpose: Flexible review focusing on specific dimensions or concerns
When to Use:
Options:
Process:
Define Custom Review Scope
Run Selected Operations
Generate Targeted Report
Example Scenarios:
Scenario 1: Content-Focused Review
Custom Review: Content + Examples
- Operations: Content Review only
- Thoroughness: Thorough
- Focus: Example quality and completeness
- Time: 30 minutes
Scenario 2: Quick Quality Check
Custom Review: Structure + Quality (Fast)
- Operations: Structure + Quality
- Thoroughness: Quick
- Focus: Pattern compliance, anti-patterns
- Time: 15-20 minutes
Scenario 3: Workflow Integration Review
Custom Review: Integration Deep Dive
- Operations: Integration Review only
- Thoroughness: Thorough
- Focus: Data flow, composition patterns
- Time: 30 minutes
Practice: Run Fast Check mode before requesting comprehensive review
Rationale: Automated checks catch 70% of structural issues in 5-10 minutes, allowing manual review to focus on higher-value assessment
Application: Always run validate-structure.py before detailed review
Practice: Follow validation checklists item-by-item for each operation
Rationale: Research shows teams using checklists reduce common issues by 30% and ensure consistent results
Application: Print or display checklist, mark each item explicitly
Practice: Conduct usability review with actual usage, not just documentation reading
Rationale: Real-world testing reveals hidden usability issues that documentation review misses
Application: For Usability Review, actually use the skill to complete a realistic task
Practice: Let scripts handle routine checks, focus manual effort on judgment-requiring assessment
Rationale: Automation provides 70% reduction in manual review time for routine checks
Application: Use scripts for Structure and partial Quality checks, manual for Content/Usability
Practice: Make improvement recommendations specific, prioritized, and actionable
Rationale: Vague feedback ("improve quality") is less valuable than specific guidance ("add error handling examples to Step 3")
Application: For each issue, specify: What, Why, How (to fix), Priority
Practice: Conduct reviews throughout development lifecycle, not just at end
Rationale: Early reviews catch issues before they compound; rapid feedback maintains momentum (37% productivity increase)
Application: Fast Check during development, Comprehensive Review before production
Practice: Document before/after scores to measure improvement over time
Rationale: Tracking demonstrates progress, identifies patterns, validates improvements
Application: Save review reports, compare scores across iterations
Practice: Use review findings to improve future skills, not just current skill
Rationale: Learnings compound; patterns identified in reviews improve entire skill ecosystem
Application: Document common issues, create guidelines, update templates
Symptom: Spending time on detailed review only to discover fundamental structural issues
Cause: Assumption that structure is correct, eagerness to assess content
Fix: Always run Structure Review (Fast Check) first - takes 5-10 minutes, catches 70% of issues
Prevention: Make Fast Check mandatory first step in any review process
Symptom: Inconsistent scores, debate over ratings, difficulty justifying scores
Cause: Using personal opinion instead of rubric criteria
Fix: Use references/scoring-rubric.md - score based on specific criteria, not feeling
Prevention: Print rubric, refer to criteria for each score, document evidence
Symptom: Skill looks good on paper but difficult to use in practice
Cause: Skipping Usability Review (90% manual, time-consuming)
Fix: Actually test skill in real scenario - reveals hidden issues
Prevention: Allocate 30-60 minutes for usability testing, cannot skip for production
Symptom: Long list of improvements, unclear what to fix first, overwhelmed
Cause: Treating all issues equally without assessing impact
Fix: Prioritize issues: Critical (must fix) → High → Medium → Low (nice to have)
Prevention: Tag each issue with priority level during review
Symptom: Discovering major issues late in development, costly rework
Cause: Waiting until end to review, accumulating issues
Fix: Review early and often - Fast Check during development, iterations
Prevention: Continuous validation, rapid feedback, catch issues when small
Symptom: Repeating same issues across multiple skills
Cause: Treating each review in isolation, not learning from patterns
Fix: Track common issues, create guidelines, update development process
Prevention: Document patterns, share learnings, improve templates
| Operation | Focus | Automation | Time | Key Output | |-----------|-------|------------|------|------------| | Structure | YAML, files, naming, organization | 95% | 5-10m | Structure score, compliance report | | Content | Completeness, clarity, examples | 40% | 15-30m | Content score, section assessment | | Quality | Patterns, best practices, anti-patterns | 50% | 20-40m | Quality score, pattern compliance | | Usability | Ease of use, effectiveness | 10% | 30-60m | Usability score, scenario test results | | Integration | Dependencies, data flow, composition | 30% | 15-25m | Integration score, dependency validation |
| Score | Level | Meaning | Action | |-------|-------|---------|--------| | 5 | Excellent | Exceeds standards | Exemplary - use as example | | 4 | Good | Meets standards | Production ready - standard quality | | 3 | Acceptable | Minor improvements | Usable - note improvements | | 2 | Needs Work | Notable issues | Not ready - significant improvements | | 1 | Poor | Significant problems | Not viable - extensive rework |
| Overall Score | Grade | Status | Decision | |---------------|-------|--------|----------| | 4.5-5.0 | A | ✅ Production Ready | Ship it - high quality | | 4.0-4.4 | B+ | ✅ Ready (minor improvements) | Ship - note improvements for next iteration | | 3.5-3.9 | B- | ⚠️ Needs Improvements | Hold - fix issues first | | 2.5-3.4 | C | ❌ Not Ready | Don't ship - substantial work needed | | 1.5-2.4 | D | ❌ Not Ready | Don't ship - significant rework | | 1.0-1.4 | F | ❌ Not Ready | Don't ship - major issues |
| Mode | Time | Use Case | Coverage | |------|------|----------|----------| | Fast Check | 5-10m | During development, quick validation | Structure only (automated) | | Custom | Variable | Targeted review, specific concerns | Selected dimensions | | Comprehensive | 1.5-2.5h | Pre-production, full assessment | All 5 dimensions + report |
# Fast structure validation
python3 scripts/validate-structure.py /path/to/skill
# Verbose output
python3 scripts/validate-structure.py /path/to/skill --verbose
# JSON output
python3 scripts/validate-structure.py /path/to/skill --json
# Pattern compliance check
python3 scripts/check-patterns.py /path/to/skill
# Generate review report
python3 scripts/generate-review-report.py review_data.json --output report.md
# Run comprehensive review
python3 scripts/review-runner.py /path/to/skill --mode comprehensive
Overall = (Structure × 0.20) + (Content × 0.25) + (Quality × 0.25) +
(Usability × 0.15) + (Integration × 0.15)
Weight Rationale:
references/structure-review-guide.mdreferences/content-review-guide.mdreferences/quality-review-guide.mdreferences/usability-review-guide.mdreferences/integration-review-guide.mdreferences/scoring-rubric.mdreferences/review-report-template.mdFor detailed guidance on each dimension, see reference files. For automation tools, see scripts/.
development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.