Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

fatih-developer/agent-reviewer

Name: agent-reviewer
Author: fatih-developer

skills/agent-reviewer/SKILL.md

npx skillsauth add fatih-developer/fth-skills agent-reviewer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Agent Reviewer Protocol

Task is done — now look back. What went well, what went wrong, what should be different next time? Goal: never repeat the same mistake and continuously improve skills and processes.

Core principle: Retrospectives are painful but necessary. A good agent evaluates itself.

6 Review Dimensions

1. Goal Alignment

Did the result match the original intent?

Was the user's actual request met?
Did scope creep occur?
Over-delivery or under-delivery?

2. Efficiency

Did the task take longer than necessary?

Unnecessary tool calls?
Repeated operations?
Sequential steps that could have been parallel?
Token/resource waste?

3. Decision Quality

Were decisions well-reasoned?

Were assumptions verified?
Were alternatives considered?
Did early decisions cause later problems?

4. Error Handling

How were errors addressed?

Detected quickly?
Right strategy applied?
Same error repeated?

5. Communication

How was user interaction quality?

Unnecessary confirmations requested?
Critical information missing at key points?
Too many or too few questions?

6. Reusability

Can lessons from this task transfer to the next?

General patterns discovered?
Which skills were missing or insufficient?
Which decisions should become standard?

Finding Severity

| Severity | Meaning | Action | |----------|---------|--------| | CRITICAL | Endangered the task or significantly reduced quality | Must fix | | MODERATE | Created inefficiency but didn't break the result | Improve | | POSITIVE | Something that went better than expected | Repeat, standardize |

Output Format

AGENT REVIEWER — Task Retrospective
Task     : [task name]
Score    : X/10
Findings : N critical | N moderate | N positive

## Dimension Scores

| Dimension | Score | Summary |
|-----------|-------|---------|
| Goal Alignment | X/10 | ... |
| Efficiency | X/10 | ... |
| Decision Quality | X/10 | ... |
| Error Handling | X/10 | ... |
| Communication | X/10 | ... |
| Reusability | X/10 | ... |
| **Overall** | **X/10** | |

## Critical Findings
[If any — what happened, why critical, how to prevent]

## Improvement Areas
[Inefficiencies, missed opportunities]

## What Went Well
[Decisions and approaches worth repeating]

## Action Items

### For Next Task
1. [Concrete change — what to do]
2. [Concrete change]

### Skill / Process Improvement
1. [Which skill should be updated / added]
2. [Which pattern should be standardized]

## Lessons Learned
[Items a future agent instance should know — candidates for memory-ledger]

Inefficiency Patterns — Auto-Detect

Scan the task history for these patterns:

| Pattern | Symptom | Fix | |---------|---------|-----| | Repeated tool call | Same file/API read 2+ times | Cache it | | Unnecessary confirmation | Low-risk step triggered approval | Adjust checkpoint-guardian threshold | | Late assumption discovery | "Actually it should be..." after error | Trigger assumption-checker earlier | | Sequential parallel steps | Independent steps ran sequentially | Use parallel-planner | | Blind retry | Logic error treated as transient | Fix error-recovery categorization | | Context loss | Previous step info forgotten | Memory-ledger not updated | | Over-decomposition | 2-step task split into 8 | Adjust task-decomposer granularity |

Skill Performance Evaluation

Evaluate skills used during the task:

## Skills Used

| Skill | Used? | Effective? | Notes |
|-------|-------|------------|-------|
| task-decomposer | Yes/No | Good/Fair/Poor | ... |
| checkpoint-guardian | Yes/No | Good/Fair/Poor | ... |
| assumption-checker | Yes/No | Good/Fair/Poor | ... |
| tool-selector | Yes/No | Good/Fair/Poor | ... |
| parallel-planner | Yes/No | Good/Fair/Poor | ... |
| error-recovery | Yes/No | Good/Fair/Poor | ... |
| memory-ledger | Yes/No | Good/Fair/Poor | ... |
| output-critic | Yes/No | Good/Fair/Poor | ... |

Missing / untriggered skills and why?

When to Skip

Task was single-step or under 5 minutes
Prototype / experimental task
User said "no retrospective needed"

Guardrails

Be honest, not kind — the value is in finding problems, not hiding them.
Concrete suggestions only — "do better" is useless; "cache file reads to avoid 3 redundant calls" is actionable.
Cross-skill: this is the ecosystem's feedback loop — findings here should update other skills and processes.

fatih-developer/agent-reviewer

skills/agent-reviewer/SKILL.md

After an agentic task completes, perform a retrospective analysis across 6 dimensions (goal alignment, efficiency, decision quality, error handling, communication, reusability). Score performance, identify inefficiency patterns, evaluate skill usage, and produce actionable improvement recommendations. Triggers on 'how did it go', 'retrospective', 'review performance', 'what could be better', or after any long agentic task completes.

4 stars

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add fatih-developer/fth-skills agent-reviewer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:45 AM69.2s4 files scanned

SKILL.md

name:: agent-reviewer
description:: After an agentic task completes, perform a retrospective analysis across 6 dimensions (goal alignment, efficiency, decision quality, error handling, communication, reusability). Score performance, identify inefficiency patterns, evaluate skill usage, and produce actionable improvement recommendations. Triggers on 'how did it go', 'retrospective', 'review performance', 'what could be better', or after any long agentic task completes.

Agent Reviewer Protocol

Task is done — now look back. What went well, what went wrong, what should be different next time? Goal: never repeat the same mistake and continuously improve skills and processes.

Core principle: Retrospectives are painful but necessary. A good agent evaluates itself.

6 Review Dimensions

1. Goal Alignment

Did the result match the original intent?

Was the user's actual request met?
Did scope creep occur?
Over-delivery or under-delivery?

2. Efficiency

Did the task take longer than necessary?

Unnecessary tool calls?
Repeated operations?
Sequential steps that could have been parallel?
Token/resource waste?

3. Decision Quality

Were decisions well-reasoned?

Were assumptions verified?
Were alternatives considered?
Did early decisions cause later problems?

4. Error Handling

How were errors addressed?

Detected quickly?
Right strategy applied?
Same error repeated?

5. Communication

How was user interaction quality?

Unnecessary confirmations requested?
Critical information missing at key points?
Too many or too few questions?

6. Reusability

Can lessons from this task transfer to the next?

General patterns discovered?
Which skills were missing or insufficient?
Which decisions should become standard?

Finding Severity

Output Format

AGENT REVIEWER — Task Retrospective
Task     : [task name]
Score    : X/10
Findings : N critical | N moderate | N positive

## Dimension Scores

| Dimension | Score | Summary |
|-----------|-------|---------|
| Goal Alignment | X/10 | ... |
| Efficiency | X/10 | ... |
| Decision Quality | X/10 | ... |
| Error Handling | X/10 | ... |
| Communication | X/10 | ... |
| Reusability | X/10 | ... |
| **Overall** | **X/10** | |

## Critical Findings
[If any — what happened, why critical, how to prevent]

## Improvement Areas
[Inefficiencies, missed opportunities]

## What Went Well
[Decisions and approaches worth repeating]

## Action Items

### For Next Task
1. [Concrete change — what to do]
2. [Concrete change]

### Skill / Process Improvement
1. [Which skill should be updated / added]
2. [Which pattern should be standardized]

## Lessons Learned
[Items a future agent instance should know — candidates for memory-ledger]

Inefficiency Patterns — Auto-Detect

Scan the task history for these patterns:

Skill Performance Evaluation

Evaluate skills used during the task:

## Skills Used

| Skill | Used? | Effective? | Notes |
|-------|-------|------------|-------|
| task-decomposer | Yes/No | Good/Fair/Poor | ... |
| checkpoint-guardian | Yes/No | Good/Fair/Poor | ... |
| assumption-checker | Yes/No | Good/Fair/Poor | ... |
| tool-selector | Yes/No | Good/Fair/Poor | ... |
| parallel-planner | Yes/No | Good/Fair/Poor | ... |
| error-recovery | Yes/No | Good/Fair/Poor | ... |
| memory-ledger | Yes/No | Good/Fair/Poor | ... |
| output-critic | Yes/No | Good/Fair/Poor | ... |

Missing / untriggered skills and why?

When to Skip

Task was single-step or under 5 minutes
Prototype / experimental task
User said "no retrospective needed"

Guardrails

Be honest, not kind — the value is in finding problems, not hiding them.
Concrete suggestions only — "do better" is useless; "cache file reads to avoid 3 redundant calls" is actionable.
Cross-skill: this is the ecosystem's feedback loop — findings here should update other skills and processes.

Related Skills

fatih-developer/prompt-crafter

tools

VerifiedTrustedCommunity

Create, optimize, critique, and programmatically structure prompts for AI systems. Use this skill whenever the user is designing or improving a static prompt, system prompt, coding prompt, agent prompt, workflow prompt, MCP-oriented prompt package, or an algorithmic prompt optimization pipeline. Also use it when the user asks to turn vague AI behavior into a precise instruction set, tool policy, agent spec, evaluation metric, or prompt architecture.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/prompt-crafter

fatih-developer/plan-hardener

testing

VerifiedTrustedCommunity

Assumption-first architecture review skill to stress-test project plans and expose hidden risks.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/plan-hardener

fatih-developer/design-md-enforcer

testing

VerifiedTrustedCommunity

Enforce and manage DESIGN.md specifications, extract design systems from URLs, and combine design reasoning with token roles to prevent drift.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/design-md-enforcer

fatih-developer/claude-style-coding

testing

VerifiedTrustedCommunity

Forces the agent to act with a Claude-like product mindset, prioritizing user journey, UX states, and visual quality before coding.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/claude-style-coding

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/fatih-developer/fth-skills.git

# Copy into Claude Code skills folder (global)
cp -r fth-skills/skills/agent-reviewer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

fatih-developer/fth-skills

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT