
Write, edit, review, and validate AgentV EVAL.yaml / .eval.yaml evaluation files. Use when asked to create new eval files, update or fix existing ones, add or remove test cases, configure graders (`llm-grader`, `code-grader`, `rubrics`), review whether an eval is correct or complete, convert between EVAL.yaml and evals.json using `agentv convert`, or generate eval test cases from chat transcripts (markdown conversation or JSON messages). Do NOT use for creating SKILL.md files, writing skill definitions, or running evals — running and benchmarking belongs to agentv-bench.
Use when the user asks about deploying services, checking deployment status, rollback procedures, or release management at Acme Corp
Use when the user asks to analyze CSV data. MUST use the weighted revenue formula defined below.
Use when the user asks about deploying services, checking deployment status, rollback procedures, or release management at Acme Corp
Use when the user asks to analyze CSV data. MUST use the weighted revenue formula defined below.
Use when designing an AI agent system, selecting agentic design patterns, planning multi-phase workflows, choosing between single-agent and multi-agent architectures, or when asked "what kind of agent should I build", "how should I structure this automation", "design an agent for X", or "which agentic pattern fits this problem".
Use when reviewing eval YAML files for quality issues, linting eval files before committing, checking eval schema compliance, or when asked to "review these evals", "check eval quality", "lint eval files", or "validate eval structure". Do NOT use for writing evals (use agentv-eval-writer) or running evals (use agentv-bench).
Bootstrap AgentV in the current workspace after plugin-manager install. Ensures CLI availability, runs workspace init, and verifies setup artifacts.
Use when the user asks to analyze CSV data. MUST use the weighted revenue formula defined below.
Use when the user asks to analyze, summarize, or extract insights from CSV data or files
This skill should be used when asked to "execute a deployment", "run the deploy plan", or "deploy services". Reads deploy-plan.md and executes each step with health checks.
This skill should be used when asked to "plan a deployment", "create a deploy plan", or "prepare release steps". Produces a deployment plan with rollback strategy.
This skill should be used when asked to "rollback a deployment", "revert services", or "undo deploy". Reads deploy-plan.md and reverses completed steps.
Analyze AgentV evaluation traces and result JSONL files using `agentv inspect` and `agentv compare` CLI commands. Use when asked to inspect AgentV eval results, find regressions between AgentV evaluation runs, identify failure patterns in AgentV trace data, analyze tool trajectories, or compute cost/latency/score statistics from AgentV result files. Do NOT use for benchmarking skill trigger accuracy, analyzing skill-creator eval performance, or measuring skill description quality — those tasks belong to the skill-creator skill.
Capture, optimize, and publish screenshots to Astro docs. Use when asked to take screenshots for docs, update doc images, compress PNG assets, or add visual documentation to the agentv.dev docs site. Triggers on "add screenshots to docs", "update docs images", "compress screenshots", "optimize PNG", "document with screenshots".
Write, edit, review, and validate AgentV EVAL.yaml / .eval.yaml evaluation files. Use when asked to create new eval files, update or fix existing ones, add or remove test cases, configure graders (`llm-grader`, `code-grader`, `rubrics`), review whether an eval is correct or complete, convert between EVAL.yaml and evals.json using `agentv convert`, or generate eval test cases from chat transcripts (markdown conversation or JSON messages). Do NOT use for creating SKILL.md files, writing skill definitions, or running evals — running and benchmarking belongs to agentv-bench.
Run AgentV evaluations and optimize agents through eval-driven iteration. Triggers: run evals, benchmark agents, optimize prompts/skills against evals, compare agent outputs across providers, analyze eval results, offline evaluation of recorded sessions, run autoresearch, optimize unattended, run overnight optimization loop. Not for: writing/editing eval YAML without running (use agentv-eval-writer), analyzing existing traces/JSONL without re-running (use agentv-trace-analyst).
Author, edit, and lint `governance:` blocks in `*.eval.yaml` files. Use when creating or updating evaluation suites that carry AI-governance metadata (OWASP LLM Top 10, OWASP Agentic Top 10, MITRE ATLAS, EU AI Act, ISO 42001). Also use non-interactively (e.g., from a GitHub Action) to lint changed eval files and report violations against the rules in `references/lint-rules.md`. Do NOT use for running evals or benchmarking — that belongs to agentv-bench.
AgentV CLI skills for evaluating, optimizing, and governing AI agents. Triggers: run evals, benchmark agents, write evals, review evals, analyze traces, optimize prompts, governance linting. Covers: eval running, eval writing, eval review, trace analysis, description optimization, autoresearch, and governance compliance.
Use when the user asks about deploying services, checking deployment status, rollback procedures, or release management at Acme Corp
Author, edit, and lint `governance:` blocks in `*.eval.yaml` files. Use when creating or updating evaluation suites that carry AI-governance metadata (OWASP LLM Top 10, OWASP Agentic Top 10, MITRE ATLAS, EU AI Act, ISO 42001). Also use non-interactively (e.g., from a GitHub Action) to lint changed eval files and report violations against the rules in `references/lint-rules.md`. Do NOT use for running evals or benchmarking — that belongs to agentv-bench.
Use when reviewing eval YAML files for quality issues, linting eval files before committing, checking eval schema compliance, or when asked to "review these evals", "check eval quality", "lint eval files", or "validate eval structure". Do NOT use for writing evals (use agentv-eval-writer) or running evals (use agentv-bench).
Analyze AgentV evaluation traces and result JSONL files using `agentv inspect` and `agentv compare` CLI commands. Use when asked to inspect AgentV eval results, find regressions between AgentV evaluation runs, identify failure patterns in AgentV trace data, analyze tool trajectories, or compute cost/latency/score statistics from AgentV result files. Do NOT use for benchmarking skill trigger accuracy, analyzing skill-creator eval performance, or measuring skill description quality — those tasks belong to the skill-creator skill.
Run AgentV evaluations and optimize agents through eval-driven iteration. Triggers: run evals, benchmark agents, optimize prompts/skills against evals, compare agent outputs across providers, analyze eval results, offline evaluation of recorded sessions, run autoresearch, optimize unattended, run overnight optimization loop. Not for: writing/editing eval YAML without running (use agentv-eval-writer), analyzing existing traces/JSONL without re-running (use agentv-trace-analyst).
Use when the user asks about deploying services, checking deployment status, rollback procedures, or release management at Acme Corp
Use when reviewing an AI plugin pull request, auditing plugin quality before release, or when asked to "review a plugin PR", "review skills in this PR", "check plugin quality", or "review workflow architecture". Covers skill quality, structural linting, and workflow architecture review.