agent-orchestration-improve-agent/SKILL.md
Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
npx skillsauth add automacoescomerciaisintegradas/skills agent-orchestration-improve-agentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
[Extended thinking: Agent optimization requires a data-driven approach combining performance metrics, user feedback analysis, and advanced prompt engineering techniques. Success depends on systematic evaluation, targeted improvements, and rigorous testing with rollback capabilities for production safety.]
Comprehensive analysis of agent performance using context-manager for historical data collection.
Use: context-manager
Command: analyze-agent-performance $ARGUMENTS --days 30
Collect metrics including:
Identify recurring patterns in user interactions:
Categorize failures by root cause:
Generate quantitative baseline metrics:
Performance Baseline:
- Task Success Rate: [X%]
- Average Corrections per Task: [Y]
- Tool Call Efficiency: [Z%]
- User Satisfaction Score: [1-10]
- Average Response Latency: [Xms]
- Token Efficiency Ratio: [X:Y]
Apply advanced prompt optimization techniques using prompt-engineer agent.
Implement structured reasoning patterns:
Use: prompt-engineer
Technique: chain-of-thought-optimization
Curate high-quality examples from successful interactions:
Example structure:
Good Example:
Input: [User request]
Reasoning: [Step-by-step thought process]
Output: [Successful response]
Why this works: [Key success factors]
Bad Example:
Input: [Similar request]
Output: [Failed response]
Why this fails: [Specific issues]
Correct approach: [Fixed version]
Strengthen agent identity and capabilities:
Implement self-correction mechanisms:
Constitutional Principles:
1. Verify factual accuracy before responding
2. Self-check for potential biases or harmful content
3. Validate output format matches requirements
4. Ensure response completeness
5. Maintain consistency with previous responses
Add critique-and-revise loops:
Optimize response structure:
Comprehensive testing framework with A/B comparison.
Create representative test scenarios:
Test Categories:
1. Golden path scenarios (common successful cases)
2. Previously failed tasks (regression testing)
3. Edge cases and corner scenarios
4. Stress tests (complex, multi-step tasks)
5. Adversarial inputs (potential breaking points)
6. Cross-domain tasks (combining capabilities)
Compare original vs improved agent:
Use: parallel-test-runner
Config:
- Agent A: Original version
- Agent B: Improved version
- Test set: 100 representative tasks
- Metrics: Success rate, speed, token usage
- Evaluation: Blind human review + automated scoring
Statistical significance testing:
Comprehensive scoring framework:
Task-Level Metrics:
Quality Metrics:
Performance Metrics:
Structured human review process:
Safe rollout with monitoring and rollback capabilities.
Systematic versioning strategy:
Version Format: agent-name-v[MAJOR].[MINOR].[PATCH]
Example: customer-support-v2.3.1
MAJOR: Significant capability changes
MINOR: Prompt improvements, new examples
PATCH: Bug fixes, minor adjustments
Maintain version history:
Progressive deployment strategy:
Quick recovery mechanism:
Rollback Triggers:
- Success rate drops >10% from baseline
- Critical errors increase >5%
- User complaints spike
- Cost per task increases >20%
- Safety violations detected
Rollback Process:
1. Detect issue via monitoring
2. Alert team immediately
3. Switch to previous stable version
4. Analyze root cause
5. Fix and re-test before retry
Real-time performance tracking:
Agent improvement is successful when:
After 30 days of production use:
Establish regular improvement cadence:
Remember: Agent optimization is an iterative process. Each cycle builds upon previous learnings, gradually improving performance while maintaining stability and safety.
development
name: Claude Code System Prompts Mirror slug: claude-code-system-prompts version: 1.0.0 owner: Automações Comerciais Integradas description: Espelho versionado dos prompts de sistema do Claude Code (upstream Piebald-AI), com foco em consulta, estudo e adaptação para engenharia de agentes. language: pt-BR commands: - command: /prompts-index description: Lista categorias e principais arquivos do espelho local de prompts. parameters: - name: categoria type: string r
development
Skill de direção de arte inspirada no visual de /paz-bem.html: editorial premium, tipografia serif/sans, paleta quente (gesso/terracota/carvão), texturas, grid assimétrica e microinterações com GSAP.
development
Padrao de deploy estatico para projetos HTML/CSS/JS puro, com build local por ambiente (dev/prod), publicacao Git sem Actions e promocao de branch entre ambientes.
development
nome: Botão WhatsApp Floating descricao: Skill para gerar e integrar botões flutuantes do WhatsApp com design premium, animação de pulso e link direto para chat. autor: Antigravity comandos: comando: /gerar-botao-whatsapp descricao: Gera o código HTML/CSS completo para um botão flutuante personalizável. parametros: - nome: numero tipo: string descricao: Número do WhatsApp com DDI e DDD (ex: 5541992062238). - nome: mensagem tipo: string descricao: Mensagem inicial pré-preench