Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

stevengonsalvez/autonomous-loops

Name: autonomous-loops
Author: stevengonsalvez

toolkit/packages/skills/autonomous-loops/SKILL.md

npx skillsauth add stevengonsalvez/agents-in-a-box autonomous-loops

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Autonomous Loop Patterns

Core Principle: Reviewer Never Authored

The agent that reviews work must never be the agent that authored it.

This is the single most important principle for autonomous quality. Self-review is unreliable -- the same blind spots that caused the error will miss it during review.

Implementation:

Use a separate agent instance (different subagent_type or name) for review
The reviewer receives only the output + acceptance criteria, not the generation prompt
Reviewer can request changes but never edits directly -- sends feedback to the author

Pattern 1: Generate -> Validate -> Fix

The most common autonomous loop. Generate output, validate against criteria, fix if needed.

+----------+     +----------+     +----------+
| Generate |---->| Validate |---->|   Fix    |--+
|          |     |          |     |          |  |
+----------+     +----+-----+     +----------+  |
                      | Pass                     |
                      v                          |
                 +----------+                    |
                 |  Accept  |<-------------------+
                 +----------+     (max 3 iterations)

When to use: Code generation, document creation, configuration authoring

MAX_ITERATIONS = 3

for iteration in range(MAX_ITERATIONS):
    if iteration == 0:
        output = generate(prompt, context)
    else:
        output = fix(output, validation_errors, context)

    is_valid, errors = validate(output, acceptance_criteria)

    if is_valid:
        return accept(output)

return escalate_to_human(output, errors)

Guard rails:

Hard cap on iterations (3 is typical, never exceed 5)
Each iteration must reduce error count -- if errors increase, break
Track token cost per iteration -- escalate if cost exceeds threshold

Pattern 2: Explore -> Hypothesize -> Test

For debugging and investigation. Gather evidence, form theory, validate.

+----------+     +-------------+     +----------+
| Explore  |---->| Hypothesize |---->|   Test   |--+
| (gather  |     | (form       |     | (verify  |  |
|  evidence)|    |  theory)    |     |  theory) |  |
+----------+     +-------------+     +----+-----+  |
                                          | Fail   |
                                          v        |
                                     +----------+  |
                                     | Refine   |--+
                                     | hypothesis|
                                     +----------+

When to use: Bug investigation, root cause analysis, codebase exploration

Guard rails:

Track hypotheses tested to avoid circular reasoning
Max 5 hypotheses before requesting human input
Evidence must be concrete (file:line references, error messages)

Pattern 3: Plan -> Execute -> Verify -> Adjust

For multi-step implementation tasks.

+----------+     +----------+     +----------+     +----------+
|   Plan   |---->| Execute  |---->|  Verify  |---->|  Adjust  |--+
| (steps)  |     | (step N) |     | (tests)  |     | (plan)   |  |
+----------+     +----------+     +----------+     +----------+  |
     ^                                                            |
     +------------------------------------------------------------+

When to use: Feature implementation, refactoring, migration tasks

Guard rails:

Plan must be approved before execution starts
Verify after EACH step, not just at the end
Adjustment can only modify future steps, never rewrite completed ones
If >50% of plan needs adjustment, re-plan from scratch

Pattern 4: Diverge -> Converge -> Select

For creative or design tasks where multiple approaches are valid.

+------------+     +------------+     +----------+
|  Diverge   |---->|  Converge  |---->|  Select  |
| (generate  |     | (evaluate  |     | (pick    |
|  N options)|     |  trade-offs)|    |  best)   |
+------------+     +------------+     +----------+

When to use: Architecture decisions, API design, UI alternatives

Guard rails:

Generate minimum 3 options (avoids false dichotomies)
Evaluation criteria defined BEFORE divergence (prevents bias)
Selection must reference criteria -- no "gut feeling"

Pattern 5: Seed -> Expand -> Prune

For building up content or code incrementally.

+----------+     +----------+     +----------+
|   Seed   |---->|  Expand  |---->|  Prune   |--+
| (minimal |     | (add     |     | (remove  |  |
|  version)|     |  features)|    |  bloat)  |  |
+----------+     +----------+     +----------+  |
                      ^                          |
                      +--------------------------+
                      (until scope complete)

When to use: MVP development, documentation, test suite building

Guard rails:

Seed must be complete and working before expansion
Each expansion adds ONE feature/section
Prune after every 3 expansions
Prune agent is separate from expand agent (reviewer-never-authored)

Pattern 6: Observe -> Orient -> Decide -> Act (OODA)

For reactive, event-driven agent workflows.

+----------+     +----------+     +----------+     +----------+
| Observe  |---->|  Orient  |---->|  Decide  |---->|   Act    |
| (monitor |     | (analyze |     | (choose  |     | (execute |
|  events) |     |  context)|     |  action) |     |  action) |
+----------+     +----------+     +----------+     +----------+
     ^                                                    |
     +----------------------------------------------------+

When to use: Monitoring, incident response, CI/CD automation

Guard rails:

Observation must be fresh (re-check state before acting)
Orientation must include context from previous loops
Decision must be logged for audit trail
Action must be reversible or confirmed

Applying Patterns

Choosing the Right Pattern

| Task Type | Recommended Pattern | |-----------|-------------------| | Code generation / editing | Generate -> Validate -> Fix | | Bug investigation | Explore -> Hypothesize -> Test | | Feature implementation | Plan -> Execute -> Verify -> Adjust | | Architecture / design | Diverge -> Converge -> Select | | Incremental building | Seed -> Expand -> Prune | | Monitoring / ops | OODA |

Combining Patterns

Patterns can be nested. For example:

Plan -> Execute where each Execute step uses Generate -> Validate -> Fix
Diverge -> Converge where each option is built with Seed -> Expand -> Prune
OODA where the Act phase uses Plan -> Execute -> Verify -> Adjust

Universal Guard Rails

Apply these to ALL patterns:

Max iterations: Every loop has a hard cap (typically 3-5)
Cost tracking: Monitor token spend per iteration
Progress check: Each iteration must demonstrably advance toward the goal
Escalation path: Clear handoff to human when loop exhausts iterations
Audit trail: Log each iteration's input, output, and decision

stevengonsalvez/autonomous-loops

toolkit/packages/skills/autonomous-loops/SKILL.md

Six proven autonomous agent loop patterns with guard rails. Provides reusable patterns for generate->validate->fix, explore->hypothesize->test, and other autonomous workflows. Includes the reviewer-never-authored principle for quality assurance. Use when: (1) Building autonomous agent workflows, (2) Designing self-correcting pipelines, (3) Implementing agent retry/fix loops, (4) Setting up multi-agent review processes, (5) User asks about agent loop patterns.

9 stars

development

Updated Apr 19, 2026

$ install --global

skillsauth

npx skillsauth add stevengonsalvez/agents-in-a-box autonomous-loops

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 19, 2026, 2:32 AM7.6s1 file scanned

SKILL.md

name:: autonomous-loops
description:: |
Use when:: (1) Building autonomous agent workflows, (2) Designing

Autonomous Loop Patterns

Core Principle: Reviewer Never Authored

The agent that reviews work must never be the agent that authored it.

This is the single most important principle for autonomous quality. Self-review is unreliable -- the same blind spots that caused the error will miss it during review.

Implementation:

Use a separate agent instance (different subagent_type or name) for review
The reviewer receives only the output + acceptance criteria, not the generation prompt
Reviewer can request changes but never edits directly -- sends feedback to the author

Pattern 1: Generate -> Validate -> Fix

The most common autonomous loop. Generate output, validate against criteria, fix if needed.

+----------+     +----------+     +----------+
| Generate |---->| Validate |---->|   Fix    |--+
|          |     |          |     |          |  |
+----------+     +----+-----+     +----------+  |
                      | Pass                     |
                      v                          |
                 +----------+                    |
                 |  Accept  |<-------------------+
                 +----------+     (max 3 iterations)

When to use: Code generation, document creation, configuration authoring

MAX_ITERATIONS = 3

for iteration in range(MAX_ITERATIONS):
    if iteration == 0:
        output = generate(prompt, context)
    else:
        output = fix(output, validation_errors, context)

    is_valid, errors = validate(output, acceptance_criteria)

    if is_valid:
        return accept(output)

return escalate_to_human(output, errors)

Guard rails:

Hard cap on iterations (3 is typical, never exceed 5)
Each iteration must reduce error count -- if errors increase, break
Track token cost per iteration -- escalate if cost exceeds threshold

Pattern 2: Explore -> Hypothesize -> Test

For debugging and investigation. Gather evidence, form theory, validate.

+----------+     +-------------+     +----------+
| Explore  |---->| Hypothesize |---->|   Test   |--+
| (gather  |     | (form       |     | (verify  |  |
|  evidence)|    |  theory)    |     |  theory) |  |
+----------+     +-------------+     +----+-----+  |
                                          | Fail   |
                                          v        |
                                     +----------+  |
                                     | Refine   |--+
                                     | hypothesis|
                                     +----------+

When to use: Bug investigation, root cause analysis, codebase exploration

Guard rails:

Track hypotheses tested to avoid circular reasoning
Max 5 hypotheses before requesting human input
Evidence must be concrete (file:line references, error messages)

Pattern 3: Plan -> Execute -> Verify -> Adjust

For multi-step implementation tasks.

+----------+     +----------+     +----------+     +----------+
|   Plan   |---->| Execute  |---->|  Verify  |---->|  Adjust  |--+
| (steps)  |     | (step N) |     | (tests)  |     | (plan)   |  |
+----------+     +----------+     +----------+     +----------+  |
     ^                                                            |
     +------------------------------------------------------------+

When to use: Feature implementation, refactoring, migration tasks

Guard rails:

Plan must be approved before execution starts
Verify after EACH step, not just at the end
Adjustment can only modify future steps, never rewrite completed ones
If >50% of plan needs adjustment, re-plan from scratch

Pattern 4: Diverge -> Converge -> Select

For creative or design tasks where multiple approaches are valid.

+------------+     +------------+     +----------+
|  Diverge   |---->|  Converge  |---->|  Select  |
| (generate  |     | (evaluate  |     | (pick    |
|  N options)|     |  trade-offs)|    |  best)   |
+------------+     +------------+     +----------+

When to use: Architecture decisions, API design, UI alternatives

Guard rails:

Generate minimum 3 options (avoids false dichotomies)
Evaluation criteria defined BEFORE divergence (prevents bias)
Selection must reference criteria -- no "gut feeling"

Pattern 5: Seed -> Expand -> Prune

For building up content or code incrementally.

+----------+     +----------+     +----------+
|   Seed   |---->|  Expand  |---->|  Prune   |--+
| (minimal |     | (add     |     | (remove  |  |
|  version)|     |  features)|    |  bloat)  |  |
+----------+     +----------+     +----------+  |
                      ^                          |
                      +--------------------------+
                      (until scope complete)

When to use: MVP development, documentation, test suite building

Guard rails:

Seed must be complete and working before expansion
Each expansion adds ONE feature/section
Prune after every 3 expansions
Prune agent is separate from expand agent (reviewer-never-authored)

Pattern 6: Observe -> Orient -> Decide -> Act (OODA)

For reactive, event-driven agent workflows.

+----------+     +----------+     +----------+     +----------+
| Observe  |---->|  Orient  |---->|  Decide  |---->|   Act    |
| (monitor |     | (analyze |     | (choose  |     | (execute |
|  events) |     |  context)|     |  action) |     |  action) |
+----------+     +----------+     +----------+     +----------+
     ^                                                    |
     +----------------------------------------------------+

When to use: Monitoring, incident response, CI/CD automation

Guard rails:

Observation must be fresh (re-check state before acting)
Orientation must include context from previous loops
Decision must be logged for audit trail
Action must be reversible or confirmed

Applying Patterns

Choosing the Right Pattern

Combining Patterns

Patterns can be nested. For example:

Plan -> Execute where each Execute step uses Generate -> Validate -> Fix
Diverge -> Converge where each option is built with Seed -> Expand -> Prune
OODA where the Act phase uses Plan -> Execute -> Verify -> Adjust

Universal Guard Rails

Apply these to ALL patterns:

Max iterations: Every loop has a hard cap (typically 3-5)
Cost tracking: Monitor token spend per iteration
Progress check: Each iteration must demonstrably advance toward the goal
Escalation path: Clear handoff to human when loop exhausts iterations
Audit trail: Log each iteration's input, output, and decision

Related Skills

stevengonsalvez/reflect:cost

documentation

VerifiedTrustedCommunity

Report reflect drain spend over a time window — tokens split by cached (cache_read), uncached writes (cache_creation), and io (input+output), with a $ estimate, grouped by day / outcome / model / transcript. Reads the drainer's cost log and surfaces outlier runs and cache-reuse health (the 41.5M-token failure mode = low cache reuse + high cache writes). Use to answer "what is reflection costing me" for the last day / week.

12SKILL.mdUpdated Jun 2, 2026

stevengonsalvez/reflect:cost

stevengonsalvez/ainb-fleet:standup

development

VerifiedTrustedCommunity

Show fleet status — every claude session running on the host, merged across ainb + claude-peers broker + background jobs. Use when you need to enumerate sessions before composing an action, see which sessions have a peer registered (broker-routable) vs tmux-only, check the `summary` of each session, or pipe the list into jq for filtering. Default output: text table. Pass --format json for LLM consumption.

10SKILL.mdUpdated May 31, 2026

stevengonsalvez/ainb-fleet:standup

stevengonsalvez/ainb-fleet:sequence

testing

VerifiedTrustedCommunity

Ordered multi-step prompts to fleet targets, ack-gated between steps via JSONL assistant-turn-end detection. Use for cycles like disconnect→reconnect→verify, or any flow where step N+1 requires step N to have completed first. The skill BLOCKS until each target's transcript shows the next assistant turn finishing OR per-step timeout fires (default 300s).

10SKILL.mdUpdated May 31, 2026

stevengonsalvez/ainb-fleet:sequence

stevengonsalvez/ainb-fleet:needs

development

VerifiedTrustedCommunity

Center control panel — enumerate every claude session that is blocked waiting on something: a user answer (AskUserQuestion fired), an API error retry, an idle assistant turn-end with no follow-up, or an explicit WAITING: marker. Returns rich JSON with signal kind + context per session. Use this when you've stepped away from the fleet and want one place to see everything that wants your attention and answer it.

10SKILL.mdUpdated May 31, 2026

stevengonsalvez/ainb-fleet:needs

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/stevengonsalvez/agents-in-a-box.git

# Copy into Claude Code skills folder (global)
cp -r agents-in-a-box/toolkit/packages/skills/autonomous-loops ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

stevengonsalvez/agents-in-a-box

9 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT