Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

hexbee/multi-agent-systems

Name: multi-agent-systems
Author: hexbee

skills/multi-agent-systems/SKILL.md

npx skillsauth add hexbee/hello-skills multi-agent-systems

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Multi-Agent Systems

When to Use Multi-Agent Architectures

Multi-agent systems introduce overhead. Every additional agent represents another potential point of failure, another set of prompts to maintain, and another source of unexpected behavior.

Multi-agent systems use 3-10x more tokens than single-agent approaches due to:

Duplicating context across agents
Coordination messages between agents
Summarizing results for handoffs

Start with a Single Agent

A well-designed single agent with appropriate tools can accomplish far more than expected. Use single agent when:

Tasks are sequential and context-dependent
Tool count is under 15-20
No clear benefit from parallelization

Three Cases Where Multi-Agent Excels

Context pollution - Subtasks generate >1000 tokens but most info is irrelevant to main task
Parallelization - Tasks can run independently and explore larger search space
Specialization - Different tasks need different tools, prompts, or domain expertise

Decision Framework

Context Protection Pattern

Use when subtasks generate large context but only summary is needed for main task.

Example: Customer Support

class OrderLookupAgent:
    def lookup_order(self, order_id: str) -> dict:
        messages = [{"role": "user", "content": f"Get essential details for order {order_id}"}]
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=1024,
            messages=messages, tools=[get_order_details_tool]
        )
        return extract_summary(response)  # Returns 50-100 tokens, not 2000+

class SupportAgent:
    def handle_issue(self, user_message: str):
        if needs_order_info(user_message):
            order_id = extract_order_id(user_message)
            order_summary = OrderLookupAgent().lookup_order(order_id)
            context = f"Order {order_id}: {order_summary['status']}, purchased {order_summary['date']}"
        # Main agent gets clean context

Best when:

Subtask generates >1000 tokens, most irrelevant
Subtast is well-defined with clear extraction criteria
Lookup/retrieval operations need filtering before use

Parallelization Pattern

Use when exploring larger search space or independent research facets.

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()

async def research_topic(query: str) -> dict:
    facets = await lead_agent.decompose_query(query)
    tasks = [research_subagent(facet) for facet in facets]
    results = await asyncio.gather(*tasks)
    return await lead_agent.synthesize(results)

async def research_subagent(facet: str) -> dict:
    messages = [{"role": "user", "content": f"Research: {facet}"}]
    response = await client.messages.create(
        model="claude-sonnet-4-5", max_tokens=4096,
        messages=messages, tools=[web_search, read_document]
    )
    return extract_findings(response)

Benefit: Thoroughness, not speed. Covers more ground at higher token cost.

Specialization Patterns

Tool Set Specialization

Split by domain when agent has 20+ tools, shows domain confusion, or degraded performance.

Signs you need specialization:

Quantity: 20+ tools
Domain confusion: Tools span unrelated domains
Degraded performance: New tools hurt existing tasks

System Prompt Specialization

Different tasks require conflicting behavioral modes:

Customer support: empathetic, patient
Code review: precise, critical
Compliance: rigid rule-following
Brainstorming: creative flexibility

Domain Expertise Specialization

Deep domain context that would overwhelm a generalist:

Legal analysis: case law, regulatory frameworks
Medical research: clinical trial methodology

Multi-Platform Integration Example

class CRMAgent:
    system_prompt = """You are a CRM specialist. You manage contacts,
    opportunities, and account records. Always verify record ownership
    before updates and maintain data integrity across related records."""
    tools = [crm_get_contacts, crm_create_opportunity]  # 8-10 CRM tools

class MarketingAgent:
    system_prompt = """You are a marketing automation specialist. You
    manage campaigns, lead scoring, and email sequences."""
    tools = [marketing_get_campaigns, marketing_create_lead]  # 8-10 tools

class OrchestratorAgent:
    def execute(self, user_request: str):
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=1024,
            system="""Route to appropriate specialist:
    - CRM: Contacts, opportunities, accounts, sales pipeline
    - Marketing: Campaigns, lead nurturing, email sequences""",
            messages=[{"role": "user", "content": user_request}],
            tools=[delegate_to_crm, delegate_to_marketing]
        )
        return response

Context-Centric Decomposition

Problem-centric (counterproductive): Split by work type (writer, tester, reviewer) - creates coordination overhead, context loss at handoffs.

Context-centric (effective): Agent handling a feature also handles its tests - already has necessary context.

Effective Boundaries

Independent research paths
Separate components with clean API contracts
Blackbox verification

Problematic Boundaries

Sequential phases of same work
Tightly coupled components
Work requiring shared state

Verification Subagent Pattern

Dedicated agent for testing/validating main agent's work. Succeeds because verification requires minimal context transfer.

class CodingAgent:
    def implement_feature(self, requirements: str) -> dict:
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=4096,
            messages=[{"role": "user", "content": f"Implement: {requirements}"}],
            tools=[read_file, write_file, list_directory]
        )
        return {"code": response.content, "files_changed": extract_files(response)}

class VerificationAgent:
    def verify_implementation(self, requirements: str, files_changed: list) -> dict:
        messages = [{"role": "user", "content": f"""
Requirements: {requirements}
Files changed: {files_changed}

Run the complete test suite and verify:
1. All existing tests pass
2. New functionality works as specified
3. No obvious errors or security issues

You MUST run: pytest --verbose
Only mark as PASSED if ALL tests pass with no failures.
"""}]
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=4096,
            messages=messages, tools=[run_tests, execute_code, read_file]
        )
        return {"passed": extract_pass_fail(response), "issues": extract_issues(response)}

Mitigate "Early Victory Problem"

Verifier marks passing without thorough testing. Prevention:

Concrete criteria: "Run full test suite" not "make sure it works"
Comprehensive checks: Test multiple scenarios and edge cases
Negative tests: Confirm inputs that should fail do fail
Explicit instructions: "You MUST run the complete test suite"

Moving Forward Checklist

Before adding multi-agent complexity:

[ ] Genuine constraints exist (context limits, parallelization, specialization need)
[ ] Decomposition follows context, not problem type
[ ] Clear verification points where subagents can validate

Start with simplest approach that works. Add complexity only when evidence supports it.

References

Building multi-agent systems: when and how to use them
Building effective agents
Effective context engineering for AI agents
Writing effective tools for agents

hexbee/multi-agent-systems

skills/multi-agent-systems/SKILL.md

Design and implement multi-agent LLM architectures using the orchestrator-subagent pattern. Use when: (1) Deciding whether to use multi-agent vs single-agent systems, (2) Implementing context isolation for high-volume operations, (3) Parallelizing independent research tasks, (4) Creating specialized agents with focused tool sets, (5) Building verification subagents for quality assurance, or (6) Analyzing context-centric decomposition boundaries.

tools

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add hexbee/hello-skills multi-agent-systems

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 8:05 AM12.3s5 files scanned

SKILL.md

name:: multi-agent-systems
description:: Design and implement multi-agent LLM architectures using the orchestrator-subagent pattern. Use when: (1) Deciding whether to use multi-agent vs single-agent systems, (2) Implementing context isolation for high-volume operations, (3) Parallelizing independent research tasks, (4) Creating specialized agents with focused tool sets, (5) Building verification subagents for quality assurance, or (6) Analyzing context-centric decomposition boundaries.

Multi-Agent Systems

When to Use Multi-Agent Architectures

Multi-agent systems introduce overhead. Every additional agent represents another potential point of failure, another set of prompts to maintain, and another source of unexpected behavior.

Multi-agent systems use 3-10x more tokens than single-agent approaches due to:

Duplicating context across agents
Coordination messages between agents
Summarizing results for handoffs

Start with a Single Agent

A well-designed single agent with appropriate tools can accomplish far more than expected. Use single agent when:

Tasks are sequential and context-dependent
Tool count is under 15-20
No clear benefit from parallelization

Three Cases Where Multi-Agent Excels

Context pollution - Subtasks generate >1000 tokens but most info is irrelevant to main task
Parallelization - Tasks can run independently and explore larger search space
Specialization - Different tasks need different tools, prompts, or domain expertise

Decision Framework

Context Protection Pattern

Use when subtasks generate large context but only summary is needed for main task.

Example: Customer Support

class OrderLookupAgent:
    def lookup_order(self, order_id: str) -> dict:
        messages = [{"role": "user", "content": f"Get essential details for order {order_id}"}]
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=1024,
            messages=messages, tools=[get_order_details_tool]
        )
        return extract_summary(response)  # Returns 50-100 tokens, not 2000+

class SupportAgent:
    def handle_issue(self, user_message: str):
        if needs_order_info(user_message):
            order_id = extract_order_id(user_message)
            order_summary = OrderLookupAgent().lookup_order(order_id)
            context = f"Order {order_id}: {order_summary['status']}, purchased {order_summary['date']}"
        # Main agent gets clean context

Best when:

Subtask generates >1000 tokens, most irrelevant
Subtast is well-defined with clear extraction criteria
Lookup/retrieval operations need filtering before use

Parallelization Pattern

Use when exploring larger search space or independent research facets.

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()

async def research_topic(query: str) -> dict:
    facets = await lead_agent.decompose_query(query)
    tasks = [research_subagent(facet) for facet in facets]
    results = await asyncio.gather(*tasks)
    return await lead_agent.synthesize(results)

async def research_subagent(facet: str) -> dict:
    messages = [{"role": "user", "content": f"Research: {facet}"}]
    response = await client.messages.create(
        model="claude-sonnet-4-5", max_tokens=4096,
        messages=messages, tools=[web_search, read_document]
    )
    return extract_findings(response)

Benefit: Thoroughness, not speed. Covers more ground at higher token cost.

Specialization Patterns

Tool Set Specialization

Split by domain when agent has 20+ tools, shows domain confusion, or degraded performance.

Signs you need specialization:

Quantity: 20+ tools
Domain confusion: Tools span unrelated domains
Degraded performance: New tools hurt existing tasks

System Prompt Specialization

Different tasks require conflicting behavioral modes:

Customer support: empathetic, patient
Code review: precise, critical
Compliance: rigid rule-following
Brainstorming: creative flexibility

Domain Expertise Specialization

Deep domain context that would overwhelm a generalist:

Legal analysis: case law, regulatory frameworks
Medical research: clinical trial methodology

Multi-Platform Integration Example

class CRMAgent:
    system_prompt = """You are a CRM specialist. You manage contacts,
    opportunities, and account records. Always verify record ownership
    before updates and maintain data integrity across related records."""
    tools = [crm_get_contacts, crm_create_opportunity]  # 8-10 CRM tools

class MarketingAgent:
    system_prompt = """You are a marketing automation specialist. You
    manage campaigns, lead scoring, and email sequences."""
    tools = [marketing_get_campaigns, marketing_create_lead]  # 8-10 tools

class OrchestratorAgent:
    def execute(self, user_request: str):
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=1024,
            system="""Route to appropriate specialist:
    - CRM: Contacts, opportunities, accounts, sales pipeline
    - Marketing: Campaigns, lead nurturing, email sequences""",
            messages=[{"role": "user", "content": user_request}],
            tools=[delegate_to_crm, delegate_to_marketing]
        )
        return response

Context-Centric Decomposition

Problem-centric (counterproductive): Split by work type (writer, tester, reviewer) - creates coordination overhead, context loss at handoffs.

Context-centric (effective): Agent handling a feature also handles its tests - already has necessary context.

Effective Boundaries

Independent research paths
Separate components with clean API contracts
Blackbox verification

Problematic Boundaries

Sequential phases of same work
Tightly coupled components
Work requiring shared state

Verification Subagent Pattern

Dedicated agent for testing/validating main agent's work. Succeeds because verification requires minimal context transfer.

class CodingAgent:
    def implement_feature(self, requirements: str) -> dict:
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=4096,
            messages=[{"role": "user", "content": f"Implement: {requirements}"}],
            tools=[read_file, write_file, list_directory]
        )
        return {"code": response.content, "files_changed": extract_files(response)}

class VerificationAgent:
    def verify_implementation(self, requirements: str, files_changed: list) -> dict:
        messages = [{"role": "user", "content": f"""
Requirements: {requirements}
Files changed: {files_changed}

Run the complete test suite and verify:
1. All existing tests pass
2. New functionality works as specified
3. No obvious errors or security issues

You MUST run: pytest --verbose
Only mark as PASSED if ALL tests pass with no failures.
"""}]
        response = client.messages.create(
            model="claude-sonnet-4-5", max_tokens=4096,
            messages=messages, tools=[run_tests, execute_code, read_file]
        )
        return {"passed": extract_pass_fail(response), "issues": extract_issues(response)}

Mitigate "Early Victory Problem"

Verifier marks passing without thorough testing. Prevention:

Concrete criteria: "Run full test suite" not "make sure it works"
Comprehensive checks: Test multiple scenarios and edge cases
Negative tests: Confirm inputs that should fail do fail
Explicit instructions: "You MUST run the complete test suite"

Moving Forward Checklist

Before adding multi-agent complexity:

[ ] Genuine constraints exist (context limits, parallelization, specialization need)
[ ] Decomposition follows context, not problem type
[ ] Clear verification points where subagents can validate

Start with simplest approach that works. Add complexity only when evidence supports it.

References

Building multi-agent systems: when and how to use them
Building effective agents
Effective context engineering for AI agents
Writing effective tools for agents

Related Skills

hexbee/fix-orbstack-docker-pull

testing

VerifiedTrustedCommunity

Diagnose and fix Docker image pull failures on macOS with OrbStack, especially Docker Hub EOF/TLS/manifest errors caused by system proxies, Clash/CyberClash/Mihomo/Surge-style TUN mode, fake-ip DNS such as 198.18.0.x, or unstable registry access. Use when `docker pull` or `docker manifest inspect` fails with EOF, SSL_ERROR_SYSCALL, failed to fetch anonymous token, failed to resolve reference, failed to copy, or registry-1.docker.io/auth.docker.io connectivity confusion.

SKILL.mdUpdated Apr 24, 2026

hexbee/fix-orbstack-docker-pull

hexbee/resume-builder

development

VerifiedTrustedCommunity

Generate and revise job resumes from raw notes, existing resumes, career histories, or profile snippets. Use when Codex needs to create, redesign, tighten, or review a resume/CV, especially for Chinese or English A4 resumes, PDF/HTML output, first-screen hiring signal, skill ordering, pagination balance, header/contact layout, or reframing an engineering background for AI-focused roles.

SKILL.mdUpdated Apr 22, 2026

hexbee/resume-builder

hexbee/url-to-markdown

development

VerifiedTrustedCommunity

Convert a public webpage URL into Markdown and save it as a reusable `.md` file with the bundled script. Prefer `https://r.jina.ai/<url>` first, and only fallback to `https://markdown.new/` if `r.jina.ai` is unavailable. Use this whenever the user wants to turn a public webpage, article, documentation page, blog post, release note, or reference URL into Markdown for reading, archiving, summarizing, extraction, RAG prep, or downstream agent reuse, even if they do not explicitly mention markdown or saving a file.

SKILL.mdUpdated Apr 16, 2026

hexbee/url-to-markdown

hexbee/saas-agent-toolkit

tools

VerifiedTrustedCommunity

Design agent-usable SaaS tool systems using six reusable tool shapes (Search, Summarize, Draft, Update, Notify, Approve) plus connectors and policy guardrails. Use when turning SaaS features into reliable agent actions with clear contracts, permissions, audit trails, and approval gates.

SKILL.mdUpdated Apr 16, 2026

hexbee/saas-agent-toolkit

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/hexbee/hello-skills.git

# Copy into Claude Code skills folder (global)
cp -r hello-skills/skills/multi-agent-systems ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

hexbee/hello-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT