skills/hosted-agents/SKILL.md
--- name: hosted-agents description: This skill should be used when designing hosted or background agent infrastructure: sandboxed execution, remote coding environments, warm pools, session persistence, multiplayer collaboration, self-spawning agents, or Modal-style sandboxes. --- # Hosted Agent Infrastructure Hosted agents run in remote sandboxed environments rather than on local machines. When designed well, they provide unlimited concurrency, consistent execution environments, and multiplay
npx skillsauth add guanyang/antigravity-skills skills/hosted-agentsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Hosted agents run in remote sandboxed environments rather than on local machines. When designed well, they provide unlimited concurrency, consistent execution environments, and multiplayer collaboration. The critical insight is that session speed should be limited only by model provider time-to-first-token, with all infrastructure setup completed before the user starts their session.
Activate this skill when:
Do not activate this skill for adjacent work owned by other skills:
harness-engineering.multi-agent-patterns.tool-design.filesystem-context.Move agent execution to remote sandboxed environments to eliminate the fundamental limits of local execution: resource contention, environment inconsistency, and single-user constraints. Remote sandboxes unlock unlimited concurrency, reproducible environments, and collaborative workflows because each session gets its own isolated compute with a known-good environment image.
Design the architecture in three layers because each layer scales independently. Build sandbox infrastructure for isolated execution, an API layer for state management and client coordination, and client interfaces for user interaction across platforms. Keep these layers cleanly separated so sandbox changes do not ripple into clients.
The Core Challenge Eliminate sandbox spin-up latency because users perceive anything over a few seconds as broken. Development environments require cloning repositories, installing dependencies, and running build steps -- do all of this before the user ever submits a prompt.
Image Registry Pattern Pre-build environment images on a regular cadence (every 30 minutes works well) because this makes synchronization with the latest code a fast delta rather than a full clone. Include in each image:
When starting a session, spin up a sandbox from the most recent image. The repository is at most 30 minutes out of date, making the remaining git sync fast.
Snapshot and Restore Take filesystem snapshots at key points to enable instant restoration for follow-up prompts without re-running setup:
Git Configuration for Background Agents Configure git identity explicitly in every sandbox because background agents are not tied to a specific user during image builds:
user.name and user.email when committing and pushing changesWarm Pool Strategy Maintain a pool of pre-warmed sandboxes for high-volume repositories because cold starts are the primary source of user frustration:
Server-First Architecture Structure the agent framework as a server first, with TUI and desktop apps as thin clients, because this prevents duplicating agent logic across surfaces:
Code as Source of Truth Select frameworks where the agent can read its own source code to understand behavior. Prioritize this because having code as source of truth prevents the agent from hallucinating about its own capabilities -- an underrated failure mode in AI development.
Plugin System Requirements Require a plugin system that supports runtime interception because this enables safety controls and observability without modifying core agent logic:
tool.execute.before)Predictive Warm-Up Start warming the sandbox as soon as a user begins typing their prompt, not when they submit it, because the typing interval (5-30 seconds) is enough to complete most setup:
Parallel File Reading Allow the agent to start reading files immediately even if sync from latest base branch is not complete, because in large repositories incoming prompts rarely touch recently-changed files:
Maximize Build-Time Work Move everything possible to the image build step because build-time duration is invisible to users:
Agent-Spawned Sessions Build tools that allow agents to spawn new sessions because frontier models are capable of decomposing work and coordinating sub-tasks:
Expose three primitives: start a new session with specified parameters, read status of any session (check-in capability), and continue main work while sub-sessions run in parallel.
Prompt Engineering for Self-Spawning Engineer prompts that guide when agents should spawn sub-sessions rather than doing work inline:
Per-Session State Isolation Isolate state per session (SQLite per session works well) because cross-session interference is a subtle and hard-to-debug failure mode:
Real-Time Streaming Stream all agent work in real-time because high-frequency feedback is critical for user trust:
Use WebSocket connections with hibernation APIs to reduce compute costs during idle periods while maintaining open connections.
Synchronization Across Clients Build a single state system that synchronizes across all clients (chat interfaces, Slack bots, Chrome extensions, web interfaces, VS Code instances) because users switch surfaces frequently and expect continuity. All changes sync to the session state, enabling seamless client switching.
Why Multiplayer Matters Design for multiplayer from day one because it is nearly free to add with proper synchronization architecture, and it unlocks high-value workflows:
Implementation Requirements Build the data model so sessions are not tied to single authors because multiplayer fails silently if authorship is hardcoded:
User-Based Commits Use GitHub authentication to open PRs on behalf of the user (not the app) because this preserves the audit trail and prevents users from approving their own AI-generated changes:
Sandbox-to-API Flow Follow this sequence because it keeps sandbox permissions minimal while letting the API handle sensitive operations:
Slack Integration Prioritize Slack as the first distribution channel for internal adoption because it creates a virality loop as team members see others using it:
Web Interface Build a web interface with these features because it serves as the primary power-user surface:
Chrome Extension Build a Chrome extension for non-engineering users because DOM and React internals extraction gives higher precision than raw screenshots at lower token cost:
Before building the system, decide these infrastructure properties explicitly:
Choose between queueing and inserting follow-up messages sent during execution. Prefer queueing because it is simpler to manage and lets users send thoughts on next steps while the agent works. Build a mechanism to stop the agent mid-execution when needed, because without it users feel trapped.
Track these metrics because they indicate real value rather than vanity usage:
Drive internal adoption through visibility rather than mandates because forced usage breeds resentment:
Example 1: Background coding session lifecycle
user prompt
-> API allocates warm sandbox from current image
-> sandbox syncs latest branch delta
-> reads allowed immediately, writes blocked until sync completes
-> agent edits and tests inside isolated workspace
-> sandbox snapshots final filesystem
-> branch is pushed
-> API creates PR using user token
-> session summary, logs, and PR URL are returned to clients
This sequence keeps setup work outside the user-visible path while preserving auditability and user ownership of code changes.
Example 2: Boundary decision
If the task is "make the agent loop run for days with locked rubrics and PR approval," use harness-engineering. If the task is "run that loop in remote sandboxes with warm pools, session snapshots, streaming clients, and user-authored PRs," use this skill.
user.name or user.email causes commit failures in background agents. Always set git identity explicitly during sandbox configuration, never assume it carries over from the image.This skill owns hosted runtime infrastructure. Adjacent skills own the control system, topology, and tool contracts:
harness-engineering: governance, locked evaluators, rollback, novelty gates, and human approval boundaries around autonomous work.multi-agent-patterns: self-spawning and supervisor patterns once hosted infrastructure exists.tool-design: spawn, status, teardown, and PR tools exposed to agents.context-optimization: managing context across distributed hosted sessions.filesystem-context: using the sandbox filesystem for durable session state and artifacts.Internal reference:
Related skills in this collection:
External resources:
Created: 2026-01-12 Last Updated: 2026-05-15 Author: Agent Skills for Context Engineering Contributors Version: 1.2.0
tools
Implements Manus-style file-based planning to organize and track progress on complex tasks. Creates task_plan.md, findings.md, and progress.md. Use when asked to plan out, break down, or organize a multi-step project, research task, or any work requiring 5+ tool calls. Supports automatic session recovery after /clear.
development
AI image generation with OpenAI GPT Image 2, Azure OpenAI, Google, OpenRouter, DashScope, Z.AI GLM-Image, MiniMax, Jimeng, Seedream, Replicate and Agnes APIs. Supports text-to-image, reference images, aspect ratios, and batch generation from saved prompt files. Sequential by default; use batch parallel generation when the user already has multiple prompts or wants stable multi-image throughput. Use when user asks to generate, create, or draw images.
development
Guidance for distinctive, intentional visual design when building new UI or reshaping an existing one. Helps with aesthetic direction, typography, and making choices that don't read as templated defaults.
tools
Reference for the Claude API / Anthropic SDK — model ids, pricing, params, streaming, tool use, MCP, agents, caching, token counting, model migration. TRIGGER — read BEFORE opening the target file; don't skip because it "looks like a one-liner" — whenever: the prompt names Claude/Anthropic in any form (Claude, Anthropic, Fable, Opus, Sonnet, Haiku, `anthropic`, `@anthropic-ai`, `claude-*`, `us.anthropic.*`, `[1m]`); the user asks about an LLM (pricing/model choice/limits/caching) — never answer from memory; OR the task is LLM-shaped with provider unstated (agent/MCP/tool-definition/multi-agent/RAG/LLM-judge/computer-use; generate/summarize/extract/classify/rewrite/converse over NL; debugging refusals/cutoffs/streaming/tool-calls/tokens). SKIP only when another provider is being worked on (overrides all triggers): OpenAI/GPT/Gemini/Llama/Mistral/Cohere/Ollama named in the query; OR `grep -rE 'openai|langchain_openai|google.generativeai|genai|mistralai|cohere|ollama'` over the project hits (run this grep FIRST if no provider named — don't Read the file).