skills/claude-4-6-features/computer-use/SKILL.md
Build browser/desktop automation agents using Claude's Computer Use capability — screen-taking, clicking, typing. Covers the reference container, virtualization safety, task decomposition, and when to use computer-use vs API integration. Use this skill when building agents that operate GUIs (browsers, legacy apps), automating workflows without APIs, or QA/testing agents. Activate when: Claude computer use, browser automation, desktop agent, screen control, computer_20250124, click and type agent.
npx skillsauth add latestaiagents/agent-skills computer-useInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Computer Use lets Claude control a virtual computer — take screenshots, move cursor, click, type. Use it when there's no API, not as a first resort.
const tools = [
{ type: "computer_20250124", name: "computer", display_width_px: 1280, display_height_px: 800, display_number: 1 },
{ type: "text_editor_20250124", name: "str_replace_editor" }, // optional
{ type: "bash_20250124", name: "bash" }, // optional
];
const response = await client.beta.messages.create(
{
model: "claude-sonnet-4-6",
max_tokens: 4096,
tools,
messages: [{ role: "user", content: "Open the browser and find Claude's API pricing page." }],
},
{ headers: { "anthropic-beta": "computer-use-2025-01-24" } },
);
The model returns tool_use blocks with actions like screenshot, left_click, type, key, mouse_move, scroll.
Anthropic ships a reference Docker container (ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest) with:
Don't run this on a host with real user data. It's a demo. For production, use a sandboxed cloud VM (Firecracker, Vercel Sandbox, cloud-hypervisor).
async function run(goal: string) {
const messages = [{ role: "user", content: goal }];
while (true) {
const response = await client.beta.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 4096,
tools,
messages,
}, { headers: { "anthropic-beta": "computer-use-2025-01-24" } });
messages.push({ role: "assistant", content: response.content });
if (response.stop_reason !== "tool_use") return response;
const results = [];
for (const block of response.content) {
if (block.type === "tool_use") {
const output = await executeAction(block.name, block.input); // your VM controller
results.push({
type: "tool_result",
tool_use_id: block.id,
content: output.screenshot ? [{ type: "image", source: output.screenshot }] : output.text,
});
}
}
messages.push({ role: "user", content: results });
}
}
Every action returns a fresh screenshot so the model sees the result. Image tokens add up fast.
Computer Use is slow and error-prone on long sequences. Break work into sub-goals:
Bad: "Go to Example.com, find the pricing page, fill out the demo form with fake data, submit."
Good: Step 1: "Navigate to example.com/pricing"
Step 2: "Locate the demo request button"
Step 3: "Fill the form: name=..., email=..."
Step 4: "Submit"
Verify each step's screenshot before moving on. Your app orchestrates — the model doesn't need to do everything in one loop.
Don't use Computer Use for high-volume work. It's a specialty tool.
The biggest threat: the agent reads text on screen, some of which is attacker-controlled (email content, web pages). Injection can redirect it to click "Send $1000 to X".
Mitigations:
development
Test skills for correct activation, content quality, and regression — both automated checks (frontmatter validity, lint) and manual verification (query-suite activation testing). Covers CI integration and how to catch skill regressions before users do. Use this skill when adding skills to a repo, setting up CI for a skill library, or debugging "the skill exists but doesn't work". Activate when: test skills, validate skills, skill CI, skill linting, skill activation test, skill regression.
documentation
Write the YAML frontmatter for a SKILL.md file so it activates reliably — name, description, and activation keywords that the model matches against. Covers length, tone, and the most common frontmatter mistakes. Use this skill when authoring a new skill, fixing a skill that isn't auto-activating, or reviewing skills for publication. Activate when: SKILL.md frontmatter, skill description, skill activation, skill YAML, write a skill, author a skill.
development
Design skills that fire at the right moment — neither over-eager (noise) nor under-eager (silent). Covers activation specificity, trigger phrases, disambiguation between overlapping skills, and debugging activation. Use this skill when multiple skills could fire on the same query, a skill never fires, or a skill fires too often. Activate when: skill won't activate, skill over-activates, overlapping skills, skill triggers, skill selection, skill disambiguation.
development
Structure SKILL.md content so the model reads just enough — concise summary up front, progressively deeper detail, examples on demand. Covers section ordering, length budgets, when to split into multiple skills. Use this skill when writing or refactoring a skill body, one skill has grown too long, or a skill is wordy but not useful. Activate when: SKILL.md structure, skill content, skill too long, split skill, progressive disclosure, skill body.