Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

markus41/plugins/claude-code-expert/skills/computer-use

Name: plugins/claude-code-expert/skills/computer-use
Author: markus41

plugins/claude-code-expert/skills/computer-use/SKILL.md

npx skillsauth add markus41/claude plugins/claude-code-expert/skills/computer-use

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Computer Use & GUI Automation

Computer use lets Claude interact with GUIs: click buttons, fill forms, take screenshots, and navigate native apps. This is powerful but expensive and slow — use it only when a more precise tool doesn't exist.

Tool Selection Priority

Before reaching for computer use, exhaust these options first:

| Task | Prefer This | Over Computer Use | |------|------------|-------------------| | API endpoint testing | Bash + curl | Clicking through UI | | Database inspection | MCP postgres/sqlite | Navigating admin UI | | File operations | Read/Write/Edit | Drag-and-drop UI | | Web scraping | Firecrawl MCP | Screenshot + parse | | Browser automation | Playwright MCP | Computer use click | | CI status | GitHub API / gh CLI | Browser navigation | | Log inspection | Bash + grep | Terminal screenshot |

Rule: If you can express the task as a shell command or API call, do that. Computer use is the fallback for GUI-only workflows.

When Computer Use Is the Right Choice

1. Native App Validation

Testing a desktop app that has no API or CLI interface.

# Example: Validate Electron app UI after a build
Take a screenshot of the app after launch.
Click the "New Project" button.
Verify the dialog opens with the correct fields.
Fill in project name: "Test Project 2026"
Click Create and verify the project appears in the list.

2. Visual Regression Checks

Detecting layout regressions that unit tests can't catch.

# Workflow:
1. Take baseline screenshot of the current UI state
2. Apply the change
3. Take comparison screenshot
4. Highlight pixel differences > 1%
5. Human reviews diff

3. GUI-Only Admin Tools

Admin panels, legacy enterprise software, and embedded UIs with no API.

# Example: Generate a report from a legacy admin panel
Navigate to: http://admin.internal/reports
Click: "Export" → "CSV" → "Last 30 days"
Wait for download
Move file to: /tmp/report-{date}.csv

4. Local Simulator Flows

Mobile simulator or desktop app testing that requires visual interaction.

# Example: iOS simulator validation
Launch: xcrun simctl launch booted com.example.MyApp
Take screenshot
Verify: "Welcome" text is visible in the header
Tap: "Get Started" button (coordinates or element description)
Verify: onboarding screen loads

Result Verification

Computer use output is inherently visual and unstructured. Always verify results with a structured check after GUI actions:

Verification Pattern

After each GUI action:
1. Take a screenshot
2. Verify the expected visual state (specific text, element position, color)
3. If verification fails: log "FAIL: {what was expected vs. what was seen}"
4. If unsure: take another screenshot from a wider viewport

At the end:
- List each action and its verification result
- Count: {N} actions taken, {M} verified OK, {K} failed

Confidence Levels

| Confidence | Verification | Action | |------------|-------------|--------| | HIGH | Text matches exactly / element found by ID | Proceed | | MEDIUM | Visual match but element found by position | Log and proceed | | LOW | Can't find element / ambiguous screenshot | Stop, report to human |

Safety Guardrails

Computer use can cause irreversible actions (delete files, send emails, submit forms). Apply these guardrails:

Never Without Confirmation

Form submissions in production environments
Delete or "Archive" actions
Payment or billing interactions
Sending emails or messages
Anything involving real user data

Screenshot Audit Trail

Keep screenshots of:

State before any action
State after each major action
Final state

Dry-Run First

For complex GUI flows, describe the steps and ask for confirmation before executing:

Before I click "Submit", here's what will happen:
- Form data: {summary}
- This action cannot be undone
- Proceeding? (yes/no)

Computer Use vs. Playwright MCP

For web UIs, Playwright MCP is almost always better than computer use:

| | Playwright MCP | Computer Use | |--|---------------|-------------| | Reliability | High (DOM-based) | Medium (pixel-based) | | Speed | Fast | Slow (screenshot per action) | | Testability | Scriptable, repeatable | Hard to reproduce exactly | | Cost | Low | High (vision model per screenshot) | | Works on | Web browsers | Any visual surface |

Use Playwright MCP for: Web app testing, scraping, form automation on websites.

Use Computer Use for: Native desktop apps, embedded UIs, legacy apps with no API.

Cost Awareness

Computer use is expensive:

Each screenshot = vision model inference (high token cost)
A 10-step GUI flow = 10+ vision inferences
Compare: a 10-step shell script = near-zero cost

Estimate before using: If a GUI flow has N steps, expect N × (screenshot tokens + generation tokens). For flows > 20 steps, consider whether a shell/API approach exists.

Claude Desktop Requirement

Computer use requires the Claude Desktop app (not CLI or Web). The Desktop app has the screen capture and input simulation capabilities that CLI lacks.

CLI:     ❌ Computer use not available
Web:     ❌ Computer use not available
Desktop: ✅ Computer use available

markus41/plugins/claude-code-expert/skills/computer-use

plugins/claude-code-expert/skills/computer-use/SKILL.md

Computer use and GUI automation patterns — when to use GUI automation vs shell/MCP/browser tools, visual validation techniques, native app testing, and guardrails for visual regression workflows

9 stars

tools

Updated Apr 7, 2026

$ install --global

skillsauth

npx skillsauth add markus41/claude plugins/claude-code-expert/skills/computer-use

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 7, 2026, 2:27 AM42.4s1 file scanned

SKILL.md

description:: Computer use and GUI automation patterns — when to use GUI automation vs shell/MCP/browser tools, visual validation techniques, native app testing, and guardrails for visual regression workflows
model:: claude-opus-4-6

Computer Use & GUI Automation

Tool Selection Priority

Before reaching for computer use, exhaust these options first:

Rule: If you can express the task as a shell command or API call, do that. Computer use is the fallback for GUI-only workflows.

When Computer Use Is the Right Choice

1. Native App Validation

Testing a desktop app that has no API or CLI interface.

# Example: Validate Electron app UI after a build
Take a screenshot of the app after launch.
Click the "New Project" button.
Verify the dialog opens with the correct fields.
Fill in project name: "Test Project 2026"
Click Create and verify the project appears in the list.

2. Visual Regression Checks

Detecting layout regressions that unit tests can't catch.

# Workflow:
1. Take baseline screenshot of the current UI state
2. Apply the change
3. Take comparison screenshot
4. Highlight pixel differences > 1%
5. Human reviews diff

3. GUI-Only Admin Tools

Admin panels, legacy enterprise software, and embedded UIs with no API.

# Example: Generate a report from a legacy admin panel
Navigate to: http://admin.internal/reports
Click: "Export" → "CSV" → "Last 30 days"
Wait for download
Move file to: /tmp/report-{date}.csv

4. Local Simulator Flows

Mobile simulator or desktop app testing that requires visual interaction.

# Example: iOS simulator validation
Launch: xcrun simctl launch booted com.example.MyApp
Take screenshot
Verify: "Welcome" text is visible in the header
Tap: "Get Started" button (coordinates or element description)
Verify: onboarding screen loads

Result Verification

Computer use output is inherently visual and unstructured. Always verify results with a structured check after GUI actions:

Verification Pattern

After each GUI action:
1. Take a screenshot
2. Verify the expected visual state (specific text, element position, color)
3. If verification fails: log "FAIL: {what was expected vs. what was seen}"
4. If unsure: take another screenshot from a wider viewport

At the end:
- List each action and its verification result
- Count: {N} actions taken, {M} verified OK, {K} failed

Confidence Levels

Safety Guardrails

Computer use can cause irreversible actions (delete files, send emails, submit forms). Apply these guardrails:

Never Without Confirmation

Form submissions in production environments
Delete or "Archive" actions
Payment or billing interactions
Sending emails or messages
Anything involving real user data

Screenshot Audit Trail

Keep screenshots of:

State before any action
State after each major action
Final state

Dry-Run First

For complex GUI flows, describe the steps and ask for confirmation before executing:

Before I click "Submit", here's what will happen:
- Form data: {summary}
- This action cannot be undone
- Proceeding? (yes/no)

Computer Use vs. Playwright MCP

For web UIs, Playwright MCP is almost always better than computer use:

Use Playwright MCP for: Web app testing, scraping, form automation on websites.

Use Computer Use for: Native desktop apps, embedded UIs, legacy apps with no API.

Cost Awareness

Computer use is expensive:

Each screenshot = vision model inference (high token cost)
A 10-step GUI flow = 10+ vision inferences
Compare: a 10-step shell script = near-zero cost

Estimate before using: If a GUI flow has N steps, expect N × (screenshot tokens + generation tokens). For flows > 20 steps, consider whether a shell/API approach exists.

Claude Desktop Requirement

Computer use requires the Claude Desktop app (not CLI or Web). The Desktop app has the screen capture and input simulation capabilities that CLI lacks.

CLI:     ❌ Computer use not available
Web:     ❌ Computer use not available
Desktop: ✅ Computer use available

Related Skills

markus41/plugins/microsoft-agents-expert/skills/teams-agents

tools

VerifiedTrustedCommunity

Build Teams-native agents with the Teams SDK (formerly Teams AI Library v2) — App class, activity routing, adaptive cards, streaming, AI-generated labels, feedback, message extensions, Teams-as-MCP-server, and the bring-your-own-AI pattern with Agent Framework.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/teams-agents

markus41/plugins/microsoft-agents-expert/skills/microsoft-foundry

tools

VerifiedTrustedCommunity

Run agents on Microsoft Foundry (formerly Azure AI Foundry) Agent Service — prompt agents vs hosted agents, threads/runs and the Responses API, built-in tools (Bing grounding, code interpreter, file search, MCP, OpenAPI, A2A), connected agents, Entra agent identity, SDKs, and observability/evaluations.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/microsoft-foundry

markus41/plugins/microsoft-agents-expert/skills/m365-agents-sdk

tools

VerifiedTrustedCommunity

Build and host custom engine agents with the Microsoft 365 Agents SDK — AgentApplication, the Activity protocol, channel reach via Azure Bot Service, hosting Agent Framework or Semantic Kernel engines, and the Agents Toolkit/Playground workflow. Successor to the Bot Framework SDK.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/m365-agents-sdk

markus41/plugins/microsoft-agents-expert/skills/copilot-studio

tools

VerifiedTrustedCommunity

Design, govern, and extend Microsoft Copilot Studio agents — topics, generative orchestration, knowledge, tools and MCP, agent flows, autonomous triggers, publishing channels, Copilot Credits pricing, and solution-based ALM on Power Platform.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/copilot-studio

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/markus41/claude.git

# Copy into Claude Code skills folder (global)
cp -r claude/plugins/claude-code-expert/skills/computer-use ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

markus41/claude

9 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT