packages/skills/skills/desktop-computer-automation/SKILL.md
# Desktop Computer Automation > **CRITICAL RULES:** > > 1. **Never run midscene commands in the background.** Each command must run synchronously so you can read its output (especially screenshots) before deciding the next action. > 2. **Run only one midscene command at a time.** Wait for the previous command to finish, read the screenshot, then decide the next action. > 3. **Allow enough time for each command to complete.** Midscene commands involve AI inference and screen interaction, which c
npx skillsauth add mediar-ai/skillhubz packages/skills/skills/desktop-computer-automationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
CRITICAL RULES:
- Never run midscene commands in the background. Each command must run synchronously so you can read its output (especially screenshots) before deciding the next action.
- Run only one midscene command at a time. Wait for the previous command to finish, read the screenshot, then decide the next action.
- Allow enough time for each command to complete. Midscene commands involve AI inference and screen interaction, which can take longer than typical shell commands.
- Always report task results before finishing.
Control your desktop (macOS, Windows, Linux) using npx @midscene/computer@1. Each CLI command maps directly to an MCP tool -- you (the AI agent) act as the brain, deciding which actions to take based on screenshots.
Midscene requires models with strong visual grounding capabilities. Configure these environment variables:
MIDSCENE_MODEL_API_KEY="your-api-key"
MIDSCENE_MODEL_NAME="model-name"
MIDSCENE_MODEL_BASE_URL="https://..."
MIDSCENE_MODEL_FAMILY="family-identifier"
npx @midscene/computer@1 connect
npx @midscene/computer@1 connect --displayId <id>
npx @midscene/computer@1 list_displays
npx @midscene/computer@1 take_screenshot
Use act to interact with the computer. Describe what you want to do in natural language:
npx @midscene/computer@1 act --prompt "type hello world in the search field and press Enter"
npx @midscene/computer@1 act --prompt "drag the file icon to the Trash"
npx @midscene/computer@1 act --prompt "search for the weather in Shanghai using the Chrome browser, tell me the result"
npx @midscene/computer@1 disconnect
act to perform the desired actionopen -a <AppName> on macOS)list_displays if an app is not visibleact command when possibleexport PATH="/usr/sbin:/usr/bin:/bin:/sbin:$PATH"Open System Settings > Privacy & Security > Accessibility and add your terminal app.
xcode-select --install
Check .env file contains MIDSCENE_MODEL_API_KEY=<your-key>.
tools
# X Twitter Scraper Use Xquik for X/Twitter tweet search, user lookup, profile tweets, follower export, media download, monitors, webhooks, posting workflows, and MCP-backed API exploration. ## Prerequisites - A Xquik API key in `XQUIK_API_KEY`. - Internet access to `https://xquik.com/api/v1`, `https://xquik.com/mcp`, and `https://docs.xquik.com`. - A clear user request that identifies the target tweets, users, accounts, keywords, media, monitor, webhook, or write action. ## Source Truth -
tools
Use when the user says "mk0r", "appmaker CLI", "open a VM", "run something in the sandbox", "talk to the VM agent", "spin up an E2B sandbox", or "chat with appmaker from CLI." Wraps the `mk0r` CLI to list projects, exec commands inside their E2B sandboxes, stream chat with the VM agent (same `/api/chat` the web UI uses), toggle SOAX residential IP, manage schedules, and copy files. Supports a sticky default project via `mk0r projects use`.
testing
Use when the user mentions "influencer candidates", "social media operator", "check proposals on Upwork/Fiverr", "review influencer applications", "qualify candidates", or "reach out to operators". Manages the IG/TikTok account operator hiring pipeline — review applicants, check replies, qualify, and do proactive outreach.
tools
End-to-end newsletter pipeline: investigate recent features, draft, send via API endpoint, and track delivery/open/click metrics.