skills/pinchtab-mcp/SKILL.md
Use this skill when a task requires browser automation through PinchTab's MCP server connected to a remote browser instance. Covers navigation, element interaction, data extraction, form filling, multi-step flows, and session management via MCP tools.
npx skillsauth add pinchtab/pinchtab pinchtab-mcpInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use MCP tools to control a browser through the PinchTab HTTP API. The MCP server defaults to http://127.0.0.1:9867; for remote or containerized PinchTab instances, override with the PINCHTAB_SERVER env var (e.g. PINCHTAB_SERVER=http://pinchtab:9867).
pinchtab_navigate(url="https://example.com") — auto-creates a session and tab.pinchtab_snapshot(interactive=true, compact=true) — returns numbered refs like e5, e12.pinchtab_click(selector="e5") — use refs from the snapshot.pinchtab_get_text() or re-snapshot to confirm the action succeeded.Critical rule: Element refs (e5, e12) are ephemeral. They expire after navigation or DOM updates. Always re-call pinchtab_snapshot after a page load before using refs.
Choose the cheapest tool that satisfies your goal:
| Goal | Tool | Token Cost |
|------|------|------------|
| Check a specific value | pinchtab_eval(expression="document.title") | Lowest |
| Find a specific element | pinchtab_find(query="login button") | Low |
| Read page text only | pinchtab_get_text() | Low |
| Find interactive elements | pinchtab_snapshot(interactive=true, compact=true) | Medium |
| Full page structure | pinchtab_snapshot() | Medium-High |
| Visual verification | pinchtab_screenshot() | Highest |
Default observation: pinchtab_snapshot(interactive=true, compact=true) — returns only interactive elements in compact format. Use this as your starting point.
pinchtab_navigate(url="https://example.com")
http:// or https:// scheme.pinchtab_snapshot() to get element refs.After navigation: Always call pinchtab_snapshot() before interacting. The page may have redirects, modals, or cookie banners.
pinchtab_snapshot(interactive=true, compact=true)
Returns an accessibility tree with numbered refs:
[0]<a href="/about" />
About
[2]<button aria-label="Sign in" />
Sign in
[5]<input type="text" placeholder="Search" />
Key rules:
[index] are interactive.diff=true after an interaction to see only changed elements (saves tokens).selector to scope the snapshot to a specific section.pinchtab_get_text()
Use when you only need to read content (articles, dashboards, results). Cheaper than snapshot when you won't interact with elements.
pinchtab_find(query="submit button")
Semantic search for elements without a full snapshot. Returns matching refs. Great for known targets.
pinchtab_screenshot()
Returns an MCP image (image/jpeg by default) — clients render it inline. The text block is always the JSON envelope {"format": "jpeg"|"png", "annotations": [...]}; annotations is [] by default and becomes [{ref, role, name, tag, box: {x, y, w, h}}, ...] with annotate=true so refs in the picture map back to the same selectors used by pinchtab_click etc. Screenshots are heavy (500KB–2MB per image), so use sparingly.
quality=60 to reduce file size for JPEG screenshots.selector="e5" to capture a specific element instead of the full page.annotate=true to overlay numbered ref boxes and get the matching annotations list.beyondViewport=true to capture the entire scrollable document (annotation box coords become document-relative). Ignored when selector is set.When to use screenshots:
When NOT to use screenshots:
pinchtab_get_text() insteadpinchtab_snapshot() insteadpinchtab_snapshot(diff=true) insteadpinchtab_click(selector="e5")
e5).waitNav=true.snap=true to get a snapshot after the click.pinchtab_fill(selector="e3", value="[email protected]")
pinchtab_fill over pinchtab_type — sets value directly via JS.pinchtab_type only when the site depends on keystroke events (rare).pinchtab_type(selector="e3", text="hello")
Use when the site needs real keystrokes (e.g., some autocomplete widgets).
pinchtab_press(key="Enter")
Common keys: Enter, Tab, Escape, ArrowDown, ArrowUp, Backspace.
pinchtab_select(selector="e7", value="Option Label")
Matches by visible text or value attribute.
pinchtab_scroll(pixels=500)
Positive = down, negative = up. Or use selector to scroll an element into view.
pinchtab_navigate(url="...")pinchtab_snapshot(interactive=true, compact=true) — get refs for all fieldspinchtab_fill(selector="e3", value="...") — fill each fieldpinchtab_click(selector="e12", waitNav=true) — submitpinchtab_get_text() — verify success messagepinchtab_navigate(url="...")pinchtab_snapshot(interactive=true, compact=true)pinchtab_click(selector="e5", snap=true) — click next, get updated refspinchtab_get_text() or pinchtab_screenshot()pinchtab_navigate(url="https://example.com")pinchtab_find(query="search input") — find search boxpinchtab_fill(selector="e3", value="query")pinchtab_click(selector="e5", waitNav=true) — submit searchpinchtab_snapshot(interactive=true, compact=true) — see resultspinchtab_get_text() — extract dataFor complex tasks, follow this structured approach:
Before starting, outline your approach:
Track progress mentally or in notes for long tasks.
After every action, verify it succeeded:
Use pinchtab_get_text() or pinchtab_snapshot(diff=true) to verify.
| Error | Recovery |
|-------|----------|
| ref not found | Re-call pinchtab_snapshot() — refs are stale |
| Element not visible | Scroll first: pinchtab_scroll(pixels=500) |
| Page didn't change | Try alternative selector or press Enter |
| Modal/popup blocking | Find and click close/dismiss button |
| Login required | Navigate to login page, fill credentials |
| CAPTCHA/Cloudflare | Report to user — requires manual intervention |
When the task is complete:
pinchtab_get_text() and pinchtab_snapshot() before interacting.pinchtab_list_tabs() # List all open tabs
pinchtab_close_tab(tabId="...") # Close a specific tab
tabId parameter on any tool to target a specific tab.Use for async content (spinners, XHR, lazy-loaded elements):
pinchtab_wait(ms=2000) # Fixed delay (last resort)
pinchtab_wait_for_selector(selector="e5") # Wait for element
pinchtab_wait_for_text(text="Success") # Wait for text
pinchtab_wait_for_url(url="**/dashboard") # Wait for URL change
pinchtab_wait_for_load(load="network-idle") # Wait for page load
Timeout: 10s default, 30s max. Prefer selector/text waits over fixed delays.
waitNav=truepinchtab_get_text() for prose contentpinchtab_snapshot() for structured datapinchtab_get_text()The following require CLI or HTTP API (not available via MCP):
For these, use the pinchtab CLI or HTTP API directly.
pinchtab_navigate or pinchtab_click(waitNav=true).diff=true after interactions. Shows only changed elements, saving tokens.| Symptom | Cause | Fix |
|---------|-------|-----|
| Connection refused | PinchTab server not running | Check container status, restart |
| ref not found | Stale element ref | Re-call pinchtab_snapshot() |
| evaluate not allowed | security.allowEvaluate is false | Use pinchtab_find instead |
| invalid URL | Missing scheme | Include http:// or https:// |
| Element not found | Page not loaded | Use pinchtab_wait_for_selector |
| Action seems ignored | Page changed mid-action | Re-snapshot, use fresh refs |
tools
Use this skill when a task needs browser automation through PinchTab: open a website, inspect interactive elements, click through flows, fill out forms, scrape page text, reuse a dedicated automation profile with user approval, export screenshots or PDFs, manage multiple browser instances, or fall back to the HTTP API when the CLI is unavailable. Prefer this skill for token-efficient browser work driven by stable accessibility refs such as `e5` and `e12`.
tools
Run the PinchTab optimization loop. Spawns blind subagents that execute 108 browser automation steps across 47 groups using only the PinchTab skill, then reports pass/fail results and operation counts vs baseline. Use when asked to 'run optimization', 'run the opt loop', 'benchmark the agent', or 'test pinchtab agent'.
testing
Run the PinchTab cold-start test. Spawns a subagent that follows tests/coldstart/subagent-context.md to validate the documented first-install user journey. Use when asked to 'run cold start', 'cold-start test', or 'test the agent onboarding flow'.
development
Develop and contribute to the PinchTab project. Use when working on PinchTab source code, adding features, fixing bugs, running tests, or preparing PRs. Triggers on "work on pinchtab", "pinchtab development", "contribute to pinchtab", "fix pinchtab bug", "add pinchtab feature".