tools/browser/SKILL.md
Teaches the agent how to use the headless browser tools — per-agent persistent sessions, the open_login helper, refs, batching, detail levels, console/network inspection, dialogs, viewport/UA, cookies, and lock/unlock for multi-agent safety.
npx skillsauth add osaurus-ai/osaurus-tools osaurus-browserInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Headless browser automation via element refs. Every action returns a page snapshot automatically — you rarely need to call browser_snapshot separately.
Your browser session is persistent across runs and isolated per agent. Cookies, localStorage, and IndexedDB are stored on disk, keyed by the active agent. That means:
browser_open_login); after that you simply call browser_navigate and run logged-in.browser_open_login instead.When browser_navigate returns a LOGIN_REQUIRED error envelope, the response includes the login URL the page redirected to. Your job is:
browser_open_login with the suggested URL.browser_navigate.// 1. Got LOGIN_REQUIRED from browser_navigate
// 2. Open the helper window
{ "url": "https://github.com/login" }
// 3. After the user signs in and closes the window, retry
{ "url": "https://github.com/notifications" }
If sign-in fails repeatedly (wrong password, expired session, 2FA issues), ask the user before calling browser_reset_session — that tool wipes the entire profile, including any other sites the user signed into for this agent.
1. browser_navigate(url) → snapshot with refs [E1] [E2] [E3]
2. browser_do([type E1, type E2, click E3]) → snapshot of result page
Element refs — browser_navigate and all actions return refs like [E1] input, [E2] button "Submit". Use these refs in subsequent calls.
Detail levels — Every tool accepts detail to control snapshot verbosity:
none — action result only, no snapshot (~10 tokens)compact — single-line refs, default for actions (~200 tokens)standard — multi-line with attributes, default for browser_snapshot (~500 tokens)full — all attributes + IDs + aria-labels + page text excerpt (~1000+ tokens)Use compact (default) for speed. Use full when you need to identify elements by ID or aria-label on complex pages.
Navigate and get initial page snapshot.
{ "url": "https://example.com", "detail": "compact" }
Use wait_until: "networkidle" for SPAs.
Primary interaction tool. Batch multiple actions in one call. All refs from the previous snapshot stay valid throughout the batch.
{
"actions": [
{ "action": "type", "ref": "E1", "text": "[email protected]" },
{ "action": "type", "ref": "E2", "text": "password123" },
{ "action": "click", "ref": "E3" }
],
"detail": "compact"
}
Supported actions: click, type, select, hover, scroll, press_key, wait_for.
If an action fails, execution stops and the response includes: which action failed (index), the error, and a snapshot of current state for recovery.
Use wait_after: "domstable" or "networkidle" when the last action triggers async content.
Individual action tools — each returns a snapshot automatically. Prefer browser_do when performing 2+ actions in sequence.
Re-inspect the page without acting. Usually not needed since actions auto-return snapshots. Use when you need to re-check after browser_wait_for or browser_execute_script.
Press keyboard keys: Enter, Escape, Tab, arrow keys, or characters with modifiers.
Wait for text to appear, disappear, or a specified time.
Visual debugging — saves a PNG. Use full_page: true for the entire scrollable page.
Escape hatch for arbitrary JavaScript.
Opens a visible browser window so the user can sign in. Cookies are saved per-agent and are immediately visible to your subsequent browser_navigate calls.
{ "url": "https://github.com/login" }
{ "url": "https://github.com/login", "timeout_ms": 600000 }
{ } // open a blank window so the user can navigate freely
Returns when the user closes the window or timeout_ms (default 5 min) elapses. Response includes final_url so you can confirm where the user ended up.
Always reach for this when you hit a login wall. Never type passwords yourself.
Wipes the active agent's profile. Closes the headless browser and removes the on-disk data store (cookies, localStorage, IndexedDB, cache). Next browser_navigate spawns a fresh logged-out profile.
{}
Destructive — confirm with the user first.
These tools return the standard JSON envelope ({ok, data} or {ok:false, error:{code,message,hint?}}).
Read JavaScript console output captured since page load. Useful for diagnosing client-side errors.
{ "level": "error", "clear": false }
Returns data.messages: [{level, message, timestamp, location}].
List fetch/XHR requests the page has made. Use failed_only: true to surface 4xx/5xx and network errors.
{ "failed_only": true, "url_contains": "/api/" }
Returns data.requests: [{method, url, status, ok, duration_ms, kind}].
Pre-register the policy for the next alert / confirm / prompt before the action that triggers it.
{ "action": "accept", "prompt_text": "yes" }
{ "action": "dismiss" }
{ "action": "status" }
Default policy if you never call this is accept.
Resize the headless WebKit viewport (e.g. mobile-emulation widths).
Override the User-Agent header for subsequent navigations. Pass empty/null to reset.
{ "action": "get", "domain": "example.com" }
{ "action": "set", "cookie": { "name": "x", "value": "y", "domain": "example.com" } }
{ "action": "clear", "domain": "example.com" }
Cooperative lock so two agents don't fight over the same headless browser. Advisory only — other agents are expected to honor it.
{ "action": "lock", "owner": "agent-alice" }
... do work ...
{ "action": "unlock", "owner": "agent-alice" }
{ "action": "status" }
If lock returns {ok: false, error: {code: "LOCK_HELD", ...}}, wait and retry.
browser_navigate — it gives you the refs you need.browser_do to minimize round-trips.detail: "none" for intermediate actions where you already know the next step.browser_snapshot to get fresh ones.wait_until: "networkidle" on navigate, or wait_after: "domstable" on browser_do.browser_console_messages({"level": "error"}) to confirm.browser_handle_dialog({"action": "accept"}).browser_open_login. Never type credentials.osaurus.chrome plugin will provide a real-Chromium driver for those sites.tools
How to use the web search tools. Default to `search(query=...)` — the plugin auto-picks the best backend and races free fallbacks in parallel. Only override defaults when you have a specific reason.
tools
Teaches the agent how to use the time tools — current time, timezone conversion, parsing dates, formatting, date arithmetic.
tools
Teaches the agent how to use the HTTP fetch tools — JSON APIs, Readability HTML extraction, file downloads, with built-in SSRF and size limits.
tools
Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layers like Lobster, ACPX, plugins, or plain code. Keep conditional logic in the caller; use TaskFlow for flow identity, child-task linkage, waiting state, revision-checked mutations, and user-facing emergence.