skills/1jehuang/firefox-browser/SKILL.md
Control the user's Firefox browser with their logins and cookies intact. Use when you need to browse websites as the user, interact with authenticated pages, fill forms, click buttons, take screenshots, or get page content. (user)
npx skillsauth add aiskillstore/marketplace firefox-browserInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Control the user's actual Firefox browser session via WebSocket. This uses their real browser with existing logins and cookies - not a headless browser.
# 0. If Firefox isn't running, start it first
nohup firefox &>/dev/null &
# 1. Check connection
browser ping
# 2. See what tabs are open
browser listTabs '{}'
# 3. Start a new session (recommended)
browser newSession '{"url": "https://example.com"}'
# 4. Read the page with interactable elements marked
browser getContent '{"format": "annotated"}'
browser <action> '<json_params>'
| Action | Description | Key Params |
|--------|-------------|------------|
| listTabs | List all open tabs across windows | - |
| newSession | Create new tab to work in | url (optional) |
| setActiveTab | Switch which tab agent works on | tabId, focus |
| getActiveTab | Get current tab info | - |
| Action | Description | Key Params |
|--------|-------------|------------|
| navigate | Go to URL in current tab | url, wait, newTab |
| getContent | Get page content | format: annotated, text, html |
| getInteractables | List clickable elements and inputs | selector (optional scope) |
| screenshot | Capture visible area as PNG | filename (optional) |
| Action | Description | Key Params |
|--------|-------------|------------|
| click | Click element | selector, text, or x/y coords |
| type | Type into focused/selected input | selector, text, submit, clear |
| fillForm | Fill form fields (inputs, textareas, selects) | fields[] array with selector/value |
| waitFor | Wait for element/text | selector, text, timeout |
IMPORTANT: There is no fill command. Use fillForm with a fields array:
# Fill a single field
browser fillForm '{"fields": [{"selector": "#email", "value": "[email protected]"}]}'
# Fill multiple fields at once (text inputs, textareas, AND select dropdowns)
browser fillForm '{"fields": [
{"selector": "#name", "value": "John Doe"},
{"selector": "#email", "value": "[email protected]"},
{"selector": "#subject", "value": "support"},
{"selector": "#message", "value": "Hello world"}
]}'
Works with: <input>, <textarea>, <select>, checkboxes, radio buttons.
| Action | Description | Key Params |
|--------|-------------|------------|
| fork | Duplicate tab into multiple paths | paths[] with name + commands |
| killFork | Close a fork | fork (name) |
| listForks | List active forks | - |
| tryUntil | Try alternatives until one succeeds | alternatives[], timeout |
| parallel | Run commands on multiple URLs | branches[] with url + commands |
| Action | Description | Key Params |
|--------|-------------|------------|
| getAuthContext | Detect login pages, available accounts | - |
| requestAuth | Request user approval for auth | reason |
| configureAuth | Set auth preferences | authMode, setSiteRule, domain |
browser listTabs '{}'
Returns:
{
"activeTabId": 123,
"windows": [
{
"windowId": 1,
"focused": true,
"tabs": [
{"tabId": 123, "url": "https://...", "title": "...", "active": true}
]
}
],
"totalTabs": 5
}
# Start fresh
browser newSession '{"url": "https://amazon.com"}'
# Or switch to existing tab
browser setActiveTab '{"tabId": 456}'
browser getContent '{"format": "annotated"}'
Returns content with interactive elements marked inline:
Product Name Here
$4.99
[button: "Add to cart" | selector: #add-btn]
[input:text: "search" | value: "" | selector: #search-box]
[link: "View details" | href: /product/123 | selector: a.details-link]
This shows what's clickable and where it is in context.
# Click using selector from annotated output
browser click '{"selector": "#add-btn"}'
# Or by text (prefers visible elements)
browser click '{"text": "Add to cart"}'
# Type into input
browser type '{"selector": "#search-box", "text": "query", "submit": true}'
When you're not sure which path is right, fork the tab and try both:
# Create forks
browser fork '{
"paths": [
{
"name": "google-auth",
"commands": [{"action": "click", "params": {"text": "Sign in with Google"}}]
},
{
"name": "email-auth",
"commands": [{"action": "click", "params": {"text": "Sign in with Email"}}]
}
]
}'
Returns:
{
"forked": true,
"sourceTabId": 123,
"forks": [
{"name": "google-auth", "tabId": 456, "url": "...", "commandResults": [...]},
{"name": "email-auth", "tabId": 789, "url": "...", "commandResults": [...]}
]
}
Work on specific fork:
browser getContent '{"format": "annotated", "fork": "google-auth"}'
browser click '{"text": "Continue", "fork": "google-auth"}'
Kill the wrong path:
browser killFork '{"fork": "email-auth"}'
When the exact button varies (cookie banners, A/B tests):
browser tryUntil '{
"alternatives": [
{"action": "click", "params": {"selector": "#accept-cookies"}},
{"action": "click", "params": {"text": "Accept All"}},
{"action": "click", "params": {"selector": ".cookie-dismiss"}}
],
"timeout": 3000
}'
Tries each until one succeeds.
Compare prices across sites:
browser parallel '{
"branches": [
{"url": "https://amazon.com/product", "commands": [{"action": "getContent", "params": {"format": "text"}}]},
{"url": "https://walmart.com/product", "commands": [{"action": "getContent", "params": {"format": "text"}}]}
]
}'
The bridge detects auth pages and leverages existing browser sessions:
# Check if on login page
browser getAuthContext '{}'
# Returns available accounts, OAuth options, etc.
When running multiple tasks in parallel, use tabId to avoid conflicts:
# 1. Create isolated session - get a unique tabId
browser newSession '{"url": "https://example.com"}'
# Returns: {"tabId": 15, "url": "...", "windowId": 1}
# 2. Use that tabId in ALL subsequent commands
browser navigate '{"url": "https://example.com/page", "tabId": 15}'
browser getContent '{"format": "annotated", "tabId": 15}'
browser click '{"selector": "#btn", "tabId": 15}'
browser type '{"selector": "#input", "text": "hello", "tabId": 15}'
This lets multiple agents work in parallel without stepping on each other.
listTabs to see what's opennewSession for a clean starttabId for parallel/isolated executionannotated format - shows content + clickable elements togethernohup firefox &>/dev/null &browser pingabout:debuggingbrowser getContent '{"format": "annotated"}' to see what's on the pagedevelopment
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.