Web AI

Use this skill when the task is to ask an AI website through browser control instead of calling a model API directly.

Safe Defaults

Render before sending.
If the user explicitly says to use agbrowse or standalone agbrowse, run agbrowse --help first, and run agbrowse web-ai --help before choosing web-ai flags. Treat the current help output as command truth and adapt this skill to that surface instead of assuming cli-jaw wrapper parity.
Use --inline-only only when the user explicitly wants pasted inline context. Source context should normally be packaged with --context-from-files / --context-file; upload transport creates one .zip archive attachment containing CONTEXT_PACKAGE.md plus the selected source files.
Do not upload files with --file unless explicitly requested. For source context, use the context packaging flags first.
Do not switch models.
Do not expose arbitrary evaluate through web-ai.
For live ChatGPT/Gemini observation or smoke tests, do not use headless Chrome. Use a headed, user-visible 30_browser/CDP session so account gates, Cloudflare, tool drawers, upload pickers, and model menus match the real frontend.
Use vision-click only as an explicit fallback when DOM/snapshot cannot see a visible UI target.

Support Labels

| Surface | Label | Notes | | --- | --- | --- | | prompt render/context dry-run | ready | browser-free and deterministic | | ChatGPT/Gemini/Grok live send/poll/query | beta | depends on provider UI/account state | | ChatGPT semantic resolver and answer artifacts | ready in cli-jaw mirror | mirrors agbrowse Phase 16/17 contracts | | source audit flags | ready in cli-jaw mirror | --require-source-audit fails closed on missing inline sources | | hosted/cloud/external-CDP operation | deferred | do not claim hosted browser support |

Prompt Shape

Build a structured question envelope:

[SYSTEM]
...

[USER]
## Project
...

## Goal
...

## Context
...

## Question
...

## Output
...

## Constraints
...

Commands

cli-jaw browser web-ai render --vendor chatgpt --prompt "..."
cli-jaw browser web-ai context-dry-run --vendor chatgpt --prompt "..." --context-from-files "src/**/*.ts" --files-report
cli-jaw browser web-ai context-render --vendor chatgpt --prompt "..." --context-from-files "src/**/*.ts"
cli-jaw browser web-ai code --vendor chatgpt --model thinking --effort heavy --prompt "Build a Flask API MVP" --output-zip ./result.zip
cli-jaw browser web-ai code-extract --vendor chatgpt --conversation "https://chatgpt.com/c/<conversation-id>" --output-zip ./result.zip
cli-jaw browser web-ai status --vendor chatgpt
cli-jaw browser web-ai query --vendor chatgpt --prompt "..." --context-from-files "src/foo.ts"
cli-jaw browser web-ai query --vendor chatgpt --inline-only --allow-copy-markdown-fallback --prompt "..."
cli-jaw browser web-ai query --vendor grok --inline-only --require-source-audit --source-audit-scope "sources checked" --source-audit-date "2026-05-05" --prompt "..."
cli-jaw browser web-ai poll --vendor chatgpt --timeout 1200
cli-jaw browser web-ai capabilities --vendor chatgpt
cli-jaw browser web-ai notifications --vendor chatgpt
cli-jaw browser web-ai stop --vendor chatgpt

Copy Markdown Fallback

Use only when explicitly needed:

cli-jaw browser web-ai query \
  --vendor chatgpt \
  --inline-only \
  --allow-copy-markdown-fallback \
  --prompt "Return a markdown table."

The runtime intercepts the page's navigator.clipboard.writeText/write during the provider Copy button click. It does not read the OS clipboard. The flag is the explicit policy opt-in for CLI use; do not add --unsafe-allow.

Polling Timeouts

web-ai poll, web-ai query, and web-ai watch accept --timeout <seconds>. When omitted, the runtime uses these defaults so heavy reasoning models (ChatGPT Pro/Heavy, Gemini Deep Think) have room to finish:

| Vendor | Default --timeout | Roughly | | --- | ---: | --- | | ChatGPT | 1200 | 20 minutes | | Gemini | 1200 | 20 minutes | | Grok | 600 | 10 minutes |

Pass --timeout 1800 (30 min) or higher for unusually long Pro/Deep Think runs. The provider tab and the cli-jaw browser Chrome process stay open across a poll timeout — only the polling loop gives up.

Long-Running Queries — bgtask (no turn blocking)

For responses that may take many minutes (ChatGPT Pro/Heavy, Gemini Deep Think, Deep Research), do NOT block the boss turn on web-ai query. Register a server-owned background task instead:

SID=$(cli-jaw browser web-ai send --vendor chatgpt --model pro --inline-only \
  --prompt "..." --json | jq -r .sessionId)
cli-jaw bgtask add --preset web-ai --session "$SID" \
  --prompt "web-ai result: {{result}} — summarize and deliver to the user"
# → end the turn. The jaw server owns the work (native session probe,
#   restart-durable) and re-invokes the boss with a [bgtask:*] prompt
#   when the session reaches complete/timeout/error.

Check status anytime: cli-jaw bgtask list / cli-jaw bgtask show <taskId>.
This is the sanctioned exception to the "never relinquish the turn while work is in flight" rule — a registered bgtask is server-owned, not in-flight.
Keep blocking query for short/fast lookups where waiting is cheaper.

Tab pooling and lease contention

Completed provider tabs are kept warm in a per-vendor pool so the next send reuses a tab instead of creating a new one. Defaults (overridable via env on the agbrowse side):

| Setting | Default | Env Var | | --- | --- | --- | | TTL per pooled tab | 15 min | AGBROWSE_PROVIDER_POOL_TTL | | Warm tabs per (owner,vendor,sessionType,origin,profile) | 3 | AGBROWSE_PROVIDER_POOL_MAX_PER_KEY | | Global cap on warm provider tabs | 8 | AGBROWSE_PROVIDER_POOL_GLOBAL_MAX |

If you hit Target page... has been closed while issuing a second Pro / Deep Think query while another is still polling, that is lease contention on the per-key cap. Pass --new-tab (or its alias --parallel) on the second call to bypass pool reuse and allocate a fresh provider tab.

Runtime capabilities

cli-jaw browser web-ai status --vendor <v> --json now embeds a capabilities[] array sourced from src/browser/web-ai/capability-registry.ts. Each row carries { providerId, capabilityId, family, status, frontendStatus, mutationAllowed, activationPath, activeStateSignals, failureStage }. Scope to a single capability with --probe <capabilityId>.

cli-jaw browser web-ai capabilities continues to expose the registry directly (with --family / --frontend-status filters). agbrowse mirrors the same hyphenated capability ID convention via its much smaller probe runtime in web-ai/capability.mjs.

Completed poll, query, and watch results may include:

answerArtifact: normalized capture metadata (capturedBy, exactnessScore, text/markdown lengths, warnings).
sourceAudit: inline source coverage report when --require-source-audit is enabled.

Use --require-source-audit for research tasks where bottom-only provider source drawers are not enough. Pair absence/no-official-response claims with --source-audit-scope and --source-audit-date.

Error taxonomy

Failures from cli-jaw browser web-ai * carry a typed JSON envelope with errorCode, stage, retryHint, vendor, mutationAllowed, selectorsTried, and optional evidence. HTTP responses (/api/browser/web-ai/* 5xx bodies) and CLI --json output share the same shape via WebAiError.toJSON(). Initial code list (full catalog in agbrowse devlog/03_phase2_errors.md):

cdp.unreachable, cdp.target-mismatch
provider.composer-not-visible, provider.model-mismatch, provider.attachment-preflight, provider.attachment-evidence-missing, provider.commit-not-verified, provider.poll-timeout, provider.runtime-disabled
capability.unsupported
context.over-budget, context.symlink-rejected
grok.context-pack-not-allowed
internal.unhandled

Existing cli-jaw error classes map to typed codes via fromCliJawStructuredError:

WrongTargetError → cdp.target-mismatch (preserves expectedTargetId / actualTargetId in evidence).
BrowserCapabilityError → capability.unsupported (preserves capabilityId / ownerPrd).
ProviderRuntimeDisabledError → provider.runtime-disabled (preserves vendor / stage).

Grok context packaging is fail-closed

cli-jaw browser web-ai send/query --vendor grok with --context-from-files / --context-file / --context-transport upload throws with stage: 'grok-context-pack-not-allowed'. Pass --allow-grok-context-pack to override deliberately. When the override is used, the runtime emits a grok-context-pack-not-recommended warning. Grok prefers inline prompts plus an optional single --file upload; ChatGPT or Gemini handle context packages more reliably.

Standalone agbrowse Alternative

When the user asks to drive a single Chrome instance (for example to keep their own logged-in profile open and not run two CDP sessions), the same web-ai workflow is available through the standalone agbrowse CLI (npm install -g agbrowse). The flags and prompt envelope shape are identical; only the binary prefix changes.

| cli-jaw browser form | agbrowse form | | --- | --- | | cli-jaw browser start | agbrowse start | | cli-jaw browser status | agbrowse status | | cli-jaw browser snapshot --interactive | agbrowse snapshot --interactive | | cli-jaw browser web-ai render ... | agbrowse web-ai render ... | | cli-jaw browser web-ai query --vendor chatgpt ... | agbrowse web-ai query --vendor chatgpt ... | | cli-jaw browser web-ai poll --vendor chatgpt --timeout 1200 | agbrowse web-ai poll --vendor chatgpt --timeout 1200 | | cli-jaw browser web-ai code --vendor chatgpt ... | agbrowse web-ai code --vendor chatgpt ... | | cli-jaw browser web-ai code-extract --vendor chatgpt ... | agbrowse web-ai code-extract --vendor chatgpt ... |

Only switch when the user explicitly asks for the standalone path. The two runtimes share defaults (ChatGPT/Gemini 1200s, Grok 600s) and the same [INSTRUCTIONS] prompt block, so behavior stays consistent. Do not run both against the same --port at the same time.

When standalone agbrowse is explicitly requested, first inspect:

agbrowse --help
agbrowse web-ai --help

Then select flags from the observed help text. The standalone binary can move faster than the cli-jaw wrapper, so do not invent wrapper-only flags or assume older aliases when the current help output differs.

ChatGPT Code Artifact Extraction

cli-jaw browser web-ai code is the native cli-jaw mirror of agbrowse code mode. It is ChatGPT-only. The runtime automatically uploads gpt-dev-agent-context.zip as attachment 1 before any user-provided --file attachments, then sends a strict code-generation prompt. New artifacts must contain PLAN.md or 00_plan.md; the retrieval step fails closed when that plan file is absent.

The context zip is attached only on the FIRST turn of a conversation. Continuation turns (--url, --conversation, or --session targeting an existing conversation) skip it: the container /mnt persists across turns and the contract already lives in the conversation history. Pass --context-refresh to force a re-upload (e.g. after the context module changed, or when a long-idle conversation may have recycled its sandbox).

The prompt asks ChatGPT to use a visible plan/todo tool only when the tool is actually available. If no such tool exists in the ChatGPT environment, the generated project must put the checklist and verification record in PLAN.md or 00_plan.md instead.

Generate and recover a single artifact:

cli-jaw browser web-ai code \
  --vendor chatgpt \
  --model thinking \
  --effort heavy \
  --prompt "Build a Flask API MVP" \
  --output-zip ./result.zip

Generate multiple named artifacts:

cli-jaw browser web-ai code \
  --vendor chatgpt \
  --model pro \
  --effort extended \
  --prompt "Build backend.zip and frontend.zip" \
  --multi-zip \
  --output-dir ./artifacts

When an old ChatGPT conversation still contains assistant text such as MACHINE: /mnt/data/result.zip or /mnt/data/result.zip, recover it later without sending a new prompt:

cli-jaw browser web-ai code-extract \
  --vendor chatgpt \
  --conversation "https://chatgpt.com/c/<conversation-id>" \
  --output-zip ./result.zip

Then verify locally:

unzip -t ./result.zip
unzip -l ./result.zip

The original conversation URL/session/current ChatGPT tab and logged-in browser profile are still required; a copied /mnt/data/result.zip line alone is not enough.

Stale-snapshot guard: when one conversation rebuilds the same sandbox path (e.g. /mnt/data/result.zip) across several code runs, the download API serves the snapshot tied to the message id used to mint the URL. The extractor mints candidate message ids NEWEST-first (agbrowse 02f03cc), so the first successful mint is the latest sandbox state, and the result reports mintedMessageId for auditing. Even so, ALWAYS verify retrieved zip contents against drop-specific symbols (grep a file or identifier unique to the expected delivery) before applying — on mismatch, retry with --multi-zip to recover every archive and identify the right one.

For new agbrowse web-ai code runs, the prompt contract asks ChatGPT to create PLAN.md or 00_plan.md in every generated code zip, and to use a visible todo/checklist tool such as turn_plan.update_turn_plan only when that tool is actually available while the response is streaming. Keep visible/top-level checklists to 8 items or fewer; for complex work, put extra detailed stage instructions in the plan markdown instead of creating more visible todo items. That visible todo UI may disappear after the answer finishes; do not fail a completed run because the UI is no longer visible. The durable validation target is the zip-root PLAN.md or 00_plan.md checklist. Completed items in that plan file should be marked [x] before final packaging.

Context Packaging

Use this when the user asks for max context / current context packaging before browser submission.

Rules:

--file still means live browser upload. Do not use it for source context.
--context-from-files may be repeated and accepts files, directories, and globs.
--context-exclude may be repeated and accepts glob excludes.
--context-file accepts a newline or JSON list of include/exclude patterns.
default source-context transport is upload: write one .zip archive context package and attach it in the ChatGPT/Gemini composer. Do not create a temporary .txt/.md file yourself for source context.
--inline-only or --context-transport inline forces the old pasted composer path.
--max-input sets the model input-token preflight budget.
--max-file-size defaults to 1 MB per file.
context-dry-run --json omits composerText unless --full is passed.
context-render prints the CONTEXT_PACKAGE.md body that will be placed inside the .zip archive by live upload transport.
send/query with context packaging must fail before browser mutation if token budget is exceeded, or if inline transport exceeds the inline character budget.

Example:

cli-jaw browser web-ai context-dry-run \
  --vendor chatgpt \
  --model pro \
  --prompt "review current context" \
  --context-from-files "src/browser/web-ai/**/*.ts" \
  --context-exclude "**/*.test.ts" \
  --files-report \
  --json

Browser Execution Policy

Live web-ai execution policy:

headed Chrome required
headless forbidden
Codex Cloud out of scope
observed frontend capability -> schema row -> verified mutation
not observed -> fail closed

Use the 30_browser-derived loop:

active-tab -> snapshot -> act -> snapshot -> verify

Refs are latest-snapshot scoped. Re-run snapshot after navigation, reload, or any action that can replace the DOM before using an existing ref.

Before sending a prompt, verify the active tab is ChatGPT. If active tab is not verified, stop and ask the operator to run:

cli-jaw browser tabs --json
cli-jaw browser tab-switch <target>

Session Handling

web-ai send captures a baseline before insertion:

vendor
targetId
URL
promptHash
assistantCount

Raw prompt text must not be persisted. Polling only accepts answers that appear after the saved baseline.

Current Scope

Current:

ChatGPT
Gemini / Deep Think
inline prompt
structured context packaging dry-run/render and inline send/query preflight
ChatGPT file upload
ChatGPT model switching: instant / thinking / pro
- 2026-04-30 headed UI note: the visible opener may be the bottom composer button.__composer-pill[aria-haspopup="menu"] labeled Instant/Thinking/Pro or a plain Heavy pill, while the older top model-switcher-dropdown-button can be absent. Treat visible Heavy as active ChatGPT Pro/Heavy. For direct DOM fallback, open the model pill and select [data-testid="model-switcher-gpt-5-5-pro-thinking-effort"]; do not click generic "Pro" by role/name because the profile menu can also match.
- 2026-06-11 headed UI note: ChatGPT may show a simplified Intelligence menu instead of the older model row plus effort submenu. The runtime maps instant and thinking --effort light to Instant, thinking --effort standard to Medium, thinking --effort extended to High, thinking --effort heavy to Extra High, pro --effort standard to Pro Extended, and pro --effort extended to Pro Extended.
Gemini model switching: flash-lite / flash / pro
- 2026-05-19 headed UI note: the Gemini picker currently exposes visible versioned labels such as 3.1 Flash-Lite, 3 Flash, and 3.1 Pro, but workflow commands must use stable aliases (flash-lite, flash, pro). The runtime normalizes future 3.n labels generically and keeps legacy fast as flash-lite, while thinking maps to pro. Deep Think remains a separate tool/mode request, not the plain --model alias.
render/status/send/poll/query/watch/watchers/sessions/capabilities/notifications/stop
long-running watcher startup recovery and channel delivery loop
observed capability schemas with fail-closed unobserved tools
ChatGPT code mode (cli-jaw browser web-ai code) with automatic gpt-dev-agent-context.zip attachment, PLAN.md/00_plan.md enforcement for new artifacts, later code-extract, multi-zip retrieval, and repeatable mixed --file uploads

Future:

Grok
Claude
ChatGPT web search and image generation tool runtime after headed frontend observation
Gemini image generation runtime after headed frontend observation
Web UI watcher dashboard

Web AI

Use this skill when the task is to ask an AI website through browser control instead of calling a model API directly.

Safe Defaults

Render before sending.
If the user explicitly says to use agbrowse or standalone agbrowse, run agbrowse --help first, and run agbrowse web-ai --help before choosing web-ai flags. Treat the current help output as command truth and adapt this skill to that surface instead of assuming cli-jaw wrapper parity.
Use --inline-only only when the user explicitly wants pasted inline context. Source context should normally be packaged with --context-from-files / --context-file; upload transport creates one .zip archive attachment containing CONTEXT_PACKAGE.md plus the selected source files.
Do not upload files with --file unless explicitly requested. For source context, use the context packaging flags first.
Do not switch models.
Do not expose arbitrary evaluate through web-ai.
For live ChatGPT/Gemini observation or smoke tests, do not use headless Chrome. Use a headed, user-visible 30_browser/CDP session so account gates, Cloudflare, tool drawers, upload pickers, and model menus match the real frontend.
Use vision-click only as an explicit fallback when DOM/snapshot cannot see a visible UI target.

Support Labels

Prompt Shape

Build a structured question envelope:

[SYSTEM]
...

[USER]
## Project
...

## Goal
...

## Context
...

## Question
...

## Output
...

## Constraints
...

Commands

cli-jaw browser web-ai render --vendor chatgpt --prompt "..."
cli-jaw browser web-ai context-dry-run --vendor chatgpt --prompt "..." --context-from-files "src/**/*.ts" --files-report
cli-jaw browser web-ai context-render --vendor chatgpt --prompt "..." --context-from-files "src/**/*.ts"
cli-jaw browser web-ai code --vendor chatgpt --model thinking --effort heavy --prompt "Build a Flask API MVP" --output-zip ./result.zip
cli-jaw browser web-ai code-extract --vendor chatgpt --conversation "https://chatgpt.com/c/<conversation-id>" --output-zip ./result.zip
cli-jaw browser web-ai status --vendor chatgpt
cli-jaw browser web-ai query --vendor chatgpt --prompt "..." --context-from-files "src/foo.ts"
cli-jaw browser web-ai query --vendor chatgpt --inline-only --allow-copy-markdown-fallback --prompt "..."
cli-jaw browser web-ai query --vendor grok --inline-only --require-source-audit --source-audit-scope "sources checked" --source-audit-date "2026-05-05" --prompt "..."
cli-jaw browser web-ai poll --vendor chatgpt --timeout 1200
cli-jaw browser web-ai capabilities --vendor chatgpt
cli-jaw browser web-ai notifications --vendor chatgpt
cli-jaw browser web-ai stop --vendor chatgpt

Copy Markdown Fallback

Use only when explicitly needed:

cli-jaw browser web-ai query \
  --vendor chatgpt \
  --inline-only \
  --allow-copy-markdown-fallback \
  --prompt "Return a markdown table."

Polling Timeouts

| Vendor | Default --timeout | Roughly | | --- | ---: | --- | | ChatGPT | 1200 | 20 minutes | | Gemini | 1200 | 20 minutes | | Grok | 600 | 10 minutes |

Long-Running Queries — bgtask (no turn blocking)

For responses that may take many minutes (ChatGPT Pro/Heavy, Gemini Deep Think, Deep Research), do NOT block the boss turn on web-ai query. Register a server-owned background task instead:

SID=$(cli-jaw browser web-ai send --vendor chatgpt --model pro --inline-only \
  --prompt "..." --json | jq -r .sessionId)
cli-jaw bgtask add --preset web-ai --session "$SID" \
  --prompt "web-ai result: {{result}} — summarize and deliver to the user"
# → end the turn. The jaw server owns the work (native session probe,
#   restart-durable) and re-invokes the boss with a [bgtask:*] prompt
#   when the session reaches complete/timeout/error.

Check status anytime: cli-jaw bgtask list / cli-jaw bgtask show <taskId>.
This is the sanctioned exception to the "never relinquish the turn while work is in flight" rule — a registered bgtask is server-owned, not in-flight.
Keep blocking query for short/fast lookups where waiting is cheaper.

Tab pooling and lease contention

Completed provider tabs are kept warm in a per-vendor pool so the next send reuses a tab instead of creating a new one. Defaults (overridable via env on the agbrowse side):

Runtime capabilities

Completed poll, query, and watch results may include:

answerArtifact: normalized capture metadata (capturedBy, exactnessScore, text/markdown lengths, warnings).
sourceAudit: inline source coverage report when --require-source-audit is enabled.

Error taxonomy

cdp.unreachable, cdp.target-mismatch
provider.composer-not-visible, provider.model-mismatch, provider.attachment-preflight, provider.attachment-evidence-missing, provider.commit-not-verified, provider.poll-timeout, provider.runtime-disabled
capability.unsupported
context.over-budget, context.symlink-rejected
grok.context-pack-not-allowed
internal.unhandled

Existing cli-jaw error classes map to typed codes via fromCliJawStructuredError:

WrongTargetError → cdp.target-mismatch (preserves expectedTargetId / actualTargetId in evidence).
BrowserCapabilityError → capability.unsupported (preserves capabilityId / ownerPrd).
ProviderRuntimeDisabledError → provider.runtime-disabled (preserves vendor / stage).

Grok context packaging is fail-closed

Standalone agbrowse Alternative

When standalone agbrowse is explicitly requested, first inspect:

agbrowse --help
agbrowse web-ai --help

ChatGPT Code Artifact Extraction

Generate and recover a single artifact:

cli-jaw browser web-ai code \
  --vendor chatgpt \
  --model thinking \
  --effort heavy \
  --prompt "Build a Flask API MVP" \
  --output-zip ./result.zip

Generate multiple named artifacts:

cli-jaw browser web-ai code \
  --vendor chatgpt \
  --model pro \
  --effort extended \
  --prompt "Build backend.zip and frontend.zip" \
  --multi-zip \
  --output-dir ./artifacts

When an old ChatGPT conversation still contains assistant text such as MACHINE: /mnt/data/result.zip or /mnt/data/result.zip, recover it later without sending a new prompt:

cli-jaw browser web-ai code-extract \
  --vendor chatgpt \
  --conversation "https://chatgpt.com/c/<conversation-id>" \
  --output-zip ./result.zip

Then verify locally:

unzip -t ./result.zip
unzip -l ./result.zip

The original conversation URL/session/current ChatGPT tab and logged-in browser profile are still required; a copied /mnt/data/result.zip line alone is not enough.

Context Packaging

Use this when the user asks for max context / current context packaging before browser submission.

Rules:

--file still means live browser upload. Do not use it for source context.
--context-from-files may be repeated and accepts files, directories, and globs.
--context-exclude may be repeated and accepts glob excludes.
--context-file accepts a newline or JSON list of include/exclude patterns.
default source-context transport is upload: write one .zip archive context package and attach it in the ChatGPT/Gemini composer. Do not create a temporary .txt/.md file yourself for source context.
--inline-only or --context-transport inline forces the old pasted composer path.
--max-input sets the model input-token preflight budget.
--max-file-size defaults to 1 MB per file.
context-dry-run --json omits composerText unless --full is passed.
context-render prints the CONTEXT_PACKAGE.md body that will be placed inside the .zip archive by live upload transport.
send/query with context packaging must fail before browser mutation if token budget is exceeded, or if inline transport exceeds the inline character budget.

Example:

cli-jaw browser web-ai context-dry-run \
  --vendor chatgpt \
  --model pro \
  --prompt "review current context" \
  --context-from-files "src/browser/web-ai/**/*.ts" \
  --context-exclude "**/*.test.ts" \
  --files-report \
  --json

Browser Execution Policy

Live web-ai execution policy:

headed Chrome required
headless forbidden
Codex Cloud out of scope
observed frontend capability -> schema row -> verified mutation
not observed -> fail closed

Use the 30_browser-derived loop:

active-tab -> snapshot -> act -> snapshot -> verify

Refs are latest-snapshot scoped. Re-run snapshot after navigation, reload, or any action that can replace the DOM before using an existing ref.

Before sending a prompt, verify the active tab is ChatGPT. If active tab is not verified, stop and ask the operator to run:

cli-jaw browser tabs --json
cli-jaw browser tab-switch <target>

Session Handling

web-ai send captures a baseline before insertion:

vendor
targetId
URL
promptHash
assistantCount

Raw prompt text must not be persisted. Polling only accepts answers that appear after the saved baseline.

Current Scope

Current:

ChatGPT
Gemini / Deep Think
inline prompt
structured context packaging dry-run/render and inline send/query preflight
ChatGPT file upload
ChatGPT model switching: instant / thinking / pro
- 2026-04-30 headed UI note: the visible opener may be the bottom composer button.__composer-pill[aria-haspopup="menu"] labeled Instant/Thinking/Pro or a plain Heavy pill, while the older top model-switcher-dropdown-button can be absent. Treat visible Heavy as active ChatGPT Pro/Heavy. For direct DOM fallback, open the model pill and select [data-testid="model-switcher-gpt-5-5-pro-thinking-effort"]; do not click generic "Pro" by role/name because the profile menu can also match.
- 2026-06-11 headed UI note: ChatGPT may show a simplified Intelligence menu instead of the older model row plus effort submenu. The runtime maps instant and thinking --effort light to Instant, thinking --effort standard to Medium, thinking --effort extended to High, thinking --effort heavy to Extra High, pro --effort standard to Pro Extended, and pro --effort extended to Pro Extended.
Gemini model switching: flash-lite / flash / pro
- 2026-05-19 headed UI note: the Gemini picker currently exposes visible versioned labels such as 3.1 Flash-Lite, 3 Flash, and 3.1 Pro, but workflow commands must use stable aliases (flash-lite, flash, pro). The runtime normalizes future 3.n labels generically and keeps legacy fast as flash-lite, while thinking maps to pro. Deep Think remains a separate tool/mode request, not the plain --model alias.
render/status/send/poll/query/watch/watchers/sessions/capabilities/notifications/stop
long-running watcher startup recovery and channel delivery loop
observed capability schemas with fail-closed unobserved tools
ChatGPT code mode (cli-jaw browser web-ai code) with automatic gpt-dev-agent-context.zip attachment, PLAN.md/00_plan.md enforcement for new artifacts, later code-extract, multi-zip retrieval, and repeatable mixed --file uploads

Future:

Grok
Claude
ChatGPT web search and image generation tool runtime after headed frontend observation
Gemini image generation runtime after headed frontend observation
Web UI watcher dashboard

Adoption

lidge-jun/web-ai

$ install --global

Security Scan Results

SKILL.md

Web AI

Safe Defaults

Support Labels

Prompt Shape

Commands

Copy Markdown Fallback

Polling Timeouts

Long-Running Queries — bgtask (no turn blocking)

Tab pooling and lease contention

Runtime capabilities

Error taxonomy

Grok context packaging is fail-closed

Standalone agbrowse Alternative

ChatGPT Code Artifact Extraction

Context Packaging

Browser Execution Policy

Session Handling

Current Scope

Related Skills

lidge-jun/codex-imagegen

lidge-jun/repo-map

lidge-jun/design

lidge-jun/dev-devops

lidge-jun/web-ai

$ install --global

Security Scan Results

SKILL.md

Web AI

Safe Defaults

Support Labels

Prompt Shape

Commands

Copy Markdown Fallback

Polling Timeouts

Long-Running Queries — bgtask (no turn blocking)

Tab pooling and lease contention

Runtime capabilities

Error taxonomy

Grok context packaging is fail-closed

Standalone agbrowse Alternative

ChatGPT Code Artifact Extraction

Context Packaging

Browser Execution Policy

Session Handling

Current Scope

Related Skills

lidge-jun/codex-imagegen

lidge-jun/repo-map

lidge-jun/design

lidge-jun/dev-devops