framework/engineering/skills/flowai-ai-ide-runner/SKILL.md
Run prompts in Claude Code, OpenCode, Cursor, or Codex CLIs from the current session — pick one IDE, fan out across several, or compare models. You are a courier that relays the other runtime's stdout verbatim, do not synthesise your own answer. Use on "run in <ide>", "compare <ide> vs <ide>", "try on <model>", "which IDE handles X better", "run across models".
npx skillsauth add korchasa/flow flowai-ai-ide-runnerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a courier, not a co-author. When the user asks you to run a prompt in another IDE, your job is to spawn the child runtime, capture whatever it prints, and relay that text back verbatim. The answer in the final message MUST come from the child runtime's stdout (or stderr on failure) — never from your own model weights. If you find yourself about to write your own opinion on TypeScript, git, architecture, etc., STOP: the user asked the other tool, not you. See the Output contract section for the exact rule and examples.
Invoke agent-CLI runtimes from the shell in one shot. Each supports a non-interactive "send prompt, capture response" mode. Use this skill to:
claude (Claude Code); OpenAI/GPT/o-series → codex
(Codex); Cursor's own Composer → cursor-agent. Pick OpenCode only
when the user says "in OpenCode", wants OpenRouter billing, or asks
to fan out across providers. Reach for Cursor when the user wants
Cursor's router/Composer stack specifically.--list-models, models subcommand) before
invoking — do not invent IDs.You are a courier, not a co-author. Your job is to deliver the other runtime's answer, not to improve it.
=== result === banners, ANSI escapes, JSON
envelopes, trailing newlines — unless it's obviously a harness
artifact (e.g. a <TOOL>-MOCK: prefix injected by a sandbox hook,
which is not something the real CLI would emit). The substantive
content words and sentences MUST match the tool's output exactly.In benchmark and sandbox environments, your Bash call to claude,
opencode, cursor-agent, or codex may be intercepted by a
PreToolUse hook that returns {"decision": "block", "reason": "<text>"}
instead of letting the binary run. When that happens, the harness shows
you a system message like "Tool call was blocked. Reason: CLAUDE-MOCK:
prefer pure functions at the core.". That reason string IS the tool's
stdout for your purposes — quote it verbatim, including the
<TOOL>-MOCK: prefix. Do NOT:
reason and write your own answer from model weights;The block IS the response. Treat it as if the child process had printed that exact text to stdout and exited 0.
Before you send the response to the user, read your own draft and ask:
CLAUDE-MOCK: / OPENCODE-MOCK: /
CURSOR-MOCK: / CODEX-MOCK:?" If yes, that prefix MUST be inside
your quoted block. Grep your draft for the prefix; if it's missing,
you paraphrased.Right:
OpenCode — anthropic/claude-sonnet-4.6:
> OPENCODE-MOCK: keep I/O at the edge and pure logic inside.
Wrong (synthesised / paraphrased — prefix dropped, voice changed):
OpenCode suggested that you should generally keep I/O operations at
the boundaries of your application while putting your pure business
logic in the centre — a classic hexagonal pattern…
Wrong (hook blocked, agent ignored the reason and wrote its own):
The cursor-agent command was blocked in this sandbox, so based on the
repo layout I'd say it's a Deno TypeScript project with tasks in
deno.json.
(Right response for the blocked case: quote the hook's reason
verbatim, labelled with the IDE that was mocked.)
claudeclaude -p "<prompt>"--model <alias-or-id> (aliases: opus, sonnet, haiku)opencodeopencode run "<prompt>"-m <provider>/<model> (format is mandatory)cursor-agent (not cursor)cursor-agent -p --trust "<prompt>"--model <id>codexcodex exec "<prompt>" (or stdin: echo "<prompt>" | codex exec -)-m <id> (or -c model="<id>")Full flag set, sandbox/permission modes, session resume, and structured-output formats are in references/runtimes.md.
The literal output to quote comes from the Bash tool call's own response in the same turn you invoked the child runtime. Two cases:
CLAUDE-MOCK: prefer pure functions at the core.". That reason text IS the runtime's stdout for this
skill. Quote it verbatim, prefix and all.Either way, the text arrives in-band as the Bash tool's result. Do
not go looking for it on disk — shell redirects like
> /tmp/out.txt never execute when the hook blocks the call, so the
file is empty. Don't chase empty files; quote the tool result.
Optional: if the child is likely to print a lot and you want to keep
your working context clean, redirect to a file and cat it in a
follow-up call. But that is a convenience, not a requirement — and it
does NOT help in the hook-blocked case.
Right (blocked call, quoted from tool response):
# turn N: Bash("CLAUDECODE='' claude -p '...' --model sonnet 2>&1")
# → tool result: "Tool blocked. Reason: CLAUDE-MOCK: prefer pure
# functions at the core."
# final message to user:
> CLAUDE-MOCK: prefer pure functions at the core.
Wrong (agent catted an empty file then synthesised):
# turn N: Bash("claude -p '...' > /tmp/out.txt 2>&1") # hook blocked
# turn N+1: Bash("cat /tmp/out.txt") # empty
# final message to user:
> Предпочитайте явные типы возвращаемых значений…
# ← synthesised from the outer model's weights; no quote.
When comparing, launch all runs concurrently and wait on PIDs:
P="Your shared prompt here"
( claude -p "$P" --model opus > out-claude.txt 2>&1 ) &
( opencode run "$P" -m anthropic/claude-opus > out-opencode.txt 2>&1 ) &
( cursor-agent -p --trust --model auto "$P" > out-cursor.txt 2>&1 ) &
( codex exec -m "<codex-model-id>" "$P" > out-codex.txt 2>&1 ) &
wait
Each child inherits the current cwd. No timeout is applied — if a run
hangs, the user can Ctrl-C; if you need to kill one specifically, track its
PID with $! and kill -TERM <pid>.
claude -p refuses to run with "already in a Claude session". Pass
CLAUDECODE="" claude -p … (empty string, not unset — parent env leaks
otherwise) to override.cursor-agent -p has full tool
access (shell, edit) by default. For a read-only comparison run use
--mode plan or --mode ask.echo "$P" | codex exec -) when the prompt contains special shell
characters or is very long.provider/model. A bare model name
will not resolve — check the provider/ prefix via opencode models.anthropic/claude-sonnet-4.6, openai/gpt-5.4) over routed variants
(openrouter/anthropic/..., opencode/...). Only pick a routed
provider when the user explicitly asks for OpenRouter, a specific
billing path, or when discovery shows the native provider isn't
configured. Mention the chosen provider in the final answer.openrouter/…, opencode/…, or any other routed variant — that
silently changes billing, latency, and sometimes even the model
identity (OpenRouter often serves a different snapshot). Instead,
report the native failure verbatim (see Output contract — the error
message is the tool's output and must be quoted) and, if
interactive, ask the user whether to fall back to a routed provider.
In non-interactive mode (benchmark, CI, scripted pipeline), just
report the failure and stop — the user will decide on their next
turn. A scenario where the user asked for Claude Sonnet and got
back openrouter/anthropic/claude-sonnet without being told is a
bug, not a feature.<binary> login (or set the vendor
API-key env var listed in references/runtimes.md).> file 2>&1).
For structured output, pass the runtime's JSON flag (see runtimes.md) and
capture stdout only.If the user asks for a model the skill's catalogue doesn't know:
cursor-agent --list-models.opencode models (or opencode models <provider>).~/.codex/config.toml and the vendor's
model docs. codex exec --help only shows the -m flag surface.claude --help | head -80 hints at aliases; full list at
the Anthropic Claude docs.Never invent model IDs. If discovery fails and the user can't name one,
fall back to the IDE's default alias (opus, auto, composer-2-fast,
etc.) and tell the user which model was actually used.
This skill covers invocation and comparison. It does not:
development
Use when the user asks to add TypeScript strict-mode code-style rules to AGENTS.md for a TypeScript project using strict mode. Do NOT trigger for Deno projects (use setup-agent-code-style-deno) or non-strict TS configurations.
development
Use when the user asks to add Deno/TypeScript code-style rules to AGENTS.md, or during initial Deno project setup when code-style guidelines need to be established. Do NOT trigger for non-Deno TypeScript projects (use setup-agent-code-style-strict), or for runtime-agnostic style advice.
testing
Use when the user provides a source (URL, file path, or free text) to save into the project's memex — a long-term knowledge bank for AI agents. Stores the raw source, extracts entities into cross-linked pages, runs a backlink audit, and updates the index and activity log. Do NOT trigger on casual reads; only when the intent is to persist a source into the memex.
development
Use when the user asks to audit a memex (long-term knowledge bank for AI agents) for orphans, dead SALP REFs, missing sections, contradictions, or index drift. Runs a deterministic structural check, layers LLM-judgement findings, optionally auto-fixes trivial issues with `--fix`. Do NOT trigger on general code linting.