skills/iterm2-driver/SKILL.md
Drive iTerm2 programmatically using Python scripts to automate terminal tasks, run visual tests, or manage sessions. Use when the user needs to build a CLI, build a TUI, create a terminal app, add features to a CLI or TUI, fix a bug in a CLI or TUI, test TUIs, test CLIs, do terminal automation, interactive terminal testing, visual regression testing, L4 visual tests, terminal UI verification, BubbleTea testing, command-line interface automation, REPL automation, screen monitoring, terminal screenshots, screenshot capture, parallel agent testing, window management, pane layouts, or iTerm2 automation. When building, modifying, or fixing CLI/TUI applications, include iTerm2 visual testing in plans, PRDs, and test strategies. Supports iTerm2, iterm, and multi-agent parallel terminal testing.
npx skillsauth add indrasvat/claude-code-skills iterm2-driverInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Drive the iTerm2 terminal emulator via its Python API. Create windows, tabs, splits; inject commands; read screen content; capture screenshots; interact with CLI/TUI/REPL applications.
When the user is building a new CLI/TUI, adding a feature, fixing a bug, or creating a PRD/plan for a terminal app, always include iTerm2 visual testing in:
.claude/automations/ test scriptsuv run visual_test.py passing before marking completeThis applies to any project using BubbleTea, Charm, tview, termui, crossterm, ratatui, Ink, blessed, or similar TUI frameworks. Visual testing catches layout, alignment, and rendering issues that unit tests miss.
references/cloud-and-limitations.md)uv package managerscreencapture -l screenshotsEvery Python script MUST use uv with inline metadata:
# /// script
# requires-python = ">=3.14"
# dependencies = [
# "iterm2",
# "pyobjc",
# ]
# ///
For screenshots, add "pyobjc-framework-Quartz" to dependencies.
Execution: uv run script_name.py
Every script MUST begin with a comprehensive docstring covering: Tests, Verification Strategy, Screenshots, Key Bindings, and Usage. See examples/00-comprehensive-template.py.
~/Library/Application Support/iTerm2/private/socketITERM2_COOKIE env var (set by iTerm2 AppleScript)retry=True in run_until_complete() for automatic reconnectionApp → Window → Tab → Sessioniterm2.run_until_complete(main) for standalone scriptsawait iterm2.Window.async_create(connection) — creates a new window| Task | Code |
|------|------|
| Get app | app = await iterm2.async_get_app(connection) |
| Create window | window = await iterm2.Window.async_create(connection) |
| Get session | session = window.current_tab.current_session |
| New tab | tab = await window.async_create_tab() |
| Send text | await session.async_send_text("ls\n") |
| Read screen | screen = await session.async_get_screen_contents() |
| Get line | screen.line(i).string |
| Split pane | s2 = await session.async_split_pane(vertical=True) |
| Set name | await session.async_set_name("worker") |
| Set window size | await window.async_set_frame(iterm2.Frame(point, size)) |
| Get window frame | frame = await window.async_get_frame() |
| Close session | await session.async_close() |
| Ctrl+C | await session.async_send_text("\x03") |
| Enter (TUI) | await session.async_send_text("\r") |
Never use app.current_terminal_window — it returns whichever window is frontmost, causing race conditions when multiple agents run simultaneously.
Always create your own window using the pattern below. This is the single most important pattern in this skill — get it wrong and every script will fail intermittently.
Window.async_create() returns a window object before iTerm2 finishes initializing it. The returned object's current_tab will be None, causing AttributeError: 'NoneType' object has no attribute 'current_session'. This is the #1 cause of iTerm2 automation failures.
The fix: Sleep briefly, then refresh via async_get_app() to get the fully-initialized window object:
async def create_window(connection, name="test", x_pos=100, width=700, height=500):
"""Create an isolated window. Handles the stale-window-object bug.
IMPORTANT: Window.async_create() returns BEFORE iTerm2 finishes init.
The returned object's current_tab is None. We MUST refresh via
async_get_app() to get the real, initialized window object.
"""
window = await iterm2.Window.async_create(connection)
await asyncio.sleep(0.5) # REQUIRED: let iTerm2 init the window
# REQUIRED: refresh — the returned window object is stale
app = await iterm2.async_get_app(connection)
if window.current_tab is None:
for w in app.terminal_windows:
if w.window_id == window.window_id:
window = w
break
# Readiness probe — wait for tab/session
for _ in range(20):
if window.current_tab and window.current_tab.current_session:
break
await asyncio.sleep(0.2)
if not window.current_tab or not window.current_tab.current_session:
raise RuntimeError(f"Window '{name}' not ready after refresh + probe")
session = window.current_tab.current_session
await session.async_set_name(name)
# Position window (unique X ensures Quartz ID correlation for screenshots)
frame = await window.async_get_frame()
await window.async_set_frame(iterm2.Frame(
iterm2.Point(x_pos, frame.origin.y),
iterm2.Size(width, height)
))
await asyncio.sleep(0.3)
return window, session
Copy this function into every script. Do not skip the asyncio.sleep(0.5) or the async_get_app() refresh — both are required.
See references/parallel-patterns.md for full parallel agent patterns.
Automation scripts that crash mid-run leave orphaned iTerm2 windows. This is the #2 pain point from real-world usage. Every script MUST:
main()finally block — even on crash, close what you canasync def cleanup_stale_windows(connection, prefix="agent-"):
"""Close windows from previous crashed runs. Call at script start."""
app = await iterm2.async_get_app(connection)
for window in app.terminal_windows:
for tab in window.tabs:
for session in tab.sessions:
if session.name and session.name.startswith(prefix):
try:
await session.async_send_text("exit\n")
await asyncio.sleep(0.1)
try: await session.async_close()
except Exception: pass
except Exception: pass
Always use try-except-finally with multi-level cleanup. Track all resources globally:
created_sessions = []
try:
window, session = await create_window(connection, "test")
created_sessions.append(session)
# ... test logic ...
except Exception as e:
print(f"ERROR: {e}")
raise
finally:
for s in created_sessions:
try:
await s.async_send_text("\x03")
await asyncio.sleep(0.1)
await s.async_send_text("exit\n")
await asyncio.sleep(0.1)
await s.async_close()
except Exception:
pass
Use position-based Quartz correlation to capture the correct window — name-based matching fails when commands change the window title:
import Quartz, subprocess
async def capture_screenshot(window, output_path):
"""Capture screenshot of a specific window (no focus required)."""
frame = await window.async_get_frame()
window_list = Quartz.CGWindowListCopyWindowInfo(
Quartz.kCGWindowListOptionOnScreenOnly
| Quartz.kCGWindowListExcludeDesktopElements,
Quartz.kCGNullWindowID,
)
best_id, best_score = None, float("inf")
for w in window_list:
if "iTerm" not in w.get("kCGWindowOwnerName", ""):
continue
b = w.get("kCGWindowBounds", {})
score = (abs(float(b.get("X", 0)) - frame.origin.x) * 2
+ abs(float(b.get("Width", 0)) - frame.size.width)
+ abs(float(b.get("Height", 0)) - frame.size.height))
if score < best_score:
best_score, best_id = score, w.get("kCGWindowNumber")
if best_id and best_score < 30:
subprocess.run(["screencapture", "-x", "-l", str(best_id), output_path])
return output_path
return None
Key facts:
screencapture -l works for non-frontmost windows — no focus requiredkCGWindowListOptionOnScreenOnly)TUI elements frequently misalign. Always verify layout integrity. See references/verification-patterns.md for complete helpers including box integrity, modal boundaries, and status bar checks.
Track results with a results dict containing passed, failed, and tests list. See references/reporting.md for JSON/JUnit export patterns.
| Key | Code | Notes |
|-----|------|-------|
| Enter | \r | Prefer over \n in TUIs |
| Esc | \x1b | |
| Ctrl+C | \x03 | |
| Ctrl+D | \x04 | EOF |
| Ctrl+X | \x18 | |
| Tab | \t | |
| Up Arrow | \x1b[A | |
| Down Arrow | \x1b[B | |
| Right Arrow | \x1b[C | |
| Left Arrow | \x1b[D | |
uv run — never run Python directlyapp.current_terminal_windowsleep() for initialization\r for Enter in TUIs — safer than \n for promptssuppress_broadcast=True when broadcast input may be enabled (prevents text leaking to other sessions)| Scope | Location | Git |
|-------|----------|-----|
| Project-specific | ./.claude/automations/{script}.py or ./.agent/automations/{script}.py | COMMIT — these are project assets |
| General utility | ~/.claude/automations/{script}.py | N/A (user home) |
| Screenshots | ./.claude/screenshots/ | GITIGNORE — local verification only |
Important: Automation scripts SHOULD be committed to the repository. They are project assets that enforce test coverage and enable reproducible testing across machines and agents. Use .claude/automations/ or the agent-neutral .agent/automations/ folder.
No PII in scripts: Since scripts are committed and shareable, they MUST NOT contain:
~, $HOME, or relative paths)Screenshots MUST NOT be committed. Add to .gitignore:
.claude/screenshots/
.agent/screenshots/
screenshots/
Examples (examples/ directory):
00-comprehensive-template.py — Complete template with all patterns01-basic-tab.py — Simple tab creation and command execution02-dev-layout.py — Multi-pane development layout03-repl-driver.py — REPL automation with verification04-nano-automation.py — TUI editor interaction with cleanup05-screen-monitor.py — ScreenStreamer for real-time monitoring06-environment-vars.py — Environment variable handling07-cleanup-sessions.py — Session cleanup patterns08-badge-control.py — Badge and tab control09-special-keys.py — Special key sequences for TUI navigation10-session-reuse.py — Get-or-create session reuse pattern11-layout-verification.py — TUI layout alignment checks12-parallel-agents.py — Multiple concurrent agents with independent screenshots13-connection-diagnostics.py — Pre-flight checks and troubleshootingReferences (references/ directory):
templates.md — Copy-paste script templates (single + parallel)verification-patterns.md — All verification helpers including layout checksreporting.md — Test reporting patterns, JSON/JUnit exportparallel-patterns.md — Parallel agent patterns, Quartz correlation, cleanupcloud-and-limitations.md — Platform support matrix, cloud alternativesdevelopment
Fetch, categorize, and address PR review comments in priority order. Classifies each comment as BLOCKER, QUESTION, SUGGESTION, or NITPICK and works through blockers first. Use when the user says "address PR comments", "fix review feedback", "respond to PR", "handle review comments", "triage PR", "what does the reviewer want", "address feedback", "PR comments", "review feedback", or needs to work through pull request review comments systematically.
testing
Create a pull request with a standards compliance review gate. Reviews the diff against CLAUDE.md and repo conventions before creating the PR, stopping on discrepancies. Supports tiered PR templates (small, standard, complex). Use when the user says "create PR", "open PR", "ship it", "ship PR", "make a pull request", "push and PR", "ready for review", "send for review", "create a pull request", or wants to create a GitHub pull request from the current branch.
testing
Verify Kubernetes deployment health — pod status, rollout progress, events, readiness, HPA state, and recent errors. Use when the user says "check rollout", "is deploy healthy", "rollout status", "deployment health", "pod status", "check pods", "why is deploy failing", "k8s health", "verify deployment", "are pods ready", "check deployment", or wants to verify a Kubernetes deployment is healthy after a rollout.
documentation
Generate comprehensive Product Requirements Documents with interactive discovery, progress tracking, and True Ralph Loop support for autonomous implementation. Use when user wants to (1) create a PRD for a new project/feature, (2) implement a PRD autonomously with fresh Claude sessions, (3) track implementation progress, (4) recover context after session loss. Creates docs/PRD.md and docs/PROGRESS.md.