skills/agent-tools/SKILL.md
Command-line tools that delegate analysis tasks to AI models and format up-to-date context for agents. Includes image description, screenshot comparison, smart cropping, token counting, technical essay generation, boolean condition evaluation, live context gathering, Android UI interaction via popper, GitHub PR/Issue/Workflow Run formatting via gh-markdown, and deep reasoning research via Oracle. Use this skill when the user needs to analyze images, count tokens, evaluate conditions, gather the latest authoritative documentation, format GitHub data, automate Android apps, generate technical essays, or perform complex architectural reasoning requiring recursive directory traversal and external search. Triggers: ai analysis, describe image, visual diff, token count, generate essay, boolean evaluation, gather context, latest docs, research topic, github, pull request, gh-markdown, automate app, oracle, deep research, architecture.
npx skillsauth add ithinkihaveacat/dotfiles agent-toolsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
ALWAYS prefer the scripts in scripts/ over raw curl API calls. Scripts
are located in the scripts/ subdirectory of this skill's folder. They provide
features that raw commands do not:
When to read the script source: If a script doesn't do exactly what you need, or fails due to missing dependencies, read the script source. The scripts encode API best practices (image ordering, structured output schemas, model selection) that may not be obvious—use them as reference when building similar functionality.
Environment: AI commands require a Gemini API key (reads from
GEMINI_API_KEY). Scripts will report clear errors if no key is found.
gh-markdown optionally accepts a --token for GitHub API access.
Model selection: Every Gemini-backed script accepts --model MODEL and
honors the GEMINI_MODEL environment variable (--model wins; each script's
built-in default applies when neither is set). Defaults vary per tool and are
tuned for its task; only override when you have a reason.
Dependencies: curl, jq, uv (all tools); base64, magick (image
tools only)
# Gather context and analyze
scripts/context show gemini-api | scripts/emerson "Explain the key features"
# Fetch a GitHub PR, Issue, or Workflow Run as Markdown
scripts/gh-markdown https://github.com/owner/repo/pull/123
# Describe an image (generate alt-text)
scripts/screenshot-describe screenshot.png
# Compare two images for visual differences
scripts/screenshot-compare before.png after.png
# Smart crop image around the detected primary subject
scripts/photo-smart-crop photo.jpg cropped.jpg
# Check if a photo prominently features people (exit code = answer)
scripts/photo-query @people photo.jpg
# Generic image query with a JSON schema
scripts/photo-query "Is there a fireplace?" \
--schema "has_fireplace bool" photo.jpg
# Generate essay-length analysis from text
scripts/emerson "Summarize the key changes" < documentation.md
# Evaluate a boolean condition against text
echo "Hello world" | scripts/satisfies "is a greeting"
# Count tokens in text
cat document.md | scripts/token-count
# Interact with an Android UI via AI
scripts/popper "start an exercise"
Consult the Oracle for a very carefully researched and considered answer. The Oracle utilizes deep reasoning and Google Search grounding to provide the highest quality response possible. It accepts arbitrary files and directories as positional arguments, recursively walks directories, and automatically uploads media files. Use this tool for deep research, complex architectural reasoning, and synthesis requiring external data or massive repository context.
Important Usage Guidelines:
--dry-run flag:
scripts/oracle --dry-run "PROMPT" [FILE_OR_DIR ...] Present the resulting
dry-run summary (the total payload size, the list of resolved files, and your
drafted prompt) to the user in the chat. Ask the user if they want to add
more directories, exclude specific files, or tweak the focus of the prompt.
Proceed with the live command (without --dry-run) after the user approves
the plan.Warning: Output can be detailed and lengthy.
scripts/oracle [OPTIONS] "PROMPT" [FILE_OR_DIR ...]
Options:
--force: Bypass context size limits (1MB for text, 20MB per media file). Use
when you are confident the large context is necessary and the model can handle
it.--maps: Use Google Maps grounding instead of Google Search. Use this for
queries about locations, places, or general routing options. Warning:
Specific details like live star ratings, current operating hours, or recent
business closures may still be inaccurate or outdated and should be verified.
Note: Cannot be combined with --code.--code: Enable Code Execution for Python. Use this whenever the task
requires precise calculations, complex mathematics, data analysis on provided
files, or programmatic logic. The model will write and execute Python code in
a sandboxed environment.Environment: GEMINI_API_KEY (Required)
Exit codes: 0 success, 1 error
Examples:
# Evaluate an architectural pattern
scripts/oracle "Evaluate this implementation against solid principles and propose a refactoring plan." src/
# Time-sensitive research based on context
scripts/oracle "What are the latest developments in this framework as of May 2026?" framework-docs.md
Fetch GitHub Pull Requests, Issues, or Workflow Runs and format them as Markdown for LLM Agents.
Features:
Requires GITHUB_TOKEN environment variable to be set with a GitHub Personal
Access Token.
Token Setup: You can generate a token at https://github.com/settings/personal-access-tokens.
scripts/gh-markdown URL
Environment: GITHUB_TOKEN (Required)
Exit codes: 0 success, 1 error
Examples:
# Fetch a PR
scripts/gh-markdown https://github.com/owner/repo/pull/123
# Fetch a Workflow Run
scripts/gh-markdown https://github.com/owner/repo/actions/runs/12345678
Gathers the very latest, authoritative, up-to-date context for deep research on
various technical subjects (e.g., gemini-api, mcp, home-assistant) or
arbitrary GitHub directories. Run context catalog to see all available entries.
This script should be your first tool for gathering background knowledge or the
latest documentation for an unfamiliar domain. Supports passing a full GitHub
URL as a target (e.g., https://github.com/owner/repo/tree/branch/path).
Warning: Output can be very large. Do not read output directly into your
conversation history. Pipe to emerson for analysis, or redirect to a file to
search/read locally.
scripts/context show TARGET
Commands: catalog (list available entries), show (show context for target), template (output plugin template)
Exit codes: 0 success, 1 error, 127 missing dependency
Examples:
# List available catalog entries
scripts/context catalog
# Gather context for Gemini API
scripts/context show gemini-api > gemini-context.xml
# Pipe context directly to analysis
scripts/context show gemini-cli | scripts/emerson "How do commands work?"
Generate concise alt-text for an image. Optimized for UI captures.
scripts/screenshot-describe IMAGE [PROMPT]
Exit codes: 0 success, 1 error, 127 missing dependency
Compare two images for visual differences. Identifies layout shifts, color changes, padding, and text updates.
scripts/screenshot-compare IMAGE1 IMAGE2 [PROMPT]
Exit codes: 0 differences found, 1 error (including missing ImageMagick), 2 images identical
Smart crop images around the detected primary subject (people, food, focal points in a landscape) with a specified aspect ratio. Centers the maximal crop box on the subject and enforces the aspect ratio. If no specific focal point is found, crops around the central compositional area.
scripts/photo-smart-crop [--ratio W:H] INPUT OUTPUT
Options: --ratio W:H (default 5:3)
Exit codes: 0 success, 1 error (API error, invalid arguments), 2 rate limited, 127 missing dependency
Examples:
# Default 5:3 aspect ratio
scripts/photo-smart-crop family.jpg family-cropped.jpg
# 16:9 for video thumbnails
scripts/photo-smart-crop --ratio 16:9 portrait.jpg thumbnail.jpg
# Square crop for profile pictures
scripts/photo-smart-crop --ratio 1:1 headshot.png avatar.png
Ask Gemini a question about one or more photos. The QUERY positional is either
an @-prefixed built-in or a free-form prompt:
@people — boolean: do people feature prominently? Single-file mode encodes
the answer in the exit code (0 true / 1 false / 2 error); stdout silent;
-v echoes true/false to stderr. Defaults to a 384px resize
(single-tile token cost).--schema SPEC (llm-style DSL
like 'has_bed bool, count int') for structured output and --filter FIELD
to print only paths whose boolean field is true.Multiple files (or non-boolean queries) emit per-file lines on stdout; exit code only reflects success/failure.
Default model is gemini-3.1-flash-lite — the cheapest/fastest Gemini 3 tier,
appropriate for high-volume classification and lightweight visual Q&A. Override
with --model for harder questions.
Deterministic image prep (EXIF rotate, alpha flatten, resize to --max-size
(default 768, 384 for @people), WebP encode) is content-addressed-cached at
~/.cache/agent-tools/photo-query/ so repeated queries against the same images
skip the resize entirely. Use --no-cache to bypass.
# Boolean check (exit code idiom)
if scripts/photo-query @people photo.jpg; then echo "Found people"; fi
# Multi-file boolean: per-line `<path>\t<true|false>` on stdout
scripts/photo-query @people *.jpg
# Schema-constrained query with filter
scripts/photo-query --recursive \
--schema "has_bedside_table bool" \
--filter has_bedside_table \
"Does this image feature a bedside table?" \
./photos/
# Free-text description per file
scripts/photo-query "Describe the scene in under 200 chars." room.jpg
Exit codes: 0 success (or true for single-file boolean), 1 false (only for single-file boolean), 2 error (network, parse, missing file).
Generate essay-length (~3000 words) analysis from text input. Produces
authoritative, footnoted Markdown. Operates as a strict, sandboxed tool that
relies entirely on the provided standard input (stdin). It performs
closed-book analysis without external search and acts as an elite technical
analyst instructed to treat the input as the sole source of truth to prevent
hallucination. Use this tool when you need summarization or formatting of
specific, pre-gathered text. Can be combined with context to provide rich
background material.
scripts/emerson "PROMPT" < input.txt
Exit codes: 0 success, 1 error, 127 missing dependency
Ask a question and get a short, paragraph-style response (wrapped to 80 columns). Optimized for quick answers.
scripts/pascal [-] "QUESTION"
Input: Optional context via stdin. Pass - as the first argument to read
it; without -, stdin is ignored.
Exit codes: 0 success, 1 error, 127 missing dependency
Examples:
# Ask a quick question
scripts/pascal "What is the capital of Peru?"
# Summarize a file
cat article.md | scripts/pascal - "Summarize this article"
# Explain code
scripts/pascal - "Explain this code" < script.sh
Evaluate whether input text satisfies a condition. Returns boolean via exit code.
echo "text" | scripts/satisfies [-v|--verbose] "CONDITION"
Options: -v, --verbose (output "true" or "false" to stderr)
Exit codes: 0 true (satisfies), 1 false (does not satisfy), 127 missing dependency
Examples:
# Check if file mentions a topic
cat file.txt | scripts/satisfies "mentions Elvis" && echo "Found it"
# Validate content type
cat response.json | scripts/satisfies "is valid JSON with an 'id' field"
# Use in conditionals
if cat log.txt | scripts/satisfies "contains error messages"; then
echo "Errors detected"
fi
Count tokens in text using the Gemini API.
cat file.txt | scripts/token-count
Exit codes: 0 success, 1 error, 127 missing dependency
Ping Gemini models to test API key validity and endpoint responsiveness. Runs checks in parallel and enforces a 60-second timeout.
scripts/gemini-api-doctor [MODELS...]
Input:
Environment: GEMINI_API_KEY (Optional. Used if set, otherwise reads from
stdin)
Options:
--help: Display help message.Examples:
echo "YOUR_API_KEY" | scripts/gemini-api-doctorscripts/gemini-api-doctor gemini-3.1-flash-liteExit codes: 0 success, 1 error
Interact with Android UIs using an AI agent powered by uiautomator2 and
Gemini. This allows semantic control of the device by providing a goal in
natural language. Screenshots are captured at each step and saved to a unique
run directory in an XDG-compliant temporary location.
scripts/popper "GOAL"
Options: --launch PACKAGE (launch a package before starting),
--stay-in-app (restrict the run to a single application package),
--dump-layout (print the current simplified UI layout as JSON and exit),
--agent-screenshots / --no-agent-screenshots (enable/disable sending
screenshots to API), --local-screenshots / --no-local-screenshots
(enable/disable saving screenshots locally), --screenshot-dir DIR (override
directory to save screenshots)
Environment: ANDROID_SERIAL (optional, target specific device)
Exit codes: 0 success (task completed), 1 error (task failed), 2 timeout
Examples:
# General UI task
scripts/popper "accept all permissions"
# Launch an app and keep the run inside it
scripts/popper --launch com.example.fitness --stay-in-app "start a running exercise"
# Target specific device
env ANDROID_SERIAL=12345 scripts/popper "open settings"
photo-query uses lossy WebP and
photo-smart-crop uses HEIF (both resize first to limit token cost)screenshot-describe drops the alpha channel
(-alpha off); screenshot-compare flattens onto a magenta background, so
transparency differences show up in comparisons; photo-query flattens onto
white-w 0 (Linux) or -b 0 (macOS) for single-line outputGEMINI_API_KEY)tools
Provides a comprehensive guide and ADB workflows for testing Android applications (both phone and Wear OS). Focuses on triggering system state changes, simulating connectivity edge cases, implementing robust UI automation, and interacting with Wear-specific surfaces. Triggers: android testing, wear os, testing, adb, pixel watch, galaxy watch, spoofing, fused location.
tools
Discover and select relevant agent skills, and manage workspace tool execution permissions. Use this to determine which skills apply to a workspace, to install or remove skills, and to manage allow/deny/ask rules for local agent tool execution across agents (Claude Code, Antigravity).
documentation
Use this skill when authoring, reviewing, or editing technical documents, including bug reports, known issues, friction logs, PR descriptions, and the structural content and tone of commit messages. Use to ensure engineering content maintains a clear, factual, and constructive tone. Triggers: technical writing, bug report, known issue, friction log, PR description, pull request, commit message tone, review document.
tools
Extracts image URLs and listing metadata from inigo.com property listings. Pairs well with an LLM image-query tool for triaging downloaded photography (e.g. find rooms with a bedside table, a fireplace with art above, mismatched dining chairs). Captures the Inigo-specific JSON paths (the React Server Components chunk format the App Router site uses). Use when scraping inigo.com listings or cataloguing interior-design reference photos. Triggers: inigo, inigo.com, interior design, property photography, gallery scrape, image triage, bedside table, fireplace, mismatched chairs.