skills/gemini-image/SKILL.md
Automate image generation on Google Gemini (gemini.google.com) using CamoFox CLI. Use when the user asks to "generate image", "create image with Gemini", "AI image generation", "tạo hình ảnh", or needs automated Gemini workflows.
npx skillsauth add redf0x1/camofox-browser gemini-imageInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill for Gemini image generation flows with CamoFox CLI commands only. Use curl only for download file retrieval from the local CamoFox download API.
9377.When a valid Gemini session is already saved:
All commands below assume --user gemini. Add it explicitly in multi-session scenarios.
# Load saved session and open Gemini
camofox open https://gemini.google.com/app --user gemini
# Load saved profile
camofox session load <profile-name> --user gemini
# Verify auth (check for profile avatar, no "Sign in" button)
camofox snapshot --user gemini
# Type image prompt
camofox type <prompt-ref> "your image description here" --user gemini
# Submit prompt
camofox press Enter --user gemini
# Wait for image generation to complete (15-180 seconds)
camofox wait 'mat-icon[fonticon="download"]' --timeout 180000 --user gemini
# Download the generated image
camofox click 'mat-icon[fonticon="download"]' --user gemini
# Cleanup
camofox close --user gemini
camofox open https://gemini.google.com/app --user gemini
camofox snapshot --user gemini
# Look for: "Sign in" button -> NOT authenticated
# Look for: Greeting text "Xin chào..." -> authenticated
# Navigate to Google login
camofox navigate https://accounts.google.com --user gemini
# Enter email
camofox fill '[e1]="[email protected]"' --user gemini
# OR
camofox type e1 "[email protected]" --user gemini
camofox click e4 --user gemini # Next button
# Wait for password page
camofox wait 'input[type="password"]' --user gemini
camofox snapshot --user gemini
# Enter password
camofox type e2 "password" --user gemini
camofox click e4 --user gemini # Next button
Security Tip: For production workflows, store credentials in CamoFox Auth Vault:
camofox auth save google-gemini # Follow interactive prompts to save credentials securely camofox auth load google-gemini --user gemini
After entering password, Google may prompt 2FA:
camofox wait 'a[href*="myaccount.google.com"]' --timeout 90000 --user gemini
accounts.google.com:camofox wait navigation --timeout 90000 --user gemini
camofox session save <profile-name> --user gemini
# Reuse next time with: camofox session load <profile-name> --user gemini
# Navigate to Gemini (if not already there)
camofox navigate https://gemini.google.com/app --user gemini
# Snapshot to find prompt input ref
camofox snapshot --user gemini
# Prompt textbox is typically the main textarea element
# Type your image description
camofox type <prompt-ref> "Generate an image of a cute red fox wearing sunglasses in digital art style" --user gemini
# Submit the prompt
camofox press Enter --user gemini
# Wait for generation to complete (15-180 seconds)
# The download button appears when generation is done
camofox wait 'mat-icon[fonticon="download"]' --timeout 180000 --user gemini
# Verify the result
camofox snapshot --user gemini
# Key refs after generation:
# - Download button (mat-icon fonticon="download")
# - Like (mat-icon fonticon="thumb_up")
# - Dislike (mat-icon fonticon="thumb_down")
# - Redo
# - Share
# Click download button (locale-independent CSS selector)
camofox click 'mat-icon[fonticon="download"]' --user gemini
# Check download status
camofox downloads --format json --user gemini
# Parse `downloadId` from JSON output (for example with `jq -r '.[0].id'`)
# Retrieve the file via REST API
# The click response or downloads command shows the download ID
curl -o "generated-image.png" "http://localhost:9377/downloads/<download-id>/content?userId=gemini"
# Cleanup
camofox close --user gemini
mat-icon[fonticon="..."] instead of text labels.camofox wait with CSS selectors, not fixed delays.eN) shift after navigation/interaction.| Error | Solution |
|---|---|
| "Sign in" still visible after session load | Session expired (24-48h). Re-login with email/password + 2FA |
| wait times out for download button | Image generation may have failed. Run camofox snapshot and inspect error text |
| "can't create it right now" | Not authenticated. Re-check auth state and login again |
| "Image Generation Limit Reached" | Daily limit reached. Wait 24h or use another account |
| TAB_NOT_FOUND | Tab was invalidated. Open a new tab with camofox open |
| type does not input text | Gemini may use custom editor. Fallback: camofox eval 'document.querySelector(".ql-editor").innerHTML = "prompt"' |
[email protected] (User)gemini-sky (52 cookies)Generate an image of a cute red fox wearing sunglasses in digital art style8.1MB PNG (Gemini_Generated_Image_ecu3ujecu3ujecu3.png)~30 secondsclick mat-icon[fonticon="download"]tools
Anti-detection browser automation for AI agents. Use when the user needs stealth web browsing, undetectable scraping, fingerprint spoofing, proxy rotation, or privacy-focused browser automation. Triggers include "stealth scrape", "anti-detection", "bypass fingerprinting", "camofox", "camoufox", "undetectable browser", "bot evasion", or any browser task requiring evasion of bot detection systems.
tools
Reddit automation with CamoFox CLI — login, browse, post, comment, reply with anti-detection. Use when the user needs to automate Reddit interactions, post to subreddits, comment on threads, or browse Reddit programmatically. Triggers include "reddit", "post to reddit", "reddit comment", "reddit automation", or any request to interact with Reddit via browser automation.
development
QA testing workflow for CamoFox Browser — systematic testing with console capture, error detection, and Playwright tracing. Use when the user needs to test a web application, perform QA validation, do exploratory testing, or verify web app behavior. Triggers include "test this website", "QA", "dogfood", "exploratory testing", "find bugs", or any request to systematically test a web application.
tools
CamoFox CLI — 50 commands for anti-detection browser automation from the terminal. Use when the user needs CLI reference for camofox commands, terminal-based browser control, or a quick lookup of command syntax. Triggers include "camofox command", "CLI reference", "terminal browser", or any request for camofox command-line usage.