skills/agent-media/SKILL.md
Agent-first media toolkit for image, video, and audio processing. Use when you need to resize, convert, generate, edit, upscale images, remove backgrounds, extend or crop canvases, extract audio, transcribe speech, or generate videos. All commands return deterministic JSON output.
npx skillsauth add agntswrm/agent-media agent-mediaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Agent Media is an agent-first media toolkit that provides CLI-accessible commands for image, video, and audio processing. All commands produce deterministic, machine-readable JSON output.
npx agent-media@latest image resize - Resize an imagenpx agent-media@latest image convert - Convert image formatnpx agent-media@latest image generate - Generate image from textnpx agent-media@latest image edit - Edit one or more images with text promptnpx agent-media@latest image remove-background - Remove image backgroundnpx agent-media@latest image upscale - Upscale image with AI super-resolutionnpx agent-media@latest image extend - Extend image canvas with paddingnpx agent-media@latest image crop - Crop image to dimensions around focal pointnpx agent-media@latest audio extract - Extract audio from videonpx agent-media@latest audio transcribe - Transcribe audio to textnpx agent-media@latest video generate - Generate video from text or imageAll commands return JSON to stdout:
{
"ok": true,
"media_type": "image",
"action": "resize",
"provider": "local",
"output_path": "output_123.webp",
"mime": "image/webp",
"bytes": 12345
}
On error:
{
"ok": false,
"error": {
"code": "INVALID_INPUT",
"message": "input file not found"
}
}
--provider <name>AGENT_MEDIA_DIR - Custom output directoryFAL_API_KEY - Enable fal providerREPLICATE_API_TOKEN - Enable replicate providerRUNPOD_API_KEY - Enable runpod providerAI_GATEWAY_API_KEY - Enable ai-gateway providerdata-ai
Generates video from text prompts or animates static images. Use when you need to create videos from descriptions, animate images, or produce video content using AI.
development
Upscales an image using AI super-resolution to increase resolution with detail generation. Use when you need to enlarge images, improve low-resolution photos, or prepare images for large-format display.
testing
Resizes an image to specified dimensions. Use when you need to change image size, create thumbnails, or prepare images for specific display requirements.
content-media
Removes the background from an image, leaving the foreground subject with transparency. Use when you need to isolate subjects, create cutouts, or prepare images for compositing.