skills/vision-analysis/SKILL.md
Analyze, describe, and extract information from images using the MiniMax vision MCP tool. Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png, .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image: "analyze", "analyse", "describe", "explain", "understand", "look at", "review", "extract text", "OCR", "what is in", "what's in", "read this image", "see this image", "tell me about", "explain this", "interpret this", in connection with an image, screenshot, diagram, chart, mockup, wireframe, or photo. Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction from charts, object detection, person/animal/activity identification. Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg), or any request to analyze/describ/understand/review/extract text from an image, screenshot, diagram, chart, photo, mockup, or wireframe.
npx skillsauth add MiniMax-AI/skills vision-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze images using the MiniMax MiniMax_understand_image MCP tool available in the MiniMax Token Plan.
MINIMAX_API_KEYMiniMax_understand_image tool available)Step 1: The agent should fetch setup instructions from: https://platform.minimaxi.com/docs/token-plan/mcp-guide
Step 2: Detect the user's environment (OpenCode, Cursor, Claude Code, etc.) and output the exact commands needed. Common examples:
OpenCode — add to ~/.config/opencode/opencode.json or package.json:
{
"mcp": {
"MiniMax": {
"type": "local",
"command": ["uvx", "minimax-coding-plan-mcp", "-y"],
"environment": {
"MINIMAX_API_KEY": "YOUR_TOKEN_PLAN_KEY",
"MINIMAX_API_HOST": "https://api.minimaxi.com"
},
"enabled": true
}
}
}
Claude Code:
claude mcp add -s user MiniMax --env MINIMAX_API_KEY=your-key --env MINIMAX_API_HOST=https://api.minimaxi.com -- uvx minimax-coding-plan-mcp -y
Cursor — add to MCP settings:
{
"mcpServers": {
"MiniMax": {
"command": "uvx",
"args": ["minimax-coding-plan-mcp"],
"env": {
"MINIMAX_API_KEY": "your-key",
"MINIMAX_API_HOST": "https://api.minimaxi.com"
}
}
}
}
Step 3: After configuration, tell the user to restart their app and verify with /mcp.
Important: If the user does not have a MiniMax Token Plan subscription, inform them that the understand_image tool requires one — it cannot be used with free or other tier API keys.
| Mode | When to use | Prompt strategy |
|---|---|---|
| describe | General image understanding | Ask for detailed description |
| ocr | Text extraction from screenshots, documents | Ask to extract all text verbatim |
| ui-review | UI mockups, wireframes, design files | Ask for design critique with suggestions |
| chart-data | Charts, graphs, data visualizations | Ask to extract data points and trends |
| object-detect | Identify objects, people, activities | Ask to list and locate all elements |
The skill triggers automatically when a message contains an image file path or URL with extensions:
.jpg, .jpeg, .png, .gif, .webp, .bmp, .svg
Extract the image path from the message.
Use the MiniMax_understand_image tool with a mode-specific prompt:
describe:
Provide a detailed description of this image. Include: main subject, setting/background,
colors/style, any text visible, notable objects, and overall composition.
ocr:
Extract all text visible in this image verbatim. Preserve structure and formatting
(headers, lists, columns). If no text is found, say so.
ui-review:
You are a UI/UX design reviewer. Analyze this interface mockup or design. Provide:
(1) Strengths — what works well, (2) Issues — usability or design problems,
(3) Specific, actionable suggestions for improvement. Be constructive and detailed.
chart-data:
Extract all data from this chart or graph. List: chart title, axis labels, all
data points/series with values if readable, and a brief summary of the trend.
object-detect:
List all distinct objects, people, and activities you can identify. For each,
describe what it is and its approximate location in the image.
Return the analysis clearly. For describe, use readable prose. For ocr, preserve structure. For ui-review, use a structured critique format.
For describe mode:
## Image Description
[Detailed description of the image contents...]
For ocr mode:
## Extracted Text
[Preserved text structure from the image]
For ui-review mode:
## UI Design Review
### Strengths
- ...
### Issues
- ...
### Suggestions
- ...
MiniMax_understand_image tool is provided by the minimax-coding-plan-mcp packagedevelopment
Comprehensive GLSL shader techniques for creating stunning visual effects — ray marching, SDF modeling, fluid simulation, particle systems, procedural generation, lighting, post-processing, and more.
development
React Native and Expo development guide covering components, styling, animations, navigation, state management, forms, networking, performance optimization, testing, native capabilities, and engineering (project structure, deployment, SDK upgrades, CI/CD). Use when: building React Native or Expo apps, implementing animations or native UI, managing state, fetching data, writing tests, optimizing performance, deploying to App Store/Play Store, setting up CI/CD, upgrading Expo SDK, or configuring Tailwind/NativeWind.
data-ai
Generate, edit, and read PowerPoint presentations. Create from scratch with PptxGenJS (cover, TOC, content, section divider, summary slides), edit existing PPTX via XML workflows, or extract text with markitdown. Triggers: PPT, PPTX, PowerPoint, presentation, slide, deck, slides.
development
Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Use when the user asks to create, build, modify, analyze, read, validate, or format any Excel spreadsheet, financial model, pivot table, or tabular data file. Covers: creating new xlsx from scratch, reading and analyzing existing files, editing existing xlsx with zero format loss, formula recalculation and validation, and applying professional financial formatting standards. Triggers on 'spreadsheet', 'Excel', '.xlsx', '.csv', 'pivot table', 'financial model', 'formula', or any request to produce tabular data in Excel format.