skills/glmv-prompt-gen/SKILL.md
Analyze images/videos and generate professional prompts for text-to-image and text-to-video AI tools (Midjourney, Stable Diffusion, DALL-E, Sora, Runway, Kling, Pika). Use when the user wants to generate prompts from reference images/videos, create AI art prompts, or get prompt engineering suggestions from visual content.
npx skillsauth add zai-org/GLM-V glmv-prompt-genInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze reference images or videos and generate professional prompts for AI image/video generation tools.
| Type | Formats | Max Size | Max Count | Base64 | | ----- | -------------- | ----------------- | --------- | ------------- | | Image | jpg, png, jpeg | 5MB / 6000×6000px | 50 | ✅ | | Video | mp4, mkv, mov | 200MB | — | ❌ (URL only) |
⚠️ Images and videos cannot be used in the same request. ⚠️ Videos only support URLs — local paths and base64 are NOT supported.
After running the script, you must display the full prompt output exactly as returned. Do not summarize, truncate, or only say "prompt generated". Users need the complete prompt (especially the English prompt) for direct copy/paste.
auto mode, show both text-to-image and text-to-video prompts-o), provide the file path and show file content| Mode | Description |
| ------- | -------------------------------------------------- |
| image | Generate prompts for text-to-image tools (default) |
| video | Generate prompts for text-to-video tools |
| auto | Generate prompts for both image and video |
| Resource | Link | | --------------- | --------------------------------------------------------------------------------------------------------------------------------- | | Get API Key | https://bigmodel.cn/usercenter/proj-mgmt/apikeys | | API Docs | Chat Completions / 对话补全 |
This script reads the key from the ZHIPU_API_KEY environment variable and shares it with other Zhipu skills.
脚本通过 ZHIPU_API_KEY 环境变量获取密钥,与其他智谱技能共用同一个 key。
Get Key / 获取 Key: Visit Zhipu Open Platform API Keys / 智谱开放平台 API Keys to create or copy your key.
Setup options / 配置方式(任选一种):
OpenClaw config (recommended) / OpenClaw 配置(推荐): Set in openclaw.json under skills.entries.glmv-prompt-gen.env:
"glmv-prompt-gen": { "enabled": true, "env": { "ZHIPU_API_KEY": "你的密钥" } }
Shell environment variable / Shell 环境变量: Add to ~/.zshrc:
export ZHIPU_API_KEY="你的密钥"
💡 If you already configured another Zhipu skill (for example
zhipu-toolsorglmv-caption), they share the sameZHIPU_API_KEY, so no extra setup is needed. 💡 如果你已为其他智谱 skill(如zhipu-tools、glmv-caption)配置过 key,它们共享同一个ZHIPU_API_KEY,无需重复配置。
python scripts/prompt_gen.py --images "https://example.com/photo.jpg"
python scripts/prompt_gen.py --images /path/to/photo.png
python scripts/prompt_gen.py --images "https://example.com/scene.jpg" --mode video
python scripts/prompt_gen.py --images "https://example.com/photo.jpg" --mode auto
python scripts/prompt_gen.py --videos "https://example.com/clip.mp4" --mode video
python scripts/prompt_gen.py --images photo.jpg --mode image -o prompt.md
python scripts/prompt_gen.py --images photo.jpg --model glm-4.6v-flash
### Content Analysis
A cyberpunk cityscape at night, with dense skyscrapers, glowing neon signs, and rain-wet streets reflecting colorful light.
### Prompt
Cyberpunk cityscape at night, towering skyscrapers with glowing neon signs,
rain-wet streets reflecting colorful lights, flying cars in the distance,
volumetric fog, dramatic lighting, ultra detailed, 8K, cinematic composition
### Prompt Breakdown
- **Subject**: Futuristic skyline with skyscrapers and neon lights
- **Style**: Cyberpunk, sci-fi
- **Color**: Cool/warm contrast with blue-purple dominance and neon accents
- **Lighting**: Neon glow, wet-surface reflections, volumetric fog
- **Composition**: Wide-angle perspective with layered depth
- **Mood**: Mysterious, futuristic, high-tech
python scripts/prompt_gen.py (--images IMG [IMG...] | --videos VID [VID...]) [OPTIONS]
| Parameter | Required | Description |
| --------------------- | -------- | -------------------------------------------------- |
| --images, -i | One of | Image paths or URLs (jpg/png/jpeg, base64 OK) |
| --videos, -v | One of | Video URLs (mp4/mkv/mov, URL only) |
| --mode, -m | No | Output mode: image (default), video, or auto |
| --model | No | Model name (default: glm-4.6v) |
| --temperature, -t | No | Sampling temperature 0-1 (default: 0.6) |
| --max-tokens | No | Max output tokens (default: 2048) |
| --thinking | No | Enable thinking/reasoning mode |
| --stream | No | Enable streaming output |
| --output, -o | No | Save result to file |
| --pretty | No | Pretty-print JSON error output |
API key not configured: → Guide user to configure ZHIPU_API_KEY
Authentication failed (401/403): → API key invalid/expired → check at Zhipu API Keys / 智谱官网
Rate limit (429): → Quota exhausted → wait and retry
Content filtered: → warning field present → content blocked by safety review
Timeout: → Video processing may take time → increase timeout or use smaller files
tools
Frontend visual replication skill. Explores a target website’s publicly visible pages via Playwright MCP or agent-browser, captures screenshots and layout information, then generates a static or client-side frontend replica that approximates the original’s visual appearance and page structure. This skill replicates FRONTEND PRESENTATION ONLY — it does not reproduce backend logic, server-side behavior, databases, or any non-public content. The user is responsible for ensuring they have proper authorization (ownership, license, or explicit permission) before replicating any website. ⚠️ Authorization gate: Before starting, the agent MUST confirm with the user that they have the legal right to replicate the target site. If the user cannot confirm, the skill MUST refuse to proceed.
tools
股票分析与涨跌预测分析。 在用户表达分析、判断或预测意图时触发,如“分析一下腾讯”、“0700最近走势如何”、“XX能不能买”、“预测一下后续走势”、“生成一份分析报告”等; 对于简单查询类需求(如“腾讯当前价格是多少”、“茅台代码是什么”)不触发本 Skill。 支持港股、A股、美股,整合多源数据(包括新闻、基本面、技术面、资金流及宏观信息)进行多维综合分析,输出图文结合、包含可视化图表的结构化分析报告。 ⚠️ 需要多模态主模型支持(如 glm-5v-turbo),主模型需能读取图片。
documentation
Screen and evaluate resumes against criteria using ZhiPu GLM-V multimodal model. Reads multiple resume files (PDF/DOCX/TXT), compares against user-defined screening criteria, and outputs a Markdown table with pass/fail analysis. Use when the user wants to filter resumes, compare candidates, or batch-evaluate job applications.
development
Build a complete, production-ready full-stack web application from PRD documents, prototype images, and resource files. Handles the entire pipeline: system design, database schema, seed data, backend API, frontend UI, visual verification against prototypes, and deployment script generation. Use this skill whenever the user: - Provides a PRD (product requirement document) and wants a working app built - Says things like "根据PRD开发", "build from PRD", "implement this product", "把需求文档做成应用", "develop this app from requirements" - Has prototype images + requirements and wants full-stack implementation - Wants to turn product specifications into a running web application - Mentions building an app from wireframes/mockups combined with a requirements doc Trigger this skill even if the user just says "帮我开发" or "build this" with PRD materials present in the working directory.