plugins/zai-glm/skills/glm-models/SKILL.md
Use this skill when the user asks about GLM models, GLM-5, GLM-4.7, GLM-4.6, GLM-4.5, GLM-4V, ChatGLM, CogView, CogVideoX, z.ai model capabilities, model selection for different tasks, or comparing GLM models.
npx skillsauth add nsheaps/ai-mktpl glm-modelsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The GLM (General Language Model) family is developed by z.ai (formerly Zhipu AI / 智谱AI). These models support text generation, vision, code, embeddings, image generation, and video generation. All recent models are open-weight under MIT license.
https://api.z.ai/api/paas/v4/| Model | Architecture | Context | Key Features |
| --------------- | ---------------------- | ------------------ | ------------------------------------------------------------- |
| glm-5 | ~745B MoE (44B active) | 200K in / 128K out | Agentic engineering, tool streaming, long-horizon tasks, MIT |
| glm-5-turbo | Same, optimized | 200K in / 128K out | Improved stability for long-chain agent tasks |
| glm-4.7 | ~400B MoE | 200K in / 128K out | Coding-focused, Preserved Thinking, Turn-level Thinking, MIT |
| glm-4.7-flash | Lightweight | Reduced | Free tier, lighter capability |
| glm-4.6 | 355B total | 200K | Strong code benchmarks, agent frameworks, MIT |
| glm-4.5 | 355B / 32B active | 128K | Hybrid reasoning (thinking/non-thinking modes), deep thinking |
| glm-4.5-x | Premium tier | 128K | Higher capability, premium pricing |
| glm-4.5-air | 106B / 12B active | 128K | Compact variant of GLM-4.5 |
| glm-4.5-flash | Lightweight | 128K | Free tier |
GLM-4.5+ models support hybrid reasoning — toggle between deep thinking and instant response:
{
"model": "glm-4.7",
"messages": [{ "role": "user", "content": "Solve this step by step" }],
"thinking": { "type": "enabled" }
}
tool_stream: true)| Model | Parameters | Context | Description |
| ---------------- | ----------------- | ------- | -------------------------------------- |
| glm-4.6v | 106B / 12B active | 128K | Vision understanding, function calling |
| glm-4.6v-flash | 9B | — | Free, open weights, commercial license |
| glm-4.5v | 106B VLM | — | Vision-language model |
curl "https://api.z.ai/api/paas/v4/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-4.6v",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}]
}'
| Model | Category | Description |
| ----------------- | ---------------- | -------------------------- |
| glm-image | Image generation | Text-to-image (Jan 2026) |
| glm-ocr | OCR | Document and image OCR |
| cogview-3-plus | Image gen | High-quality text-to-image |
| cogvideox | Video gen | Text-to-video generation |
| cogvideox-flash | Video gen | Fast video generation |
| Model | Dimensions | Description |
| ------------- | ---------- | ------------------------------- |
| embedding-3 | 2048 | General-purpose text embeddings |
| embedding-2 | 1024 | Previous generation embeddings |
curl "https://api.z.ai/api/paas/v4/embeddings" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embedding-3",
"input": "What is machine learning?"
}'
| Use Case | Recommended Model | Why |
| ------------------- | ----------------- | --------------------------------------- |
| Agentic tasks | glm-5 | Tool streaming, long-horizon planning |
| Coding | glm-4.7 | Coding-focused, Preserved Thinking |
| Complex reasoning | glm-4.5 | Hybrid reasoning with deep thinking |
| General chat | glm-4.5-flash | Free, good quality |
| High throughput | glm-4.5-air | Compact, fast inference |
| Image understanding | glm-4.6v | Best vision model with function calling |
| Embeddings/search | embedding-3 | Latest generation |
| Image creation | glm-image | Latest generation (Jan 2026) |
| Budget-conscious | glm-4.5-flash | Free tier available |
When using z.ai's Anthropic-compatible endpoint with Claude Code, map models to slots:
| Claude Code Slot | Recommended GLM Model | Rationale |
| ---------------- | --------------------- | ---------------------------- |
| Opus | glm-5 | Most capable, agentic |
| Sonnet | glm-4.7 | Strong coding, balanced cost |
| Haiku | glm-4.5-air | Fast, cost-effective |
| Model | Input | Output |
| ---------------- | ------ | ------ |
| glm-5 | ~$1.00 | ~$3.20 |
| glm-4.7 | $0.60 | $2.20 |
| glm-4.7-flash | Free | Free |
| glm-4.5 | ~$0.20 | ~$1.10 |
| glm-4.5-x | — | $8.90 |
| glm-4.5-flash | Free | Free |
| glm-4.6v | ~$0.14 | ~$0.41 |
| glm-4.6v-flash | Free | Free |
Prices approximate; see docs.z.ai/guides/overview/pricing for current rates. Batch API available at 50% cost.
glm-4.5-flash, glm-4.7-flash, glm-4.6v-flash are freetools
Reference material for Claude Code internals — the on-disk layout under ~/.claude and project-scope .claude, the plugin cache, session-env propagation, and the full hook lifecycle. Auto-recall when working on Claude-Code-related tasks: writing or debugging hooks, authoring plugins, inspecting session state, troubleshooting why an env var is or isn't visible to a Bash tool call, or when paths under ~/.claude or ~/.claude/plugins/ come up.
development
Manage GitHub App installation tokens in Claude Code sessions. Use when tokens expire, auth errors occur in long-running sessions, or when setting up GitHub App credentials for agent teams. <example>my github token expired</example> <example>refresh the github app token</example> <example>check token status</example> <example>set up github app authentication for this session</example>
tools
Auto-detect project formatting tools and configure edit-utils settings
tools
Use this skill when the user asks about 1Password, secrets management, retrieving credentials, using op CLI, service accounts, secret references, vault operations, or any task involving the 1Password CLI (op). Also use when needing to inject secrets into environment variables, read passwords or API keys from 1Password, or manage 1Password items from the command line.