Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

holyorevil/ascend

Name: ascend
Author: holyorevil

skills/quant-by-modelslim/SKILL.md

npx skillsauth add holyorevil/ascend-model-agent-plugin ascend

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Ascend Inference Toolchain

This skill manages Ascend NPU-related tasks, troubleshooting, and toolchain usage.

Hardware Check

Run at the start of every session before any quantization or inference task:

npu-smi info

Verify:

All expected NPUs appear and show Health: OK
No NPU is occupied by another process (check "Process ID" column)
If an NPU is occupied, ask the user whether to free it by killing vllm/python processes:
```
kill -9 $(pgrep -f vllm) 2>/dev/null
kill -9 $(pgrep -f python) 2>/dev/null
```

Common Environment Setup

ASCEND_RT_VISIBLE_DEVICES controls which NPUs are visible to both vLLM and msmodelslim. Set this before any command that touches NPUs.

Common Requirement: Run via Shell Script with Log Output

All actual run/quantization/inference commands must be saved to a shell script and executed through it. The script must redirect both stdout and stderr to a log file so that output is preserved for debugging.

Template:

cat > run.sh << 'EOF'
#!/bin/bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
LOG_FILE="${SCRIPT_DIR}/run_$(date +%Y%m%d_%H%M%S).log"

# Environment setup
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3

# Run and log
"$@" 2>&1 | tee "$LOG_FILE"
EOF
chmod +x run.sh
./run.sh <your-command>

Key points:

Both stdout and stderr are captured in the log file via 2>&1 | tee "$LOG_FILE"
Log file is named with a timestamp so each run gets a unique file
The script must be chmod +x before execution
Do not run commands directly in the terminal; always go through the script so output is saved
Do not run the script in the background (no &, no nohup, no run_in_background); run it in the foreground so output streams to the terminal in real time

Task Specifics

For detailed instructions on specific tools, refer to:

vLLM-Ascend: See vllm-install.md for installation and vllm-run.md for running and troubleshooting.
msmodelslim: See msmodelslim.md for quantization protocols (includes end-to-end iterative workflow). See sensitivity-analysis.md for diagnosing and fixing quantization accuracy drops via layer sensitivity analysis.
AISBench Evaluation: See aisbench-accuracy.md for accuracy benchmarking against a running vLLM service.

Core Tips

Editable Installs: All toolkits — vllm, vllm-ascend, msmodelslim, and ais_bench — are installed in editable mode. Before referencing or modifying any of them, run pip show <package> to locate the source directory. Never assume a fixed path.
Source Debugging: Use pip show <package> to find the editable source location for deep debugging.
Debugging Branch: Before any debugging session, create a new git branch to isolate changes:
```
git checkout -b debug/<topic>
```

holyorevil/ascend

skills/quant-by-modelslim/SKILL.md

Entry point for Ascend NPU inference toolchain. Use when running vLLM on Ascend/NPU, quantizing models with msmodelslim, or debugging NPU errors.

tools

Updated Apr 23, 2026

$ install --global

skillsauth

npx skillsauth add holyorevil/ascend-model-agent-plugin ascend

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 23, 2026, 2:51 AM305.5s7 files scanned

SKILL.md

name:: ascend
description:: Entry point for Ascend NPU inference toolchain. Use when running vLLM on Ascend/NPU, quantizing models with msmodelslim, or debugging NPU errors.
argument-hint:: vllm issue / quantization / npu usage

Ascend Inference Toolchain

This skill manages Ascend NPU-related tasks, troubleshooting, and toolchain usage.

Hardware Check

Run at the start of every session before any quantization or inference task:

npu-smi info

Verify:

All expected NPUs appear and show Health: OK
No NPU is occupied by another process (check "Process ID" column)
If an NPU is occupied, ask the user whether to free it by killing vllm/python processes:
```
kill -9 $(pgrep -f vllm) 2>/dev/null
kill -9 $(pgrep -f python) 2>/dev/null
```

Common Environment Setup

ASCEND_RT_VISIBLE_DEVICES controls which NPUs are visible to both vLLM and msmodelslim. Set this before any command that touches NPUs.

Common Requirement: Run via Shell Script with Log Output

Template:

cat > run.sh << 'EOF'
#!/bin/bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
LOG_FILE="${SCRIPT_DIR}/run_$(date +%Y%m%d_%H%M%S).log"

# Environment setup
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3

# Run and log
"$@" 2>&1 | tee "$LOG_FILE"
EOF
chmod +x run.sh
./run.sh <your-command>

Key points:

Both stdout and stderr are captured in the log file via 2>&1 | tee "$LOG_FILE"
Log file is named with a timestamp so each run gets a unique file
The script must be chmod +x before execution
Do not run commands directly in the terminal; always go through the script so output is saved
Do not run the script in the background (no &, no nohup, no run_in_background); run it in the foreground so output streams to the terminal in real time

Task Specifics

For detailed instructions on specific tools, refer to:

vLLM-Ascend: See vllm-install.md for installation and vllm-run.md for running and troubleshooting.
msmodelslim: See msmodelslim.md for quantization protocols (includes end-to-end iterative workflow). See sensitivity-analysis.md for diagnosing and fixing quantization accuracy drops via layer sensitivity analysis.
AISBench Evaluation: See aisbench-accuracy.md for accuracy benchmarking against a running vLLM service.

Core Tips

Editable Installs: All toolkits — vllm, vllm-ascend, msmodelslim, and ais_bench — are installed in editable mode. Before referencing or modifying any of them, run pip show <package> to locate the source directory. Never assume a fixed path.
Source Debugging: Use pip show <package> to find the editable source location for deep debugging.
Debugging Branch: Before any debugging session, create a new git branch to isolate changes:
```
git checkout -b debug/<topic>
```

Related Skills

holyorevil/vector-triton-ascend-ops-optimizer

data-ai

VerifiedTrustedCommunity

昇腾（Ascend） NPU 上 Triton 算子深度性能优化技能（Skill），致力于实现用户要求的 Triton 算子性能提升。核心技术包括但不限于 Unified Buffer (UB) 容量规划、多 Tokens 并行处理、MTE/Vector 流水并行、mask（掩码）优化等。当用户提及以下内容时，务必触发此技能（Skill）：昇腾（Ascend）NPU 上 Vector 类 Triton 算子性能优化。

SKILL.mdUpdated Apr 23, 2026

holyorevil/vector-triton-ascend-ops-optimizer

holyorevil/repo-reader

development

VerifiedTrustedCommunity

从模型仓库链接读取 README 文档。当用户想要从模型仓库链接（如 https://ai.gitcode.com/Ascend-SACT/Qwen3.5-27B-A2-Vllm-Ascend）获取部署文档、使用说明或任何仓库内容时触发此 skill。使用此 skill 来获取仓库的 README、文档内容、部署命令等。

SKILL.mdUpdated Apr 23, 2026

holyorevil/repo-reader

holyorevil/npu-adapter-reviewer

tools

VerifiedTrustedCommunity

GPU代码到昇腾NPU适配审查专家。当用户需要将GPU上的代码（特别是深度学习、模型推理相关）迁移到华为昇腾NPU时，必须使用此skill进行全面审查。此skill能识别GPU到NPU迁移的堵点、编写适配脚本、生成验证方案，并输出完整的Markdown审查报告。触发场景包括：用户提到"NPU适配"、"昇腾迁移"、"GPU转NPU"、"Ascend"、"CANN"、"模型迁移"、"算子适配"等关键词，或者用户要求对GPU代码仓库进行审查并迁移到NPU平台。

SKILL.mdUpdated Apr 23, 2026

holyorevil/npu-adapter-reviewer

holyorevil/model-series-vendor-detector

data-ai

VerifiedTrustedCommunity

根据模型名称识别其所属系列和开发供应商。当用户需要从模型名称判断模型属于什么系列（如GLM、Qwen3、DeepSeek、MiniCPM等）以及其开发商/供应商时使用此skill。

SKILL.mdUpdated Apr 23, 2026

holyorevil/model-series-vendor-detector

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/holyorevil/ascend-model-agent-plugin.git

# Copy into Claude Code skills folder (global)
cp -r ascend-model-agent-plugin/skills/quant-by-modelslim ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

holyorevil/ascend-model-agent-plugin

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT