Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

davepoon/qwen-vision

Name: qwen-vision
Author: davepoon

plugins/give-claude-eyes/skills/qwen-vision/SKILL.md

npx skillsauth add davepoon/buildwithclaude qwen-vision

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Qwen Vision Bridge

Claude cannot natively understand video. This skill bridges that gap by calling Qwen Omni — a natively multimodal model that processes video with temporal attention (it sees motion, not just individual frames).

The bridge also handles images, useful when you want Qwen's analysis on screenshots, diagrams, or photos.

How it works

A Python script at ${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py sends media files to the Qwen API and returns the analysis as text. Call it via Bash.

Prerequisites

The user must have:

DASHSCOPE_API_KEY environment variable set (get one at https://dashscope.console.aliyun.com/ or https://modelstudio.console.alibabacloud.com/)
Python 3.9+ with dashscope package installed

If the user hasn't set up yet, suggest running /qwen-setup first.

Basic usage

python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" "/path/to/video.mp4" "Describe what happens in this video"

Parameters

| Flag | Default | Description | |------|---------|-------------| | (positional 1) | required | Path to video or image file | | (positional 2) | generic prompt | Analysis prompt | | --fps | 2.0 | Frames per second to sample from video. Lower = cheaper, higher = more detail | | --model | qwen-omni-plus-latest | Qwen model to use | | --json | off | Output as JSON (for parsing) | | --context | none | Path to JSON file with previous conversation (multi-turn) | | --save-context | none | Save conversation context for follow-up questions | | --system-prompt | none | Custom system prompt for Qwen | | --prompt-file | none | Read prompt from a file instead of argument |

Supported formats

Video: .mp4, .mov, .avi, .mkv, .webm, .flv, .wmv Image: .png, .jpg, .jpeg, .gif, .webp, .bmp, .tiff

Patterns

Single video analysis

python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" "/path/to/video.mp4" "Describe the character's body movement, poses, and transitions" --fps 2

Parse the text response and use it in your answer to the user.

Batch analysis

When the user has multiple videos to analyze, write a Python script that loops through files and calls the bridge for each one. Use --json flag for machine-readable output. See references/batch-pattern.md for a template.

Multi-turn (follow-up questions)

# First question
python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" video.mp4 "General analysis" --save-context /tmp/ctx.json

# Follow-up
python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" video.mp4 "Tell me more about the lighting" --context /tmp/ctx.json

Image analysis

Same script, just pass an image path instead of video:

python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" "/path/to/screenshot.png" "What UI elements are visible in this screenshot?"

Cost-saving tips

Use --fps 1 for long videos or when fine detail isn't needed
Use --fps 0.5 for very long videos (minutes+)
For batch jobs, start with --fps 1 and increase only if results are too vague

Error handling

If DASHSCOPE_API_KEY is not set, the script exits with a clear error message. Guide the user to set it up.
If dashscope is not installed, suggest pip install dashscope.
If the API returns an error, the script prints the error code and message. Common issues: invalid key, quota exceeded, unsupported file format.
If a video file is too large for the API, suggest lowering --fps or trimming the video first.

What Qwen sees vs what Claude sees

This is important context for the user: Qwen processes video frames with temporal attention — it understands motion, direction, rhythm, and transitions between frames. Claude analyzing individual screenshots cannot do this. When the user needs to understand what happens in a video (not just what a single frame looks like), this bridge is the right tool.

Additional resources

references/batch-pattern.md — template for batch video classification
references/prompt-tips.md — effective prompts for different analysis types

davepoon/qwen-vision

plugins/give-claude-eyes/skills/qwen-vision/SKILL.md

Use when the user asks to "analyze video", "watch this video", "what happens in this video", "describe this clip", "review this footage", "classify these videos", "compare videos", "analyze this image", "what's in this screenshot", or when the user provides a video/image file path and expects visual understanding. Also trigger on: "qwen", "video bridge", "multimodal analysis", "motion analysis", "video reference", "video breakdown", "batch classify", or any task requiring understanding of video content that Claude cannot do natively.

2,941 stars

tools

Updated May 18, 2026

$ install --global

skillsauth

npx skillsauth add davepoon/buildwithclaude qwen-vision

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 18, 2026, 2:28 AM127.1s4 files scanned

SKILL.md

name:: qwen-vision
description:: >
file path and expects visual understanding. Also trigger on:: qwen", "video bridge",
version:: 0.1.0

Qwen Vision Bridge

The bridge also handles images, useful when you want Qwen's analysis on screenshots, diagrams, or photos.

How it works

A Python script at ${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py sends media files to the Qwen API and returns the analysis as text. Call it via Bash.

Prerequisites

The user must have:

DASHSCOPE_API_KEY environment variable set (get one at https://dashscope.console.aliyun.com/ or https://modelstudio.console.alibabacloud.com/)
Python 3.9+ with dashscope package installed

If the user hasn't set up yet, suggest running /qwen-setup first.

Basic usage

python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" "/path/to/video.mp4" "Describe what happens in this video"

Parameters

Supported formats

Video: .mp4, .mov, .avi, .mkv, .webm, .flv, .wmv Image: .png, .jpg, .jpeg, .gif, .webp, .bmp, .tiff

Patterns

Single video analysis

python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" "/path/to/video.mp4" "Describe the character's body movement, poses, and transitions" --fps 2

Parse the text response and use it in your answer to the user.

Batch analysis

Multi-turn (follow-up questions)

# First question
python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" video.mp4 "General analysis" --save-context /tmp/ctx.json

# Follow-up
python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" video.mp4 "Tell me more about the lighting" --context /tmp/ctx.json

Image analysis

Same script, just pass an image path instead of video:

python3 "${CLAUDE_PLUGIN_ROOT}/skills/qwen-vision/scripts/qwen_bridge.py" "/path/to/screenshot.png" "What UI elements are visible in this screenshot?"

Cost-saving tips

Use --fps 1 for long videos or when fine detail isn't needed
Use --fps 0.5 for very long videos (minutes+)
For batch jobs, start with --fps 1 and increase only if results are too vague

Error handling

If DASHSCOPE_API_KEY is not set, the script exits with a clear error message. Guide the user to set it up.
If dashscope is not installed, suggest pip install dashscope.
If the API returns an error, the script prints the error code and message. Common issues: invalid key, quota exceeded, unsupported file format.
If a video file is too large for the API, suggest lowering --fps or trimming the video first.

What Qwen sees vs what Claude sees

Additional resources

references/batch-pattern.md — template for batch video classification
references/prompt-tips.md — effective prompts for different analysis types

Related Skills

davepoon/anti-ui-slop

development

VerifiedTrustedCommunity

Stop coding agents from shipping generic UI. Use UIZZE's 800,000+ real web and iOS screens to build product-specific interfaces, define a design contract, cover required states, and run a hard finish gate. Use for web or iOS UI design, implementation, redesign, critique, and pre-ship review in Codex, Claude Code, Cursor, Copilot, and other coding agents.

3,220SKILL.mdUpdated Jul 27, 2026

davepoon/anti-ui-slop

davepoon/theboardroom

development

VerifiedTrustedCommunity

Convene an AI executive board of directors (CEO, CFO, COO, CLO, CISO sub-agent personas) to vet a business idea, product concept, new service offering, M&A target, or operational initiative — and deliver an integrated board memo with a Go/No-Go recommendation. Use this skill whenever the user wants an idea vetted, stress-tested, or reviewed from multiple executive perspectives; asks to "present this to the board," "run this by the boardroom," "vet this idea," "poke holes in this plan," or "prep me for a board meeting"; or shares a business plan, pitch, proposal, or initiative document and asks for structured executive feedback. Also trigger when the user asks for a Go/No-Go decision, risk review across finance/legal/security/operations, or preparation for presenting an initiative to real leadership.

3,183SKILL.mdUpdated Jul 16, 2026

davepoon/theboardroom

davepoon/travel-agent-skill

data-ai

VerifiedTrustedCommunity

私人旅行管家 — 从出发地到目的地的完整行程规划+攻略导出。输入出发地、目的地、天数、预算、风格偏好，自动输出闭环行程，包含交通推荐、酒店推荐、美食路线、每日预算，并可选生成攻略。当用户提到「做攻略」「旅行规划」「旅游计划」「行程安排」时使用。

3,180SKILL.mdUpdated Jul 15, 2026

davepoon/travel-agent-skill

davepoon/ontoly-software-graph

tools

VerifiedTrustedCommunity

Use Ontoly's deterministic Software Graph and MCP server for codebase architecture, request tracing, dependency analysis, and impact analysis.

3,180SKILL.mdUpdated Jul 15, 2026

davepoon/ontoly-software-graph

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/davepoon/buildwithclaude.git

# Copy into Claude Code skills folder (global)
cp -r buildwithclaude/plugins/give-claude-eyes/skills/qwen-vision ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

davepoon/buildwithclaude

2,941 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT