Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

hidai25/watch

Name: watch
Author: hidai25

skills/watch/SKILL.md

npx skillsauth add hidai25/eval-view watch

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Watch Mode

Use this skill when the user wants continuous regression monitoring during development. Watch mode observes file changes and automatically re-runs evalview check with debounced triggers.

What this does

EvalView's watch mode uses watchdog to monitor directories for file changes (.py, .yaml, .yml, .json, .md, .txt, .toml, .cfg, .ini). When a change is detected, it runs a regression check via the gate() API and displays a live scorecard with pass/fail status, score deltas, tool changes, and streak tracking.

How to start watch mode

Watch mode is a CLI command (not an MCP tool). Help the user run it:

evalview watch

Common options

--quick — Skip LLM judge, deterministic checks only ($0 cost, sub-second)
--path src/ --path tests/ — Watch specific directories (default: current directory)
--test "my-test" — Only check a specific test by name
--test-dir tests/evalview — Path to test cases directory (default: tests)
--interval 1 — Debounce interval in seconds (default: 2.0)
--fail-on REGRESSION,TOOLS_CHANGED — Comma-separated statuses that count as failure (default: REGRESSION)
--sound — Terminal bell on regression

Examples

# Basic: watch everything, full checks
evalview watch

# Fast development loop: no LLM judge, 1-second debounce
evalview watch --quick --interval 1

# Watch specific directories and one test
evalview watch --path src/ --path tests/ --test "calculator-division"

# Strict mode: fail on any behavioral change
evalview watch --fail-on REGRESSION,TOOLS_CHANGED,OUTPUT_CHANGED --sound

Prerequisites

Watch mode requires the watchdog package. If not installed:

pip install evalview[watch]

Notes

Watch mode excludes .evalview/, .git/, venv/, node_modules/, __pycache__/, and other common non-source directories automatically.
The initial check runs immediately on startup before watching begins.
Results include a live scorecard with pass counts, regression counts, health percentage, and streak info.
--quick mode is ideal for tight development loops since it costs nothing and runs in sub-second time.

hidai25/watch

skills/watch/SKILL.md

Start EvalView watch mode to automatically re-run regression checks whenever project files change.

82 stars

testing

Updated Apr 14, 2026

$ install --global

skillsauth

npx skillsauth add hidai25/eval-view watch

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 14, 2026, 3:00 AM8.2s1 file scanned

SKILL.md

name:: watch
description:: Start EvalView watch mode to automatically re-run regression checks whenever project files change.

Watch Mode

Use this skill when the user wants continuous regression monitoring during development. Watch mode observes file changes and automatically re-runs evalview check with debounced triggers.

What this does

How to start watch mode

Watch mode is a CLI command (not an MCP tool). Help the user run it:

evalview watch

Common options

--quick — Skip LLM judge, deterministic checks only ($0 cost, sub-second)
--path src/ --path tests/ — Watch specific directories (default: current directory)
--test "my-test" — Only check a specific test by name
--test-dir tests/evalview — Path to test cases directory (default: tests)
--interval 1 — Debounce interval in seconds (default: 2.0)
--fail-on REGRESSION,TOOLS_CHANGED — Comma-separated statuses that count as failure (default: REGRESSION)
--sound — Terminal bell on regression

Examples

# Basic: watch everything, full checks
evalview watch

# Fast development loop: no LLM judge, 1-second debounce
evalview watch --quick --interval 1

# Watch specific directories and one test
evalview watch --path src/ --path tests/ --test "calculator-division"

# Strict mode: fail on any behavioral change
evalview watch --fail-on REGRESSION,TOOLS_CHANGED,OUTPUT_CHANGED --sound

Prerequisites

Watch mode requires the watchdog package. If not installed:

pip install evalview[watch]

Notes

Watch mode excludes .evalview/, .git/, venv/, node_modules/, __pycache__/, and other common non-source directories automatically.
The initial check runs immediately on startup before watching begins.
Results include a live scorecard with pass counts, regression counts, health percentage, and streak info.
--quick mode is ideal for tight development loops since it costs nothing and runs in sub-second time.

Related Skills

hidai25/run-eval

development

VerifiedTrustedCommunity

Run EvalView regression checks against golden baselines to detect regressions in AI agent behavior after code, prompt, or model changes.

82SKILL.mdUpdated Apr 14, 2026

hidai25/generate-tests

testing

VerifiedTrustedCommunity

Generate EvalView test cases — either from a SKILL.md file using LLM-powered generation, or by capturing real agent interactions through a proxy.

82SKILL.mdUpdated Apr 14, 2026

hidai25/generate-tests

hidai25/code-reviewer

development

VerifiedTrustedCommunity

A skill that helps review code for best practices, bugs, and security issues

80SKILL.mdUpdated Apr 5, 2026

hidai25/code-reviewer

hidai25/hello-world

tools

VerifiedTrustedCommunity

A simple skill that creates a greeting file

80SKILL.mdUpdated Apr 5, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/hidai25/eval-view.git

# Copy into Claude Code skills folder (global)
cp -r eval-view/skills/watch ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

hidai25/eval-view

82 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT