Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

wanshuiyin/training-check

Name: training-check
Author: wanshuiyin

skills/skills-codex/training-check/SKILL.md

npx skillsauth add wanshuiyin/Auto-claude-code-research-in-sleep training-check

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Training Check

You are now in interactive watch / 交互式训练监控模式.

Keep the current session open and report directly in the current terminal. The user is watching this terminal for updates. By default, run a training health check every 30 minutes, output a concise but complete analysis report after each check, state the next check time, then continue monitoring.

This skill checks training quality, not basic process health. Process health checks such as whether a tmux session exists or whether the GPU is idle can be handled by watchdog-style tooling; this skill focuses on whether the run is still worth continuing.

Inputs To Establish First

Before the first check, identify or ask for the minimum monitoring context:

WandB run path or URL, if available.
Fallback log path, SSH command, or local command for reading recent training logs.
Training target, expected baseline, and key metrics that define success.
How the training was launched, so it can be stopped if needed.
Project notes path for recording decisions and evidence.

If a source is unavailable, say so clearly and continue with the available source. If both WandB and fallback logs are unreachable, report the connectivity issue, classify the round as WAIT, and check again later. Do not infer that training is bad only because data is unreachable.

Per-Round Check

Every round, read WandB first when configured. If WandB is unreachable, read the fallback logs. Inspect at least:

Training loss trend over recent checkpoints or steps.
Eval metrics and whether they improve, flatten, or degrade against baseline.
NaN or Inf in loss, gradients, activations, or logged metrics.
Sudden loss spikes, divergence, or repeated failed evaluations.
Learning rate schedule behavior.
Gradient norm, if logged.
Plateau patterns that suggest the run is no longer useful.

Output one report in the current terminal with this structure:

## Training Check - <local timestamp>

- Data source: wandb_ok | log_fallback | unreachable
- Run: <wandb run or training identifier>
- Recent metrics: <loss/eval/lr/grad summary>
- Anomalies: <NaN/Inf/spike/divergence/plateau findings>
- Evidence: <WandB URL, log lines, metric values, or files inspected>
- Decision: CONTINUE | WAIT | STOP
- Reason: <why this decision is justified>
- Next check: <local timestamp, normally 30 minutes later unless ending>

Use the decisions as follows:

| Decision | Meaning | Action | |----------|---------|--------| | CONTINUE | Run looks healthy enough to keep training. | Keep monitoring and check again in 30 minutes. | | WAIT | Evidence is inconclusive, noisy, too early, or temporarily unreachable. | Do not stop training; keep monitoring and check again later. | | STOP | Training is clearly problematic or no longer worth continuing. | Stop the training task, save evidence, write notes, output final summary, and end monitoring. |

Stop Behavior

When the decision is STOP:

Stop the training task.
If the context contains stop_command, run stop_command first.
If no stop_command is available, choose the appropriate stop action from how the training was launched, such as stopping the relevant tmux session, local process, remote process, scheduler job, or notebook job.
Save evidence: WandB URL, key metrics, relevant log snippets, files inspected, and the reason for stopping.
Append a project note for debugging and future analysis.
Output FINAL_SUMMARY in the terminal.
End the interactive monitoring loop.

Never stop on the first sign of ordinary metric noise. Look for sustained trends, hard failures, or clear divergence. Always preserve enough evidence for a later agent or human to understand why the run was stopped.

Interactive Loop Guidance

The normal interval is 30 minutes.
If a round is CONTINUE, announce the next check time and wait until then.
If a round is WAIT, explain what evidence is missing or noisy and check again later. Use a shorter interval only when the run looks suspicious but not yet stop-worthy.
If an anomaly recovers, say so explicitly and continue monitoring.
Keep the user-facing report short enough to read in a terminal, but include concrete metric values and evidence paths.

wanshuiyin/training-check

skills/skills-codex/training-check/SKILL.md

Interactively monitor training metrics from the current Codex session, periodically checking WandB or fallback logs for NaN, divergence, plateaus, and broken runs.

13,323 stars

development

Updated Jul 13, 2026

$ install --global

skillsauth

npx skillsauth add wanshuiyin/Auto-claude-code-research-in-sleep training-check

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 13, 2026, 4:50 AM112.5s1 file scanned

SKILL.md

name:: training-check
description:: Interactively monitor training metrics from the current Codex session, periodically checking WandB or fallback logs for NaN, divergence, plateaus, and broken runs.
argument-hint:: [wandb-run-or-monitoring-brief]
allowed-tools:: Bash(*), Read, Write, Edit, Grep, Glob

Training Check

You are now in interactive watch / 交互式训练监控模式.

Inputs To Establish First

Before the first check, identify or ask for the minimum monitoring context:

WandB run path or URL, if available.
Fallback log path, SSH command, or local command for reading recent training logs.
Training target, expected baseline, and key metrics that define success.
How the training was launched, so it can be stopped if needed.
Project notes path for recording decisions and evidence.

Per-Round Check

Every round, read WandB first when configured. If WandB is unreachable, read the fallback logs. Inspect at least:

Training loss trend over recent checkpoints or steps.
Eval metrics and whether they improve, flatten, or degrade against baseline.
NaN or Inf in loss, gradients, activations, or logged metrics.
Sudden loss spikes, divergence, or repeated failed evaluations.
Learning rate schedule behavior.
Gradient norm, if logged.
Plateau patterns that suggest the run is no longer useful.

Output one report in the current terminal with this structure:

## Training Check - <local timestamp>

- Data source: wandb_ok | log_fallback | unreachable
- Run: <wandb run or training identifier>
- Recent metrics: <loss/eval/lr/grad summary>
- Anomalies: <NaN/Inf/spike/divergence/plateau findings>
- Evidence: <WandB URL, log lines, metric values, or files inspected>
- Decision: CONTINUE | WAIT | STOP
- Reason: <why this decision is justified>
- Next check: <local timestamp, normally 30 minutes later unless ending>

Use the decisions as follows:

Stop Behavior

When the decision is STOP:

Stop the training task.
If the context contains stop_command, run stop_command first.
If no stop_command is available, choose the appropriate stop action from how the training was launched, such as stopping the relevant tmux session, local process, remote process, scheduler job, or notebook job.
Save evidence: WandB URL, key metrics, relevant log snippets, files inspected, and the reason for stopping.
Append a project note for debugging and future analysis.
Output FINAL_SUMMARY in the terminal.
End the interactive monitoring loop.

Interactive Loop Guidance

The normal interval is 30 minutes.
If a round is CONTINUE, announce the next check time and wait until then.
If a round is WAIT, explain what evidence is missing or noisy and check again later. Use a shorter interval only when the run looks suspicious but not yet stop-worthy.
If an anomaly recovers, say so explicitly and continue monitoring.
Keep the user-facing report short enough to read in a terminal, but include concrete metric values and evidence paths.

Related Skills

wanshuiyin/web-debug-search

development

VerifiedTrustedCommunity

Search GitHub Issues and Discussions for software errors, version compatibility problems, and exact error-string matches. Use for debugging and discovery only; results are not paper-citation evidence.

13,732SKILL.mdUpdated Jul 23, 2026

wanshuiyin/web-debug-search

wanshuiyin/web-debug-search

development

VerifiedTrustedCommunity

13,732SKILL.mdUpdated Jul 23, 2026

wanshuiyin/web-debug-search

wanshuiyin/integrity-forensics

testing

VerifiedTrustedCommunity

Run the Anti-Autoresearch integrity-forensics sweep (span-anchored evidence ledger → GPT auditors propose findings → deterministic rules-only adjudicator) against a paper via a SHA-pinned thin launcher — then convert the verdict into a typed policy gate (BLOCK/WARN/NO_NEW_BLOCKER) and an append-only obligations ledger. Use when user says "integrity forensics", "forensic audit this paper", "投稿前自查诚信", "审这篇论文的诚信", or says "anti-autoresearch" when the upstream repo's own skills are not installed. Also invoked by /paper-writing (submission self-forensics, default ON), /peer-review (forensic appendix), /resubmit-pipeline.

13,401SKILL.mdUpdated Jul 13, 2026

wanshuiyin/integrity-forensics

wanshuiyin/meta-apply

testing

VerifiedTrustedCommunity

Privileged applier that LANDS meta-optimize / corpus-audit patches the user approved — the ONLY skill permitted to mutate the skill corpus from a self-modification proposal, with cross-model jury and human approval at landing. Use when the user says "meta apply", "/meta-apply", "land the staged patches", "应用优化", after a /meta-optimize run.

13,401SKILL.mdUpdated May 31, 2026

wanshuiyin/meta-apply

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep.git

# Copy into Claude Code skills folder (global)
cp -r Auto-claude-code-research-in-sleep/skills/skills-codex/training-check ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

wanshuiyin/Auto-claude-code-research-in-sleep

13,323 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT