Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

grtninja/qwen-training-checkpoint-eval

Name: qwen-training-checkpoint-eval
Author: grtninja

skill-candidates/qwen-training-checkpoint-eval/SKILL.md

npx skillsauth add grtninja/skill-arbiter qwen-training-checkpoint-eval

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Qwen Training Checkpoint Eval

Use this skill for logical checkpoint testing during or after staggered student training.

Workflow

Start with the saved batch artifacts, not a live guess.
Read checkpoint.eval.json and trainer.report.json for the batch.
Verify exact source refs in batch_sources.json or batch_sources.txt.
If deeper validation is needed, point the Radeon eval lane at the saved adapter and verify it loads.
Compare the trained checkpoint against the baseline student only after the adapter-loaded eval lane is healthy.

Canonical Checks

Batch artifact review:

type <batch-dir>\checkpoint.eval.json
type <batch-dir>\trainer.report.json
type <batch-dir>\batch_sources.json

Eval lane launch:

powershell -ExecutionPolicy Bypass -File `
  <training-workbench-root>\tools\start_qwen35_4b_radeon_eval.ps1 `
  -Config <eval-config-pointing-at-batch-adapter>

Eval lane health:

curl <loopback-eval-lane>/health
curl <loopback-eval-lane>/v1/models

Pass Criteria

At minimum, require:

checkpoint.eval.json exists
the eval JSON parsed at least one sample response successfully
adult_context and penny_affinity matches are present for the sampled records
/health shows adapter_loaded = true when a saved adapter is mounted on the eval lane

Scope Boundary

Use this skill for checkpoint validation and trained-adapter smoke testing.

Do not use this skill for:

Running the full dataset factory
Long-running training supervision
LM Studio teacher recovery

References

references/checkpoint-contract.md

grtninja/qwen-training-checkpoint-eval

skill-candidates/qwen-training-checkpoint-eval/SKILL.md

Evaluate saved Qwen training checkpoints against batch-aligned training samples and trained-adapter eval lanes. Use when inspecting checkpoint.eval.json, validating a saved batch adapter on the Radeon eval lane, or comparing baseline student behavior against a newly trained checkpoint before promotion.

2 stars

testing

Updated Apr 5, 2026

$ install --global

skillsauth

npx skillsauth add grtninja/skill-arbiter qwen-training-checkpoint-eval

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 2:27 PM81.6s3 files scanned

SKILL.md

name:: qwen-training-checkpoint-eval
description:: Evaluate saved Qwen training checkpoints against batch-aligned training samples and trained-adapter eval lanes. Use when inspecting checkpoint.eval.json, validating a saved batch adapter on the Radeon eval lane, or comparing baseline student behavior against a newly trained checkpoint before promotion.

Qwen Training Checkpoint Eval

Use this skill for logical checkpoint testing during or after staggered student training.

Workflow

Start with the saved batch artifacts, not a live guess.
Read checkpoint.eval.json and trainer.report.json for the batch.
Verify exact source refs in batch_sources.json or batch_sources.txt.
If deeper validation is needed, point the Radeon eval lane at the saved adapter and verify it loads.
Compare the trained checkpoint against the baseline student only after the adapter-loaded eval lane is healthy.

Canonical Checks

Batch artifact review:

type <batch-dir>\checkpoint.eval.json
type <batch-dir>\trainer.report.json
type <batch-dir>\batch_sources.json

Eval lane launch:

powershell -ExecutionPolicy Bypass -File `
  <training-workbench-root>\tools\start_qwen35_4b_radeon_eval.ps1 `
  -Config <eval-config-pointing-at-batch-adapter>

Eval lane health:

curl <loopback-eval-lane>/health
curl <loopback-eval-lane>/v1/models

Pass Criteria

At minimum, require:

checkpoint.eval.json exists
the eval JSON parsed at least one sample response successfully
adult_context and penny_affinity matches are present for the sampled records
/health shows adapter_loaded = true when a saved adapter is mounted on the eval lane

Scope Boundary

Use this skill for checkpoint validation and trained-adapter smoke testing.

Do not use this skill for:

Running the full dataset factory
Long-running training supervision
LM Studio teacher recovery

References

references/checkpoint-contract.md

Related Skills

grtninja/white-hat

tools

VerifiedTrustedCommunity

Run a defender-first security sweep on code, configs, prompts, model/tooling surfaces, or third-party contribution lanes. Use when a request involves safe bug, leak, zero-day-class, exploit, or hack hunting for protection, when contributing to outside repositories and you want a focused security pass, or when touching auth, secrets, permissions, network exposure, prompt/tool boundaries, data flow, or update/build surfaces. This skill is defensive only and must never be used for weaponization or unauthorized access.

2SKILL.mdUpdated Apr 22, 2026

grtninja/vrm-sandbox-startup-acceptance

development

VerifiedTrustedCommunity

Validate and repair VRM Sandbox startup acceptance with shim-first local model authority, frontend/backend bring-up, and avatar-runtime launch proof. Use when launch behavior, chat handoff, voice fallback, or runtime bridge acceptance must be verified end to end.

2SKILL.mdUpdated Apr 22, 2026

grtninja/vrm-sandbox-startup-acceptance

grtninja/voice-catalog-runtime-alignment

documentation

VerifiedTrustedCommunity

Align documented voice-command catalogs, endpoint action allowances, and live runtime handlers so operator-visible voice surfaces match what the stack can actually execute. Use when voice command docs, parser matrices, endpoint permissions, or runtime action routing drift apart.

2SKILL.mdUpdated Apr 22, 2026

grtninja/voice-catalog-runtime-alignment

grtninja/skillhub-trend-radar

development

VerifiedTrustedCommunity

Track SkillHub trend and topic drift, maintain a bounded rewrite watchlist, and surface emerging gaps worth turning into repo-owned skills. Use when the marketplace query set shows new families or when the current shortlist has gone stale.

2SKILL.mdUpdated Apr 22, 2026

grtninja/skillhub-trend-radar

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/grtninja/skill-arbiter.git

# Copy into Claude Code skills folder (global)
cp -r skill-arbiter/skill-candidates/qwen-training-checkpoint-eval ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

grtninja/skill-arbiter

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT