codex-readiness-unit-test/SKILL.md
Run the Codex Readiness unit test report. Use when you need deterministic checks plus in-session LLM evals for AGENTS.md/PLANS.md.
npx skillsauth add syl2042/codex_skills codex-readiness-unit-testInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Instruction-first, in-session "readiness" for evaluating AGENTS/PLANS documentation quality without any external APIs or SDKs. All checks run against the current working directory (cwd), with no monorepo discovery. Each run writes to .codex-readiness-unit-test/<timestamp>/ and updates .codex-readiness-unit-test/latest.json. Keep execution deterministic (filesystem scanning + local command execution only). All LLM evaluation happens in-session and must output strict JSON via the provided references.
python skills/codex-readiness-unit-test/bin/collect_evidence.pypython skills/codex-readiness-unit-test/bin/deterministic_rules.pyreferences/ and store .codex-readiness-unit-test/<timestamp>/llm_results.json.python skills/codex-readiness-unit-test/bin/run_plan.py --plan .codex-readiness-unit-test/<timestamp>/plan.jsonpython skills/codex-readiness-unit-test/bin/scoring.py --mode read-only|executeOutputs (per run, under .codex-readiness-unit-test/<timestamp>/):
report.jsonreport.htmlsummary.jsonlogs/* (execute mode)This skill produces a deterministic evidence file plus an in-session LLM evaluation, then compiles a JSON report and HTML scorecard. It requires no OpenAI API key and makes no external HTTP calls.
mode: read-only or execute (required)soft_timeout_seconds: optional (default 600)NOT_RUN, and no execution logs/summary are produced.plan.json is executed via run_plan.py. This enables check #6 and produces execution logs + execution_summary.json for scoring.Always ask the user which mode to run (read-only vs. execute) before proceeding.
Skill references are discovered from AGENTS.md via $SkillName or .codex/skills/<name> patterns; their SKILL.md files are added to evidence for the LLM checks.
All checks run relative to the current working directory and are defined in skills/codex-readiness-unit-test/references/checks/checks.json, weighted equally by default. Each run writes outputs to .codex-readiness-unit-test/<timestamp>/ and updates .codex-readiness-unit-test/latest.json.
The helper scripts read .codex-readiness-unit-test/latest.json by default to locate the latest run directory.
For each LLM/HYBRID check:
skills/codex-readiness-unit-test/references/json_fix.md with the raw output.The JSON schema is:
{
"status": "PASS|WARN|FAIL|NOT_RUN",
"rationale": "string",
"evidence_quotes": [{"path":"...","quote":"..."}],
"recommendations": ["..."],
"confidence": 0.0
}
Combine the command summary and execute plan into one concise confirmation step. Present:
plan.json. If declined, mark execute-required checks as NOT_RUN..codex-readiness-unit-test/<timestamp>/evidence.json (from collect_evidence.py).codex-readiness-unit-test/<timestamp>/deterministic_results.json (from deterministic_rules.py).codex-readiness-unit-test/<timestamp>/llm_results.json (from in-session references).codex-readiness-unit-test/<timestamp>/execution_summary.json (execute mode only).codex-readiness-unit-test/<timestamp>/report.json and .codex-readiness-unit-test/<timestamp>/report.html (from scoring.py).codex-readiness-unit-test/<timestamp>/summary.json (structured pass/fail summary from scoring.py).codex-readiness-unit-test/latest.json (stable pointer to the latest run directory)project_context_specified → skills/codex-readiness-unit-test/references/project_context.mdbuild_test_commands_exist → skills/codex-readiness-unit-test/references/commands.mddev_build_test_loops_documented → skills/codex-readiness-unit-test/references/loop_quality.mddev_build_test_loop_execution → skills/codex-readiness-unit-test/references/execution_explanation.md{
"project_dir": "relative/or/absolute/path (optional)",
"cwd": "optional/absolute/path (defaults to current directory)",
"commands": [
{"label": "setup", "cmd": "npm install"},
{"label": "build", "cmd": "npm run build"},
{"label": "test", "cmd": "npm test"}
],
"env": {
"EXAMPLE": "value"
}
}
Place plan.json inside the run directory (e.g., .codex-readiness-unit-test/<timestamp>/plan.json).
{
"project_context_specified": {"status":"PASS","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.7},
"build_test_commands_exist": {"status":"PASS","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.7},
"dev_build_test_loops_documented": {"status":"WARN","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.6},
"dev_build_test_loop_execution": {"status":"PASS","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.6}
}
.codex-readiness-unit-test/<timestamp>/logs/.development
Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.
development
Deploy applications and websites to Vercel. Use when the user requests deployment actions like "deploy my app", "deploy and give me the link", "push this live", or "create a preview deployment".
content-media
Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.
tools
Toolkit for styling artifacts with a theme. These artifacts can be slides, docs, reportings, HTML landing pages, etc. There are 10 pre-set themes with colors/fonts that you can apply to any artifact that has been creating, or can generate a new theme on-the-fly.