skills/ml-experimentation/SKILL.md
Conduct machine learning experiments from planning through evaluation and report writing. Use when running ML experiments, testing hypotheses, training models, or writing up results. Covers single-hypothesis scoping, fast iteration loops, targeted logging, JOURNAL.md protocol, data-backed diagnostic plots, and scientific report writing.
npx skillsauth add ericmjl/skills ml-experimentationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill guides a hypothesis-driven ML experiment life cycle: planning, fast iteration, script execution, targeted logging, journaling, diagnostic visualization, and scientific report writing.
Use this skill when the user wants to run an ML experiment, test a model or idea, or write up experiment results. First decide: new experiment (different question → new experiment directory) or new run (same question, tweaks → new run under runs/). See references/experiment-setup.md for that disambiguation, hypothesis scoping, and the fast-iteration checklist.
uv run script.py or, when pixi is the environment manager, pixi run python script.py (pixi reads pyproject.toml or pixi.toml).[[tool.uv.index]] CUDA index in the script block); with pixi use a GPU-enabled environment defined in pyproject.toml or pixi.toml. Fall back to CPU only when GPU is unavailable. See references/script-patterns.md.runs/ (new run).uv run or pixi run (GPU-enabled environment preferred)YYYY-MM-DDTHH-MM-SS-<descriptive-string> (e.g. runs/2025-02-03T09-00-00-retry). See references/experiment-setup.md.runs/YYYY-MM-DDTHH-MM-SS-<descriptive-string>/ → logs/, plots/, checkpoints/, data/. For a new run: Create only the new run directory under runs/ (e.g. runs/2025-02-03T09-00-00-retry/) with logs/, plots/, checkpoints/, data/.runs/ named full ISO datetime + descriptive string YYYY-MM-DDTHH-MM-SS-<descriptive-string> (e.g. runs/2025-02-02T14-30-00-de-risk, runs/2025-02-02T15-00-00-full, runs/2025-02-03T09-00-00-retry). Keep a running log; never overwrite an existing run. See references/experiment-setup.md for the canonical tree.IGNORED_RUNS.md (or JOURNAL.md “Ignored runs”). See references/experiment-setup.md.uv run script.py or, when pixi is the environment manager, pixi run python script.py (pixi uses pyproject.toml or pixi.toml). Always run train/eval in a GPU-enabled environment when possible (uv: CUDA index or jax[cuda*] in script block; pixi: GPU-enabled env in pyproject.toml or pixi.toml).runs/2025-02-02T14-30-00-de-risk, runs/2025-02-02T15-00-00-full are relative to the experiment. Scripts accept only the descriptive name (e.g. uv run train.py de-risk or pixi run python train.py de-risk); datetime is auto-calculated. The training script (train.py) creates the run directory (logs/, plots/, checkpoints/, data/) so the experiment is self-contained—no external scaffold; see references/script-patterns.md for the Typer-based train scaffold.timeout parameter), otherwise the tool may hit its default execution timeout and the run may be killed before completion.<experiment>/ (e.g. generate_data.py); run them with CWD = experiment directory.[[tool.uv.index]] CUDA index when using uv; GPU deps in pyproject.toml/pixi.toml when using pixi) so runs are performant; fall back to CPU only when GPU is unavailable.logs/ directory (e.g. runs/2025-02-02T14-30-00-de-risk/logs/train.log, runs/2025-02-02T15-00-00-full/logs/eval.log). See references/logging-guide.md.[WEIRD], [HUNCH], [TODO], [RESOLVED] so entries are scannable.Plots
IGNORED_RUNS.md (and JOURNAL.md’s “Ignored runs” section if present); exclude any listed runs from plots.plots/ directory (e.g. runs/2025-02-02T15-00-00-full/plots/loss_curve.webp), generated from that run’s logs/.runs/2025-02-02T15-00-00-full/logs/train.log, generate runs/2025-02-02T15-00-00-full/plots/loss_curve.webp from that log; do not plot quantities that were not logged.Scientific Report
IGNORED_RUNS.md or JOURNAL.md “Ignored runs” from the report narrative and figures; do not delete those runs from disk.development
Create animated videos using Remotion from topics, product URLs, Google reviews, talking-head videos, or CSV data. Supports 5 video types: educational explainers, product launch demos, testimonial/social proof, avatar video overlays, and data visualization dashboards. Each follows a 2-step workflow: research/scrape/analyze then design and animate with spring animations, SVG diagrams, and count-up effects. Requires the Remotion best practices skill (install with `npx skills add remotion-dev/skills`). Use when the user asks to create a Remotion video, explainer video, educational video, product demo video, testimonial video, video with animated overlays, data visualization video, animated dashboard, or short-form vertical video for mobile.
development
Comprehensive YouTube operations using yt-dlp - download videos/audio, extract transcripts and subtitles, get metadata, work with playlists, download thumbnails, and inspect available formats. Use this for any YouTube content processing task.
data-ai
Ingest YouTube videos into the vault. Triggers when user pastes a YouTube URL (youtube.com/watch or youtu.be). Fetches transcript using yt-dlp, extracts metadata, creates transcript note and summary note. User may provide additional context about the video.
tools
Advanced negotiation and communication advisor grounded in Chris Voss's tactical empathy methodology (Never Split the Difference, The Black Swan Group). Use this skill whenever the user needs help with any interpersonal situation involving influence, persuasion, or navigating difficult dynamics. This includes but is not limited to: analyzing conversations, call transcripts, or email threads; preparing for negotiations (salary, vendor, client, partner); drafting tactful responses; handling pushback, objections, or conflict; navigating difficult workplace conversations; preparing for performance reviews or raises; buying a car, house, or any big purchase; dealing with landlords, contractors, or service providers; resolving personal disagreements; practicing negotiation through role-play; or any situation where the user says things like "how should I respond to this", "they're pushing back", "I need to have a tough conversation", "how do I ask for...", "they ghosted me", "I'm not sure how to handle this person", "counter-offer", "pricing", "deal", "objection", or "difficult conversation". Activate broadly — most interpersonal communication benefits from tactical empathy whether or not the user frames it as "negotiation." This skill integrates FBI hostage negotiation techniques (93% success rate) with behavioral economics (Kahneman's Prospect Theory) and neuroscience (amygdala hijacking, loss aversion).