skills/mlops/training/tinker/SKILL.md
Run LLM post-training on Tinker with CPU-side orchestration and remote GPU execution. Use when preparing or launching SFT, DPO, or PPO-style runs, checkpointing, sampling checkpoints, or resuming long-running jobs through Hermes Research Agent.
npx skillsauth add aum08desai/hermes-research-agent tinkerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Tinker is the only compute backend in Hermes Research Agent v1.
TINKER_API_KEY must be set.tinker_posttrain for lifecycle management instead of ad hoc shell commands when possible.sft: dataset rows need prompt and completiondpo: dataset rows need prompt, chosen, and rejectedppo: dataset rows need prompt, completion, token-level logprobs, and either advantage or rewardtinker_posttrain(action="validate_config", ...).tinker_posttrain(action="start_run", ...).research_loop(action="monitor_run", ...) manage long-running polling and resumption.tinker_posttrain(action="sample_checkpoint", ...) for quick qualitative checks.tinker_posttrain(action="download_checkpoint", ...) only when the project needs local checkpoint artifacts.research_state(action="record_result", ...).start_run, resume_run, and stop_run to require an approval record.development
Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples.
development
Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach.
development
Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation — NO fixes without understanding the problem first.
development
Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality).