engineering/agenthub/skills/run/SKILL.md
One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation.
npx skillsauth add alirezarezvani/claude-skills runInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run the full AgentHub lifecycle in one command: initialize, capture baseline, spawn agents, evaluate results, and merge the winner.
/hub:run --task "Reduce p50 latency" --agents 3 \
--eval "pytest bench.py --json" --metric p50_ms --direction lower \
--template optimizer
/hub:run --task "Refactor auth module" --agents 2 --template refactorer
/hub:run --task "Cover untested utils" --agents 3 \
--eval "pytest --cov=utils --cov-report=json" --metric coverage_pct --direction higher \
--template test-writer
/hub:run --task "Write 3 email subject lines for spring sale campaign" --agents 3 --judge
| Parameter | Required | Description |
|-----------|----------|-------------|
| --task | Yes | Task description for agents |
| --agents | No | Number of parallel agents (default: 3) |
| --eval | No | Eval command to measure results (skip for LLM judge mode) |
| --metric | No | Metric name to extract from eval output (required if --eval given) |
| --direction | No | lower or higher — which direction is better (required if --metric given) |
| --template | No | Agent template: optimizer, refactorer, test-writer, bug-fixer |
Execute these steps sequentially:
Run /hub:init with the provided arguments:
python {skill_path}/scripts/hub_init.py \
--task "{task}" --agents {N} \
[--eval "{eval_cmd}"] [--metric {metric}] [--direction {direction}]
Display the session ID to the user.
If --eval was provided:
Baseline captured: {metric} = {value}baseline: {value} to .agenthub/sessions/{session-id}/config.yamlIf no --eval was provided, skip this step.
Run /hub:spawn with the session ID.
If --template was provided, use the template dispatch prompt from references/agent-templates.md instead of the default dispatch prompt. Pass the eval command, metric, and baseline to the template variables.
Launch all agents in a single message with multiple Agent tool calls (true parallelism).
After spawning, inform the user that agents are running. When all agents complete (Agent tool returns results):
Run /hub:eval with the session ID:
--eval was provided: metric-based ranking with result_ranker.py--eval: LLM judge mode (coordinator reads diffs and ranks)If baseline was captured, pass --baseline {value} to result_ranker.py so deltas are shown.
Display the ranked results table.
Present the results to the user and ask for confirmation:
Agent-2 is the winner (128ms, -52ms from baseline).
Merge agent-2's branch? [Y/n]
If confirmed, run /hub:merge. If declined, inform the user they can:
/hub:merge --agent agent-{N} to pick a different winner/hub:eval --judge to re-evaluate with LLM judge--template, agents use the default dispatch prompt from /hub:spawntools
Code review automation for TypeScript, JavaScript, Python, Go, Swift, Kotlin, C#, .NET, Java, C, C++, Rust, Ruby, PHP, and Dart/Flutter. Analyzes PRs for complexity and risk, checks code quality for SOLID violations and code smells, generates review reports. Use when reviewing pull requests, analyzing code quality, identifying issues, generating review checklists.
tools
Use when planning, funding, scoping, or synthesizing enterprise research across workstreams — clinical study design, R&D program finance, market sizing/surveys, or product/user research. Triggers on "design this clinical study", "what sample size", "R&D budget", "burn rate", "capitalize or expense", "TAM SAM SOM", "market sizing", "survey design", "segment the market", "plan user interviews", "usability test", "synthesize research insights". Forks context to route to one of four Research-Operations sub-skills (clinical-research, research-finance, market-research, product-research) and returns a digest. Distinct from ra-qm-team (regulatory submission), finance (corporate close/valuation), research/grants (funding discovery), product-team (persona/journey/live experiments), and marketing-skill (campaign analytics).
development
Use when managing the money for an internal R&D program or portfolio — building a multi-period program budget with the F&A (indirect) split, tracking burn rate and runway against value-inflection milestones, or routing R&D cost items to a capitalize-vs-expense determination. Every budget output surfaces its assumptions block; capitalize-vs-expense is decision-support only and routes to a named finance owner — it never books an entry or decides accounting treatment. Distinct from finance/financial-analysis (corporate DCF, close, valuation) and research/grants (funding discovery — this manages money already won).
development
Use when planning and synthesizing product/user research as a method-and-repository discipline — selecting the right method for the goal (generative interviews vs usability test vs concept test vs validation), computing method-based saturation/sample size with an explicit confidence level, or synthesizing coded observations into insights while flagging single-source anecdotes. Never fabricates user insight; an insight requires recurrence across independent participants. Distinct from product-team/ux-researcher-designer (persona/journey artifacts), product-discovery (discovery-sprint planning), and experiment-designer (live A/B) — this is the research-ops method + insight-repository layer.