plugins/superpowers/skills/executing-plans/SKILL.md
Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skillsauth add primatrix/skills executing-plansInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Load plan, review critically, execute all tasks, report when complete.
Announce at start: "I'm using the executing-plans skill to implement this plan."
Note: Tell your human partner that Superpowers works much better with access to subagents. The quality of its work will be significantly higher if run on a platform with subagent support (such as Claude Code or Codex). If subagents are available, use superpowers:subagent-driven-development instead of this skill.
For each task:
After all tasks complete and verified:
STOP executing immediately when:
Ask for clarification rather than guessing.
Return to Review (Step 1) when:
Don't force through blockers - stop and ask.
Required workflow skills:
development
Use when analyzing TPU pretraining HBM occupancy from a profile directory — locates the static HBM peak (the same number TensorBoard's Memory Viewer shows), enumerates every buffer alive at the peak schedule moment with size / HLO instruction / opcode / op_name, and rolls the alive set up by opcode and op_name. Reads compile-time `*.hlo_proto.pb` (BufferAssignmentProto) as the primary source; runtime `*.xplane.pb` allocator events are a secondary, often-truncated signal.
testing
Use when analyzing TPU pretraining compute efficiency from xplane.pb — produces source-line-aggregated HLO duration tables, layer-scoped breakdowns, non-compute (padding/cast/copy) audits, and v7x roofline shortfall vs theoretical peak. Reads schema documented by profile-anatomy.
tools
--- name: comm-analysis description: Use when analyzing communication on a TPU pretraining profile — extracts every comm primitive (async + sync, TC + SparseCore), attributes axes via HLO replica_groups, computes per-row NCCL bus BW vs per-axis peak ICI BW (peak_link × k_torus_dims × directions_per_dim; TPUv7x: 200 GB/s bidir per link on a 3D torus; util% requires `--mesh-spec` with topology), and reports per-step compute/comm overlap. Builds on profile-anatomy. --- # Communication Analysis **
documentation
Use when reading TPU pretraining profiles (xplane.pb, trace.json.gz) — describes the on-disk layout, the XSpace/XPlane/XLine/XEvent/XStat hierarchy, and provides reference scripts that future tpu-perf skills can read as schema documentation.