pstack/skills/show-me-your-work/SKILL.md
Keep a reviewable decision trail for long-running or unattended work: a TSV log with one row per decision (what, why, evidence, result). Local by default; commit it when a reviewer needs the trail to trust the result. Use for /show-me-your-work, autonomous or multi-phase runs, or work a human reviews after stepping away.
npx skillsauth add cursor/plugins show-me-your-workInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For work a human reviews after the fact, a decision trail lets them reconstruct what was decided, why, and on what evidence, without rerunning the work or reading the whole transcript. Keep one canonical log so the trail is consistent and a future agent can find it.
A single TSV file, one row per decision. TSV because GitHub renders it as a sortable table, column -s$'\t' -t and spreadsheets read it, and a row appends with one command. Cells stay single-line. Evidence is a pointer, not prose.
Copy references/decision-log-template.tsv (the header row) to start a clean log. Columns:
explored options first, this was a one-way door), not as a jargon tag.file:line, or an artifact, trace, or screenshot path. Never a paragraph.tests green, reverted, pixel-diff 0, INCONCLUSIVE, open.An example, plain-spoken so a reviewer reads it at a glance. This is illustration only; don't copy these rows into a real log.
ts phase decision why evidence result
2026-05-24T09:02:00Z frame counted the work first, about 100 components and roughly 75 hours wanted to know the size before starting a long run commit 3a9f1c2 found 5 things to sort out before starting
2026-05-24T09:40:00Z harness took screenshots of the old version before changing anything so we can compare old against new and catch any visual change scripts/snapshot.sh, baseline/ saved 120 reference screenshots
2026-05-24T11:15:00Z widget moved the widget styles over without changing how it looks keep the change small and the result identical commit 7c21e0a, pixel-diff 0 looks identical, tests pass
2026-05-24T12:30:00Z widget threw out a helper's work because its screenshots were blank checked the real files instead of trusting its summary worktree reset reverted, tightened the instructions for next time
Write each entry the way you'd tell a teammate what you did. Plain words, concrete actions, no AI speak or abstract jargon (the unslop skill applies to log text too). A reviewer should understand each row without decoding it.
Use the helper so rows stay well-formed: scripts/log.sh <logfile> <phase> <decision> <why> <evidence> <result>. It stamps ts, writes the header on first use, strips stray tabs/newlines, and prefixes any cell starting with =, +, -, or @ with a single quote so a reviewer opening the log in a spreadsheet doesn't trigger formula execution. A bare printf appending a row works too, but mind those same bytes if cells come from generated or user-supplied text.
Log decision points and checkpoints, not every action: a fork chosen, a unit completed with its verification result, a pivot or revert with its trigger, a blocker surfaced, a gate fixed. For loop runs, one row per iteration. Skip the trivial and self-evident.
By default the log is a working artifact, not committed. Keep it at decisions.tsv in the work dir, or .audit/<task-slug>.tsv when several efforts run at once, and leave it out of git. Most work doesn't need a committed trail; the local log still keeps the run honest and can be discarded after.
Commit it only when the work is ambitious enough that a reviewer needs the trail to trust the result: a large cross-language port, a multi-week migration, anything where confidence has to be shown rather than assumed. A committed log renders as a table in the PR.
At the end of the run, before handing back, check the log told the truth. Read this run's transcript under the active workspace's agent-transcripts/ directory (the system prompt names the path). Don't glob across ~/.cursor/projects/*/; that reads unrelated private chats. Walk the log against what actually happened:
Fix the log, not the story. If the work diverged from what a row claims, the row is wrong.
Before handing back, you must spawn a subagent on a different model family from the one that did the work. Self-review is not a substitute; the point is fresh eyes you cannot bring yourself. The subagent reads the audit trail and the run's transcript, then flags what the user should pay attention to. Not a redo of the work, a scan for what's suboptimal or risky.
Every reply for a run that produced a trail ends with an "Attention" section. Lead with the reviewer's model on its own line (reviewed by <model>), then list each flag pointing to specific rows or moments. "No flags" is a valid value; the model name is not. The self-audit asks if the log told the truth; this asks what the user should still scrutinize even when it did.
Read top to bottom, follow the evidence pointers, spot-check. GitHub renders a committed TSV as a table; column -s$'\t' -t decisions.tsv renders it in a terminal. A row whose evidence doesn't resolve, or whose result is unverified, is the audit catching a gap.
Other skills route their audit trail here instead of inventing one. Reference it by name and let it own the format; don't restate the columns.
documentation
Configure which models pstack uses per role. Detects your available models and writes an always-applied rule that overrides the skill defaults. Use for /setup-pstack, "configure pstack models", or changing pstack's model choices.
testing
Apply to multi-step work (sweeps, migrations, runs of similar edits) and to how you stack commits and PRs. Break work into small units that each end in a verifiable state, check each before the next, and order delivery so the sequence proves itself to a reviewer.
development
Design an auditable playbook when no narrower one fits: a large migration, an ambitious multi-part change, or work a human reviews after stepping away. Scales rigor to the task, runs a hypothesis loop, and logs decisions via show-me-your-work. Use for /figure-it-out, 'figure it out', a large migration, or when no narrower playbook applies.
tools
Use for 'why does X work this way', 'why we picked Y', design rationale, regressions, postmortems, or data-backed thresholds. Discovers available MCPs and queries each evidence category (source control, issue tracker, long-form docs, real-time chat, infrastructure observability, error tracking, product analytics warehouse) in parallel, then returns a cited read on decisions and tradeoffs. Use how for runtime behavior.