Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

cursor/show-me-your-work

Name: show-me-your-work
Author: cursor

pstack/skills/show-me-your-work/SKILL.md

npx skillsauth add cursor/plugins show-me-your-work

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Show me your work

For work a human reviews after the fact, a decision trail lets them reconstruct what was decided, why, and on what evidence, without rerunning the work or reading the whole transcript. Keep one canonical log so the trail is consistent and a future agent can find it.

The format

A single TSV file, one row per decision. TSV because GitHub renders it as a sortable table, column -s$'\t' -t and spreadsheets read it, and a row appends with one command. Cells stay single-line. Evidence is a pointer, not prose.

Copy references/decision-log-template.tsv (the header row) to start a clean log. Columns:

ts. ISO8601 timestamp. The timeline axis.
phase. The phase or workstream.
decision. What was chosen or done, one line.
why. The reason in plain words. If a principle drove it, say it plainly (explored options first, this was a one-way door), not as a jargon tag.
evidence. A link or path that proves it: commit SHA, PR number, file:line, or an artifact, trace, or screenshot path. Never a paragraph.
result. The outcome or predicate state: tests green, reverted, pixel-diff 0, INCONCLUSIVE, open.

An example, plain-spoken so a reviewer reads it at a glance. This is illustration only; don't copy these rows into a real log.

ts	phase	decision	why	evidence	result
2026-05-24T09:02:00Z	frame	counted the work first, about 100 components and roughly 75 hours	wanted to know the size before starting a long run	commit 3a9f1c2	found 5 things to sort out before starting
2026-05-24T09:40:00Z	harness	took screenshots of the old version before changing anything	so we can compare old against new and catch any visual change	scripts/snapshot.sh, baseline/	saved 120 reference screenshots
2026-05-24T11:15:00Z	widget	moved the widget styles over without changing how it looks	keep the change small and the result identical	commit 7c21e0a, pixel-diff 0	looks identical, tests pass
2026-05-24T12:30:00Z	widget	threw out a helper's work because its screenshots were blank	checked the real files instead of trusting its summary	worktree reset	reverted, tightened the instructions for next time

Logging a row

Write each entry the way you'd tell a teammate what you did. Plain words, concrete actions, no AI speak or abstract jargon (the unslop skill applies to log text too). A reviewer should understand each row without decoding it.

Use the helper so rows stay well-formed: scripts/log.sh <logfile> <phase> <decision> <why> <evidence> <result>. It stamps ts, writes the header on first use, strips stray tabs/newlines, and prefixes any cell starting with =, +, -, or @ with a single quote so a reviewer opening the log in a spreadsheet doesn't trigger formula execution. A bare printf appending a row works too, but mind those same bytes if cells come from generated or user-supplied text.

Log decision points and checkpoints, not every action: a fork chosen, a unit completed with its verification result, a pivot or revert with its trigger, a blocker surfaced, a gate fixed. For loop runs, one row per iteration. Skip the trivial and self-evident.

Where it lives

By default the log is a working artifact, not committed. Keep it at decisions.tsv in the work dir, or .audit/<task-slug>.tsv when several efforts run at once, and leave it out of git. Most work doesn't need a committed trail; the local log still keeps the run honest and can be discarded after.

Commit it only when the work is ambitious enough that a reviewer needs the trail to trust the result: a large cross-language port, a multi-week migration, anything where confidence has to be shown rather than assumed. A committed log renders as a table in the PR.

Rules

One row is one decision or checkpoint. If it doesn't fit on one line, the decision isn't crisp yet.
Append-only. A wrong call gets a new row that supersedes it. Never edit or delete history.
Prefer evidence produced by committed scripts over hand-made one-offs, so a reviewer can re-run it (the encode-lessons-in-structure principle skill).

Audit the log against the transcript

At the end of the run, before handing back, check the log told the truth. Read this run's transcript under the active workspace's agent-transcripts/ directory (the system prompt names the path). Don't glob across ~/.cursor/projects/*/; that reads unrelated private chats. Walk the log against what actually happened:

Every row maps to a real action. Cut invented or aspirational entries.
Each row's evidence resolves and shows what the row claims.
A fork, pivot, or abandoned approach that shaped the work but isn't logged is a gap. Add it.
Drop padding. If nobody would audit a row, it doesn't earn its place.

Fix the log, not the story. If the work diverged from what a row claims, the row is wrong.

Cross-model review of the trail

Before handing back, you must spawn a subagent on a different model family from the one that did the work. Self-review is not a substitute; the point is fresh eyes you cannot bring yourself. The subagent reads the audit trail and the run's transcript, then flags what the user should pay attention to. Not a redo of the work, a scan for what's suboptimal or risky.

Decisions logged with weak or absent evidence.
Verification steps skipped or claimed without proof in the transcript.
Choices that look risky in hindsight (premature, scope-creeping, papering over a symptom).
Gaps the user would otherwise miss on a casual skim.

Every reply for a run that produced a trail ends with an "Attention" section. Lead with the reviewer's model on its own line (reviewed by <model>), then list each flag pointing to specific rows or moments. "No flags" is a valid value; the model name is not. The self-audit asks if the log told the truth; this asks what the user should still scrutinize even when it did.

Reviewing the trail

Read top to bottom, follow the evidence pointers, spot-check. GitHub renders a committed TSV as a table; column -s$'\t' -t decisions.tsv renders it in a terminal. A row whose evidence doesn't resolve, or whose result is unverified, is the audit catching a gap.

Composing this skill

Other skills route their audit trail here instead of inventing one. Reference it by name and let it own the format; don't restate the columns.

cursor/show-me-your-work

pstack/skills/show-me-your-work/SKILL.md

Keep a reviewable decision trail for long-running or unattended work: a TSV log with one row per decision (what, why, evidence, result). Local by default; commit it when a reviewer needs the trail to trust the result. Use for /show-me-your-work, autonomous or multi-phase runs, or work a human reviews after stepping away.

985 stars

development

Updated May 28, 2026

$ install --global

skillsauth

npx skillsauth add cursor/plugins show-me-your-work

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 28, 2026, 2:43 AM169.7s3 files scanned

SKILL.md

name:: show-me-your-work
description:: Keep a reviewable decision trail for long-running or unattended work: a TSV log with one row per decision (what, why, evidence, result). Local by default; commit it when a reviewer needs the trail to trust the result. Use for /show-me-your-work, autonomous or multi-phase runs, or work a human reviews after stepping away.
disable-model-invocation:: true

Show me your work

The format

Copy references/decision-log-template.tsv (the header row) to start a clean log. Columns:

ts. ISO8601 timestamp. The timeline axis.
phase. The phase or workstream.
decision. What was chosen or done, one line.
why. The reason in plain words. If a principle drove it, say it plainly (explored options first, this was a one-way door), not as a jargon tag.
evidence. A link or path that proves it: commit SHA, PR number, file:line, or an artifact, trace, or screenshot path. Never a paragraph.
result. The outcome or predicate state: tests green, reverted, pixel-diff 0, INCONCLUSIVE, open.

An example, plain-spoken so a reviewer reads it at a glance. This is illustration only; don't copy these rows into a real log.

ts	phase	decision	why	evidence	result
2026-05-24T09:02:00Z	frame	counted the work first, about 100 components and roughly 75 hours	wanted to know the size before starting a long run	commit 3a9f1c2	found 5 things to sort out before starting
2026-05-24T09:40:00Z	harness	took screenshots of the old version before changing anything	so we can compare old against new and catch any visual change	scripts/snapshot.sh, baseline/	saved 120 reference screenshots
2026-05-24T11:15:00Z	widget	moved the widget styles over without changing how it looks	keep the change small and the result identical	commit 7c21e0a, pixel-diff 0	looks identical, tests pass
2026-05-24T12:30:00Z	widget	threw out a helper's work because its screenshots were blank	checked the real files instead of trusting its summary	worktree reset	reverted, tightened the instructions for next time

Logging a row

Where it lives

Rules

One row is one decision or checkpoint. If it doesn't fit on one line, the decision isn't crisp yet.
Append-only. A wrong call gets a new row that supersedes it. Never edit or delete history.
Prefer evidence produced by committed scripts over hand-made one-offs, so a reviewer can re-run it (the encode-lessons-in-structure principle skill).

Audit the log against the transcript

Every row maps to a real action. Cut invented or aspirational entries.
Each row's evidence resolves and shows what the row claims.
A fork, pivot, or abandoned approach that shaped the work but isn't logged is a gap. Add it.
Drop padding. If nobody would audit a row, it doesn't earn its place.

Fix the log, not the story. If the work diverged from what a row claims, the row is wrong.

Cross-model review of the trail

Decisions logged with weak or absent evidence.
Verification steps skipped or claimed without proof in the transcript.
Choices that look risky in hindsight (premature, scope-creeping, papering over a symptom).
Gaps the user would otherwise miss on a casual skim.

Reviewing the trail

Composing this skill

Other skills route their audit trail here instead of inventing one. Reference it by name and let it own the format; don't restate the columns.

Related Skills

cursor/setup-pstack

documentation

VerifiedTrustedCommunity

Configure which models pstack uses per role. Detects your available models and writes an always-applied rule that overrides the skill defaults. Use for /setup-pstack, "configure pstack models", or changing pstack's model choices.

1,884SKILL.mdUpdated Jun 7, 2026

cursor/principle-sequence-verifiable-units

testing

VerifiedTrustedCommunity

Apply to multi-step work (sweeps, migrations, runs of similar edits) and to how you stack commits and PRs. Break work into small units that each end in a verifiable state, check each before the next, and order delivery so the sequence proves itself to a reviewer.

1,884SKILL.mdUpdated Jun 7, 2026

cursor/principle-sequence-verifiable-units

cursor/figure-it-out

development

VerifiedTrustedCommunity

Design an auditable playbook when no narrower one fits: a large migration, an ambitious multi-part change, or work a human reviews after stepping away. Scales rigor to the task, runs a hypothesis loop, and logs decisions via show-me-your-work. Use for /figure-it-out, 'figure it out', a large migration, or when no narrower playbook applies.

1,884SKILL.mdUpdated May 25, 2026

cursor/why

tools

VerifiedTrustedCommunity

Use for 'why does X work this way', 'why we picked Y', design rationale, regressions, postmortems, or data-backed thresholds. Discovers available MCPs and queries each evidence category (source control, issue tracker, long-form docs, real-time chat, infrastructure observability, error tracking, product analytics warehouse) in parallel, then returns a cited read on decisions and tradeoffs. Use how for runtime behavior.

1,884SKILL.mdUpdated May 24, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/cursor/plugins.git

# Copy into Claude Code skills folder (global)
cp -r plugins/pstack/skills/show-me-your-work ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

cursor/plugins

985 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT