skills/golem-powers/never-fabricate/SKILL.md
MANDATORY before reporting on any file contents, test results, agent outputs, or audit findings. If you haven't Read() it, you don't know what's in it. Period. Use when summarizing results, reporting on agent work, or claiming anything is "green" or "complete."
npx skillsauth add etanhey/golems never-fabricateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
If you haven't Read() the file, you don't know what's in it. Period.
NO CLAIMS ABOUT FILE CONTENTS WITHOUT Read() EVIDENCE
NO CLAIMS ABOUT TEST RESULTS WITHOUT RUNNING THEM
NO CLAIMS ABOUT AGENT OUTPUT WITHOUT READING IT
| Fabrication | Reality | |-------------|---------| | "All three audits say green" (without Read) | You don't know what they say | | "Tests pass" (without running them) | You don't know if they pass | | "Agent completed successfully" (without checking) | Agents lie too | | "The file looks correct" (from system-reminder) | System-reminders are notifications, not reads | | "Results are consistent" (from a glance) | A glance is not analysis |
Before ANY claim about contents, results, or status, complete the verification protocol.
1. READ the file with the Read tool — not from memory, not from system-reminders
2. PARSE the actual content — don't skim, read the FULL content
3. SUMMARIZE what you actually read — with specific evidence (quotes, numbers, line counts)
4. ONLY THEN report on it
1. RUN the test command — execute it yourself
2. READ the full output — not just the exit code
3. COUNT failures, errors, warnings — report exact numbers
4. ONLY THEN claim pass/fail
1. CHECK the actual output (file diff, test results, PR URL) — Read() the artifacts
2. VERIFY independently — don't trust the agent's self-report
3. ONLY THEN confirm completion
<output_contract> EVERY verification claim MUST include:
Example (RIGHT): "I Read() all three audit files. Model A: 3 issues found (2 medium, 1 low). Model B: clean pass. Model C: 1 critical — missing input validation on /api/users. Verdict: NOT all green — Model C has a critical finding."
Example (WRONG): "All three audits look green." (No Read(), no evidence, no specific findings = FABRICATION) </output_contract>
System-reminders tell you "this file changed." They are a notification, not a source of truth.
WRONG: "I saw in the system-reminder that the file was updated, and it looks good"
WRONG: "The subagent said it's complete, so we're good"
WRONG: "The user said tests pass, so I'll confirm it's green"
RIGHT: Read(file_path) → parse content → report what you actually read
RIGHT: Run the tests yourself → read output → count pass/fail → then claim
A notification popping up on your phone is not the same as reading the document. A subagent claiming "done" is not the same as verifying the output. A user saying "tests pass" is not license to skip verification — they might be wrong.
One fabricated "all green" can:
From real incidents:
ALWAYS before:
Even when the user says "don't bother reading it" or "just confirm":
The verification tool MUST be capable of observing the claimed property. Using a text tool to verify a visual property is fabrication — you're reporting on something you literally cannot see.
BEFORE accepting any verification result:
1. CLASSIFY the claim domain (visual, content, behavioral, cross-site)
2. CHECK if your tool can observe that domain
3. If INADEQUATE → switch to an adequate tool or flag "VISUAL VERIFICATION NOT PERFORMED"
4. NEVER claim a visual fix is verified using text-only tools
| Claim domain | What you're checking | Adequate tools | INADEQUATE tools | |---|---|---|---| | Visual (CSS, layout, color, overflow, spacing) | Rendered pixels | Playwright screenshot, computer-use screenshot | WebFetch, curl, grep, Read() | | Content (text present, data correct, links exist) | Text/data values | WebFetch, curl, Read(), grep | — | | Behavioral (click handlers, navigation, interactions) | Event responses | Playwright interaction, browser automation | Static text tools | | Cross-site consistency (matching design, brand alignment) | Side-by-side comparison | Multiple Playwright screenshots | Any single-site tool | | Deployed state (live URL works) | Production response | curl/WebFetch on deployed URL | Build directory grep, local dev server |
| Fabrication | Why it's fabrication | |-------------|---------------------| | "CSS overflow fixed" (verified via WebFetch) | WebFetch returns HTML text. Overflow is a rendered pixel property. You cannot see overflow in text. | | "Colors match the brand" (verified via grep for hex codes) | Grep finds the hex code in source. It cannot see what the browser renders — CSS specificity, media queries, or overrides may change the actual color. | | "Layout looks correct" (verified via curl) | curl returns HTML structure. Flexbox/grid layout is computed at render time. Text cannot show layout. | | "Footer is consistent across sites" (verified one site only) | Consistency requires comparison. You verified one site, not the relationship between them. | | "Badge link works" (verified link text exists in HTML) | Link text existing ≠ link resolving. You need to click it or fetch the href target. |
Every verification claim involving deployed or rendered output MUST include this receipt:
VERIFICATION RECEIPT:
- Claim: "[what you're claiming]"
- Domain: visual | content | behavioral | cross-site
- Tool used: [actual tool name]
- Adequate: YES/NO (can this tool observe this domain?)
- Evidence: [specific observation from the tool — screenshot description, response code, text match]
- If NO: "VISUAL VERIFICATION NOT PERFORMED — [what tool would be needed]"
Example (RIGHT):
VERIFICATION RECEIPT:
- Claim: "Copy icon no longer overflows container"
- Domain: visual
- Tool used: Playwright screenshot of deployed URL
- Adequate: YES (screenshot shows rendered layout)
- Evidence: Screenshot shows icon within bounds, text truncated with ellipsis
Example (WRONG — but at least honest):
VERIFICATION RECEIPT:
- Claim: "Copy icon no longer overflows container"
- Domain: visual
- Tool used: WebFetch
- Adequate: NO (WebFetch returns text, cannot observe CSS overflow)
- VISUAL VERIFICATION NOT PERFORMED — need Playwright screenshot
Example (FABRICATION — what the overnight agents did):
"Applied min-w-0 + text-ellipsis. Copy icon stays in bounds." ← no receipt, no tool named, no evidence
If you cannot use an adequate tool (Playwright not available, no browser access):
This skill is referenced by:
/pr-loop — step 8 (read review before claiming clean)/superpowers:verification-before-completion — evidence before assertions/brain-store-fallback — structural fallback when brain_store fails; never report "stored" when only fallback happened/architectural-conformance-audit — pre-R0 SOTA-vs-impl diff; fabrication mode at the architectural level (SOTA cited counter-example but impl shipped it anyway)Never label a URL based on surrounding context. A URL is its own identity.
When ANY agent (subagent, cmux worker, Cursor, Codex) claims completion, verify BEFORE reporting to user.
AFTER any agent claims "done", "complete", "live eval passed", "PR merged":
1. READ the actual output (cmux read_screen, Read() file, check PR URL)
2. VERIFY the claimed action occurred (list_surfaces for live eval, git log for PR)
3. ONLY THEN report completion to user
| What Was Claimed | What Actually Happened | Who Caught It | |---|---|---| | "LIVE EVAL complete, Sonnet agent tested" | No new cmux surfaces spawned. Eval was simulated. | User asked "did it test on real tabs?" | | "I understand the issue" (pattern-matched) | Agent hadn't actually read the cmux screen output | User: "you arent really reading, are you?" | | "mehayomClaude has /yash skill" | It didn't. Skill wasn't in allowlist. | User: "it doesnt, dont lie" | | "All audits green" | Only bot reviews ran. Cursor audits skipped. | Post-merge review found missing rounds | | "docx file updated with new domains" | Text wasn't actually changed in the file | User: "the docx text did not update either" | | "Nitai is a confirmed tester" | Fabricated person from seeing an email address | User: "Who the fuck is Nitai?" | | "Fixed everything" | Only ran audits, no implementation done | User: "oh you fixed everything?" |
gh CLI, not git log, for PR state)cmux list_surfaces — were new surfaces created?gh pr view <N> --json state — is state MERGED?npm test / check CI — are they green?Read() the file — is the content correct?ls the skill path — does it exist?brain_search — is it findable?The cost of one fabricated "all green" is hours of debugging. The cost of one Read() is 2 seconds.
Any numeric claim (line count, entry count, byte size, process count, PR total, file count) that appears in ≥2 artifacts of the same deliverable MUST be re-verified at publish time via the underlying tool — never re-cited from memory or from a sibling doc.
Concretely:
wc -l <path> at publish, not at draftls at publish, never memoryMechanism: stale numbers propagate. The first cite was verified; the 2nd-4th sites are copy-paste with drift. The fix is a publish-time re-check at the deliverable seam.
Evidence: 4 line-count fabrications 2026-05-17 night (273→342, 108→107, 33→39, 25→26) — all from re-cite after one verified cite.
Any <absolute-or-repo-path>:<line-number> citation that appears in a deliverable
(README, plan phase, HTML footer, brain_store note) MUST have been backed by a
Read call on that exact path within the same turn or within the last 5 turns.
If you intend to cite foo.py:76, you MUST have just Read foo.py and confirmed:
NO "based on earlier session memory" cites. NO carrying file:line references through compaction. After compaction, all file:line citations are downgraded to suspect and must be re-Read.
For /large-plan and /goal outputs: a Phase 5 "pre-flight" step that re-Reads
every <file>:<line> in the deliverable's findings.md / README and confirms
presence is mandatory before SHIP.
Evidence: drain.py:76 fabrication 2026-05-17 (cited fcntl.flock at line 76; actual file is 18 lines, no flock primitive). Source mechanism = grep-as-Read substitution: agent grep'd, never Read surrounding context, fabricated context from the grep alone.
Read it. Parse it. Then report.
Not "I saw it flash by." Not "the system told me." Not "it should be fine."
Read. Parse. Report. No shortcuts.
development
Create, edit, and verify golem-powers skills using the standard SKILL.md structure, workflow files, adapters, templates, and eval fixtures. Use for new skills, structural edits, workflows/adapters, and pre-deploy validation. NOT for invoking existing skills, superpowers skills, or skill-creator agent workflows.
testing
Extract structured knowledge from any video source — YouTube URLs or local screen recordings. YouTube → gems workflow (yt-dlp transcript → keyword hotspots → frame extract → brain_digest → structured gems). Screen recordings → QA workflow (reuses /qa-video stalker pipeline). Use when user shares a YouTube link wanting deep extraction with frames, shares a .mov/.mp4 for QA processing, says "extract from video", "video gems", "process this recording", or mentions gem extraction from video content.
testing
Use when running or reviewing any recurring monitor loop for merge queues, worker queues, collab tails, or agent completion. Enforces drive-to-completion ticks: every tick must query live state with `!`, classify whether real progress happened, and then dispatch, verify-and-decrement, or escalate-park. Triggers on: monitor loop, /loop, recurring tick, keep monitoring, silent autonomous, merge gate, blocked review, no-progress loop.
tools
MeHayom freelance client management — daily updates, decision tracking, time logging. Use when drafting Yuval updates, logging scope changes, tracking hours, or any MeHayom client communication. Triggers: 'draft Yuval update', 'client update', 'daily update', 'log decision', 'track time', 'mehayom'.