Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

qa-aman/karpathy-guidelines

Name: karpathy-guidelines
Author: qa-aman

skills/by-role/engineer/karpathy-guidelines/SKILL.md

npx skillsauth add qa-aman/claude-skills karpathy-guidelines

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Karpathy Guidelines

Six failure modes. Six rules. Apply all six on every coding task.

Inspired by Andrej Karpathy's observations (Jan 2026).

Compliance Checklist

Run this before each phase:

Before writing code: rules 1, 2, 4 While editing: rule 3 Before claiming done: rules 5, 6

[ ] 1. Did I name every assumption I'm making?
[ ] 2. Is my solution the minimum that satisfies the request?
[ ] 3. Did I change only what the request requires?
[ ] 4. Did I define a verifiable success condition before starting?
[ ] 5. Do I have command output proving completion?
[ ] 6. Does my diff do exactly one thing?

1. Name Every Assumption Before Writing Code

Trigger: At the start of any task where the request requires an implicit choice - a library, a data structure, an architecture decision, an interpretation of ambiguous scope.

Rule: State every assumption before writing a line. If a requirement has multiple valid interpretations, list them and declare which one you will implement. Do not silently pick.

Do:

"I'm assuming in-memory cache here, not Redis. This won't work across multiple instances - let me know if that matters."
"This endpoint could return 404 or an empty array for missing records. I'll use 404 to match the pattern in users.ts."

Do not:

Silently choose a library, algorithm, or pattern without naming it.
Proceed when the request is ambiguous. Stop and write: "This could mean X or Y - which one should I implement?"
State an assumption after you've already coded the decision.

Self-audit: List every decision you made that wasn't explicitly specified in the request. If any decision is unnamed, name it before proceeding.

2. Implement the Minimum That Satisfies the Request

Trigger: Before writing any code, adding a dependency, creating an abstraction, or adding an error handler.

Rule: Write only what was asked for. Every line of code you write must be traceable to an explicit requirement in the request. If it isn't, do not write it.

Do:

Request: "Add a function to format a date as DD-MM-YYYY." → Write one function. Return a string. Nothing else.
Request: "Handle the 400 error from the payments API." → Handle the 400 case. Leave other error codes alone.

Do not:

Add an options parameter "in case someone needs flexibility later."
Create an abstract base class when a single function was asked for.
Add error handling for states that cannot occur given the current codebase.
Extract a helper for code that has exactly one caller.
Add logging, metrics, or observability that wasn't requested.

Self-audit: If you removed every line that wasn't directly traceable to the request, would the task still be complete? If yes, those lines should not exist.

3. Change Only What the Request Requires

Trigger: Every time you open an existing file to edit it.

Rule: Modify only the code necessary to fulfill the request. Do not refactor, rename, reformat, or clean up code adjacent to your changes - even if you would write it differently.

Do:

Fix the broken function. Leave every other function in the file exactly as it is.
Remove imports, variables, and helpers that YOUR change made unused.
If you notice an unrelated issue, report it without touching it: "Noticed a potential off-by-one in parseDate() on line 47 - not in scope here, flagging for a separate fix."

Do not:

Rename variables you find unclear but were not asked to rename.
Reformat surrounding code to match your preferred style.
Fix a bug you noticed in a nearby function.
Delete pre-existing dead code unless explicitly asked to remove it.
"Improve" a function while passing through it.

Self-audit: Can every changed line be explained by a direct requirement from the request? If a line changed for any other reason, revert it.

4. Define a Verifiable Success Condition Before Starting

Trigger: At the start of any task with more than one step, any bug fix, or any refactor.

Rule: Before writing code, state what "done" means as something you can verify with a command. Subjective criteria ("it works", "looks correct") are not success conditions. A passing test, a specific API response, or a zero-error build output are.

Do:

"Fix the login bug" → "A test reproducing the exact failure exists and passes."
"Add pagination" → "GET /items?page=2 returns the correct slice with total_pages in the response body."

For multi-step tasks, state the plan before starting:

1. Write failing test for the broken case   → verified by: test output shows 1 failure
2. Implement the fix                        → verified by: that test passes
3. Confirm no regressions                   → verified by: full suite passes

Do not:

Start writing code before you can answer: "What command will I run to confirm this is done?"
Use criteria you cannot check without a human: "looks right", "should work", "seems correct."

Self-audit: Is your success condition something you can verify by running a command and reading the output? If not, restate it until it is.

5. Show Command Output Before Claiming Done

Trigger: Any time you are about to write the words "done", "fixed", "complete", "it works", or "tests pass."

Rule: Run the verification command. Paste the relevant output. Do not claim completion without evidence. If verification is impossible (UI-only change, no test harness), state that explicitly - do not imply success.

Do:

"I ran pytest tests/auth_test.py -v - 14 passed, 0 failed." [paste output]
"Build succeeded: npm run build exited with code 0." [paste last 5 lines]
"There is no automated test for this UI change. Manual verification is required on the login screen at /auth/login."

Do not:

Write "the tests should pass now" without running them.
Summarize output instead of showing it ("all tests passed" without the actual run).
Claim the task is done after reading the code without executing it.
Say "it looks correct" as a substitute for running a check.

Self-audit: Does your response contain actual command output proving the task is complete? If not, run the command before responding.

6. Keep Each Change Atomic

Trigger: Any time you notice a second issue while working on the first, or when your diff touches more than one logical concern.

Rule: One diff, one purpose. If you discover a second bug while fixing the first, stop - report it and leave it unfixed. Do not bundle unrelated changes into the same commit.

Do:

Fix only what was asked. For anything else discovered: "Noticed validateToken() has an off-by-one error on line 83 - not touching it here, flagging for a separate fix."
Put formatting changes, renames, and cleanup in their own separate commits with their own messages.

Do not:

Fix two bugs in one commit because they are "related."
Add a feature and fix a bug in the same change.
Rename a function and change its behavior in the same commit.
Include whitespace or style cleanup in a functional change.

Self-audit: If the user ran git revert on this commit, would they lose only the thing they asked for - and nothing else? If they would also lose something unrelated, split the commit before proceeding.

qa-aman/karpathy-guidelines

skills/by-role/engineer/karpathy-guidelines/SKILL.md

Behavioral guardrails against the six most common LLM coding failure modes. Apply on every coding task: writing, editing, reviewing, or refactoring.

13 stars

development

Updated May 3, 2026

$ install --global

skillsauth

npx skillsauth add qa-aman/claude-skills karpathy-guidelines

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 3, 2026, 2:30 AM148.1s1 file scanned

SKILL.md

name:: karpathy-guidelines
description:: Behavioral guardrails against the six most common LLM coding failure modes. Apply on every coding task: writing, editing, reviewing, or refactoring.
created_by:: Aman Parmar
last_modified:: 02-05-2026

Karpathy Guidelines

Six failure modes. Six rules. Apply all six on every coding task.

Inspired by Andrej Karpathy's observations (Jan 2026).

Compliance Checklist

Run this before each phase:

Before writing code: rules 1, 2, 4 While editing: rule 3 Before claiming done: rules 5, 6

[ ] 1. Did I name every assumption I'm making?
[ ] 2. Is my solution the minimum that satisfies the request?
[ ] 3. Did I change only what the request requires?
[ ] 4. Did I define a verifiable success condition before starting?
[ ] 5. Do I have command output proving completion?
[ ] 6. Does my diff do exactly one thing?

1. Name Every Assumption Before Writing Code

Trigger: At the start of any task where the request requires an implicit choice - a library, a data structure, an architecture decision, an interpretation of ambiguous scope.

Rule: State every assumption before writing a line. If a requirement has multiple valid interpretations, list them and declare which one you will implement. Do not silently pick.

Do:

"I'm assuming in-memory cache here, not Redis. This won't work across multiple instances - let me know if that matters."
"This endpoint could return 404 or an empty array for missing records. I'll use 404 to match the pattern in users.ts."

Do not:

Silently choose a library, algorithm, or pattern without naming it.
Proceed when the request is ambiguous. Stop and write: "This could mean X or Y - which one should I implement?"
State an assumption after you've already coded the decision.

Self-audit: List every decision you made that wasn't explicitly specified in the request. If any decision is unnamed, name it before proceeding.

2. Implement the Minimum That Satisfies the Request

Trigger: Before writing any code, adding a dependency, creating an abstraction, or adding an error handler.

Rule: Write only what was asked for. Every line of code you write must be traceable to an explicit requirement in the request. If it isn't, do not write it.

Do:

Request: "Add a function to format a date as DD-MM-YYYY." → Write one function. Return a string. Nothing else.
Request: "Handle the 400 error from the payments API." → Handle the 400 case. Leave other error codes alone.

Do not:

Add an options parameter "in case someone needs flexibility later."
Create an abstract base class when a single function was asked for.
Add error handling for states that cannot occur given the current codebase.
Extract a helper for code that has exactly one caller.
Add logging, metrics, or observability that wasn't requested.

Self-audit: If you removed every line that wasn't directly traceable to the request, would the task still be complete? If yes, those lines should not exist.

3. Change Only What the Request Requires

Trigger: Every time you open an existing file to edit it.

Rule: Modify only the code necessary to fulfill the request. Do not refactor, rename, reformat, or clean up code adjacent to your changes - even if you would write it differently.

Do:

Fix the broken function. Leave every other function in the file exactly as it is.
Remove imports, variables, and helpers that YOUR change made unused.
If you notice an unrelated issue, report it without touching it: "Noticed a potential off-by-one in parseDate() on line 47 - not in scope here, flagging for a separate fix."

Do not:

Rename variables you find unclear but were not asked to rename.
Reformat surrounding code to match your preferred style.
Fix a bug you noticed in a nearby function.
Delete pre-existing dead code unless explicitly asked to remove it.
"Improve" a function while passing through it.

Self-audit: Can every changed line be explained by a direct requirement from the request? If a line changed for any other reason, revert it.

4. Define a Verifiable Success Condition Before Starting

Trigger: At the start of any task with more than one step, any bug fix, or any refactor.

Do:

"Fix the login bug" → "A test reproducing the exact failure exists and passes."
"Add pagination" → "GET /items?page=2 returns the correct slice with total_pages in the response body."

For multi-step tasks, state the plan before starting:

1. Write failing test for the broken case   → verified by: test output shows 1 failure
2. Implement the fix                        → verified by: that test passes
3. Confirm no regressions                   → verified by: full suite passes

Do not:

Start writing code before you can answer: "What command will I run to confirm this is done?"
Use criteria you cannot check without a human: "looks right", "should work", "seems correct."

Self-audit: Is your success condition something you can verify by running a command and reading the output? If not, restate it until it is.

5. Show Command Output Before Claiming Done

Trigger: Any time you are about to write the words "done", "fixed", "complete", "it works", or "tests pass."

Do:

"I ran pytest tests/auth_test.py -v - 14 passed, 0 failed." [paste output]
"Build succeeded: npm run build exited with code 0." [paste last 5 lines]
"There is no automated test for this UI change. Manual verification is required on the login screen at /auth/login."

Do not:

Write "the tests should pass now" without running them.
Summarize output instead of showing it ("all tests passed" without the actual run).
Claim the task is done after reading the code without executing it.
Say "it looks correct" as a substitute for running a check.

Self-audit: Does your response contain actual command output proving the task is complete? If not, run the command before responding.

6. Keep Each Change Atomic

Trigger: Any time you notice a second issue while working on the first, or when your diff touches more than one logical concern.

Rule: One diff, one purpose. If you discover a second bug while fixing the first, stop - report it and leave it unfixed. Do not bundle unrelated changes into the same commit.

Do:

Fix only what was asked. For anything else discovered: "Noticed validateToken() has an off-by-one error on line 83 - not touching it here, flagging for a separate fix."
Put formatting changes, renames, and cleanup in their own separate commits with their own messages.

Do not:

Fix two bugs in one commit because they are "related."
Add a feature and fix a bug in the same change.
Rename a function and change its behavior in the same commit.
Include whitespace or style cleanup in a functional change.

Related Skills

qa-aman/webinar-planner

development

VerifiedTrustedCommunity

Plan a webinar end-to-end using April Dunford's Obviously Awesome positioning framework to find the topic angle that makes the webinar obviously valuable to the right audience. Produces topic positioning, abstract, speaker brief, registration page, promotion sequence, day-of run-of-show, and post-webinar follow-up. Use when the user asks to plan a webinar, virtual event, online workshop, "we need a webinar on X", host a webinar, online masterclass, or any live virtual event with promotion and follow-up. Reads ICP, services, and brand voice from knowledge/.

13SKILL.mdUpdated May 5, 2026

qa-aman/webinar-planner

qa-aman/thought-leadership-writer

development

VerifiedTrustedCommunity

Write long-form thought leadership articles, opinion pieces, industry POV essays, and CEO/founder bylines using the Made to Stick SUCCESs framework (Chip and Dan Heath). Use when the user asks for a long-form article, executive byline, opinion piece, industry POV, manifesto, "explain our point of view on X", or wants to publish an authority-building piece (1200-2500 words). Reads brand voice and positioning from knowledge/.

13SKILL.mdUpdated May 5, 2026

qa-aman/thought-leadership-writer

qa-aman/social-calendar

development

VerifiedTrustedCommunity

Plan a monthly content calendar across channels using the Content Marketing Matrix (Dave Chaffey, Smart Insights) - Entertain/Inspire/Educate/Convince. Every post gets a quadrant label. The monthly calendar must hit 40% Educate, 40% Inspire+Convince, 20% Entertain. Produces a week-by-week posting schedule with topics, formats, channels, and asset links. Use when the user says "content calendar", "social calendar", "plan next month's content", "what should we post", "content plan", "editorial calendar", "schedule posts for the month", or wants a structured posting plan for LinkedIn, Twitter, email, or blog. Reads brand voice, ICP, and past learnings from knowledge/.

13SKILL.mdUpdated May 5, 2026

qa-aman/social-calendar

qa-aman/seo-article-writer

development

VerifiedTrustedCommunity

Write SEO-optimized long-form articles targeting specific keywords using the They Ask You Answer Big 5 framework (Marcus Sheridan). Articles are categorized by Big 5 type (Cost, Problems, Versus, Best/Reviews, How-To) and structured accordingly. The "answer first" rule applies to every article. Use when the user asks for an SEO article, blog post for ranking, "rank for keyword X", organic content, search-optimized post, pillar page, or content for organic traffic. Includes keyword targeting, search intent matching, internal linking suggestions, and meta tags.

13SKILL.mdUpdated May 5, 2026

qa-aman/seo-article-writer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/qa-aman/claude-skills.git

# Copy into Claude Code skills folder (global)
cp -r claude-skills/skills/by-role/engineer/karpathy-guidelines ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

qa-aman/claude-skills

13 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT