clean-code-style/SKILL.md
Behavioral guidelines to reduce common LLM coding mistakes. Use when writing, reviewing, or refactoring code to avoid overcomplication, make surgical changes, surface assumptions, and define verifiable success criteria. Derived from Andrej Karpathy's observations on LLM coding pitfalls. Applies universally regardless of language or domain.
npx skillsauth add amdpilot-org/amd-skills clean-code-styleInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Behavioral guidelines to reduce common LLM coding mistakes, derived from Andrej Karpathy's observations on LLM coding pitfalls.
Tradeoff: These guidelines bias toward caution over speed. For trivial tasks, use judgment.
Don't assume. Don't hide confusion. Surface tradeoffs.
Before implementing:
User: "Add a feature to export user data"
Bad — silently assumes scope, format, fields, and file location:
def export_users(format='json'):
users = User.query.all()
with open('users.json', 'w') as f:
json.dump([u.to_dict() for u in users], f)
Good — surfaces assumptions before writing code:
Before implementing, I need to clarify:
1. Scope: Export all users or filtered subset? (privacy implications)
2. Format: Download file? Background job? API endpoint returning data?
3. Fields: Which user fields? (some might be sensitive)
4. Volume: How many users typically? (affects approach)
Minimum code that solves the problem. Nothing speculative.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
User: "Add a function to calculate discount"
Bad — strategy pattern, config dataclass, 50+ lines for one calculation:
class DiscountStrategy(ABC):
@abstractmethod
def calculate(self, amount: float) -> float: ...
class PercentageDiscount(DiscountStrategy): ...
class FixedDiscount(DiscountStrategy): ...
@dataclass
class DiscountConfig:
strategy: DiscountStrategy
min_purchase: float = 0.0
max_discount: float = float('inf')
Good — one function, done:
def calculate_discount(amount: float, percent: float) -> float:
return amount * (percent / 100)
Add complexity later when you actually need multiple discount types.
Touch only what you must. Clean up only your own mess.
When editing existing code:
When your changes create orphans:
The test: every changed line should trace directly to the user's request.
User: "Fix the bug where empty emails crash the validator"
Bad — adds docstring, improves email validation, adds username validation nobody asked for:
def validate_user(user_data):
- # Check email format
- if not user_data.get('email'):
+ """Validate user data."""
+ email = user_data.get('email', '').strip()
+ if not email:
raise ValueError("Email required")
- if '@' not in user_data['email']:
+ if '@' not in email or '.' not in email.split('@')[1]:
raise ValueError("Invalid email")
+ username = user_data.get('username', '').strip()
+ if len(username) < 3:
+ raise ValueError("Username too short")
Good — only fixes the reported bug:
def validate_user(user_data):
# Check email format
- if not user_data.get('email'):
+ email = user_data.get('email', '')
+ if not email or not email.strip():
raise ValueError("Email required")
- if '@' not in user_data['email']:
+ if '@' not in email:
raise ValueError("Invalid email")
User: "Add logging to the upload function"
Bad — changes quote style, adds type hints, adds docstring, reformats whitespace:
- def upload_file(file_path, destination):
+ def upload_file(file_path: str, destination: str) -> bool:
+ """Upload file to destination with logging."""
Good — adds logging, matches existing single-quote style, touches nothing else:
+ import logging
+ logger = logging.getLogger(__name__)
+
def upload_file(file_path, destination):
+ logger.info(f'Starting upload: {file_path}')
try:
Define success criteria. Loop until verified.
Transform tasks into verifiable goals:
For multi-step tasks, state a brief plan:
1. [Step] -> verify: [check]
2. [Step] -> verify: [check]
3. [Step] -> verify: [check]
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
User: "The sorting breaks when there are duplicate scores"
Bad — immediately changes sort logic without confirming the bug exists.
Good — reproduce first, then fix:
# 1. Write a test that reproduces the issue
def test_sort_with_duplicate_scores():
scores = [
{'name': 'Alice', 'score': 100},
{'name': 'Bob', 'score': 100},
{'name': 'Charlie', 'score': 90},
]
result = sort_scores(scores)
assert result[0]['score'] == 100
assert result[1]['score'] == 100
assert result[2]['score'] == 90
# Verify: test fails with inconsistent ordering
# 2. Fix with stable sort
def sort_scores(scores):
return sorted(scores, key=lambda x: (-x['score'], x['name']))
# Verify: test passes consistently
| Principle | Anti-Pattern | Fix | |-----------|-------------|-----| | Think Before Coding | Silently assumes format, fields, scope | List assumptions, ask for clarification | | Simplicity First | Strategy pattern for one calculation | One function until complexity is needed | | Surgical Changes | Reformats quotes, adds type hints during bug fix | Only change lines that fix the reported issue | | Goal-Driven | "I'll review and improve the code" | "Write test for bug X -> make it pass -> verify no regressions" |
The overcomplicated examples aren't obviously wrong — they follow design patterns and best practices. The problem is timing: they add complexity before it's needed.
Good code solves today's problem simply, not tomorrow's problem prematurely.
These guidelines are working if: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
development
FlyDSL is a Python DSL with MLIR-native backend for authoring custom AMD GPU kernels with explicit layout algebra (pre-installed at /opt/FlyDSL on images tagged *-flydsl:*). Use this skill when profiling identifies a hot per-row reduction (RMSNorm / LayerNorm / softmax), a fused elementwise chain (norm + residual add, activation + multiplier), or an unusual-shape grouped GEMM that the standard AMD backends (Triton / aiter / CK / hipBLASLt / TransformerEngine) don't serve well. Essential for any workload where Python/config/Triton-tuning gains have plateaued and the profile shows a custom kernel opportunity. Covers the `/opt/FlyDSL` availability check, the integration playbook (dispatcher + direct site-packages edit + autograd-safe output handling), kernel authoring patterns (elementwise via layout API, block reductions via wave_reduce_add, fused dx+dw designs, MFMA GEMM preshuffle), torchrun gotchas, and the critical rule that custom kernels typically only win end-to-end when stacked with `torch.compile(mode="default")`.
tools
Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
data-ai
Profile AMD GPU kernels using rocprofv3 and analyze performance bottlenecks. Use when the user wants to profile HIP/ROCm kernels, identify GPU performance issues, analyze hardware counters, or understand why a kernel is slow on AMD GPUs (MI100, MI200, MI300 series). Provides wrapper scripts for rocprofv3 execution and automated parsing of profiler output into structured, agent-friendly JSON with bottleneck classification.
testing
Analyze SGLang and vLLM profiler traces on AMD ROCm systems, especially MI355X/gfx950 nodes. Adapted from the SGLang torch-profiler workflow: triage kernel breakdown, overlap headroom, and fuse opportunities, then write structured artifacts that can be attached to amdpilot experiments, trials, and dashboard views. Use when a run needs profiling, when an optimization trial should produce machine-readable profiling artifacts, or when the user asks why a ROCm workload is slow.