Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/code-optimizer

Name: code-optimizer
Author: santosomar

skills/code-quality/code-optimizer/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills code-optimizer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Code Optimizer

Making code fast is a measurement discipline, not a coding style. The first rule: you don't know where the time goes until you measure. The second rule: you're usually wrong about where you think it goes.

Step 0 — Do you actually need to optimize?

| Question | If no → stop | | ---------------------------------------------- | ------------------------------------------------ | | Is there a concrete, measured slowness? | "It feels slow" is not a measurement | | Is the slow path on a hot path? | A 10s function called once at startup is fine | | Is there a target? ("under 100ms p99") | Without a target, you don't know when to stop |

Step 1 — Profile. Always. First.

| What's slow | Tool | | -------------------------- | -------------------------------------------------------------- | | CPU-bound Python | py-spy, cProfile + snakeviz | | CPU-bound JVM | async-profiler, JFR | | CPU-bound native | perf, Instruments, vtune | | Memory pressure / GC | Heap profiler (tracemalloc, jmap, heaptrack) | | I/O-bound (DB, network) | Query logs, EXPLAIN ANALYZE, trace spans | | Unclear | Flame graph first — it'll tell you which category |

Profile the real workload, not a toy. Micro-benchmarks lie.

Step 2 — Pick the lever

Optimizations, ranked by typical payoff-to-effort:

| Lever | When it applies | Typical speedup | Effort | | ---------------------------- | --------------------------------------------------- | --------------- | ------ | | Do less work | You're computing things nobody uses | 10–100× | Low | | Fix the algorithm | O(n²) where O(n log n) exists; nested loops over the same collection | 10–1000× | Medium | | Cache / memoize | Same expensive call, same inputs, repeatedly | 2–100× | Low | | Batch | N round-trips to a service → 1 round-trip | N× | Medium | | Move out of the loop | Invariant computation inside a loop | iterations× | Trivial| | Use the right data structure | list where you need set lookup; linear scan where you need index | 2–1000× | Low | | Parallelize | Embarrassingly parallel work on a multi-core box | cores× | High | | Go native / use SIMD | Tight numeric loop in an interpreted language | 10–100× | High | | Micro-optimize | Unroll, inline, avoid allocations | 1.1–2× | High |

Start at the top. Micro-optimization is the last resort, not the first instinct.

Step 3 — Change one thing, measure again

Benchmark before → one change → benchmark after → record the delta. Every time. If you make three changes and it's faster, you don't know which one did it — and one of them probably made it slower.

Worked example

Complaint: "Exporting the report takes 40 seconds."

Profile (py-spy top):

 84%  _lookup_user_name   (report.py:67)
 11%  _format_row         (report.py:80)
  3%  csv.writer.writerow

84% in one function. Look at it:

def _lookup_user_name(user_id):
    return db.query("SELECT name FROM users WHERE id = ?", user_id).one()

def export(rows):
    for row in rows:                        # 10,000 rows
        row.user_name = _lookup_user_name(row.user_id)
        writer.writerow(_format_row(row))

Diagnosis: N+1 query. 10,000 rows → 10,000 round-trips. Lever: batch.

def export(rows):
    user_ids = {row.user_id for row in rows}
    names = dict(db.query("SELECT id, name FROM users WHERE id IN ?", list(user_ids)))
    for row in rows:
        row.user_name = names[row.user_id]
        writer.writerow(_format_row(row))

Measure: 40s → 0.6s. 67× speedup, one query instead of 10,000. No data structure changed, no parallelism, no C extension. Just: do less work.

→ behavior-preservation-checker — the IN query with a set dedupes user_ids; make sure that's equivalent (it is — we're populating a dict, dupes were redundant anyway).

Common traps

Optimizing the wrong thing. You made _format_row 2× faster. It was 11% of runtime. Total speedup: 1.06×. The profiler told you to look at _lookup_user_name.
Micro-benchmark lies. Your loop is 3× faster in isolation. In production, it's memory-bandwidth-bound and the "optimization" does nothing. Benchmark the real path.
Caching without eviction. Memoization sped up the hot path; three days later you're OOM because the cache never forgets.
Premature parallelism. Threading added 20% overhead and the GIL means you got zero speedup. Profile says you're CPU-bound in Python → multiprocessing, not threading.

Do not

Do not optimize without profiling. Your intuition is wrong. Everyone's is.
Do not optimize before the code is correct. A fast wrong answer is worthless.
Do not change the algorithm and micro-optimize in the same pass. You won't know which helped.
Do not leave the benchmark out of the PR. "It's faster" is a claim; the before/after numbers are the evidence.
Do not sacrifice readability for a 5% gain in cold code. 5% of nothing is nothing.

Output format

## Baseline
<metric> = <value>  (measured with: <tool/command>)

## Bottleneck
<file>:<line>  — <N>% of runtime
<why it's slow — the diagnosis>

## Change
<lever from table> — <one-sentence what>
<diff>

## Result
<metric> = <value>  (<N>× speedup)

## Behavior check
<→ behavior-preservation-checker, or: tests green>

santosomar/code-optimizer

skills/code-quality/code-optimizer/SKILL.md

Optimizes code for performance by identifying the actual bottleneck, choosing the right optimization lever, and measuring the result. Use when a specific operation is too slow, when a profiler has pointed at a hot path, or when the user asks to make something faster.

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills code-optimizer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 20, 2026, 7:35 AM0.4s1 file scanned

SKILL.md

name:: code-optimizer
description:: Optimizes code for performance by identifying the actual bottleneck, choosing the right optimization lever, and measuring the result. Use when a specific operation is too slow, when a profiler has pointed at a hot path, or when the user asks to make something faster.
license:: Apache-2.0
category:: code-quality
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: behavior-preservation-checker, code-refactoring-assistant

Code Optimizer

Step 0 — Do you actually need to optimize?

Step 1 — Profile. Always. First.

Profile the real workload, not a toy. Micro-benchmarks lie.

Step 2 — Pick the lever

Optimizations, ranked by typical payoff-to-effort:

Start at the top. Micro-optimization is the last resort, not the first instinct.

Step 3 — Change one thing, measure again

Worked example

Complaint: "Exporting the report takes 40 seconds."

Profile (py-spy top):

 84%  _lookup_user_name   (report.py:67)
 11%  _format_row         (report.py:80)
  3%  csv.writer.writerow

84% in one function. Look at it:

def _lookup_user_name(user_id):
    return db.query("SELECT name FROM users WHERE id = ?", user_id).one()

def export(rows):
    for row in rows:                        # 10,000 rows
        row.user_name = _lookup_user_name(row.user_id)
        writer.writerow(_format_row(row))

Diagnosis: N+1 query. 10,000 rows → 10,000 round-trips. Lever: batch.

def export(rows):
    user_ids = {row.user_id for row in rows}
    names = dict(db.query("SELECT id, name FROM users WHERE id IN ?", list(user_ids)))
    for row in rows:
        row.user_name = names[row.user_id]
        writer.writerow(_format_row(row))

Measure: 40s → 0.6s. 67× speedup, one query instead of 10,000. No data structure changed, no parallelism, no C extension. Just: do less work.

→ behavior-preservation-checker — the IN query with a set dedupes user_ids; make sure that's equivalent (it is — we're populating a dict, dupes were redundant anyway).

Common traps

Optimizing the wrong thing. You made _format_row 2× faster. It was 11% of runtime. Total speedup: 1.06×. The profiler told you to look at _lookup_user_name.
Micro-benchmark lies. Your loop is 3× faster in isolation. In production, it's memory-bandwidth-bound and the "optimization" does nothing. Benchmark the real path.
Caching without eviction. Memoization sped up the hot path; three days later you're OOM because the cache never forgets.
Premature parallelism. Threading added 20% overhead and the GIL means you got zero speedup. Profile says you're CPU-bound in Python → multiprocessing, not threading.

Do not

Do not optimize without profiling. Your intuition is wrong. Everyone's is.
Do not optimize before the code is correct. A fast wrong answer is worthless.
Do not change the algorithm and micro-optimize in the same pass. You won't know which helped.
Do not leave the benchmark out of the PR. "It's faster" is a claim; the before/after numbers are the evidence.
Do not sacrifice readability for a 5% gain in cold code. 5% of nothing is nothing.

Output format

## Baseline
<metric> = <value>  (measured with: <tool/command>)

## Bottleneck
<file>:<line>  — <N>% of runtime
<why it's slow — the diagnosis>

## Change
<lever from table> — <one-sentence what>
<diff>

## Result
<metric> = <value>  (<N>× speedup)

## Behavior check
<→ behavior-preservation-checker, or: tests green>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/code-quality/code-optimizer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT