Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/legacy-code-summarizer

Name: legacy-code-summarizer
Author: santosomar

skills/code-analysis/legacy-code-summarizer/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills legacy-code-summarizer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Legacy Code Summarizer

Legacy code has no docs, misleading names, and commit history that says fix. The summary you produce is archaeology — reconstructing intent from artifacts. Be honest about what's deduction vs. what's guesswork.

Differs from → code-summarizer: that one assumes the code is readable. This one assumes it isn't.

Evidence sources — in order of reliability

| Source | Reliability | What it tells you | | ------------------------------- | ----------- | ---------------------------------------------------- | | What the code does (dataflow) | High | Actual behavior. Can't lie. | | How it's called (grep callers) | High | What callers pass in, what they do with the result | | Tests (if any) | High | Someone wrote down an expectation | | Variable/function names | Medium | Might be right. Might be copy-pasted from elsewhere. | | Comments | Low | Often stale. Trust the code over the comment. | | Commit messages | Low | "fix bug", "wip", "asdf" |

Step 1 — Trace the data, not the names

Start from inputs, follow to outputs. Ignore what things are called; watch what happens to them.

int proc_rec(char *buf, int n) {       // name tells you nothing
    int k = 0;
    for (int i = 0; i < n; i++) {
        if (buf[i] == 0x1F) k++;        // 0x1F = ASCII unit separator
    }
    return k + 1;
}

Data trace: counts 0x1F bytes in buf[0..n], returns count + 1. That's "count fields in a unit-separator-delimited record." The +1 is fence-post: N separators → N+1 fields. Now you know what proc_rec means: "process record" → actually "count fields in record."

Step 2 — Look at callers

int nf = proc_rec(line, len);
char **fields = malloc(nf * sizeof(char*));
split_on(line, len, 0x1F, fields);

Caller uses the result to size an allocation for split_on with the same delimiter. Confirms: proc_rec is pre-counting fields so the split can allocate exactly. The abstraction is "two-pass field split: count pass, then split pass."

Step 3 — Mark confidence explicitly

Every claim in the summary gets a confidence tag:

[observed] — this is what the code does, period.
[inferred] — this is probably why, based on callers/structure.
[guess] — name/comment suggests this, unverified.
[unknown] — can't tell. Flag for a human who knows the history.

Worked example output

Code: 400 lines of Perl, sub munge_data, no comments, written 2009.

Summary:

## munge_data($ref, $mode)

### Behavior [observed]
- Walks a nested hashref `$ref` in-place.
- In mode 'C': lowercases all string leaf values, trims whitespace.
- In mode 'V': validates leaf values against regexes in the package-global
  %PATTERNS (keyed by parent hash key). Dies on mismatch.
- In any other mode: no-op (falls through the if/elsif chain).
- Mutates `$ref` directly. Returns nothing useful ($ref, but caller ignores it).

### Intent [inferred]
- 'C' = "canonicalize", 'V' = "validate" — based on the operations, not the letters.
- Called in sequence: `munge_data($d, 'C'); munge_data($d, 'V');` in 3 of 4 call
  sites. The 4th calls only 'V' — that input is apparently pre-canonicalized.
- %PATTERNS is populated from `config/field_rules.txt` at module load.

### Hazards [observed]
- Any mode other than 'C'/'V' silently does nothing — no error. Typo in mode
  string = silent skip. [→ seen in git log: commit a3f891 "fix: was passing 'c'
  not 'C'" — this bit someone]
- Deep recursion, no depth limit. Cyclic `$ref` → infinite loop.
- %PATTERNS global — if two threads load different configs, races.

### Unknowns
- [unknown] Why mode 'V' dies instead of returning an error list. Every caller
  wraps it in eval{} — suggests dying was a mistake, worked around.
- [guess] The 4th call site (V-only, in `import_legacy.pl`) might be handling
  data that's already canonical from an upstream system. Or might be a bug.

Archaeology tools

| Question | Command | | -------------------------------------- | --------------------------------------------------------- | | Who calls this? | rg -w funcname across the repo | | When was this line last touched? | git blame -w <file> | | What did it look like before? | git log -p --follow -- <file> | | Who wrote it originally? | git log --diff-filter=A -- <file> (first commit adding) | | Was there ever a comment here? | git log -p -L <line>,<line>:<file> | | What else changed in the same commit? | git show <sha> — context for why |

Do not

Do not trust names over behavior. validate() might mutate. get_x() might have side effects. Read the body.
Do not write a summary that sounds confident about guesses. "This function validates input" [guess] is very different from "[observed]." Tag it.
Do not summarize in isolation. The callers are half the story — they tell you which inputs actually occur and what's done with outputs.
Do not clean up the code while summarizing. Understand first, change later. Changing what you don't understand is how legacy code gets worse.

Output format

## <function/module name>

### Behavior [observed]
<what it does — traced from code, no interpretation>

### Intent [inferred]
<why — based on callers, structure, git history>

### Hazards [observed]
<footguns, silent failures, global state, thread-unsafety>

### Unknowns
<[unknown] and [guess] items — what a human with context should confirm>

### Evidence
<callers found, git commits consulted, tests referenced>

santosomar/legacy-code-summarizer

skills/code-analysis/legacy-code-summarizer/SKILL.md

Summarizes undocumented legacy code by inferring intent from structure, naming, data flow, and calling context — explicitly flagging what's inferred vs. what's certain. Use when onboarding to inherited code, when documentation is missing or wrong, or when deciding whether legacy code is safe to change.

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills legacy-code-summarizer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 3:57 AM45.9s1 file scanned

SKILL.md

name:: legacy-code-summarizer
description:: Summarizes undocumented legacy code by inferring intent from structure, naming, data flow, and calling context — explicitly flagging what's inferred vs. what's certain. Use when onboarding to inherited code, when documentation is missing or wrong, or when deciding whether legacy code is safe to change.
license:: Apache-2.0
category:: code-analysis
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: code-summarizer, code-comment-generator, pseudocode-extractor

Legacy Code Summarizer

Differs from → code-summarizer: that one assumes the code is readable. This one assumes it isn't.

Evidence sources — in order of reliability

Step 1 — Trace the data, not the names

Start from inputs, follow to outputs. Ignore what things are called; watch what happens to them.

int proc_rec(char *buf, int n) {       // name tells you nothing
    int k = 0;
    for (int i = 0; i < n; i++) {
        if (buf[i] == 0x1F) k++;        // 0x1F = ASCII unit separator
    }
    return k + 1;
}

Step 2 — Look at callers

int nf = proc_rec(line, len);
char **fields = malloc(nf * sizeof(char*));
split_on(line, len, 0x1F, fields);

Step 3 — Mark confidence explicitly

Every claim in the summary gets a confidence tag:

[observed] — this is what the code does, period.
[inferred] — this is probably why, based on callers/structure.
[guess] — name/comment suggests this, unverified.
[unknown] — can't tell. Flag for a human who knows the history.

Worked example output

Code: 400 lines of Perl, sub munge_data, no comments, written 2009.

Summary:

## munge_data($ref, $mode)

### Behavior [observed]
- Walks a nested hashref `$ref` in-place.
- In mode 'C': lowercases all string leaf values, trims whitespace.
- In mode 'V': validates leaf values against regexes in the package-global
  %PATTERNS (keyed by parent hash key). Dies on mismatch.
- In any other mode: no-op (falls through the if/elsif chain).
- Mutates `$ref` directly. Returns nothing useful ($ref, but caller ignores it).

### Intent [inferred]
- 'C' = "canonicalize", 'V' = "validate" — based on the operations, not the letters.
- Called in sequence: `munge_data($d, 'C'); munge_data($d, 'V');` in 3 of 4 call
  sites. The 4th calls only 'V' — that input is apparently pre-canonicalized.
- %PATTERNS is populated from `config/field_rules.txt` at module load.

### Hazards [observed]
- Any mode other than 'C'/'V' silently does nothing — no error. Typo in mode
  string = silent skip. [→ seen in git log: commit a3f891 "fix: was passing 'c'
  not 'C'" — this bit someone]
- Deep recursion, no depth limit. Cyclic `$ref` → infinite loop.
- %PATTERNS global — if two threads load different configs, races.

### Unknowns
- [unknown] Why mode 'V' dies instead of returning an error list. Every caller
  wraps it in eval{} — suggests dying was a mistake, worked around.
- [guess] The 4th call site (V-only, in `import_legacy.pl`) might be handling
  data that's already canonical from an upstream system. Or might be a bug.

Archaeology tools

Do not

Do not trust names over behavior. validate() might mutate. get_x() might have side effects. Read the body.
Do not write a summary that sounds confident about guesses. "This function validates input" [guess] is very different from "[observed]." Tag it.
Do not summarize in isolation. The callers are half the story — they tell you which inputs actually occur and what's done with outputs.
Do not clean up the code while summarizing. Understand first, change later. Changing what you don't understand is how legacy code gets worse.

Output format

## <function/module name>

### Behavior [observed]
<what it does — traced from code, no interpretation>

### Intent [inferred]
<why — based on callers, structure, git history>

### Hazards [observed]
<footguns, silent failures, global state, thread-unsafety>

### Unknowns
<[unknown] and [guess] items — what a human with context should confirm>

### Evidence
<callers found, git commits consulted, tests referenced>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/code-analysis/legacy-code-summarizer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT