Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/rollback-strategy-advisor

Name: rollback-strategy-advisor
Author: santosomar

skills/devops/rollback-strategy-advisor/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills rollback-strategy-advisor

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Rollback Strategy Advisor

"Just deploy the old version" only works if the new version didn't change anything the old version depends on. It usually did.

The reversibility question

Before choosing a strategy, classify what the bad deploy changed:

| Change type | Reversible by redeploying old code? | Why / why not | | --------------------------------- | ----------------------------------- | ------------------------------------------------------ | | Stateless code only | ✅ Yes | Old code runs on old state; no state changed | | Additive schema (new column, new table) | ✅ Yes | Old code ignores the new column | | Destructive schema (drop column, rename) | ❌ No | Old code expects the column that's gone | | Additive data (new rows) | ⚠️ Usually | Unless the new rows confuse old code's queries | | Mutated data (UPDATE existing rows) | ❌ No | Old code expects old data shape; you need data repair | | New external side effects (emails sent, payments made) | ❌ Never | Can't unsend. Compensating actions only. | | Config only | ✅ Yes | Revert the config | | Feature flag flip | ✅ Instantly | Flip it back — this is why flags exist |

Decision tree

Is there a feature flag gating the bad behavior?
  ├─ YES → Kill the flag. Done in seconds. Investigate at leisure.
  └─ NO →
     Did the deploy change data/schema?
       ├─ NO → Redeploy previous artifact. Done.
       └─ YES →
          Was the change additive-only?
            ├─ YES → Redeploy previous artifact. Clean up schema later.
            └─ NO →
               Can you roll FORWARD (fix is small, well-understood)?
                 ├─ YES → Roll forward. Faster than untangling.
                 └─ NO → You're in data repair. See below.

Data repair (the hard case)

When old code can't run on new data:

Stop the bleeding. Feature flag, maintenance page, or traffic drain — whatever stops more data from being mutated.
Snapshot. Before you touch anything. pg_dump / volume snapshot. You will want this when your repair script has a bug.
Assess scope. How many rows? SELECT count(*) WHERE <mutated-condition>. 10 rows is a manual fix. 10 million is a migration.
Repair or compensate:
- Reversible mutation → write the inverse UPDATE
- Irreversible mutation → restore affected rows from snapshot/backup
- External side effects → compensating action (refund, apology email, manual ticket)
Then redeploy old code.

Design-time advice — make rollbacks boring

If you're not mid-incident, the best advice is to make the next deploy reversible:

| Technique | Makes rollback trivial when | | ---------------------------- | -------------------------------------------------------- | | Feature flags | You're shipping a behavior change — gate it | | Expand-contract migrations | You're changing schema — add new alongside old, migrate, remove old in a later deploy | | Dual-write period | You're changing data format — write both formats until new code is stable | | Immutable artifacts by SHA | You're deploying — image:abc123 can always be re-deployed; image:latest can't | | Backward-compatible APIs | You're changing an interface — new version reads old format |

Worked example

Situation: Deployed v2.4.0 at 14:00. At 14:20, error rate spikes. v2.4.0 added a NOT NULL column users.tenant_id with a default, and the new code reads it.

Reversibility check: Schema change was additive (new column with default) → old code should ignore it. ✅ Reversible.

Wait — check the migration. It ran ALTER TABLE users ADD COLUMN tenant_id ... NOT NULL DEFAULT 1. But old code does INSERT INTO users (...) without tenant_id. Does NOT NULL DEFAULT allow that? Yes — the default fires. ✅ Still reversible.

Action: Redeploy v2.3.9. kubectl rollout undo deployment/app. The column stays; old code ignores it. Clean up never — the column is fine, the bug is elsewhere in v2.4.0.

Post-mortem note: If the migration had been NOT NULL without a default, old code's INSERTs would fail. That would have been a non-reversible schema change masquerading as additive.

Do not

Do not roll back before you understand what changed. Rolling back into a broken state is worse than being broken in a known state.
Do not roll forward under pressure with an untested fix. Rolling forward is for when you know the fix; it's not a license to deploy a guess.
Do not skip the snapshot before data repair. The repair will have a bug. It always does.
Do not DROP the new column/table during rollback. Leave it. It's harmless, and the next deploy attempt will need it.
Do not design a rollback strategy during an incident. Design it when you design the deploy. → cd-pipeline-generator should include the undo path.

Output format

Incident mode:

## Reversibility
<change type> → <reversible: yes/no/partial>

## Recommended action
<flag kill | redeploy | roll forward | data repair>

## Steps
1. ...

## If this doesn't work
<next fallback>

Design mode:

## This deploy's reversibility class
<from the table>

## To make it cheaply reversible
<specific technique: flag / expand-contract / dual-write>

## Rollback command (pre-written — paste during incident)
<exact command>

santosomar/rollback-strategy-advisor

skills/devops/rollback-strategy-advisor/SKILL.md

Advises on rollback strategies by analyzing what a deploy changes — recommending revert, roll-forward, feature-flag kill, or data repair depending on reversibility. Use during an incident when a deploy went bad, when designing a deploy pipeline and the user asks how to make it reversible, or when a migration needs an undo plan.

testing

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills rollback-strategy-advisor

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:20 AM70.5s1 file scanned

SKILL.md

name:: rollback-strategy-advisor
description:: Advises on rollback strategies by analyzing what a deploy changes — recommending revert, roll-forward, feature-flag kill, or data repair depending on reversibility. Use during an incident when a deploy went bad, when designing a deploy pipeline and the user asks how to make it reversible, or when a migration needs an undo plan.
license:: Apache-2.0
category:: devops
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: cd-pipeline-generator, config-consistency-checker

Rollback Strategy Advisor

"Just deploy the old version" only works if the new version didn't change anything the old version depends on. It usually did.

The reversibility question

Before choosing a strategy, classify what the bad deploy changed:

Decision tree

Is there a feature flag gating the bad behavior?
  ├─ YES → Kill the flag. Done in seconds. Investigate at leisure.
  └─ NO →
     Did the deploy change data/schema?
       ├─ NO → Redeploy previous artifact. Done.
       └─ YES →
          Was the change additive-only?
            ├─ YES → Redeploy previous artifact. Clean up schema later.
            └─ NO →
               Can you roll FORWARD (fix is small, well-understood)?
                 ├─ YES → Roll forward. Faster than untangling.
                 └─ NO → You're in data repair. See below.

Data repair (the hard case)

When old code can't run on new data:

Stop the bleeding. Feature flag, maintenance page, or traffic drain — whatever stops more data from being mutated.
Snapshot. Before you touch anything. pg_dump / volume snapshot. You will want this when your repair script has a bug.
Assess scope. How many rows? SELECT count(*) WHERE <mutated-condition>. 10 rows is a manual fix. 10 million is a migration.
Repair or compensate:
- Reversible mutation → write the inverse UPDATE
- Irreversible mutation → restore affected rows from snapshot/backup
- External side effects → compensating action (refund, apology email, manual ticket)
Then redeploy old code.

Design-time advice — make rollbacks boring

If you're not mid-incident, the best advice is to make the next deploy reversible:

Worked example

Situation: Deployed v2.4.0 at 14:00. At 14:20, error rate spikes. v2.4.0 added a NOT NULL column users.tenant_id with a default, and the new code reads it.

Reversibility check: Schema change was additive (new column with default) → old code should ignore it. ✅ Reversible.

Action: Redeploy v2.3.9. kubectl rollout undo deployment/app. The column stays; old code ignores it. Clean up never — the column is fine, the bug is elsewhere in v2.4.0.

Post-mortem note: If the migration had been NOT NULL without a default, old code's INSERTs would fail. That would have been a non-reversible schema change masquerading as additive.

Do not

Do not roll back before you understand what changed. Rolling back into a broken state is worse than being broken in a known state.
Do not roll forward under pressure with an untested fix. Rolling forward is for when you know the fix; it's not a license to deploy a guess.
Do not skip the snapshot before data repair. The repair will have a bug. It always does.
Do not DROP the new column/table during rollback. Leave it. It's harmless, and the next deploy attempt will need it.
Do not design a rollback strategy during an incident. Design it when you design the deploy. → cd-pipeline-generator should include the undo path.

Output format

Incident mode:

## Reversibility
<change type> → <reversible: yes/no/partial>

## Recommended action
<flag kill | redeploy | roll forward | data repair>

## Steps
1. ...

## If this doesn't work
<next fallback>

Design mode:

## This deploy's reversibility class
<from the table>

## To make it cheaply reversible
<specific technique: flag / expand-contract / dual-write>

## Rollback command (pre-written — paste during incident)
<exact command>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/devops/rollback-strategy-advisor ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT