Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

apecloud/kubeblocks-rebuild-replica

Name: kubeblocks-rebuild-replica
Author: apecloud

skills/kubeblocks-rebuild-replica/SKILL.md

npx skillsauth add apecloud/kubeblocks-skills kubeblocks-rebuild-replica

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Rebuild Failed Replica

Overview

Rebuild replica recovers a failed secondary instance by recreating its data from the primary or from a backup. Use this when:

Replica pod is in CrashLoopBackOff or unrecoverable
Data corruption on the replica (storage/volume issues)
Replication lag is irrecoverable or replication slot is corrupted
Replica cannot rejoin the replication group

Supported engines: MySQL (ApeCloud MySQL) and PostgreSQL only — engines with primary-secondary replication.

Official docs: MySQL | PostgreSQL

Workflow

- [ ] Step 1: Identify the failed replica
- [ ] Step 2: Choose rebuild source (from primary vs from backup)
- [ ] Step 3: Apply RebuildInstance OpsRequest (dry-run then apply)
- [ ] Step 4: Monitor and verify

Step 1: Identify the Failed Replica

Check pod status and roles:

kubectl get pods -n <ns> -l app.kubernetes.io/instance=<cluster> \
  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'

Identify the pod that is CrashLoopBackOff, Error, or has secondary role but is unhealthy. Note the component name (e.g. mysql, postgresql) from the Cluster spec.

Step 2: Choose Rebuild Source

| Source | When to use | |--------|-------------| | From primary | Primary is healthy; fastest option. Omit backupName. | | From backup | Primary unavailable or you need a specific point-in-time. Set backupName. |

List backups (if rebuilding from backup):

kubectl get backup -n <ns> -l app.kubernetes.io/instance=<cluster>

Step 3: Apply RebuildInstance OpsRequest

Rebuild from Primary

apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: rebuild-<cluster>-<pod>
  namespace: <ns>
spec:
  clusterName: <cluster>
  type: RebuildInstance
  rebuildFrom:
    - componentName: <component>
      instances:
        - name: <failed-pod-name>

Rebuild from Backup

apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: rebuild-<cluster>-<pod>
  namespace: <ns>
spec:
  clusterName: <cluster>
  type: RebuildInstance
  rebuildFrom:
    - componentName: <component>
      backupName: <backup-name>
      instances:
        - name: <failed-pod-name>

Optional: inPlace: true keeps the same pod name and recreates PVC; omit or false for non-in-place (new pod, then old one removed). Add force: true if preconditions block the operation.

Dry-run first:

kubectl apply -f rebuild-ops.yaml --dry-run=server

If dry-run succeeds, apply:

kubectl apply -f rebuild-ops.yaml
kubectl get ops rebuild-<cluster>-<pod> -n <ns> -w

Success condition: .status.phase = Succeed | Typical: 5–15 min | If stuck >20 min: kubectl describe ops <name> -n <ns>

Status progresses: Pending → Running → Succeed

Step 4: Verify

Confirm the replica pod is Running and has the secondary role:

kubectl get pods -n <ns> -l app.kubernetes.io/instance=<cluster> \
  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'

Verify replication:

# MySQL
kubectl exec -it <replica-pod> -n <ns> -- mysql -u root -p<password> -e "SHOW REPLICA STATUS\G"

# PostgreSQL
kubectl exec -it <primary-pod> -n <ns> -- psql -U postgres -c "SELECT * FROM pg_stat_replication;"

Troubleshooting

OpsRequest fails or stays Pending:

Ensure cluster is Running and no other OpsRequest is in progress
For backup source: verify backupName exists and is Completed
Check kubectl describe ops <name> -n <ns> for events

Replica still unhealthy after rebuild:

Inspect pod logs: kubectl logs <pod> -n <ns> --tail=100
Verify primary is healthy and reachable from the replica

Non-in-place: pod name changed:

Expected: old pod is replaced by a new one (e.g. mysql-0 → mysql-2). The cluster keeps the same replica count.

Additional Reference

For general agent safety conventions (dry-run, status confirmation, production protection), see safety-patterns.md.

apecloud/kubeblocks-rebuild-replica

skills/kubeblocks-rebuild-replica/SKILL.md

Rebuild a failed replica in MySQL or PostgreSQL clusters managed by KubeBlocks. Use when a replica's data is corrupted, the pod is in CrashLoopBackOff, replication is broken, or you need to recover or repair a secondary instance. NOT for planned switchover (see switchover) or full cluster restore (see restore).

2 stars

development

Updated Apr 18, 2026

$ install --global

skillsauth

npx skillsauth add apecloud/kubeblocks-skills kubeblocks-rebuild-replica

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 18, 2026, 8:00 AM15.8s1 file scanned

SKILL.md

name:: kubeblocks-rebuild-replica
version:: 0.1.0
description:: Rebuild a failed replica in MySQL or PostgreSQL clusters managed by KubeBlocks. Use when a replica's data is corrupted, the pod is in CrashLoopBackOff, replication is broken, or you need to recover or repair a secondary instance. NOT for planned switchover (see switchover) or full cluster restore (see restore).

Rebuild Failed Replica

Overview

Rebuild replica recovers a failed secondary instance by recreating its data from the primary or from a backup. Use this when:

Replica pod is in CrashLoopBackOff or unrecoverable
Data corruption on the replica (storage/volume issues)
Replication lag is irrecoverable or replication slot is corrupted
Replica cannot rejoin the replication group

Supported engines: MySQL (ApeCloud MySQL) and PostgreSQL only — engines with primary-secondary replication.

Official docs: MySQL | PostgreSQL

Workflow

- [ ] Step 1: Identify the failed replica
- [ ] Step 2: Choose rebuild source (from primary vs from backup)
- [ ] Step 3: Apply RebuildInstance OpsRequest (dry-run then apply)
- [ ] Step 4: Monitor and verify

Step 1: Identify the Failed Replica

Check pod status and roles:

kubectl get pods -n <ns> -l app.kubernetes.io/instance=<cluster> \
  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'

Identify the pod that is CrashLoopBackOff, Error, or has secondary role but is unhealthy. Note the component name (e.g. mysql, postgresql) from the Cluster spec.

Step 2: Choose Rebuild Source

List backups (if rebuilding from backup):

kubectl get backup -n <ns> -l app.kubernetes.io/instance=<cluster>

Step 3: Apply RebuildInstance OpsRequest

Rebuild from Primary

apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: rebuild-<cluster>-<pod>
  namespace: <ns>
spec:
  clusterName: <cluster>
  type: RebuildInstance
  rebuildFrom:
    - componentName: <component>
      instances:
        - name: <failed-pod-name>

Rebuild from Backup

apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  name: rebuild-<cluster>-<pod>
  namespace: <ns>
spec:
  clusterName: <cluster>
  type: RebuildInstance
  rebuildFrom:
    - componentName: <component>
      backupName: <backup-name>
      instances:
        - name: <failed-pod-name>

Optional: inPlace: true keeps the same pod name and recreates PVC; omit or false for non-in-place (new pod, then old one removed). Add force: true if preconditions block the operation.

Dry-run first:

kubectl apply -f rebuild-ops.yaml --dry-run=server

If dry-run succeeds, apply:

kubectl apply -f rebuild-ops.yaml
kubectl get ops rebuild-<cluster>-<pod> -n <ns> -w

Success condition: .status.phase = Succeed | Typical: 5–15 min | If stuck >20 min: kubectl describe ops <name> -n <ns>

Status progresses: Pending → Running → Succeed

Step 4: Verify

Confirm the replica pod is Running and has the secondary role:

kubectl get pods -n <ns> -l app.kubernetes.io/instance=<cluster> \
  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'

Verify replication:

# MySQL
kubectl exec -it <replica-pod> -n <ns> -- mysql -u root -p<password> -e "SHOW REPLICA STATUS\G"

# PostgreSQL
kubectl exec -it <primary-pod> -n <ns> -- psql -U postgres -c "SELECT * FROM pg_stat_replication;"

Troubleshooting

OpsRequest fails or stays Pending:

Ensure cluster is Running and no other OpsRequest is in progress
For backup source: verify backupName exists and is Completed
Check kubectl describe ops <name> -n <ns> for events

Replica still unhealthy after rebuild:

Inspect pod logs: kubectl logs <pod> -n <ns> --tail=100
Verify primary is healthy and reachable from the replica

Non-in-place: pod name changed:

Expected: old pod is replaced by a new one (e.g. mysql-0 → mysql-2). The cluster keeps the same replica count.

Additional Reference

For general agent safety conventions (dry-run, status confirmation, production protection), see safety-patterns.md.

Related Skills

apecloud/kubeblocks-volume-expansion

devops

VerifiedTrustedCommunity

Expand persistent volume storage for KubeBlocks database clusters via OpsRequest. Requires the StorageClass to support volume expansion (allowVolumeExpansion=true). Use when the user needs more disk space, wants to increase storage, expand volumes, or resize PVCs. NOT for changing CPU/memory (see vertical-scaling) or adding more replicas (see horizontal-scaling). Note that volume shrinking is not supported by Kubernetes.

2SKILL.mdUpdated Apr 18, 2026

apecloud/kubeblocks-volume-expansion

apecloud/kubeblocks-vertical-scaling

data-ai

VerifiedTrustedCommunity

Scale CPU and memory resources for KubeBlocks database clusters via OpsRequest (vertical scaling). Supports in-place updates when the feature gate is enabled. Use when the user wants to change, increase, decrease, resize, or adjust CPU or memory resources of a database cluster. NOT for adding/removing replicas or shards (see horizontal-scaling) or expanding disk storage (see volume-expansion).

2SKILL.mdUpdated Apr 18, 2026

apecloud/kubeblocks-vertical-scaling

apecloud/kubeblocks-upgrade

data-ai

VerifiedTrustedCommunity

Upgrade the KubeBlocks operator itself via Helm. Covers update operator, upgrade to v1.0, update kubeblocks version, and CRD updates. Use when the user wants to upgrade KubeBlocks, update the operator, or upgrade to a new KubeBlocks release. NOT for upgrading database engine versions (see minor-version-upgrade).

2SKILL.mdUpdated Apr 18, 2026

apecloud/kubeblocks-upgrade

apecloud/kubeblocks-troubleshoot

development

VerifiedTrustedCommunity

Diagnostic guide for KubeBlocks-managed database clusters. Use when the user reports troubleshoot, debug, diagnose, not working, error, failed, stuck, CrashLoopBackOff, cluster exception, or similar problems with their database cluster. This skill guides the agent through diagnostic steps — it does NOT perform actions.

2SKILL.mdUpdated Apr 18, 2026

apecloud/kubeblocks-troubleshoot

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/apecloud/kubeblocks-skills.git

# Copy into Claude Code skills folder (global)
cp -r kubeblocks-skills/skills/kubeblocks-rebuild-replica ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

apecloud/kubeblocks-skills

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT