Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

scitix/cluster-events

Name: cluster-events
Author: scitix

skills/core/cluster-events/SKILL.md

npx skillsauth add scitix/siclaw cluster-events

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Cluster Events Analysis

Use this flow to analyze cluster-wide events for identifying issues, patterns, and correlations across resources.

Scope: This skill is for analysis and diagnosis only. It helps you understand what is happening across the cluster by examining events. Do NOT attempt to fix issues directly — identify root causes and either use a specific diagnostic skill or report findings to the user.

Diagnostic Flow

1. Get recent events

Get all events sorted by time, focusing on Warning events:

kubectl get events -A --sort-by='.lastTimestamp' --field-selector type=Warning

If you need all event types for context:

kubectl get events -A --sort-by='.lastTimestamp'

For events in a specific namespace:

kubectl get events -n <ns> --sort-by='.lastTimestamp'

2. Identify high-frequency events

Look for events with high COUNT values — these indicate repeated occurrences and often point to persistent issues.

For a structured view:

kubectl get events -A --field-selector type=Warning -o custom-columns='LAST SEEN:.lastTimestamp,COUNT:.count,KIND:.involvedObject.kind,NAME:.involvedObject.name,NAMESPACE:.involvedObject.namespace,REASON:.reason,MESSAGE:.message'

3. Correlate events by resource

When you find Warning events, check if the same resource has related events that tell a more complete story:

kubectl get events -n <ns> --field-selector involvedObject.name=<resource-name>

4. Match event patterns and recommend next steps

Match the Warning events against the patterns below. For each matched pattern, recommend the appropriate diagnostic skill or action.

`FailedScheduling` — Pod cannot be scheduled

The scheduler cannot place a pod on any node.

Next step: Use the pod-pending-debug skill to diagnose the specific pod. If the pod has a scheduling.volcano.sh/pod-group annotation (managed by Volcano scheduler), use volcano-diagnose-pod skill instead for Volcano-specific issues (PodGroup, Queue, Gang scheduling).

`BackOff` / `Back-off restarting failed container` — Container crash loop

A container is repeatedly crashing and restarting.

Next step: Use the pod-crash-debug skill to diagnose the specific pod.

`Failed` / `ErrImagePull` / `ImagePullBackOff` — Image pull failure

The container image cannot be pulled.

Next step: Use the image-pull-debug skill to diagnose the specific pod.

`FailedMount` / `FailedAttachVolume` — Volume mount failure

A volume (PVC, ConfigMap, Secret, or other) cannot be mounted.

Check the specific error message:

not found — the referenced ConfigMap/Secret/PVC does not exist
already attached — the volume is stuck on another node (common with RWO PVs)
timed out waiting — the storage provisioner is slow or failing

`Unhealthy` — Probe failure

A liveness or readiness probe is failing.

Check which probe is failing from the event message:

Liveness probe failed — the container will be restarted, may lead to CrashLoopBackOff
Readiness probe failed — the container is removed from service endpoints but not restarted
Startup probe failed — the container is killed during startup

Advise the user to check probe configuration (endpoint, port, timing parameters).

`NodeNotReady` — Node became unhealthy

A node transitioned to NotReady state, which may affect all pods on that node.

Next step: Use the node-health-check skill to diagnose the specific node.

`Evicted` — Pod was evicted

A pod was evicted from a node, typically due to resource pressure (DiskPressure, MemoryPressure).

Check which node evicted the pod and investigate node health:

kubectl get pod <pod> -n <ns> -o jsonpath='{.status.reason} {.status.message}'

`FailedCreate` — Controller cannot create pods

A ReplicaSet, Job, or other controller cannot create pods. Common causes: resource quota exceeded, admission webhook rejection.

Check the controller's events:

kubectl describe rs <replicaset> -n <ns>

`OOMKilling` — Kernel OOM killer invoked

The kernel killed a process due to memory exhaustion. This may affect containers on the node.

Next step: Use the pod-crash-debug skill for the affected pod, or node-health-check for the node.

Notes

Kubernetes events have a default TTL of 1 hour. For older events, check monitoring/logging systems.
Events with count > 1 show the first and last timestamp — the actual frequency may be higher than it appears.
When multiple Warning events appear simultaneously across different resources, look for a common cause (e.g., a node going down affects all pods on that node).

scitix/cluster-events

skills/core/cluster-events/SKILL.md

Analyze cluster-wide Kubernetes events to identify issues and patterns. Aggregates Warning events, detects high-frequency patterns, and correlates related events.

88 stars

devops

Updated Apr 12, 2026

$ install --global

skillsauth

npx skillsauth add scitix/siclaw cluster-events

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 12, 2026, 11:44 PM51.6s1 file scanned

SKILL.md

name:: cluster-events
description:: >-

Cluster Events Analysis

Use this flow to analyze cluster-wide events for identifying issues, patterns, and correlations across resources.

Diagnostic Flow

1. Get recent events

Get all events sorted by time, focusing on Warning events:

kubectl get events -A --sort-by='.lastTimestamp' --field-selector type=Warning

If you need all event types for context:

kubectl get events -A --sort-by='.lastTimestamp'

For events in a specific namespace:

kubectl get events -n <ns> --sort-by='.lastTimestamp'

2. Identify high-frequency events

Look for events with high COUNT values — these indicate repeated occurrences and often point to persistent issues.

For a structured view:

kubectl get events -A --field-selector type=Warning -o custom-columns='LAST SEEN:.lastTimestamp,COUNT:.count,KIND:.involvedObject.kind,NAME:.involvedObject.name,NAMESPACE:.involvedObject.namespace,REASON:.reason,MESSAGE:.message'

3. Correlate events by resource

When you find Warning events, check if the same resource has related events that tell a more complete story:

kubectl get events -n <ns> --field-selector involvedObject.name=<resource-name>

4. Match event patterns and recommend next steps

Match the Warning events against the patterns below. For each matched pattern, recommend the appropriate diagnostic skill or action.

`FailedScheduling` — Pod cannot be scheduled

The scheduler cannot place a pod on any node.

`BackOff` / `Back-off restarting failed container` — Container crash loop

A container is repeatedly crashing and restarting.

Next step: Use the pod-crash-debug skill to diagnose the specific pod.

`Failed` / `ErrImagePull` / `ImagePullBackOff` — Image pull failure

The container image cannot be pulled.

Next step: Use the image-pull-debug skill to diagnose the specific pod.

`FailedMount` / `FailedAttachVolume` — Volume mount failure

A volume (PVC, ConfigMap, Secret, or other) cannot be mounted.

Check the specific error message:

not found — the referenced ConfigMap/Secret/PVC does not exist
already attached — the volume is stuck on another node (common with RWO PVs)
timed out waiting — the storage provisioner is slow or failing

`Unhealthy` — Probe failure

A liveness or readiness probe is failing.

Check which probe is failing from the event message:

Liveness probe failed — the container will be restarted, may lead to CrashLoopBackOff
Readiness probe failed — the container is removed from service endpoints but not restarted
Startup probe failed — the container is killed during startup

Advise the user to check probe configuration (endpoint, port, timing parameters).

`NodeNotReady` — Node became unhealthy

A node transitioned to NotReady state, which may affect all pods on that node.

Next step: Use the node-health-check skill to diagnose the specific node.

`Evicted` — Pod was evicted

A pod was evicted from a node, typically due to resource pressure (DiskPressure, MemoryPressure).

Check which node evicted the pod and investigate node health:

kubectl get pod <pod> -n <ns> -o jsonpath='{.status.reason} {.status.message}'

`FailedCreate` — Controller cannot create pods

A ReplicaSet, Job, or other controller cannot create pods. Common causes: resource quota exceeded, admission webhook rejection.

Check the controller's events:

kubectl describe rs <replicaset> -n <ns>

`OOMKilling` — Kernel OOM killer invoked

The kernel killed a process due to memory exhaustion. This may affect containers on the node.

Next step: Use the pod-crash-debug skill for the affected pod, or node-health-check for the node.

Notes

Kubernetes events have a default TTL of 1 hour. For older events, check monitoring/logging systems.
Events with count > 1 show the first and last timestamp — the actual frequency may be higher than it appears.
When multiple Warning events appear simultaneously across different resources, look for a common cause (e.g., a node going down affects all pods on that node).

Related Skills

scitix/skill-authoring

development

VerifiedTrustedCommunity

Guide for writing and improving Siclaw skills. Read this when creating or modifying a skill. Covers skill directory layout, SKILL.md format, script execution modes, and best practices.

207SKILL.mdUpdated Apr 23, 2026

scitix/skill-authoring

scitix/manage-skill

development

VerifiedTrustedCommunity

Guides the user to the Siclaw Web page to manage Skills. Use this guide when the user requests to create, edit, or view a Skill in a Channel conversation.

88SKILL.mdUpdated Apr 12, 2026

scitix/volcano-scheduler-logs

development

VerifiedTrustedCommunity

Retrieve and analyze Volcano scheduler logs. Filter by keyword, time range, or pod name to debug scheduling decisions.

88SKILL.mdUpdated Apr 12, 2026

scitix/volcano-scheduler-logs

scitix/volcano-scheduler-config

tools

VerifiedTrustedCommunity

View Volcano scheduler configuration. Check scheduler ConfigMap, actions, plugins, and tier settings.

88SKILL.mdUpdated Apr 12, 2026

scitix/volcano-scheduler-config

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/scitix/siclaw.git

# Copy into Claude Code skills folder (global)
cp -r siclaw/skills/core/cluster-events ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

scitix/siclaw

88 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

scitix/cluster-events

$ install --global

Security Scan Results

SKILL.md

Cluster Events Analysis

Diagnostic Flow

1. Get recent events

2. Identify high-frequency events

3. Correlate events by resource

4. Match event patterns and recommend next steps

FailedScheduling — Pod cannot be scheduled

BackOff / Back-off restarting failed container — Container crash loop

Failed / ErrImagePull / ImagePullBackOff — Image pull failure

FailedMount / FailedAttachVolume — Volume mount failure

Unhealthy — Probe failure

NodeNotReady — Node became unhealthy

Evicted — Pod was evicted

FailedCreate — Controller cannot create pods

OOMKilling — Kernel OOM killer invoked

Notes

Related Skills

scitix/skill-authoring

scitix/manage-skill

scitix/volcano-scheduler-logs

scitix/volcano-scheduler-config

scitix/cluster-events

$ install --global

Security Scan Results

SKILL.md

Cluster Events Analysis

Diagnostic Flow

1. Get recent events

2. Identify high-frequency events

3. Correlate events by resource

4. Match event patterns and recommend next steps

FailedScheduling — Pod cannot be scheduled

BackOff / Back-off restarting failed container — Container crash loop

Failed / ErrImagePull / ImagePullBackOff — Image pull failure

FailedMount / FailedAttachVolume — Volume mount failure

Unhealthy — Probe failure

NodeNotReady — Node became unhealthy

Evicted — Pod was evicted

FailedCreate — Controller cannot create pods

OOMKilling — Kernel OOM killer invoked

Notes

Related Skills

scitix/skill-authoring

scitix/manage-skill

scitix/volcano-scheduler-logs

scitix/volcano-scheduler-config

`FailedScheduling` — Pod cannot be scheduled

`BackOff` / `Back-off restarting failed container` — Container crash loop

`Failed` / `ErrImagePull` / `ImagePullBackOff` — Image pull failure

`FailedMount` / `FailedAttachVolume` — Volume mount failure

`Unhealthy` — Probe failure

`NodeNotReady` — Node became unhealthy

`Evicted` — Pod was evicted

`FailedCreate` — Controller cannot create pods

`OOMKilling` — Kernel OOM killer invoked

`FailedScheduling` — Pod cannot be scheduled

`BackOff` / `Back-off restarting failed container` — Container crash loop

`Failed` / `ErrImagePull` / `ImagePullBackOff` — Image pull failure

`FailedMount` / `FailedAttachVolume` — Volume mount failure

`Unhealthy` — Probe failure

`NodeNotReady` — Node became unhealthy

`Evicted` — Pod was evicted

`FailedCreate` — Controller cannot create pods

`OOMKilling` — Kernel OOM killer invoked