Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

scitix/hpa-debug

Name: hpa-debug
Author: scitix

skills/core/hpa-debug/SKILL.md

npx skillsauth add scitix/siclaw hpa-debug

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

HPA Autoscaling Failure Diagnosis

When a HorizontalPodAutoscaler is not scaling as expected — stuck at min/max replicas, showing <unknown> metrics, or not responding to load — follow this flow to identify the root cause.

Scope: This skill is for diagnosis only. Once you identify the root cause, report it to the user and stop. Do NOT attempt to modify HPA settings, resource requests, or metrics configuration — that should be left to the user.

Diagnostic Flow

1. Check HPA status

kubectl get hpa <hpa-name> -n <ns>

Note:

TARGETS — current value vs target (e.g., 80%/50%). If showing <unknown>/50%, metrics are unavailable.
MINPODS / MAXPODS — scaling bounds
REPLICAS — current replica count

2. Describe the HPA

kubectl describe hpa <hpa-name> -n <ns>

Focus on:

Conditions — look for AbleToScale, ScalingActive, and ScalingLimited with their status and reason
Events — scaling decisions, metric fetch errors, or rate-limiting messages
Metrics — each metric's current value, target, and type (resource, custom, external)

3. Check metrics-server health

kubectl get pods -n kube-system -l k8s-app=metrics-server -o wide

If no pods are found, try:

kubectl get deployment -n kube-system metrics-server

Verify metrics-server is serving data:

kubectl top pods -n <ns>

If kubectl top returns error: Metrics API not available, metrics-server is not working.

4. Check target resource requests

HPA with CPU/memory percentage targets requires resources.requests to be set on the target containers.

kubectl get deployment <target-name> -n <ns> -o jsonpath='{range .spec.template.spec.containers[*]}{.name}: cpu={.resources.requests.cpu} memory={.resources.requests.memory}{"\n"}{end}'

If requests are not set, the HPA cannot calculate utilization percentages.

5. Match patterns and conclude

`<unknown>` metrics — Metrics not available

The HPA cannot fetch metrics for one or more targets.

Common causes:

metrics-server not installed or not running — check step 3
metrics-server not ready yet — it may take a few minutes after startup to collect data
No resource requests set — CPU/memory percentage targets require resources.requests on the target pods (step 4)
Custom metrics adapter missing — if the HPA uses custom or external metrics, the corresponding adapter (e.g., Prometheus adapter) must be installed

Advise the user to check metrics-server health and ensure resource requests are set.

`ScalingActive: False` — HPA cannot scale

The HPA has been disabled or cannot function.

Check the reason in kubectl describe hpa:

FailedGetResourceMetric — cannot fetch resource metrics (metrics-server issue)
FailedGetExternalMetric — cannot fetch external metrics (adapter issue)
InvalidMetricSourceType — the metric source type is not recognized

`ScalingLimited: True` — At min or max replicas

The HPA wants to scale but is constrained by minReplicas or maxReplicas.

Check the reason:

TooFewReplicas — HPA wants to scale down below minReplicas
TooManyReplicas — HPA wants to scale up above maxReplicas
DesiredWithinRange — current replicas are within bounds (normal)

If the HPA is stuck at maxReplicas and load is still high, advise the user to increase maxReplicas or investigate why the application needs so many replicas (possible performance issue).

HPA flapping (scaling up and down repeatedly) — Unstable metrics

The HPA keeps oscillating between replica counts.

Check the events for rapid scale-up/scale-down cycles. Common causes:

Metric target too close to normal usage — small load changes trigger scaling
Application slow to start — new pods take time to become effective, causing the HPA to scale up further before they help

The HPA has a default stabilization window (5 minutes for scale-down). Check if custom behavior is set:

kubectl get hpa <hpa-name> -n <ns> -o jsonpath='{.spec.behavior}'

Advise the user to adjust the stabilization window or the target utilization.

HPA not scaling up under load — Target not reached

The HPA sees metrics below the target threshold, so it does not scale.

Verify the actual pod resource usage:

kubectl top pods -n <ns> -l <selector>

Compare with the HPA target. If actual usage is below the target even under load:

The resource requests may be set too high (actual usage is a low percentage of requests)
The metric source may not reflect actual load

Multiple HPAs targeting the same workload — Conflicting autoscalers

Only one HPA should target a given Deployment/StatefulSet. Multiple HPAs on the same target cause unpredictable behavior.

kubectl get hpa -n <ns>

Check if multiple HPAs reference the same scaleTargetRef. Advise the user to consolidate into a single HPA with multiple metrics.

Notes

HPA evaluates metrics every 15 seconds by default (controlled by --horizontal-pod-autoscaler-sync-period on the controller manager).
Scale-down has a default stabilization window of 5 minutes to prevent flapping. Scale-up defaults to 0 (immediate).
For HPAs using autoscaling/v2, multiple metrics can be specified. The HPA scales to the highest recommended replica count across all metrics.
If the target Deployment is managed by ArgoCD or Helm, the HPA replica count may be overwritten on sync — check if the Deployment's replicas field is managed externally.

scitix/hpa-debug

skills/core/hpa-debug/SKILL.md

Diagnose HorizontalPodAutoscaler failures (not scaling, metrics unavailable, target mismatch). Checks HPA status, metrics-server health, and scaling events to identify why autoscaling is not working.

88 stars

development

Updated Apr 12, 2026

$ install --global

skillsauth

npx skillsauth add scitix/siclaw hpa-debug

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 25, 2026, 12:02 AM3.2s1 file scanned

SKILL.md

name:: hpa-debug
description:: >-

HPA Autoscaling Failure Diagnosis

When a HorizontalPodAutoscaler is not scaling as expected — stuck at min/max replicas, showing <unknown> metrics, or not responding to load — follow this flow to identify the root cause.

Diagnostic Flow

1. Check HPA status

kubectl get hpa <hpa-name> -n <ns>

Note:

TARGETS — current value vs target (e.g., 80%/50%). If showing <unknown>/50%, metrics are unavailable.
MINPODS / MAXPODS — scaling bounds
REPLICAS — current replica count

2. Describe the HPA

kubectl describe hpa <hpa-name> -n <ns>

Focus on:

Conditions — look for AbleToScale, ScalingActive, and ScalingLimited with their status and reason
Events — scaling decisions, metric fetch errors, or rate-limiting messages
Metrics — each metric's current value, target, and type (resource, custom, external)

3. Check metrics-server health

kubectl get pods -n kube-system -l k8s-app=metrics-server -o wide

If no pods are found, try:

kubectl get deployment -n kube-system metrics-server

Verify metrics-server is serving data:

kubectl top pods -n <ns>

If kubectl top returns error: Metrics API not available, metrics-server is not working.

4. Check target resource requests

HPA with CPU/memory percentage targets requires resources.requests to be set on the target containers.

kubectl get deployment <target-name> -n <ns> -o jsonpath='{range .spec.template.spec.containers[*]}{.name}: cpu={.resources.requests.cpu} memory={.resources.requests.memory}{"\n"}{end}'

If requests are not set, the HPA cannot calculate utilization percentages.

5. Match patterns and conclude

`<unknown>` metrics — Metrics not available

The HPA cannot fetch metrics for one or more targets.

Common causes:

metrics-server not installed or not running — check step 3
metrics-server not ready yet — it may take a few minutes after startup to collect data
No resource requests set — CPU/memory percentage targets require resources.requests on the target pods (step 4)
Custom metrics adapter missing — if the HPA uses custom or external metrics, the corresponding adapter (e.g., Prometheus adapter) must be installed

Advise the user to check metrics-server health and ensure resource requests are set.

`ScalingActive: False` — HPA cannot scale

The HPA has been disabled or cannot function.

Check the reason in kubectl describe hpa:

FailedGetResourceMetric — cannot fetch resource metrics (metrics-server issue)
FailedGetExternalMetric — cannot fetch external metrics (adapter issue)
InvalidMetricSourceType — the metric source type is not recognized

`ScalingLimited: True` — At min or max replicas

The HPA wants to scale but is constrained by minReplicas or maxReplicas.

Check the reason:

TooFewReplicas — HPA wants to scale down below minReplicas
TooManyReplicas — HPA wants to scale up above maxReplicas
DesiredWithinRange — current replicas are within bounds (normal)

If the HPA is stuck at maxReplicas and load is still high, advise the user to increase maxReplicas or investigate why the application needs so many replicas (possible performance issue).

HPA flapping (scaling up and down repeatedly) — Unstable metrics

The HPA keeps oscillating between replica counts.

Check the events for rapid scale-up/scale-down cycles. Common causes:

Metric target too close to normal usage — small load changes trigger scaling
Application slow to start — new pods take time to become effective, causing the HPA to scale up further before they help

The HPA has a default stabilization window (5 minutes for scale-down). Check if custom behavior is set:

kubectl get hpa <hpa-name> -n <ns> -o jsonpath='{.spec.behavior}'

Advise the user to adjust the stabilization window or the target utilization.

HPA not scaling up under load — Target not reached

The HPA sees metrics below the target threshold, so it does not scale.

Verify the actual pod resource usage:

kubectl top pods -n <ns> -l <selector>

Compare with the HPA target. If actual usage is below the target even under load:

The resource requests may be set too high (actual usage is a low percentage of requests)
The metric source may not reflect actual load

Multiple HPAs targeting the same workload — Conflicting autoscalers

Only one HPA should target a given Deployment/StatefulSet. Multiple HPAs on the same target cause unpredictable behavior.

kubectl get hpa -n <ns>

Check if multiple HPAs reference the same scaleTargetRef. Advise the user to consolidate into a single HPA with multiple metrics.

Notes

HPA evaluates metrics every 15 seconds by default (controlled by --horizontal-pod-autoscaler-sync-period on the controller manager).
Scale-down has a default stabilization window of 5 minutes to prevent flapping. Scale-up defaults to 0 (immediate).
For HPAs using autoscaling/v2, multiple metrics can be specified. The HPA scales to the highest recommended replica count across all metrics.
If the target Deployment is managed by ArgoCD or Helm, the HPA replica count may be overwritten on sync — check if the Deployment's replicas field is managed externally.

Related Skills

scitix/gateway-diagnostics

testing

VerifiedTrustedCommunity

Show and ping the gateway of a network interface, on a Kubernetes node or inside a pod's network namespace. Auto-detects the gateway from the routing table (ip -j route), reports interface type (RoCE / Ethernet / IB), and tests reachability with ping. Use for default-route / gateway questions, network reachability checks, RoCE/RDMA data-path validation, and "can this node/pod reach its gateway" investigations.

209SKILL.mdUpdated Jun 7, 2026

scitix/gateway-diagnostics

scitix/skill-authoring

development

VerifiedTrustedCommunity

Guide for writing and improving Siclaw skills. Read this when creating or modifying a skill. Covers skill directory layout, SKILL.md format, script execution modes, and best practices.

209SKILL.mdUpdated Apr 23, 2026

scitix/skill-authoring

scitix/node-logs

devops

VerifiedTrustedCommunity

Retrieve logs from a Kubernetes node. Supports journalctl (systemd units) and file-based logs. Use when you need to inspect node-level logs (containerd, kubelet, etc.). Run via host_script (preferred) or node_script.

209SKILL.mdUpdated Apr 12, 2026

scitix/manage-skill

development

VerifiedTrustedCommunity

Guides the user to the Siclaw Web page to manage Skills. Use this guide when the user requests to create, edit, or view a Skill in a Channel conversation.

88SKILL.mdUpdated Apr 12, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/scitix/siclaw.git

# Copy into Claude Code skills folder (global)
cp -r siclaw/skills/core/hpa-debug ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

scitix/siclaw

88 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

scitix/hpa-debug

$ install --global

Security Scan Results

SKILL.md

HPA Autoscaling Failure Diagnosis

Diagnostic Flow

1. Check HPA status

2. Describe the HPA

3. Check metrics-server health

4. Check target resource requests

5. Match patterns and conclude

<unknown> metrics — Metrics not available

ScalingActive: False — HPA cannot scale

ScalingLimited: True — At min or max replicas

HPA flapping (scaling up and down repeatedly) — Unstable metrics

HPA not scaling up under load — Target not reached

Multiple HPAs targeting the same workload — Conflicting autoscalers

Notes

Related Skills

scitix/gateway-diagnostics

scitix/skill-authoring

scitix/node-logs

scitix/manage-skill

scitix/hpa-debug

$ install --global

Security Scan Results

SKILL.md

HPA Autoscaling Failure Diagnosis

Diagnostic Flow

1. Check HPA status

2. Describe the HPA

3. Check metrics-server health

4. Check target resource requests

5. Match patterns and conclude

<unknown> metrics — Metrics not available

ScalingActive: False — HPA cannot scale

ScalingLimited: True — At min or max replicas

HPA flapping (scaling up and down repeatedly) — Unstable metrics

HPA not scaling up under load — Target not reached

Multiple HPAs targeting the same workload — Conflicting autoscalers

Notes

Related Skills

scitix/gateway-diagnostics

scitix/skill-authoring

scitix/node-logs

scitix/manage-skill

`<unknown>` metrics — Metrics not available

`ScalingActive: False` — HPA cannot scale

`ScalingLimited: True` — At min or max replicas

`<unknown>` metrics — Metrics not available

`ScalingActive: False` — HPA cannot scale

`ScalingLimited: True` — At min or max replicas