Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

scitix/image-pull-debug

Name: image-pull-debug
Author: scitix

skills/core/image-pull-debug/SKILL.md

npx skillsauth add scitix/siclaw image-pull-debug

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Image Pull Failure Diagnosis

When a pod is stuck in ErrImagePull or ImagePullBackOff, follow this flow to identify the root cause.

Important: ErrImagePull, ImagePullBackOff, and Back-off pulling image are NOT causes — they only indicate the pull failed. You MUST proceed through all steps below to find the actual cause. Never conclude with just these status messages.

Scope: This skill is for diagnosis only. Once you identify the root cause, report it to the user and stop. Do NOT attempt network-level debugging (ping, curl, iptables, traceroute, etc.) — that is outside the scope of this skill and should be left to the user or network administrator.

Diagnostic Flow

1. Get pod info

kubectl get pod <pod> -n <ns> -o jsonpath='{.spec.nodeName}'
kubectl get pod <pod> -n <ns> -o jsonpath='{.status.containerStatuses[*].image}'
kubectl get pod <pod> -n <ns> -o jsonpath='{.status.containerStatuses[0].state.waiting.message}'

Note the node name, image name, and waiting message (may already contain the root cause).

Also check the image registry:

If the image has no registry prefix (e.g. nginx:latest, envoyproxy/gateway:v1.2.8), it pulls from Docker Hub (docker.io).
If it has a prefix (e.g. registry.example.com/app:v1), it pulls from that registry.

2. Check containerd logs

Containerd logs are the authoritative source for the root cause. Pod events are often generic ("Failed to pull image") and do not contain the actual error — always check containerd logs.

Use the node-logs skill:

bash skills/core/node-logs/scripts/get-node-logs.sh \
  --node <nodeName> --unit containerd --grep "<image>" --since "1h ago"

Replace <image> with the image name or a unique substring. Adjust --since to cover the pod's creation time.

If journalctl returns nothing, try log files:

bash skills/core/node-logs/scripts/get-node-logs.sh \
  --node <nodeName> --file /var/log/messages --grep "<image>"

3. Match error and conclude

Match the error from containerd logs (or the state.waiting.message from step 1) against the patterns below. Once a pattern matches, report the root cause to the user and stop. Do not continue with further diagnostic commands.

If containerd logs have no relevant entries, check events as a supplementary source:

kubectl get events -n <ns> --field-selector involvedObject.name=<pod>

If still no match, report whatever error information you have found and let the user decide next steps. Do NOT start autonomous network investigation.

`not found` / `manifest unknown` — Image does not exist

The image name or tag does not exist in the registry. Inform the user to verify the image name and tag.

`unauthorized` / `access denied` / `denied` — Authentication failed

The registry rejected the request. Advise the user to:

Configure an imagePullSecret for the pod with valid registry credentials
Or adjust image permissions on the registry side to allow access

`x509` / `certificate` / `tls` — Certificate not trusted

The node's containerd does not trust the registry's CA certificate. Advise the user to add the registry CA to the node's containerd trust config (/etc/containerd/certs.d/) or system trust store.

`no such host` / `lookup.*failed` — DNS resolution failed

The registry hostname cannot be resolved. Advise the user to check the hostname spelling and node DNS config.

`connection reset by peer` — Remote reset

TCP reached the registry but was reset by the remote end — a server-side or intermediary issue.

`i/o timeout` / `dial tcp.*timeout` — Connection timed out

The node cannot establish a TCP connection to the registry. Common causes: firewall blocking, proxy misconfiguration, or registry unreachable from the node's network.

Docker Hub specific: If the image is from Docker Hub (docker.io) and the node is in mainland China, this is almost certainly caused by network restrictions (GFW). Advise the user to use a registry mirror or re-tag the image to a domestically accessible registry.

Report the timeout to the user and stop. Do NOT attempt network diagnostics (ping, curl, iptables, etc.).

`connection refused` — Connection refused

TCP reached the host but the port is not listening. The registry service is down or on a different port.

`too many requests` / `429` / `rate limit` — Rate limited

The registry is throttling requests. Inform the user to wait for the rate limit window to expire, or configure a registry mirror to reduce direct requests.

`no space left on device` — Disk full

Containerd cannot unpack image layers due to insufficient disk space on the node.

`invalid reference format` — Malformed image name

The image reference contains illegal characters or has incorrect format. Inform the user to fix the image field in the pod spec.

`server gave HTTP response to HTTPS client` — Protocol mismatch

The registry serves HTTP but containerd expects HTTPS. Advise the user to configure the registry as insecure in containerd config, or enable TLS on the registry.

`does not match the specified platform` — Architecture mismatch

The image exists but has no manifest for this node's CPU architecture. Inform the user to use a multi-arch image or the correct platform-specific tag.

`ErrImageNeverPull` — Pull policy forbids pulling

imagePullPolicy is Never but the image is not present on the node. Inform the user to pre-load the image or change imagePullPolicy.

Notes

The --since parameter should cover the pod's creation time. If the pod was created long ago, increase accordingly (e.g. --since "24h ago").
If both containerd logs and events are empty, the state.waiting.message from step 1 is your best available information — report it directly.

scitix/image-pull-debug

skills/core/image-pull-debug/SKILL.md

Diagnose container image pull failures (ErrImagePull / ImagePullBackOff). Checks pod status, containerd logs, and events to identify root cause.

88 stars

development

Updated Apr 12, 2026

$ install --global

skillsauth

npx skillsauth add scitix/siclaw image-pull-debug

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 25, 2026, 12:02 AM2.3s1 file scanned

SKILL.md

name:: image-pull-debug
description:: >-

Image Pull Failure Diagnosis

When a pod is stuck in ErrImagePull or ImagePullBackOff, follow this flow to identify the root cause.

Diagnostic Flow

1. Get pod info

kubectl get pod <pod> -n <ns> -o jsonpath='{.spec.nodeName}'
kubectl get pod <pod> -n <ns> -o jsonpath='{.status.containerStatuses[*].image}'
kubectl get pod <pod> -n <ns> -o jsonpath='{.status.containerStatuses[0].state.waiting.message}'

Note the node name, image name, and waiting message (may already contain the root cause).

Also check the image registry:

If the image has no registry prefix (e.g. nginx:latest, envoyproxy/gateway:v1.2.8), it pulls from Docker Hub (docker.io).
If it has a prefix (e.g. registry.example.com/app:v1), it pulls from that registry.

2. Check containerd logs

Containerd logs are the authoritative source for the root cause. Pod events are often generic ("Failed to pull image") and do not contain the actual error — always check containerd logs.

Use the node-logs skill:

bash skills/core/node-logs/scripts/get-node-logs.sh \
  --node <nodeName> --unit containerd --grep "<image>" --since "1h ago"

Replace <image> with the image name or a unique substring. Adjust --since to cover the pod's creation time.

If journalctl returns nothing, try log files:

bash skills/core/node-logs/scripts/get-node-logs.sh \
  --node <nodeName> --file /var/log/messages --grep "<image>"

3. Match error and conclude

If containerd logs have no relevant entries, check events as a supplementary source:

kubectl get events -n <ns> --field-selector involvedObject.name=<pod>

If still no match, report whatever error information you have found and let the user decide next steps. Do NOT start autonomous network investigation.

`not found` / `manifest unknown` — Image does not exist

The image name or tag does not exist in the registry. Inform the user to verify the image name and tag.

`unauthorized` / `access denied` / `denied` — Authentication failed

The registry rejected the request. Advise the user to:

Configure an imagePullSecret for the pod with valid registry credentials
Or adjust image permissions on the registry side to allow access

`x509` / `certificate` / `tls` — Certificate not trusted

The node's containerd does not trust the registry's CA certificate. Advise the user to add the registry CA to the node's containerd trust config (/etc/containerd/certs.d/) or system trust store.

`no such host` / `lookup.*failed` — DNS resolution failed

The registry hostname cannot be resolved. Advise the user to check the hostname spelling and node DNS config.

`connection reset by peer` — Remote reset

TCP reached the registry but was reset by the remote end — a server-side or intermediary issue.

`i/o timeout` / `dial tcp.*timeout` — Connection timed out

The node cannot establish a TCP connection to the registry. Common causes: firewall blocking, proxy misconfiguration, or registry unreachable from the node's network.

Report the timeout to the user and stop. Do NOT attempt network diagnostics (ping, curl, iptables, etc.).

`connection refused` — Connection refused

TCP reached the host but the port is not listening. The registry service is down or on a different port.

`too many requests` / `429` / `rate limit` — Rate limited

The registry is throttling requests. Inform the user to wait for the rate limit window to expire, or configure a registry mirror to reduce direct requests.

`no space left on device` — Disk full

Containerd cannot unpack image layers due to insufficient disk space on the node.

`invalid reference format` — Malformed image name

The image reference contains illegal characters or has incorrect format. Inform the user to fix the image field in the pod spec.

`server gave HTTP response to HTTPS client` — Protocol mismatch

The registry serves HTTP but containerd expects HTTPS. Advise the user to configure the registry as insecure in containerd config, or enable TLS on the registry.

`does not match the specified platform` — Architecture mismatch

The image exists but has no manifest for this node's CPU architecture. Inform the user to use a multi-arch image or the correct platform-specific tag.

`ErrImageNeverPull` — Pull policy forbids pulling

imagePullPolicy is Never but the image is not present on the node. Inform the user to pre-load the image or change imagePullPolicy.

Notes

The --since parameter should cover the pod's creation time. If the pod was created long ago, increase accordingly (e.g. --since "24h ago").
If both containerd logs and events are empty, the state.waiting.message from step 1 is your best available information — report it directly.

Related Skills

scitix/gateway-diagnostics

testing

VerifiedTrustedCommunity

Show and ping the gateway of a network interface, on a Kubernetes node or inside a pod's network namespace. Auto-detects the gateway from the routing table (ip -j route), reports interface type (RoCE / Ethernet / IB), and tests reachability with ping. Use for default-route / gateway questions, network reachability checks, RoCE/RDMA data-path validation, and "can this node/pod reach its gateway" investigations.

209SKILL.mdUpdated Jun 7, 2026

scitix/gateway-diagnostics

scitix/skill-authoring

development

VerifiedTrustedCommunity

Guide for writing and improving Siclaw skills. Read this when creating or modifying a skill. Covers skill directory layout, SKILL.md format, script execution modes, and best practices.

209SKILL.mdUpdated Apr 23, 2026

scitix/skill-authoring

scitix/node-logs

devops

VerifiedTrustedCommunity

Retrieve logs from a Kubernetes node. Supports journalctl (systemd units) and file-based logs. Use when you need to inspect node-level logs (containerd, kubelet, etc.). Run via host_script (preferred) or node_script.

209SKILL.mdUpdated Apr 12, 2026

scitix/manage-skill

development

VerifiedTrustedCommunity

Guides the user to the Siclaw Web page to manage Skills. Use this guide when the user requests to create, edit, or view a Skill in a Channel conversation.

88SKILL.mdUpdated Apr 12, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/scitix/siclaw.git

# Copy into Claude Code skills folder (global)
cp -r siclaw/skills/core/image-pull-debug ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

scitix/siclaw

88 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

scitix/image-pull-debug

$ install --global

Security Scan Results

SKILL.md

Image Pull Failure Diagnosis

Diagnostic Flow

1. Get pod info

2. Check containerd logs

3. Match error and conclude

not found / manifest unknown — Image does not exist

unauthorized / access denied / denied — Authentication failed

x509 / certificate / tls — Certificate not trusted

no such host / lookup.*failed — DNS resolution failed

connection reset by peer — Remote reset

i/o timeout / dial tcp.*timeout — Connection timed out

connection refused — Connection refused

too many requests / 429 / rate limit — Rate limited

no space left on device — Disk full

invalid reference format — Malformed image name

server gave HTTP response to HTTPS client — Protocol mismatch

does not match the specified platform — Architecture mismatch

ErrImageNeverPull — Pull policy forbids pulling

Notes

Related Skills

scitix/gateway-diagnostics

scitix/skill-authoring

scitix/node-logs

scitix/manage-skill

scitix/image-pull-debug

$ install --global

Security Scan Results

SKILL.md

Image Pull Failure Diagnosis

Diagnostic Flow

1. Get pod info

2. Check containerd logs

3. Match error and conclude

not found / manifest unknown — Image does not exist

unauthorized / access denied / denied — Authentication failed

x509 / certificate / tls — Certificate not trusted

no such host / lookup.*failed — DNS resolution failed

connection reset by peer — Remote reset

i/o timeout / dial tcp.*timeout — Connection timed out

connection refused — Connection refused

too many requests / 429 / rate limit — Rate limited

no space left on device — Disk full

invalid reference format — Malformed image name

server gave HTTP response to HTTPS client — Protocol mismatch

does not match the specified platform — Architecture mismatch

ErrImageNeverPull — Pull policy forbids pulling

Notes

Related Skills

scitix/gateway-diagnostics

scitix/skill-authoring

scitix/node-logs

scitix/manage-skill

`not found` / `manifest unknown` — Image does not exist

`unauthorized` / `access denied` / `denied` — Authentication failed

`x509` / `certificate` / `tls` — Certificate not trusted

`no such host` / `lookup.*failed` — DNS resolution failed

`connection reset by peer` — Remote reset

`i/o timeout` / `dial tcp.*timeout` — Connection timed out

`connection refused` — Connection refused

`too many requests` / `429` / `rate limit` — Rate limited

`no space left on device` — Disk full

`invalid reference format` — Malformed image name

`server gave HTTP response to HTTPS client` — Protocol mismatch

`does not match the specified platform` — Architecture mismatch

`ErrImageNeverPull` — Pull policy forbids pulling

`not found` / `manifest unknown` — Image does not exist

`unauthorized` / `access denied` / `denied` — Authentication failed

`x509` / `certificate` / `tls` — Certificate not trusted

`no such host` / `lookup.*failed` — DNS resolution failed

`connection reset by peer` — Remote reset

`i/o timeout` / `dial tcp.*timeout` — Connection timed out

`connection refused` — Connection refused

`too many requests` / `429` / `rate limit` — Rate limited

`no space left on device` — Disk full

`invalid reference format` — Malformed image name

`server gave HTTP response to HTTPS client` — Protocol mismatch

`does not match the specified platform` — Architecture mismatch

`ErrImageNeverPull` — Pull policy forbids pulling