cloud-foundation-principles/skills/container-image-tagging/SKILL.md
This skill should be used when the user is building Docker images, configuring container registries, designing image tagging strategies, setting up registry lifecycle policies, debugging production incidents that require tracing running code, or discussing OCI labels and build metadata. Covers git SHA tagging, the traceability chain from container to source code, registry retention policies, OCI build labels, and why date-based or environment-based tags fail.
npx skillsauth add oborchers/fractional-cto container-image-taggingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
At 3am during an incident, the first question is always the same: "what code is running?" If the answer requires digging through CI/CD logs, cross-referencing deployment timestamps, or asking a colleague who might remember which build went out Tuesday, you have already lost critical minutes. A container image tagged with the full git commit SHA answers the question instantly. You look at the running container, read the image tag, and you have the exact commit. From there, one command shows you every line of code in production.
The latest tag is the root of a particularly insidious class of failures. Two services pulling latest five minutes apart may receive different images. A rollback to latest deploys whatever happens to be newest, which may be the broken version you are trying to escape. Environment-based tags like staging or production silently mutate, destroying the ability to reproduce issues. The only tag that is both immutable and traceable is the git commit SHA.
Every container image receives exactly one meaningful tag: the full 40-character git commit SHA. Short SHAs are not acceptable -- they can collide as repository history grows, and they provide no benefit since the tag is never typed by hand.
# The only acceptable tagging pattern
IMAGE_TAG=$(git rev-parse HEAD)
# Result: a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0
docker build -t myorg/myapp:${IMAGE_TAG} .
docker push registry.example.com/myorg/myapp:${IMAGE_TAG}
| Pattern | Example | Verdict | Why |
|---------|---------|---------|-----|
| Full git SHA | myapp:a1b2c3d4e5f6... | Correct | Immutable, traceable to exact commit |
| latest | myapp:latest | Dangerous | Mutable, two pulls can get different images, rollback is meaningless |
| Short SHA | myapp:a1b2c3d | Risky | Collisions grow with repo size, loses traceability precision |
| Date-based | myapp:2025-03-15 | Broken | Multiple builds per day, no code traceability, timezone confusion |
| Build number | myapp:build-142 | Fragile | CI-system-specific, lost if CI rebuilt, no direct link to code |
| Environment | myapp:production | Dangerous | Mutable like latest, silently changes, impossible to diff |
| Semver only | myapp:1.2.3 | Incomplete | Required for production deploys, but insufficient alone -- must always accompany the SHA tag for traceability |
Every image is SHA-tagged at build time. For production deployments, a semver release tag is required -- it signals an explicit promotion decision, not just "the latest build."
1.2.3) that points to a previously built, SHA-tagged imageThe image is built once and tagged with the SHA. When promoting to production, CI adds the semver tag to the existing image -- no rebuild. Both tags point to the same image digest.
# At build time: tag with SHA (mandatory, every build)
docker tag myorg/myapp:${GIT_SHA} registry.example.com/myorg/myapp:${GIT_SHA}
docker push registry.example.com/myorg/myapp:${GIT_SHA}
# At release time: add semver tag to the existing image (required for prod)
docker tag myorg/myapp:${GIT_SHA} registry.example.com/myorg/myapp:1.2.3
docker push registry.example.com/myorg/myapp:1.2.3
The git SHA tag creates an unbreakable chain from a running container back to the exact source code:
Running Container
|
| kubectl describe pod / ecs describe-tasks / az container show
|
Image Tag: registry.example.com/myorg/myapp:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0
|
| git show a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0
|
Exact Commit (author, timestamp, message, diff)
|
| git log, git blame, code review link
|
Every Line of Source Code in Production
This chain answers every incident question: What changed? Who changed it? When? What was the review? What did the diff look like? All from one tag.
Beyond the tag, embed build metadata directly into the image using OCI standard labels. This information survives even if the registry tag is deleted or overwritten.
# Dockerfile
FROM node:20-alpine AS base
# ... build stages ...
FROM base AS production
LABEL org.opencontainers.image.revision="${GIT_SHA}"
LABEL org.opencontainers.image.source="https://github.com/myorg/myapp"
LABEL org.opencontainers.image.created="${BUILD_TIMESTAMP}"
LABEL org.opencontainers.image.version="${APP_VERSION}"
Pass build arguments at build time:
docker build \
--build-arg GIT_SHA=$(git rev-parse HEAD) \
--build-arg BUILD_TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ) \
--build-arg APP_VERSION=$(git describe --tags --always) \
-t myorg/myapp:$(git rev-parse HEAD) .
To inspect a running image's metadata:
docker inspect myorg/myapp:a1b2c3d4... --format '{{json .Config.Labels}}' | jq .
# {
# "org.opencontainers.image.revision": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0",
# "org.opencontainers.image.source": "https://github.com/myorg/myapp",
# "org.opencontainers.image.created": "2025-03-15T14:30:00Z"
# }
Container registries accumulate images quickly. Without lifecycle policies, storage costs grow unbounded and old vulnerable images remain pullable. Define retention rules that keep what matters and clean up what does not.
| Rule | Retention | Rationale | |------|-----------|-----------| | SHA-tagged images (dev builds) | Keep last 20 | Covers ~2-4 weeks of dev deployments, enough for rollback | | Semver-tagged images (prod releases) | Keep all | Production releases are infrequent; needed for audits, compliance, and post-mortems | | Untagged images (build layers) | Delete after 7 days | Build cache artifacts, no production value |
# Terraform: Container registry lifecycle policy
# Note: semver-tagged images are kept indefinitely -- no expiry rule needed.
resource "aws_ecr_lifecycle_policy" "this" {
repository = aws_ecr_repository.myapp.name
policy = jsonencode({
rules = [
{
rulePriority = 1
description = "Keep last 20 SHA-tagged images (dev builds)"
selection = {
tagStatus = "any"
tagPrefixList = [""]
countType = "imageCountMoreThan"
countNumber = 20
}
action = { type = "expire" }
},
{
rulePriority = 10
description = "Remove untagged images after 7 days"
selection = {
tagStatus = "untagged"
countType = "sinceImagePushed"
countUnit = "days"
countNumber = 7
}
action = { type = "expire" }
}
]
})
}
Twenty SHA-tagged images covers approximately 2-4 weeks of active development builds. This provides enough rollback depth for dev environments while preventing unbounded growth. If a team deploys more frequently, increase the count proportionally. Production release images (semver-tagged) are kept indefinitely -- they are infrequent, storage cost is negligible, and you never want to explain in an audit why a production image was deleted.
myapp:2025-03-15)A team deploys twice on March 15th. Which image is 2025-03-15? The first? The second? A hotfix at 11pm? Date tags have no disambiguation mechanism. They also provide zero information about what code is in the image. Incident responders must still search CI/CD logs to find the commit.
myapp:production)Environment tags are mutable pointers. When you push a new production tag, the old image is not gone -- it is just untagged. If you need to roll back, what do you roll back to? The previous production tag no longer exists. You are left searching through image digests. Worse, if two services reference myapp:production and you push a new image between their deployments, they run different code with the same tag.
myapp:build-142)Build numbers are CI-system-specific. If you switch from one CI platform to another, build numbers reset. If you re-run a build, does build-142 now contain different code? Build numbers also require a lookup table to map back to commits. The SHA eliminates the lookup entirely.
| Concept | AWS | GCP | Azure |
|---------|-----|-----|-------|
| Container registry | ECR (Elastic Container Registry) | Artifact Registry | ACR (Azure Container Registry) |
| Lifecycle policy | ECR Lifecycle Policy | Artifact Registry Cleanup Policies | ACR Retention Policy + Purge Tasks |
| Image scanning | ECR Enhanced Scanning (Inspector) | Artifact Analysis | ACR + Microsoft Defender |
| Registry authentication | aws ecr get-login-password | gcloud auth configure-docker | az acr login |
| Cross-account image pull | ECR Repository Policy (allow pull) | Artifact Registry IAM (roles/artifactregistry.reader) | ACR RBAC (AcrPull role) |
| Image immutability | ECR Image Tag Immutability | Artifact Registry tag immutability | ACR tag locking (preview) |
Working implementations in examples/:
examples/image-build-pipeline.md -- Complete CI pipeline that builds, tags with git SHA, adds OCI labels, pushes to registry, and applies lifecycle policiesexamples/registry-lifecycle-terraform.md -- Terraform configuration for a container registry with lifecycle policies, image scanning, and cross-account pull permissionsWhen designing or reviewing container image tagging:
latest tag is never used in deployment configurations or service definitionsstaging, production) are never used as deployment referencestools
This skill should be used when the user invokes any /plan-* command from the planning-tools plugin (/plan-context, /plan-master, /plan-open-questions, /plan-verify, /plan-tick, /plan-progress, /plan-delete), asks how Claude Code's plan files work, asks where plans are stored, asks to author or audit a multi-phase master planning document, asks how to walk through a plan's Open Questions interactively, asks how to write progress entries, or mentions ~/.claude/plans/ or .claude/planning-tools.local.md. Provides the index of planning-tools commands, the master-plan workflow lifecycle, the v0.3.0+ list-shape mandate (phases and questions as headings + bulleted scope items, never tables), the v0.3.2+ plain-bullet shape (no `- [ ]` checkboxes — heading emoji is the sole tick signal), the progress-entry methodology, and the mechanics of Claude Code's plan-mode file storage.
testing
This skill should be used when the user is adjusting spacing, padding, margins, content density, section gaps, vertical rhythm, or separation between elements. Also applies when reviewing whether a design feels cramped or too sparse, choosing between borders and whitespace for separation, or defining a spacing system. Covers the 4px/8px spacing system, macro vs micro whitespace, content density spectrum, separation techniques (whitespace > background shifts > borders), and vertical rhythm.
development
This skill should be used when the user is defining brand personality in design, choosing between illustration and photography, adding motion or animation, creating visual motifs, ensuring layout variety, customizing CSS framework defaults, or calibrating the level of creative expression for a given context. Covers Lavie & Tractinsky's expressive aesthetics, the expression spectrum (restrained to bold), brand personality translation, illustration systems, photography direction, and template independence.
development
This skill should be used when the user is establishing visual importance, designing headings, creating focal points, designing CTAs or buttons, arranging label-data relationships, implementing scanning patterns (F-pattern, Z-pattern), or ensuring one dominant element per screen. Covers the three levers of hierarchy (size, weight, color), three-tier information architecture, the 'emphasize by de-emphasizing' principle, CTA design, and label-data relationships.