security/ai/model-supply-chain-security/SKILL.md
Secure the AI model supply chain with artifact signing, provenance attestation, SBOM workflows, dependency controls, and trusted model promotion.
npx skillsauth add bagelhole/devops-security-agent-skills model-supply-chain-securityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Protect models and inference components from tampering, dependency compromise, and untrusted artifact promotion.
Use this skill when:
cosign v2+ installed for signing and verificationsyft for SBOM generation of model-serving imagescrane or skopeo for OCI image inspection# Generate a keypair (store private key securely)
cosign generate-key-pair
# Sign an OCI-packaged model image
cosign sign --key cosign.key ghcr.io/acme/ml-models/sentiment:v2.1.0
# Keyless signing with Sigstore (uses OIDC identity)
cosign sign ghcr.io/acme/ml-models/sentiment:v2.1.0
# Verify the signature
cosign verify --key cosign.pub ghcr.io/acme/ml-models/sentiment:v2.1.0
# Keyless verification (requires certificate identity)
cosign verify \
[email protected] \
--certificate-oidc-issuer=https://accounts.google.com \
ghcr.io/acme/ml-models/sentiment:v2.1.0
# For model files stored as blobs (not OCI images)
# Compute digest and sign
sha256sum model-weights.safetensors > model-weights.sha256
cosign sign-blob --key cosign.key model-weights.safetensors \
--output-signature model-weights.sig \
--output-certificate model-weights.crt
# Verify blob signature
cosign verify-blob --key cosign.pub \
--signature model-weights.sig \
model-weights.safetensors
# slsa-requirements.yaml
slsa_levels:
level_1:
- Build process is scripted (not manual)
- Provenance document generated automatically
level_2:
- Build runs on hosted CI service
- Provenance is authenticated (signed)
- Source is version controlled
level_3:
- Build environment is ephemeral and isolated
- Provenance is non-falsifiable (hardened builder)
- Source integrity verified (two-person review)
# .github/workflows/model-build-slsa.yml
name: Model Build with SLSA Provenance
on:
push:
tags: ['model-v*']
jobs:
train-and-package:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Train model
run: python train.py --config configs/production.yaml
- name: Package model as OCI artifact
run: |
oras push ghcr.io/acme/ml-models/sentiment:${{ github.ref_name }} \
model-weights.safetensors:application/vnd.acme.model.safetensors \
model-config.json:application/json
- name: Generate SBOM for training environment
run: |
syft dir:. -o cyclonedx-json > training-sbom.json
- name: Sign and attest
run: |
cosign sign ghcr.io/acme/ml-models/sentiment:${{ github.ref_name }}
cosign attest --predicate training-sbom.json \
--type cyclonedx \
ghcr.io/acme/ml-models/sentiment:${{ github.ref_name }}
- name: Generate provenance
uses: slsa-framework/slsa-github-generator/.github/workflows/[email protected]
with:
image: ghcr.io/acme/ml-models/sentiment
digest: ${{ steps.push.outputs.digest }}
# model-card.yaml
model_details:
name: "sentiment-classifier-v2.1.0"
version: "2.1.0"
type: "text-classification"
framework: "pytorch"
license: "Apache-2.0"
provenance:
training_data:
source: "s3://acme-datasets/sentiment-v3/"
hash: "sha256:abc123..."
data_card_ref: "https://internal.acme.com/data-cards/sentiment-v3"
training_config:
source: "git://github.com/acme/ml-models@abc123"
hyperparameters:
learning_rate: 0.00005
epochs: 10
batch_size: 32
build_environment:
builder: "github-actions"
runner: "ubuntu-22.04"
python: "3.11.7"
torch: "2.1.2"
cuda: "12.1"
build_id: "gh-actions-12345"
commit_sha: "abc123def456"
build_timestamp: "2025-01-15T10:30:00Z"
signed_by: "[email protected]"
performance:
accuracy: 0.94
f1_score: 0.93
evaluation_dataset: "s3://acme-datasets/sentiment-eval-v3/"
evaluation_hash: "sha256:def456..."
security:
vulnerability_scan: "clean"
sbom_ref: "ghcr.io/acme/ml-models/sentiment:v2.1.0.sbom"
last_security_review: "2025-01-10"
known_limitations:
- "May produce biased outputs for underrepresented languages"
- "Not evaluated for adversarial robustness"
# Scan model-serving image for CVEs
trivy image ghcr.io/acme/ml-models/sentiment-serving:v2.1.0
# Generate SBOM for the serving container
syft ghcr.io/acme/ml-models/sentiment-serving:v2.1.0 -o spdx-json > serving-sbom.json
# Scan SBOM for vulnerabilities
grype sbom:serving-sbom.json --fail-on critical
# Check for known-malicious model files (pickle scanning)
pip install fickling
fickling --check model.pkl
# .github/workflows/registry-scan.yml
name: Nightly Registry Scan
on:
schedule:
- cron: '0 2 * * *'
jobs:
scan:
runs-on: ubuntu-latest
strategy:
matrix:
image:
- ghcr.io/acme/ml-models/sentiment-serving:latest
- ghcr.io/acme/ml-models/embedding-serving:latest
- ghcr.io/acme/ml-models/rag-api:latest
steps:
- name: Scan image
run: |
trivy image --severity CRITICAL,HIGH \
--exit-code 1 \
--format json \
--output scan-$(echo ${{ matrix.image }} | tr '/:' '-').json \
${{ matrix.image }}
- name: Verify signatures are still valid
run: |
cosign verify \
[email protected] \
--certificate-oidc-issuer=https://accounts.google.com \
${{ matrix.image }}
#!/usr/bin/env python3
"""model_promotion_gate.py - Verify model meets all promotion criteria."""
import subprocess
import json
import sys
def check_signature(image: str) -> bool:
result = subprocess.run(
["cosign", "verify", "[email protected]",
"--certificate-oidc-issuer=https://accounts.google.com", image],
capture_output=True, text=True,
)
return result.returncode == 0
def check_vulnerabilities(image: str) -> bool:
result = subprocess.run(
["trivy", "image", "--severity", "CRITICAL", "--exit-code", "1",
"--quiet", image],
capture_output=True, text=True,
)
return result.returncode == 0
def check_sbom_exists(image: str) -> bool:
result = subprocess.run(
["cosign", "verify-attestation", "--type", "cyclonedx",
"[email protected]",
"--certificate-oidc-issuer=https://accounts.google.com", image],
capture_output=True, text=True,
)
return result.returncode == 0
def check_model_card(image: str) -> bool:
result = subprocess.run(
["cosign", "verify-attestation", "--type", "custom",
"[email protected]",
"--certificate-oidc-issuer=https://accounts.google.com", image],
capture_output=True, text=True,
)
return result.returncode == 0
def main():
image = sys.argv[1]
checks = {
"signature_valid": check_signature(image),
"no_critical_cves": check_vulnerabilities(image),
"sbom_attached": check_sbom_exists(image),
"model_card_present": check_model_card(image),
}
all_passed = all(checks.values())
for name, passed in checks.items():
status = "PASS" if passed else "FAIL"
print(f" [{status}] {name}")
if not all_passed:
print("Promotion BLOCKED: not all checks passed.")
sys.exit(1)
print("Promotion APPROVED: all checks passed.")
if __name__ == "__main__":
main()
safetensors format instead of pickle to prevent deserialization attacks.# kubernetes deployment hardening
apiVersion: apps/v1
kind: Deployment
metadata:
name: model-serving
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: inference
image: ghcr.io/acme/ml-models/sentiment-serving:v2.1.0
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: model-weights
mountPath: /models
readOnly: true
resources:
limits:
memory: "4Gi"
nvidia.com/gpu: "1"
volumes:
- name: model-weights
persistentVolumeClaim:
claimName: model-weights-pvc
readOnly: true
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-signed-model-images
spec:
validationFailureAction: Enforce
rules:
- name: verify-model-image-signature
match:
any:
- resources:
kinds: ["Pod"]
namespaces: ["ml-serving"]
verifyImages:
- imageReferences: ["ghcr.io/acme/ml-models/*"]
attestors:
- entries:
- keyless:
subject: "[email protected]"
issuer: "https://accounts.google.com"
| Problem | Cause | Solution |
|---------|-------|----------|
| cosign verify fails with "no matching signatures" | Image was pushed without signing | Re-run the signing step; check CI pipeline logs |
| Provenance attestation missing | SLSA generator not configured | Add slsa-github-generator to the build workflow |
| Trivy reports CVEs in base image | Stale base image | Update FROM image in Dockerfile; rebuild and re-sign |
| Pickle deserialization warning | Model saved in unsafe format | Convert to safetensors: model.save_pretrained(".", safe_serialization=True) |
| Keyless verification fails | Wrong OIDC issuer or identity | Check --certificate-identity and --certificate-oidc-issuer flags |
| Model card not found for artifact | Attestation not attached to digest | Attach with cosign attest --predicate model-card.yaml --type custom IMAGE |
development
Design and operationalize SRE dashboards that surface reliability, latency, error, saturation, and capacity signals across services. Use when building observability views for SLOs, incident response, and executive reliability reporting.
testing
Harden OpenClaw self-hosted environments with baseline host controls, auth tightening, secret handling, network segmentation, and safe update/rollback workflows. Use when deploying OpenClaw in home labs, startups, or production-like local AI infrastructure.
devops
Deploy, manage, and optimize vector databases for AI applications. Covers Qdrant, Weaviate, pgvector, and Pinecone — collection management, indexing strategies, backup, and performance tuning for production RAG and semantic search workloads.
testing
Deploy ML models on Kubernetes with KServe (formerly KFServing) and NVIDIA Triton Inference Server. Includes canary deployments, autoscaling, model versioning, A/B testing, and GPU resource management for production model serving.