Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

kienbui1995/skills/cloud/gcp/vertex-ai-mlops

Name: skills/cloud/gcp/vertex-ai-mlops
Author: kienbui1995

skills/cloud/gcp/vertex-ai-mlops/SKILL.md

npx skillsauth add kienbui1995/magic-powers skills/cloud/gcp/vertex-ai-mlops

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Vertex AI MLOps

When to Use

Designing ML training or serving infrastructure on GCP
Setting up model monitoring or retraining pipelines
Choosing between AutoML and custom training
Preparing for GCP Professional Data Engineer or ML Engineer exam

Core Jobs

1. AutoML vs Custom Training

| Factor | AutoML | Custom Training | |--------|--------|----------------| | Code required | None | Python/TensorFlow/PyTorch | | Control | Limited | Full control | | Speed | Fastest to deploy | Requires ML expertise | | Best for | Tabular, image, text (standard tasks) | Novel architectures, research |

2. Vertex AI Pipelines

Orchestrates ML workflows as DAGs (Kubeflow Pipelines or TFX)
Each step = a containerized component (preprocessing, training, evaluation, deployment)
Use kfp.v2 SDK or pre-built Google Cloud Pipeline Components
Store pipeline artifacts in Cloud Storage; metadata in Vertex ML Metadata

3. Feature Store

Centralized repository for ML features (avoid feature duplication across teams)
Online store — low-latency serving (< 10ms) for real-time inference
Offline store — batch access for training (BigQuery-backed)
Features defined once, reused across models

4. Model Serving

Endpoint — deploys one or more model versions, handles prediction requests
Batch prediction — asynchronous, for large offline prediction jobs
Online prediction — synchronous, for real-time serving
Traffic splitting between model versions for A/B testing or canary releases

5. Model Monitoring

Skew detection — training vs serving data distribution drift
Drift detection — serving data distribution changes over time
Alert thresholds configurable per feature
Monitored logs sent to BigQuery for analysis

6. Model Registry

Version all trained models centrally
Stage models through: Experiment → Staging → Production
Alias support for promoting/rolling back versions

Key Concepts

ML Metadata — tracks lineage: which dataset trained which model, which pipeline produced what artifact
Explainable AI — feature attributions (SHAP values) for model transparency
Vertex AI Workbench — managed JupyterLab for experimentation
Training pipeline vs custom job — pipeline = orchestrated multi-step; custom job = single training run

Checklist

[ ] Training data versioned and reproducible?
[ ] Model evaluation metrics gated before promotion?
[ ] Serving endpoint has traffic splitting for safe rollout?
[ ] Model monitoring enabled (skew + drift detection)?
[ ] Feature Store used to avoid feature duplication?
[ ] Pipeline steps containerized and versioned?

Output Format

🔴 Critical — no model monitoring in production (silent degradation)
🟡 Warning — no traffic splitting for new model versions, no feature versioning
🟢 Suggestion — Feature Store for cross-team feature reuse, Explainable AI for compliance

Exam Tips

Feature Store online = real-time serving (low latency); offline = batch training (BigQuery)
Model monitoring = skew (train vs serve) + drift (serve distribution over time)
Vertex AI Pipelines = Kubeflow Pipelines on GCP (not Cloud Composer/Airflow)
AutoML Tabular = good baseline; custom training when you need specific architecture
Batch prediction = no endpoint needed; just submit job → results to GCS/BigQuery
Traffic splitting on endpoints = canary release for models (same as canary deployments)

kienbui1995/skills/cloud/gcp/vertex-ai-mlops

skills/cloud/gcp/vertex-ai-mlops/SKILL.md

--- name: vertex-ai-mlops description: Use when building ML pipelines on Vertex AI, managing model lifecycle, setting up feature stores, or deploying models for serving. Covers GCP-PDE domain: Maintain and automate data workloads (~10-15%) and GCP ML Engineer domain: MLOps (~30-35%). --- # Vertex AI MLOps ## When to Use - Designing ML training or serving infrastructure on GCP - Setting up model monitoring or retraining pipelines - Choosing between AutoML and custom training - Preparing for GCP

development

Updated Apr 23, 2026

$ install --global

skillsauth

npx skillsauth add kienbui1995/magic-powers skills/cloud/gcp/vertex-ai-mlops

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 12:53 AM44.2s1 file scanned

SKILL.md

name:: vertex-ai-mlops
description:: Use when building ML pipelines on Vertex AI, managing model lifecycle, setting up feature stores, or deploying models for serving. Covers GCP-PDE domain: Maintain and automate data workloads (~10-15%) and GCP ML Engineer domain: MLOps (~30-35%).

Vertex AI MLOps

When to Use

Designing ML training or serving infrastructure on GCP
Setting up model monitoring or retraining pipelines
Choosing between AutoML and custom training
Preparing for GCP Professional Data Engineer or ML Engineer exam

Core Jobs

1. AutoML vs Custom Training

2. Vertex AI Pipelines

Orchestrates ML workflows as DAGs (Kubeflow Pipelines or TFX)
Each step = a containerized component (preprocessing, training, evaluation, deployment)
Use kfp.v2 SDK or pre-built Google Cloud Pipeline Components
Store pipeline artifacts in Cloud Storage; metadata in Vertex ML Metadata

3. Feature Store

Centralized repository for ML features (avoid feature duplication across teams)
Online store — low-latency serving (< 10ms) for real-time inference
Offline store — batch access for training (BigQuery-backed)
Features defined once, reused across models

4. Model Serving

Endpoint — deploys one or more model versions, handles prediction requests
Batch prediction — asynchronous, for large offline prediction jobs
Online prediction — synchronous, for real-time serving
Traffic splitting between model versions for A/B testing or canary releases

5. Model Monitoring

Skew detection — training vs serving data distribution drift
Drift detection — serving data distribution changes over time
Alert thresholds configurable per feature
Monitored logs sent to BigQuery for analysis

6. Model Registry

Version all trained models centrally
Stage models through: Experiment → Staging → Production
Alias support for promoting/rolling back versions

Key Concepts

ML Metadata — tracks lineage: which dataset trained which model, which pipeline produced what artifact
Explainable AI — feature attributions (SHAP values) for model transparency
Vertex AI Workbench — managed JupyterLab for experimentation
Training pipeline vs custom job — pipeline = orchestrated multi-step; custom job = single training run

Checklist

[ ] Training data versioned and reproducible?
[ ] Model evaluation metrics gated before promotion?
[ ] Serving endpoint has traffic splitting for safe rollout?
[ ] Model monitoring enabled (skew + drift detection)?
[ ] Feature Store used to avoid feature duplication?
[ ] Pipeline steps containerized and versioned?

Output Format

🔴 Critical — no model monitoring in production (silent degradation)
🟡 Warning — no traffic splitting for new model versions, no feature versioning
🟢 Suggestion — Feature Store for cross-team feature reuse, Explainable AI for compliance

Exam Tips

Feature Store online = real-time serving (low latency); offline = batch training (BigQuery)
Model monitoring = skew (train vs serve) + drift (serve distribution over time)
Vertex AI Pipelines = Kubeflow Pipelines on GCP (not Cloud Composer/Airflow)
AutoML Tabular = good baseline; custom training when you need specific architecture
Batch prediction = no endpoint needed; just submit job → results to GCS/BigQuery
Traffic splitting on endpoints = canary release for models (same as canary deployments)

Related Skills

kienbui1995/xr-interface-design

content-media

VerifiedTrustedCommunity

Use when designing for XR (AR/VR/MR), choosing interaction modes, or adapting 2D UI patterns for spatial computing

SKILL.mdUpdated Apr 24, 2026

kienbui1995/xr-interface-design

kienbui1995/writing-skills

testing

VerifiedTrustedCommunity

Use when creating new skills, editing existing skills, or verifying skills work before deployment

SKILL.mdUpdated Apr 24, 2026

kienbui1995/writing-skills

kienbui1995/writing-plans

development

VerifiedTrustedCommunity

Use when you have a spec or requirements for a multi-step task, before touching code

SKILL.mdUpdated Apr 24, 2026

kienbui1995/writing-plans

kienbui1995/workflow-templates

development

VerifiedTrustedCommunity

Use when executing a structured workflow — select and run a feature, bugfix, refactor, research, or incident template with correct agent and model assignments per phase.

SKILL.mdUpdated Apr 24, 2026

kienbui1995/workflow-templates

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/kienbui1995/magic-powers.git

# Copy into Claude Code skills folder (global)
cp -r magic-powers/skills/cloud/gcp/vertex-ai-mlops ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

kienbui1995/magic-powers

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT