plugins/sagemaker-ai/skills/model-deployment/SKILL.md
Generates a Jupyter notebook that deploys fine-tuned models from SageMaker Serverless Model Customization to SageMaker endpoints or Bedrock. Use when the user says "deploy my model", "create an endpoint", "make it available", or asks about deployment options. Identifies the correct deployment pathway (Nova vs OSS), generates deployment code, and handles endpoint configuration.
npx skillsauth add awslabs/agent-plugins model-deploymentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Identifies the correct deployment pathway based on model characteristics and generates deployment code.
This skill supports deploying Nova and OSS models that were fine-tuned through SageMaker Serverless Model Customization only.
Not supported:
.ipynb file with the complete notebook JSON, OR use notebook MCP tools (e.g., create_notebook, add_cell) if available. Do NOT use bash commands, shell scripts, or echo/cat piping to generate notebooks.You need the training job name or ARN. Check the conversation history first — the user may have already mentioned it, or it may be available from earlier steps in the workflow (e.g., fine-tuning). If not, ask the user.
Once you have the training job name or ARN, use the AWS MCP tool to look it up:
describe-training-job and extract:
ModelArtifacts.S3ModelArtifacts or OutputDataConfig.S3OutputPath)RoleArn)list-tags on the training job ARN and extract:
sagemaker-studio:jumpstart-model-id tagUnsupported models: This skill only supports OSS and Nova models that were LoRA fine-tuned through SageMaker Serverless Model Customization. If the model doesn't match, tell the user this skill can't help and suggest the finetuning skill.
Use the following table:
| Model Type | Eligible Targets | | ---------- | ------------------ | | OSS | SageMaker, Bedrock | | Nova | SageMaker, Bedrock |
If only one target is eligible, confirm it with the user. Use details from Step 5.
If multiple targets are eligible, help the user decide. Use details from Step 5.
If no targets are eligible, tell the user and explain why.
Present the eligible options to the user. Present these details to help them decide between SageMaker and Bedrock, if both are available options:
SageMaker Endpoint:
Bedrock:
Do NOT make a recommendation. Let the user choose.
Do NOT mention technical details like merged/unmerged weights, reference files, or APIs, unless the user asks.
⏸ Wait for user to select a deployment option.
Before proceeding to deployment, display the model's license or service terms to the user.
references/model-licenses.md and look up the model by its model ID (determined in Step 1).⏸ Wait for the user to confirm before proceeding.
Read the reference file for the selected pathway and follow its instructions.
| Model Type | Deployment Target | Reference |
| ---------- | ----------------- | ------------------------------------- |
| OSS | SageMaker | references/deploy-oss-sagemaker.md |
| OSS | Bedrock | references/deploy-oss-bedrock.md |
| Nova | SageMaker | references/deploy-nova-sagemaker.md |
| Nova | Bedrock | references/deploy-nova-bedrock.md |
After deployment completes, provide the user with a summary. Cover these topics, using details from the pathway reference doc you followed in Step 5:
If deployment fails unexpectedly, the model may have been full fine-tuned (FFT) rather than LoRA. To check, download the training job's hydra config from its S3 output path at .hydra/config.yaml:
peft_config populated (r, alpha, dropout, etc.) → LoRA (supported)peft_config: null → FFT (not supported by this skill)development
Deploy to AWS Elastic Beanstalk. Triggers on: elastic beanstalk, EB, managed EC2 platform, web app with managed patching, worker on EC2, Heroku alternative, don't want to manage servers or containers, migrate from Heroku, managed operational lifecycle. Covers Elastic Beanstalk on EC2 for web and worker applications.
testing
Evaluate, configure, and migrate workloads to AWS Lambda Managed Instances (LMI). Triggers on: Lambda Managed Instances, LMI, capacity provider, multi-concurrency Lambda, dedicated instance Lambda, EC2-backed Lambda, cold start elimination, Graviton Lambda, instance type for Lambda, Lambda cost optimization with Reserved Instances or Savings Plans. Also trigger when users describe high-volume predictable workloads seeking cost savings, or compare Lambda vs EC2 for steady-state traffic. For standard Lambda without LMI, use the aws-lambda skill instead.
development
Deploy applications to AWS. Triggers on phrases like: deploy to AWS, host on AWS, run this on AWS, AWS architecture, estimate AWS cost, generate infrastructure. Analyzes any codebase and deploys to optimal AWS services.
development
Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, load data, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, query plan explainability, SQL compatibility validation, and bulk data loading. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow, aurora-dsql-loader, load CSV into DSQL.