aws-genai-lens/SKILL.md
Enforces AWS Well-Architected Generative AI Lens best practices for foundation model workloads on Amazon Bedrock and SageMaker AI. Use when designing GenAI architectures, implementing RAG pipelines, selecting foundation models, configuring Bedrock Guardrails, fine-tuning models, optimizing GenAI costs, securing AI workloads, or applying responsible AI principles including fairness, explainability, and safety.
npx skillsauth add kayaman/skills aws-genai-lensInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference: AWS GenAI Lens (updated Nov 2025)
| Phase | Key Activities | |-------|---------------| | 1. Scoping | Define the business problem; determine if GenAI is the right solution; identify success metrics | | 2. Model Selection | Evaluate models by modality, accuracy, latency, cost, and context window; benchmark candidates | | 3. Customization | Prompt engineering → RAG → fine-tuning (progressive investment); evaluate after each stage | | 4. Development & Integration | Build application logic, implement guardrails, integrate with existing systems | | 5. Deployment | Deploy to production with monitoring, scaling, and rollback capability | | 6. Continuous Improvement | Monitor quality metrics; refine prompts, retrieval, and models; manage model lifecycle |
Start with prompt engineering (lowest cost, fastest iteration)
↓ Not meeting quality bar?
Add RAG for knowledge-grounded tasks
↓ Still not meeting quality bar?
Fine-tune when behavior/style is the bottleneck
↓ Need maximum control?
Custom training via SageMaker HyperPod
| Criterion | Considerations | |-----------|---------------| | Modality | Text, image, audio, video, multi-modal, embedding | | Quality | Accuracy on your task benchmarks; reasoning capability | | Context Window | How much input context the task requires | | Latency | Real-time vs batch tolerance | | Cost | Per-token pricing; throughput pricing; reserved capacity | | Provider | Anthropic, Meta, Mistral, Amazon, Cohere, AI21, Stability AI (via Bedrock) |
RAG augments FM responses with retrieved knowledge, reducing hallucination and keeping responses grounded.
Automates the full RAG pipeline: ingestion → chunking → embedding → vector storage → retrieval → prompt augmentation.
| Store | Best For | Key Feature | |-------|----------|-------------| | OpenSearch Serverless | General-purpose vector search | Fully managed; hybrid search (vector + keyword) | | Aurora PostgreSQL (pgvector) | Teams already using Aurora | Familiar SQL; transactional consistency | | Neptune Analytics | Knowledge graph + vector (GraphRAG) | Links related content; improves complex reasoning | | Amazon S3 Vectors (2025) | High-scale, cost-sensitive workloads | Up to 2B vectors/index; ~90% cost reduction vs specialized DBs | | MongoDB Atlas | MongoDB-native teams | Atlas Vector Search integration | | Pinecone | Dedicated vector DB teams | Purpose-built vector search |
| Strategy | Use When | |----------|----------| | Fixed-size | Uniform content; simple implementation | | Semantic | Content with natural topic boundaries | | Hierarchical | Content with parent-child structure (sections, subsections) | | Custom (Lambda) | Domain-specific chunking logic required |
SHOULD fine-tune only when prompt engineering + RAG cannot achieve the required quality for behavior or style.
| Approach | Use When | Cost | |----------|----------|------| | Supervised Fine-Tuning (SFT) | Specific task format or domain style | Moderate | | Continued Pre-Training | Domain-specific vocabulary and knowledge | High | | LoRA (Low-Rank Adaptation) | Resource-efficient task adaptation | Low — dramatically less compute | | QLoRA | Fine-tuning on limited GPU memory | Very low | | Model Distillation | Compress a large model's capability into a smaller one | Moderate | | Reinforcement Fine-Tuning (2025) | Align model behavior with human preferences | High — but avg 66% accuracy gain |
| Dimension | Key Practices | |-----------|--------------| | Fairness | Bias detection and auditing across demographics; diverse evaluation datasets | | Explainability | Interpretable decisions; document model limitations and known failure modes | | Privacy & Security | Encryption at rest/transit; access controls; regulatory compliance (GDPR, HIPAA) | | Safety | Content filtering; guardrails; output validation; harmful content prevention | | Controllability | Human oversight; monitoring; ability to adjust or override model behavior | | Veracity & Robustness | Accuracy validation; automated reasoning for hallucination detection; adversarial testing | | Governance | AI review committees; model cards; documentation; escalation procedures | | Transparency | Clear disclosure that content is AI-generated; model provenance documentation |
Six safeguard policies applied on both inbound prompts and outbound responses:
| Policy | Purpose | Configuration | |--------|---------|---------------| | Content Filters | Block hate, insults, sexual, violence, misconduct, prompt attack | Configurable strength per category | | Denied Topics | Block up to 30 specific topics | Natural language topic descriptions | | Word Filters | Exact-match blocking of specific words | Word list management | | Sensitive Information Filters | Detect and block/mask PII | ML-based detection; block or mask options | | Contextual Grounding Checks | Detect hallucination in RAG responses | Grounding score threshold | | Automated Reasoning Checks | Formal mathematical logic verification | "First GenAI safeguard to use formal logic" |
| Model | Use When | |-------|----------| | On-Demand (per token) | Variable, unpredictable workloads; development and testing | | Provisioned Throughput | Sustained, predictable production workloads; guaranteed capacity | | Batch (up to 50% discount) | Non-real-time processing; bulk document analysis | | Cross-Region Inference | Need lower latency or higher availability across regions |
Key guarantee: Amazon Bedrock does not allow model providers to learn from customer data or prompts.
When designing or reviewing a GenAI workload on AWS:
| Book / Resource | Author(s) | Publisher | Year | |------|-----------|-----------|------| | AWS Well-Architected Generative AI Lens (official) | AWS | AWS Docs | 2025 | | AI Engineering | Chip Huyen | O'Reilly | 2025 | | Generative AI on AWS | Fregly, Barth, Eigenbrode | O'Reilly | 2023 | | Hands-On Large Language Models | Jay Alammar, Maarten Grootendorst | O'Reilly | 2024 | | Build a Large Language Model (From Scratch) | Sebastian Raschka | Manning | 2024 | | Designing Machine Learning Systems | Chip Huyen | O'Reilly | 2022 |
tools
Guidance for designing charts, graphs, plots, dashboards, and data visualizations that communicate clearly and persuade. Use when creating or reviewing a visualization, choosing a chart type, picking a color palette, decluttering a busy graphic, fixing misleading axes or proportions, building a dashboard, annotating a figure, or turning data into a presentation, report, or data-driven story. Grounded in the standard data-visualization literature (Knaflic, Tufte, Cleveland & McGill, Cairo, Wilke, Munzner, Few, Berinato). Covers chart selection, graphical perception and encoding, color and accessibility, decluttering, graphical integrity, dashboards, and narrative. Does NOT cover building data pipelines or ETL, statistical modeling or analysis methods, BI tool/vendor selection, or general UI/UX layout (see ux-design-principles). Tool-agnostic, with optional Python recipes.
development
Architect and implement production-grade microservices systems in TypeScript (NestJS) and Python (FastAPI), including resilience, observability, testing, deployment, and migration guidance.
development
--- name: databricks-genie-spaces-best-practices description: Design, configure, curate, govern, monitor, and integrate Databricks AI/BI Genie Spaces — the natural-language-to-SQL surface over Unity Catalog. Covers space scoping, general instructions, parameterized example SQL, SQL functions, trusted assets, JOIN configuration, knowledge store, certified queries, benchmarks, monitoring tab, feedback loops, the Genie Conversation API, governance via Unity Catalog (row filters, column masks, embed
tools
Implement OTP and passwordless authentication on AWS for TypeScript projects using Cognito CUSTOM_AUTH triggers (default) or a custom DynamoDB-backed flow, with SES (email) and SNS (SMS) delivery. Use when the user mentions OTP, one-time password, passwordless login, magic link, Cognito custom auth, DefineAuthChallenge, CreateAuthChallenge, VerifyAuthChallengeResponse, SES verification email, SNS SMS code, or MFA over email/SMS. Covers architecture decision (Cognito vs custom), Lambda trigger handlers, SES/SNS notifiers, DynamoDB schema with TTL, rate limiting, constant-time comparison, threat model (enumeration, replay, brute force), and aws-sdk-client-mock testing.