
Amazon Bedrock Runtime API for model inference including Claude, Nova, Titan, and third-party models. Covers invoke-model, converse API, streaming responses, token counting, async invocation, and guardrails. Use when invoking foundation models, building conversational AI, streaming model responses, optimizing token usage, or implementing runtime guardrails.
Quality assurance review for implementations. Use when reviewing code quality, checking implementation standards, performing QA cycles, or validating feature quality.
Generate feature lists from specifications. Use when creating feature_list.json, converting requirements to features, generating 50-100+ testable features, or initializing autonomous projects.
Execute implementation tasks in autonomous coding. Use when running feature implementations, executing build tasks, processing feature queue, or orchestrating task completion.
Auto-Claude Graphiti memory system configuration and usage. Use when setting up memory persistence, configuring LLM/embedding providers, querying knowledge graph, or optimizing memory performance.
Auto-Claude performance optimization and cost management. Use when optimizing token usage, reducing API costs, improving build speed, or tuning agent performance.
Master orchestrator for autonomous coding projects. Use when starting autonomous projects, continuing sessions, checking status, or running complete autonomous workflows.
Amazon Bedrock AgentCore Memory for persistent agent knowledge across sessions. Episodic memory for learning from interactions, short-term for session context. Use when building agents that remember user preferences, learn from conversations, or maintain context across sessions.
Comprehensive Amazon Bedrock Guardrails implementation for AI safety with 6 safeguard policies (content filters, PII redaction, topic denial, word filters, contextual grounding, automated reasoning). Use when implementing content moderation, detecting prompt attacks, preventing hallucinations, protecting sensitive data, enforcing compliance policies, or securing generative AI applications with mathematical verification.
Amazon Bedrock Prompt Management for creating, versioning, and managing prompt templates with variables, multi-variant A/B testing, and flow integration. Use when creating reusable prompt templates, managing prompt versions, implementing A/B testing for prompts, integrating prompts with Bedrock Flows, optimizing prompt engineering, or building production prompt catalogs.
State snapshots and rollback for safe experimentation. Use when creating checkpoints, rolling back changes, managing recovery points, or implementing safe experimentation.
Comprehensive guide to Claude Opus 4.5, Anthropic's most intelligent model with effort parameter for reasoning control. Covers model capabilities, benchmarks, effort levels (high/medium/low), hybrid reasoning, and model selection. Use when working with Opus 4.5, optimizing reasoning depth, choosing models, or understanding effort parameter trade-offs.
Isolated component testing for React, Vue, and Svelte with Playwright. Use when testing UI components in isolation, testing component interactions, or building component test suites.
EKS networking configuration including VPC CNI, load balancers, and network policies. Use when setting up cluster networking, configuring ingress/load balancing, implementing network security, troubleshooting connectivity, or optimizing network costs.
Interactive chat workflows with Gemini CLI including context management, multimodal conversations, and session persistence. Use for extended AI conversations, brainstorming, or collaborative problem-solving.
Create, validate, and deploy Claude Code hooks for workflow automation. Hooks enable event-driven automation at 8 lifecycle points (PreToolUse, PostToolUse, UserPromptSubmit, etc.) with structured JSON control. Use when automating code formatting, security gates, observability integration, validation enforcement, or any event-driven workflow automation in Claude Code.
External persistent memory for cross-session knowledge. Use when storing error patterns, retrieving learned solutions, managing causal memory chains, or persisting project knowledge.
Spawn and coordinate parallel agents for faster completion. Use when running parallel tasks, spawning subagents, coordinating concurrent work, or optimizing throughput.
Master orchestrator for generating Ralph Wiggum-compatible prompts. Analyzes task requirements and routes to appropriate generator (single-task, multi-task, project, or research). Use when you need to create any Ralph loop prompt and want automatic selection of the right generator.
Track skill changes, improvements, and evolution over time. Task-based operations for version tracking, change documentation, impact measurement, and evolution analysis. Use when tracking skill versions, documenting changes over time, measuring improvement impact, or analyzing how skills evolved.
Terra API authentication, credentials management, and environment configuration. Use when setting up Terra integration, managing API keys, generating widget sessions, or configuring testing/staging/production environments.
Terra API webhook handling for real-time health data. Use when setting up webhook endpoints, verifying signatures, handling events, or debugging webhook issues.
Provision production-ready AWS ECS clusters with Terraform. Covers cluster configuration, Fargate and EC2 launch types, task definitions, services, load balancer integration, auto-scaling, and deployment strategies. Use when provisioning ECS, setting up container orchestration on AWS, configuring Fargate services, or managing ECS infrastructure as code.
Extract insights from autonomous coding sessions. Use when learning from completions, extracting patterns, analyzing decisions, or improving future performance.
AI-powered browser automation using Stagehand v3 and Claude. Use when building self-healing tests, AI agents, dynamic web automation, or when traditional selectors break frequently due to UI changes.
Railway debugging and issue resolution. Use when deployments fail, builds error, services crash, performance degrades, or networking issues occur.
Main orchestration loop for autonomous coding. Use when running autonomous sessions, orchestrating feature completion, managing continuous loops, or coordinating agent lifecycle.
Setup secure VS Code IDE in browser with code-server on WSL2, accessible from mobile/tablet via ngrok/Cloudflare/Tailscale. Full IDE features with extension support, resource management, and performance optimization. Use when you need remote IDE access, VS Code in browser, remote development, or full coding environment on mobile/tablet.
Supabase Realtime for live subscriptions, broadcasts, and presence. Use when implementing real-time features, live updates, chat, or online presence tracking.
Manage git commits for autonomous coding. Use when committing feature implementations, creating descriptive commits, managing git workflow, or handling version control.
Manage knowledge graph for autonomous coding. Use when storing relationships, querying connected knowledge, building project understanding, or maintaining semantic memory.
Manage persistent memory for autonomous coding. Use when storing/retrieving knowledge, managing Graphiti integration, persisting learnings, or accessing episodic memory.
Session lifecycle management for autonomous coding. Use when starting sessions, resuming work, detecting session type (init vs continue), or managing auto-continuation between sessions.
Run TDD cycle for feature implementation. Use when implementing features with RED-GREEN-REFACTOR, running test-driven development, automating TDD workflow, or ensuring test-first development.
Manage git worktrees for isolated development. Use when creating isolated workspaces, managing parallel development, handling worktree lifecycle, or merging completed work.
Persistent memory architecture for AI agents across sessions. Episodic memory (past events), procedural memory (learned skills), semantic memory (knowledge graph), short-term memory (active context). Use when implementing cross-session persistence, skill learning, context preservation, personalization, or building truly adaptive AI systems with long-term memory.
Auto-Claude autonomous build system. Use when running builds, understanding agent workflow, managing parallel execution, or troubleshooting build issues.
Auto-update system for Auto-Claude skills and documentation. Use when checking for updates, synchronizing with upstream, updating skills automatically, or managing version compatibility.
Auto-Claude workspace and git worktree management. Use when reviewing changes, merging builds, managing branches, or understanding isolation strategy.
Amazon Bedrock AgentCore Evaluations for testing and monitoring AI agent quality. 13 built-in evaluators plus custom LLM-as-Judge patterns. Use when testing agents, monitoring production quality, setting up alerts, or validating agent behavior.
Amazon Bedrock AgentCore Policy for defining agent boundaries using natural language and Cedar. Deterministic policy enforcement at the Gateway level. Use when setting agent guardrails, access control, tool permissions, or compliance rules.
Browser-based E2E testing for feature verification. Use when running end-to-end tests, validating features in browser, verifying user flows, or testing feature completion.
Interactive REPL workflows with Codex CLI including session management, multimodal conversations, and automated execution. Use for extended development sessions, debugging, or collaborative problem-solving.
Git-aware development workflows with Codex CLI including intelligent commits, PR automation, branch management, and diff application. Use for git operations, PR reviews, or automated git workflows.
AWS Fargate serverless container compute for ECS. Covers Fargate vs EC2 decision guide, CPU/memory sizing, platform versions, Fargate Spot cost optimization, Graviton/ARM architecture, networking, and EFS integration. Use when deploying serverless containers, optimizing Fargate costs, sizing Fargate tasks, or choosing between Fargate and EC2 launch types.
FinnHub financial data API integration for stocks, forex, crypto, news, and fundamentals. Use when fetching real-time quotes, company profiles, financial statements, insider trading, earnings calendars, or market news.
Generate images with Gemini 3 Pro Image (Nano Banana Pro). Covers 4K generation, text rendering, grounded generation with Google Search, conversational editing, and cost optimization. Use when creating images, generating 4K images, editing images conversationally, fact-verified image generation, or image output tasks.
Integrate Gemini AI CLI into Claude Code for AI collaboration, code analysis, and tool execution. Use when working with Gemini AI, Google AI, multimodal tasks, or needing advanced AI capabilities.
First-session agent for autonomous coding projects. Use when starting a new autonomous project, generating feature lists, setting up environments, or scaffolding project structure.
Comprehensive research and analysis using Claude (subagents), Gemini CLI, and Codex CLI. Multi-perspective research with cross-verification, iterative refinement, and 100% citation coverage. Use for security analysis, architecture research, code quality assessment, performance analysis, or any research requiring rigorous verification and multiple AI perspectives.
Configure Grafana alerts for Claude Code anomalies and thresholds. Use when setting up monitoring alerts for sessions, errors, context usage, or subagents.
Complete planning workflow orchestrating architecture planning, task breakdown, and progress tracking setup. Sequential workflow from requirements analysis through structure design and task breakdown to implementation tracking. Use when planning complex skills, setting up comprehensive project plans, or preparing for systematic skill development with full planning artifacts.
Comprehensive Playwright E2E testing framework for browser automation. Use when setting up tests, writing E2E scenarios, debugging test failures, configuring CI/CD pipelines, or running browser automation on WSL2.
Railway.com GraphQL API automation for projects, services, deployments, and environment variables. Use when automating Railway operations, querying project data, managing deployments, setting variables via API, or integrating Railway into workflows.
Railway.com built-in metrics, monitoring dashboards, alerting (Pro plan), and external OTEL integration with Grafana. Use when setting up monitoring, creating dashboards, configuring alerts, integrating Prometheus/Loki/Tempo, deploying Grafana stack, or analyzing Railway service metrics.
Generate Ralph-compatible prompts for entire projects from scratch. Creates comprehensive prompts with architecture phase, implementation phases, testing, and documentation. Use when building complete applications, libraries, CLI tools, or any greenfield project requiring end-to-end development.
Autonomous TDD development loop with parallel agent swarm, category evolution, and convergence detection. Use when running autonomous game development, quality improvement loops, or comprehensive codebase reviews.
Complete quality assurance workflow orchestrating validation, comprehensive review, and functional testing. Sequential workflow from quality gating through multi-dimensional review to scenario testing. Use when conducting complete skill quality assurance, pre-deployment validation, or comprehensive quality checks combining multiple review approaches.
Simple calculator for basic arithmetic operations (addition, subtraction, multiplication, division). Use when performing calculations, converting units, or working with numbers.
Apply improvements to Claude Code skills systematically. Workflow for planning updates, implementing changes, validating improvements, and documenting changes. Use when applying review recommendations, updating skills based on feedback, enhancing existing skills, or implementing systematic improvements across multiple skills.
Supabase JavaScript client API and REST API usage. Use when integrating Supabase, setting up clients, or using REST endpoints directly.
Supabase Edge Functions development and deployment using Deno runtime. Use when creating serverless functions, webhooks, API endpoints, or scheduled tasks.
Terra API device and provider connections. Use when connecting users to wearables (Fitbit, Garmin, Apple Health, Oura, WHOOP), managing user sessions, or handling disconnections.
Terra SDK integration for Python, JavaScript, iOS, Android, React Native, and Flutter. Use when implementing Terra in applications, choosing SDKs, or integrating mobile health sources.
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
Execute and manage Codex CLI tools including file operations, shell commands, web search, and automation patterns. Use for automated workflows, tool orchestration, and full automation with permission bypass.
Generate tests for features using TDD approach. Use when creating test files, generating test cases, implementing RED phase of TDD, or scaffolding test infrastructure.
Create handoff packages for session transitions. Use when ending sessions, preparing for continuation, saving session state, or creating resumable context.
Auto-Claude spec creation and management. Use when creating feature specs, understanding spec pipeline phases, modifying requirements, or managing spec lifecycle.
Amazon Bedrock AgentCore multi-agent orchestration with Agent-to-Agent (A2A) protocol. Supervisor-worker patterns, agent collaboration, and hierarchical delegation. Use when building multi-agent systems, orchestrating specialized agents, or implementing complex workflows.
Incremental development agent with TDD workflow. Use when implementing features one at a time, following test-driven development, making commits, or resuming development work.
Accessibility testing with axe-core and Playwright. Use when checking WCAG compliance, finding a11y issues, ensuring keyboard navigation, or testing screen reader compatibility.
Production-ready code generation and incremental development using Explore-Plan-Code-Commit workflow. TDD-driven with <200 line changes, automatic rollback, and multi-agent coordination. Use when implementing features, refactoring code, migrating systems, or integrating components requiring rigorous testing and quality assurance.
Create and manage Grafana dashboards for Claude Code observability. Use when importing pre-built dashboards or creating custom visualizations.
Deploy LGTM observability stack to Railway cloud. Use when deploying cloud-hosted observability with team access.
Automated LGTM + Alloy observability stack deployment using Docker Compose. Use when setting up Claude Code observability infrastructure locally.
Security sandbox for autonomous coding. Use when validating commands, configuring permissions, managing allowlists, or ensuring safe execution.
State persistence for autonomous coding. Use when saving progress, loading state, tracking features, managing checkpoints, or persisting data across sessions.
EKS observability with metrics, logging, and tracing. Use when setting up monitoring, configuring logging pipelines, implementing distributed tracing, building production dashboards, troubleshooting EKS issues, optimizing observability costs, or establishing SLOs.
Comprehensive multi-dimensional skill reviews across structure, content, quality, usability, and integration. Task-based operations with automated validation, manual assessment, scoring rubrics, and improvement recommendations. Use when reviewing skills, ensuring quality, validating production readiness, identifying improvements, or conducting quality assurance.
Real-time cost tracking, budget enforcement, and ROI measurement for AI agent operations. Track token usage, predict costs, enforce budget caps ($50-70/month typical), optimize model selection, cache results, measure cost-to-value. Use when tracking AI costs, preventing budget overruns, optimizing spend, measuring ROI, or ensuring cost-effective AI operations.
Comprehensive analysis operations for code, skills, processes, data, and patterns. Task-based operations with pattern recognition, metrics calculation, trend identification, and actionable insights generation. Use when analyzing code quality, reviewing skill effectiveness, identifying process improvements, extracting patterns, or generating insights from data.
Analyze skill effectiveness through usage feedback, metrics analysis, and outcome assessment. Task-based operations for feedback collection, effectiveness measurement, trend analysis, and insight extraction. Use when analyzing skill effectiveness, measuring ROI, understanding usage patterns, or evaluating toolkit impact based on real usage data.
REST API client builder with authentication, error handling, retry logic, and request management. Supports OAuth, JWT, API keys. Use when building API integrations, creating API clients, or working with REST services.
REST and GraphQL API testing with Playwright. Use when testing APIs, mocking endpoints, validating responses, or integrating API tests with E2E flows.
Auto-Claude CLI command reference and usage patterns. Use when running specs, managing builds, checking status, or using CLI commands for autonomous coding tasks.
Complete Auto-Claude installation and setup guide for all platforms. Use when installing Auto-Claude on WSL, Windows, Linux, or macOS, setting up development environment, or troubleshooting installation issues.
Auto-Claude debugging and troubleshooting guide. Use when fixing installation issues, debugging build failures, resolving agent errors, or diagnosing performance problems.
Token and cost optimization for autonomous coding. Use when tracking token usage, optimizing API costs, managing budgets, or improving efficiency.
Autonomous Claude Code operation using Opus 4.5 for intelligent continuation decisions. Use when running long tasks, multi-step implementations, overnight development, or any workflow requiring continuous autonomous operation without human intervention.
Automatically apply improvements to skills and the ecosystem based on system-reviewer findings and best-practices-learner insights. Workflow for automated improvement identification, priority assessment, safe application, validation, and rollback capability. Use when applying systematic improvements, automating enhancement cycles, bulk updating multiple skills, or implementing ecosystem-wide improvements.
Amazon Bedrock AgentCore platform for building, deploying, and operating production AI agents. Covers Runtime, Gateway, Browser, Code Interpreter, and Identity services. Use when building Bedrock agents, deploying AI agents to production, or integrating with AgentCore services.
Amazon Bedrock AgentCore deployment patterns for production AI agents. Covers starter toolkit, direct code deploy, container deploy, CI/CD pipelines, and infrastructure as code. Use when deploying agents to production, setting up CI/CD, or managing agent infrastructure.
Amazon Bedrock Agents for building autonomous AI agents with foundation model orchestration, action groups, knowledge bases, and session management. Use when creating AI agents, orchestrating multi-step workflows, integrating tools with LLMs, building conversational agents, implementing RAG patterns, managing agent sessions, deploying production agents, or connecting knowledge bases to agents.
Process multimodal inputs (images, video, audio, PDFs) with Gemini 3 Pro. Covers image understanding, video analysis, audio processing, document extraction, media resolution control, OCR, and token optimization. Use when analyzing images, processing video, transcribing audio, extracting PDF content, or working with multimodal data.
Mobile device emulation and responsive testing with Playwright. Use when testing mobile layouts, touch interactions, device-specific features, or responsive breakpoints.
Build visual AI workflows with Amazon Bedrock Flows. Create flows with prompt nodes, knowledge bases, Lambda, inline code, condition branching, iterators, collectors, and DoWhile loops. Version management, aliases, deployment. Use when building multi-step AI workflows, orchestrating models and services, creating condition-based routing, implementing iterative processing, or deploying production AI pipelines.
Amazon Bedrock Knowledge Bases for RAG (Retrieval-Augmented Generation). Create knowledge bases with vector stores, ingest data from S3/web/Confluence/SharePoint, configure chunking strategies, query with retrieve and generate APIs, manage sessions. Use when building RAG applications, implementing semantic search, creating document Q&A systems, integrating knowledge bases with agents, optimizing chunking for accuracy, or querying enterprise knowledge.
Extract learnings and best practices from skill development experience, review findings, and pattern analysis. Task-based operations for pattern extraction, learning documentation, guideline updates, knowledge sharing, and continuous improvement. Use when extracting learnings from completed skills, updating best practices, improving development process, or feeding continuous improvement cycle.
AWS Boto3 SDK patterns for Amazon ECS cluster management, task definitions, services, and Fargate deployments. Use when working with ECS clusters, managing task definitions, deploying services, running one-off tasks, monitoring deployments, or integrating ECS with Python applications.
AWS Boto3 SDK patterns for Amazon EKS cluster management, node groups, authentication tokens, and Kubernetes client integration. Use when working with EKS clusters, managing node groups, generating kubeconfig, creating authentication tokens, integrating Kubernetes Python client, managing Fargate profiles, or implementing IRSA authentication.
CDK8s for type-safe Kubernetes manifests using Python. Use when building complex K8s applications programmatically, generating manifests from code, creating reusable infrastructure patterns, or managing multi-environment deployments.
Advanced tool use patterns including tool search, programmatic calling, and production orchestration. Use when scaling to 10,000+ tools, optimizing token usage, or implementing production tool systems.
Enable and configure Claude Code OTEL telemetry for local or Railway observability stacks. Use when setting up Claude Code to send metrics, logs, and traces to observability backends.
Maintain development momentum and prevent project stalls through progress tracking, blocker resolution, quick wins identification, energy management, and continuation strategies. Task-based operations for detecting stalls, breaking through obstacles, maintaining forward progress, and ensuring completion. Use when progress stalling, facing blockers, losing energy, needing motivation, or ensuring project continuation to completion.
AWS Controllers for Kubernetes (ACK) for Kubernetes-native AWS resource management. Use when managing AWS resources via kubectl, implementing GitOps for infrastructure, creating self-service developer platforms, integrating AWS services with EKS workloads, or adopting existing AWS resources into Kubernetes.
Configuration management for autonomous coding. Use when loading settings, managing environment variables, configuring providers, or setting up autonomous mode options.
Optimize context usage for autonomous coding. Use when managing context window, prioritizing information, reducing token usage, or improving efficiency.
Validate acceptance criteria and feature completion. Use when checking if features pass, validating test results, verifying acceptance criteria, or determining feature completion status.
Analyze features and their dependencies. Use when mapping feature relationships, detecting blockers, optimizing build order, or identifying critical paths.
Hook installation and management for autonomous coding. Use when setting up Stop hooks, managing pre/post tool hooks, or configuring autonomous continuation.
Advanced computer use patterns for UI automation, application control, and multi-step workflows using Claude's computer use tool. Use when automating desktop tasks, testing applications, analyzing screen content, controlling software programmatically, or building computer vision workflows. Supports zoom tool for enhanced vision on Opus 4.5, multi-step automation, and sophisticated application control.
Comprehensive context management strategies for cost optimization and infinite-length conversations. Covers server-side clearing (tool results, thinking blocks), client-side SDK compaction (automatic summarization), and memory tool integration. Use when managing long conversations, optimizing token costs, preventing context overflow, or enabling continuous agentic workflows.
Comprehensive cost tracking and optimization for production Claude deployments. Covers Admin API usage tracking, efficiency measurement, ROI calculation, optimization patterns (caching, batching, model selection, context editing, effort parameter), and cost prediction. Use when tracking costs, optimizing token usage, measuring efficiency, calculating ROI, reducing production expenses, or implementing cost-effective Claude integrations.
Setup and manage OpenAI Codex CLI authentication including ChatGPT Plus/Pro OAuth, API keys, and multi-account management. Use when configuring Codex access, switching accounts, or troubleshooting authentication.
Integrate OpenAI Codex CLI into Claude Code for AI collaboration, code generation, and automated development. Use when working with OpenAI models (GPT-5.2, GPT-5.1-Codex-Max, o3, o4-mini), code refactoring, git workflows, or needing full automation with permission bypass.
Optimize Claude Code context usage through monitoring, reduction strategies, progressive disclosure, planning/execution separation, and file-based optimization. Task-based operations for context window management, token efficiency, and maintaining conversation quality. Use when managing token costs, optimizing context usage, preventing context overflow, or improving multi-turn conversation quality.
State persistence across autonomous coding sessions. Use when saving progress, loading context, managing feature lists, tracking git history, or restoring session state.
ECS deployment strategies including rolling updates, blue-green with CodeDeploy, canary releases, and GitOps workflows. Covers deployment circuit breakers, rollback strategies, and production deployment patterns. Use when deploying ECS services, implementing blue-green deployments, setting up CI/CD pipelines, or managing production releases.
ECS troubleshooting and debugging guide covering task failures, service issues, networking problems, and performance diagnostics. Use when diagnosing ECS issues, debugging task failures (STOPPED, PENDING), resolving networking problems, investigating IAM/permissions errors, troubleshooting container health checks, or analyzing ECS service health.
Intelligent error detection and recovery for autonomous coding. Use when handling errors, implementing retry logic, recovering from failures, or managing exception handling.
Generate comprehensive ecosystem progress reports showing skills built, efficiency gains, quality metrics, learnings captured, and system evolution. Task-based reporting operations for status reports, efficiency analysis, quality summaries, and evolution documentation. Use when reporting ecosystem progress, communicating status to stakeholders, documenting achievements, or creating milestone reports.
Financial Modeling Prep API for stocks, fundamentals, SEC filings, institutional holdings (13F), and congressional trading. Use when fetching financial statements, ratios, DCF valuations, insider/institutional ownership, or screening stocks.
Advanced Gemini 3 Pro features including function calling, built-in tools (Google Search, Code Execution, File Search, URL Context), structured outputs, thought signatures, context caching, batch processing, and framework integration. Use when implementing tools, function calling, structured JSON output, context caching, batch API, LangChain, Vercel AI, or production features.
Setup and manage Gemini CLI authentication methods including OAuth, API keys, and Vertex AI. Use when configuring Gemini access, switching auth methods, or troubleshooting authentication issues.
Manage MCP (Model Context Protocol) servers with Gemini CLI for extended tool capabilities, custom integrations, and enterprise workflows. Use when integrating external tools, databases, or APIs with Gemini.
Execute and manage Gemini CLI built-in tools including file operations, web search, shell commands, and memory. Use for automated workflows, tool orchestration, and safe execution patterns.
Clean transitions between agents and sessions. Use when preparing handoffs, serializing state, bridging context between agents, or coordinating multi-agent workflows.
Kubernetes Python client for programmatic cluster management. Use when working with Kubernetes API, managing pods, deployments, services, namespaces, configmaps, secrets, jobs, CRDs, EKS clusters, watching resources, automating K8s operations, or building Kubernetes controllers.
Create production-ready, agent-executable plans using verification-first approach, hierarchical decomposition, dependency mapping, and quality gates. Optional multi-AI research integration (Claude + Gemini + Codex). Use when planning complex features, migrations, refactorings, security implementations, or any multi-step agentic workflows requiring rigorous verification and parallel execution coordination.
Meta-skill for building Claude Code skills using Multi-AI research, planning, and implementation. Coordinates Claude, Gemini, and Codex for comprehensive research, synthesizes findings, and generates production-ready skills. Use when creating new skills, enhancing existing skills, researching skill domains, or building skill families.
Test-driven development with independent verification to prevent test gaming. TDD workflows, test generation, coverage validation (≥80% gate, ≥95% target), property-based testing, edge case discovery. Use when implementing TDD workflows, generating comprehensive test suites, validating test coverage, or preventing test gaming through independent multi-agent verification.
Automate Claude Code plugin creation, packaging, validation, and distribution. Use when creating plugins, packaging skills, generating manifests, validating plugin structure, setting up marketplaces, or distributing skill collections.
Analyze projects and recommend observability integration. Use when adding observability to projects Claude Code works on.
Create project-specific Claude Code skills tailored to particular codebases, domains, or organizations. Workflow for analyzing project context, identifying skill opportunities, designing project-specific patterns, and building custom skills. Use when creating skills for specific projects, capturing project knowledge, building team-specific workflows, or developing domain-specific skill toolkits.
Railway authentication and token management. Use when logging into Railway, creating API tokens, setting up CI/CD authentication, or verifying Railway credentials.
Railway CI/CD integration and automation. Use when setting up GitHub Actions, GitLab CI, automated deployments, migration scripts, or programmatic Railway workflows.
Railway log access and analysis for debugging and monitoring. Covers build logs, deploy logs, runtime logs, HTTP logs, filtering, search, and external export. Use when viewing logs, debugging Railway deployments, investigating errors, analyzing HTTP requests, filtering log output, or exporting logs to external systems.
Comprehensive Railway.com project, environment, and variable management. Use when creating Railway projects, managing environments, configuring services, setting variables, syncing environments, managing PR environments, or organizing Railway infrastructure.
Generate Ralph-compatible prompts for multiple related tasks. Creates phased prompts with sequential milestones, cumulative progress tracking, and phase-based completion promises. Use when creating prompts for CRUD implementations, multi-step features, staged migrations, or any work requiring multiple distinct but related tasks.
Generate Ralph-compatible prompts for research, analysis, and planning tasks. Creates prompts with systematic research phases, synthesis requirements, and deliverable specifications. Use when analyzing codebases, creating migration plans, researching technologies, auditing security, or any task requiring investigation before action.
Generate Ralph-compatible prompts for single implementation tasks. Creates prompts with clear completion criteria, automatic verification, and TDD approach. Use when creating prompts for bug fixes, single features, refactoring tasks, or any focused implementation that can be completed in one session.
Self-improving review loop for Ralph Wiggum skills. Reviews skills against best practices, implements improvements, and continues until two consecutive clean reviews. Use when validating or improving the ralph-prompt-* skill suite.
Universal guide for creating production-ready Claude Code skills for any project. Includes 6-step workflow (understand, plan, initialize, edit, package, iterate), progressive disclosure design, YAML frontmatter templates, validation scripts, reference organization patterns, and 10 community-proven innovations. Use when creating new Claude Code skills, converting documentation to skills, improving existing skills, or learning skill development best practices for any domain.
Test Claude Code skills in real-world scenarios to validate functionality, usability, and effectiveness. Task-based testing operations for scenario testing, example validation, integration testing, and usability assessment. Use when testing skill functionality, validating examples work correctly, ensuring real-world effectiveness, or conducting scenario-based quality assurance.
Ensure Claude Code skills meet quality standards through validation operations for structure, content, patterns, and production readiness. Task-based validation with pass/fail criteria, automated checks, and compliance reporting. Use when validating skills before deployment, ensuring standards compliance, certifying production readiness, or quality gating skill releases.
Setup and manage Supabase authentication including project connection, tokens, login methods, and user management. Use when configuring Supabase access, implementing authentication, or managing users.
Supabase database operations including queries, CRUD operations, RLS policies, and PostgreSQL functions. Use when querying tables, managing data, implementing RLS, or writing database functions.
Review the skill development ecosystem itself - assess ecosystem health, identify systemic issues, evaluate toolkit effectiveness, and recommend system-level improvements. Task-based operations for ecosystem assessment, toolkit evaluation, process review, and system optimization. Use when evaluating ecosystem health, identifying systemic improvements, optimizing the toolkit itself, or conducting meta-level ecosystem reviews.
Break down skill development into concrete tasks with time estimates, dependencies, and validation criteria. Creates actionable task lists, identifies blockers, estimates effort, and sequences work optimally. Use when planning skill implementation, managing complex builds, or coordinating parallel work.
Test-Driven Development workflow for autonomous coding. Use when implementing features with TDD, writing tests first, following red-green-refactor, or ensuring test coverage.
Terra API health data retrieval and management. Use when fetching activity, sleep, body, daily, nutrition, menstruation, or athlete data from wearables.
Comprehensive AWS infrastructure management with Terraform. Covers provider configuration, state management (S3 backend with native locking in Terraform 1.11+), common resource patterns (VPC, IAM, S3, RDS, EKS), module usage, and production best practices. Use when provisioning AWS infrastructure, managing Terraform state, creating VPCs, configuring IAM roles/policies, deploying databases, or troubleshooting Terraform/AWS issues.
Provision production-ready AWS EKS clusters with Terraform. Covers cluster configuration, managed node groups, Fargate profiles, IRSA, EKS add-ons (CoreDNS, kube-proxy, VPC CNI, EBS CSI), VPC integration, and security best practices. Use when provisioning EKS, setting up Kubernetes on AWS, configuring node groups, implementing IRSA, or managing EKS infrastructure as code.
Terra API troubleshooting and debugging. Use when experiencing connection issues, data sync problems, webhook failures, SDK errors, or provider-specific issues.
Comprehensive testing validation for Claude Code skills through functional testing, example validation, integration testing, regression testing, and edge case testing. Task-based testing operations with automated example execution, manual scenario testing, and test reporting. Use when testing skill functionality, validating examples execute correctly, ensuring integration works, preventing regressions, or conducting comprehensive functional quality assurance.
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.
Integrate Playwright tests with OpenTelemetry, Grafana, Prometheus, Loki, and Tempo. Use when debugging test failures across distributed systems, measuring test performance, creating test dashboards, or correlating tests with backend traces.
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
Parse and validate project specifications. Use when loading YAML/JSON specs, validating spec structure, extracting requirements, or converting between spec formats.
Performance testing with Lighthouse and Web Vitals integration in Playwright. Use when measuring page load times, Core Web Vitals, Lighthouse audits, or performance budgets.
Validate code quality and standards. Use when running linting, checking types, validating code style, or performing static analysis.
Amazon Bedrock Model Customization with fine-tuning, continued pre-training, reinforcement fine-tuning (NEW 2025 - 66% accuracy gains), and distillation. Create customization jobs, monitor training, deploy custom models, and evaluate performance. Use when customizing Claude, Titan, or other Bedrock models for domain-specific tasks, adapting to proprietary data, improving accuracy on specialized workflows, or distilling large models to smaller ones.
Automated pattern recognition in Claude Code telemetry. Use when detecting failures, slowness, anomalies, trends, inefficiencies, conversation patterns, or tool sequences.
Manage observability stack lifecycle (start, stop, backup, restore, upgrade). Use when controlling the LGTM stack for Claude Code monitoring.
Automatic context summarization for long-running sessions. Use when context is approaching limits, summarizing completed work, preserving critical information, or managing token budgets.
Comprehensive skill development planning. Analyzes requirements, chooses organizational patterns (workflow/task/reference/capabilities), defines structure, estimates complexity, identifies dependencies, and creates detailed implementation plan. Use when planning new skills, converting documentation to skills, or architecting complex skill systems.
Session lifecycle management for autonomous coding. Use when starting new coding sessions, resuming work, detecting session type (init vs continue), or managing auto-continuation between sessions.
Comprehensive Anthropic product expertise covering Claude models, Claude API, Python SDK, Agent SDK, Claude Code, and Model Context Protocol. Six integrated capabilities with complete documentation, searchable references, code examples, and cross-product integration patterns. Use when working with Claude API, building agents, using SDKs, developing with Claude Code, integrating MCP servers, learning Anthropic products, optimizing costs, implementing Anthropic features, managing context, using Opus 4.5, or implementing advanced tool patterns.
Merge multiple PDF files into single document with customizable options. Supports page selection, bookmarks, and metadata. Use when combining PDFs, creating documents from multiple sources, or organizing PDF collections.
Assess feature and project complexity. Use when estimating effort, determining spec pipeline type, calculating cost estimates, or planning resource allocation.
Systematic debugging using Claude, Gemini, and Codex as specialized agents. Multi-agent root cause analysis, log analysis, error classification, and auto-fix generation. Use when debugging production issues, analyzing error logs, performing root cause analysis, troubleshooting complex systems, or implementing self-healing patterns.
Optimize skill development processes through bottleneck identification, efficiency analysis, automation discovery, and workflow improvement. Task-based operations for process analysis, bottleneck elimination, automation opportunities, and workflow streamlining. Use when optimizing development processes, reducing cycle times, improving efficiency, or streamlining workflows based on system-reviewer findings.
EKS troubleshooting and debugging guide covering pod failures, cluster issues, networking problems, and performance diagnostics. Use when diagnosing cluster issues, debugging pod failures (CrashLoopBackOff, Pending, OOMKilled), resolving networking problems, investigating performance issues, troubleshooting IAM/IRSA permissions, fixing image pull errors, or analyzing EKS cluster health.
Track and report progress across autonomous coding sessions. Use when generating progress reports, calculating metrics, visualizing completion, or estimating time to completion.
Amazon Bedrock Automated Reasoning for mathematical verification of AI responses against formal policy rules with up to 99% accuracy. Use when validating healthcare protocols, financial compliance, legal regulations, insurance policies, or any domain requiring deterministic verification of AI-generated content.
Self-hosted AI browser automation using Browser Use with any LLM (Claude, GPT, Ollama). Use when building web scraping agents, data extraction pipelines, self-hosted automation, or when you need flexibility without API rate limits.
Complete end-to-end skill development workflow orchestrating research, planning, task breakdown, prompt design, and progress tracking. Use when building new Claude Code skills, creating workflow skills, or following systematic development process from concept to validated skill.
Karpenter for intelligent Kubernetes node autoscaling on EKS. Use when configuring node provisioning, optimizing costs with Spot instances, replacing Cluster Autoscaler, implementing consolidation, or achieving 20-70% cost savings.
Complete feedback loop from observability insights to skill updates. Use when analyzing enhanced telemetry patterns and automatically improving skills.
Interactive specification builder for autonomous coding projects. Use when users have vague ideas, need help defining requirements, want to create project specs, or before running autonomous-master.
Analyze context and decide on continuation via Stop hook. Use when determining if work should continue, analyzing completion status, making continuation decisions, or implementing the Two-Claude pattern.
Coordinate parallel autonomous operations. Use when running parallel features, managing concurrent work, coordinating multiple agents, or optimizing throughput.
Complete development workflow orchestrator coordinating all multi-ai skills (research → planning → implementation → testing → verification) with quality gates, failure recovery, and state management. Single-command complete workflows from objective to production-ready code. Use when implementing complete features requiring full pipeline, coordinating multiple skills automatically, or executing production-grade development cycles end-to-end.
Manage and compact context for long sessions. Use when context is filling up, creating handoff summaries, optimizing context usage, or preparing for session continuation.
Comprehensive research toolkit for discovering patterns, best practices, and technical knowledge across Web search, MCP servers, GitHub repositories, and documentation. Use when researching technologies, exploring codebases, finding examples, or gathering requirements for skill development.
Comprehensive Claude Code telemetry via 10 hooks capturing sessions, conversations, tools, subagents, context, permissions, and repository analytics. Use when analyzing detailed usage patterns beyond default OTEL.
Master controller for complete autonomous operation. Use when starting full autonomous projects, managing end-to-end workflow, controlling autonomous lifecycle, or running complete implementations.
Secure command execution with allowlists and validation hooks. Use when validating bash commands, configuring security policies, implementing pre-tool-use hooks, or sandboxing autonomous agent operations.
Supabase Storage for file uploads, downloads, buckets, and signed URLs. Use when uploading files, managing storage buckets, generating signed URLs, or handling images.
EKS security hardening and best practices. Use when configuring cluster security, implementing pod security, managing secrets, preparing for compliance audits, hardening infrastructure, scanning containers, or responding to security incidents.
Supabase CLI commands for local development, migrations, project management, and deployment. Use when working with Supabase CLI, starting local dev, managing migrations, or deploying changes.
Systematic research workflow orchestrating multi-source research operations for comprehensive domain investigation. Sequential workflow from web search through GitHub exploration and documentation analysis to research synthesis. Use when researching new domains, gathering patterns, investigating technologies, or conducting comprehensive multi-source research for skill development.
Build effective prompts for Claude Code skills. Creates clear, specific, actionable prompts using engineering principles, templates, and validation. Use when creating skill instructions, workflow steps, task operations, or any Claude prompt.
Troubleshoot common Supabase issues including auth errors, RLS policies, connection problems, and performance. Use when debugging Supabase issues, fixing errors, or optimizing performance.
Main orchestrator for autonomous coding operations. Use when running autonomous sessions, coordinating components, managing the full lifecycle, or orchestrating implementations.
Complete continuous improvement cycle orchestrating review, analysis, learning extraction, systematic updates, and validation. Sequential workflow from comprehensive review through pattern analysis and learning extraction to improvement application and re-validation. Use when continuously improving skills, applying review findings, implementing systematic enhancements, or executing complete improvement cycles.
Identify improvement opportunities in Claude Code skills through targeted review operations. Complements review-multi by focusing on actionable improvements rather than scoring. Use when seeking specific improvements, conducting improvement-focused reviews, or identifying enhancement opportunities for existing skills.
Alpha Vantage financial API for stocks, forex, crypto, and 50+ technical indicators. Use when fetching time series data, technical analysis, fundamentals, economic indicators, or news sentiment.
Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.
IAM Roles for Service Accounts (IRSA) for EKS pod-level AWS permissions. Use when configuring pod IAM access, setting up AWS service integrations, implementing least-privilege security, troubleshooting OIDC trust relationships, or deploying AWS controllers.
Manage checkpoints for rollback capability. Use when creating save points, rolling back changes, managing recovery points, or restoring previous states.
Automated documentation update mechanism for anthropic-expert skill. Five-step workflow from update detection through documentation fetching and processing to skill integration and validation. Use when updating Anthropic documentation, checking for new releases, fetching latest docs, keeping anthropic-expert current, or synchronizing with Anthropic product changes.
Multi-perspective code review using Claude, Gemini, and Codex as specialized agents. 5-dimensional analysis (security, performance, maintainability, correctness, style) with LLM-as-judge consensus, quality scoring, and CI/CD integration. Use when reviewing PRs, auditing code quality, preparing production releases, or establishing code review workflows.
Comprehensive Railway.com deployment management for GitHub, Docker, and local sources. Use when deploying to Railway, redeploying services, rolling back deployments, monitoring deployment status, managing staging/production deployments, or verifying deployment health.
Multi-layer quality assurance with 5-layer verification pyramid (Rules → Functional → Visual → Integration → Quality Scoring). Independent verification with LLM-as-judge and Agent-as-a-Judge patterns. Score 0-100 with ≥90 threshold. Use when verifying code quality, security scanning, preventing test gaming, comprehensive QA, or ensuring production readiness through multi-layer validation.
Code review workflows with Codex CLI including automated reviews, diff analysis, and PR improvements. Use for code review, quality checks, or automated improvement suggestions.
Gemini 3 Pro API/SDK integration for text generation, reasoning, and chat. Covers setup, authentication, thinking levels, streaming, and production deployment. Use when working with Gemini 3 Pro API, Python SDK, Node.js SDK, text generation, chat applications, or advanced reasoning tasks.
Pure-discovery autonomous quality engine that reads ANY codebase, dynamically generates metrics from what it finds, auto-fixes what it can, creates GitHub issues for what it can't, and loops continuously until all auto-fixable metrics reach 10/10. No templates, no hardcoded categories — everything is discovered from the project itself. Each cycle re-reads the project, refreshes the rubric (adding new metrics, removing obsolete ones), and adapts to what the project looks like NOW. Works with any language, any framework, any project type. Supports goal-driven mode where you specify what you're building and the engine measures completeness + quality against that goal. Use when the user asks to "review everything", "score the app", "find all issues", "perfection engine", "run quality loop", "audit all metrics", "make everything 10/10", "score my project", "score my code", "rate my code", "find everything wrong", "quality score", "review my API", "audit my codebase", "what should I build next", "is my project complete", "continuous improvement", "portfolio review", "check my code quality", "how good is my code", or wants autonomous quality improvement.