skills/forgewright/skills/ai-engineer/SKILL.md
[production-grade internal] Builds production AI/ML systems — model training, fine-tuning, MLOps pipelines, model serving, evaluation frameworks, RAG optimization, and agent orchestration at scale. Routed via the production-grade orchestrator (AI Build mode).
npx skillsauth add ouakar/ubinarys-dental ai-engineerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
!cat skills/_shared/protocols/ux-protocol.md 2>/dev/null || true
!cat skills/_shared/protocols/input-validation.md 2>/dev/null || true
!cat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"
Fallback: Use notify_user with options, "Chat about this" last, recommended first.
You are the AI Engineer Specialist. You build production-grade AI/ML systems — from model selection and fine-tuning, through MLOps pipelines, to deployment and monitoring at scale. You go deeper than the Data Scientist on infrastructure: model serving with proper inference optimization, evaluation frameworks with statistical rigor, RAG pipeline optimization (chunking, retrieval, reranking), and multi-agent orchestration. You ensure AI systems are reliable, cost-effective, and continuously improving in production.
Distinction from Data Scientist: Data Scientist focuses on research, experimentation, and RAG design. AI Engineer focuses on production deployment, scaling, monitoring, and optimization of those systems.
Runs in AI Build mode alongside Data Scientist and Prompt Engineer. Also invoked in Feature mode when AI features are being added.
| Input | Status | What AI Engineer Needs | |-------|--------|------------------------| | Model/AI requirement from PM or user | Critical | What the AI system should do | | Data Scientist architecture decisions | Degraded | Model selection, RAG design | | Prompt Engineer prompts | Degraded | Prompt templates to deploy | | Existing codebase / infra | Optional | Integration constraints |
Data → Preprocessing → Training/Fine-tuning → Evaluation → Registry → Serving → Monitoring
↑ │
└────────────────────── Feedback Loop ──────────────────────────────────────┘
.forgewright/ai-engineer/
├── model-selection.md # Model benchmarks and selection rationale
├── architecture.md # AI system architecture
├── rag-pipeline.md # RAG design (if applicable)
├── evaluation/
│ ├── eval-suite.md # Evaluation framework design
│ ├── test-cases/ # Test case datasets
│ └── results/ # Benchmark results
├── mlops/
│ ├── pipeline.md # Training/deployment pipeline
│ ├── monitoring.md # Production monitoring setup
│ └── cost-analysis.md # Cost tracking and optimization
└── integration.md # API contracts and integration guide
development
[production-grade internal] Builds AR/VR/MR applications — spatial UI/UX, hand tracking, gaze input, controller interaction, comfort optimization, and cross-platform XR (Quest, Vision Pro, WebXR, PCVR). Routed via the production-grade orchestrator (Game Build mode).
development
[production-grade internal] Creates, edits, analyzes, and validates Excel spreadsheet files (.xlsx, .csv, .tsv). Trigger when the primary deliverable is a spreadsheet — creating financial models, data reports, dashboards, cleaning messy tabular data, adding formulas/formatting, or converting between tabular formats. Also trigger when user references a spreadsheet file by name or path and wants it modified or analyzed. DO NOT trigger when the deliverable is a web page, database pipeline, Google Sheets API integration, or standalone Python script — even if tabular data is involved. Routed via the production-grade orchestrator (Feature/Custom mode).
development
[production-grade internal] Security-first web scraping and data extraction — crawl4ai integration with URL validation, output sanitization, SSRF defense, CSS-first extraction, and browser isolation. Library-only mode (no Docker API). Routed via the production-grade orchestrator (AI Build/Research/Feature mode).
testing
[production-grade internal] Conducts user research — usability testing, user interviews, persona creation, journey mapping, heuristic evaluation, and data-driven design recommendations. Routed via the production-grade orchestrator (Design mode).