skills_all/ai-llm-engineering/SKILL.md
Operational skill hub for LLM system architecture, evaluation, deployment, and optimization (modern production standards). Links to specialized skills for prompts, RAG, agents, and safety. Integrates recent advances: PEFT/LoRA fine-tuning, hybrid RAG handoff (see dedicated skill), vLLM 24x throughput, multi-layered security (90%+ bypass for single-layer), automated drift detection (18-second response), and CI/CD-aligned evaluation.
npx skillsauth add activer007/ordinary-claude-skills ai-llm-engineeringInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A single resource for executing, validating, and scaling LLM systems with modern production standards, while delegating domain depth to specialized skills.
This skill provides quick reference, decision frameworks, and navigation to detailed operational patterns for:
For detailed patterns: See Resources and Templates sections below.
| Task | Tool/Framework | Command/Pattern | When to Use | |------|----------------|-----------------|-------------| | RAG Pipeline | LlamaIndex, LangChain | Page-level chunking + hybrid retrieval | Dynamic knowledge, 0.648 accuracy | | Agentic Workflow | LangGraph, AutoGen, CrewAI | ReAct, multi-agent orchestration | Complex tasks, tool use required | | Prompt Design | Anthropic, OpenAI guides | CoT, few-shot, structured | Task-specific behavior control | | Evaluation | LangSmith, W&B, RAGAS | Multi-metric (hallucination, bias, cost) | Quality validation, A/B testing | | Production Deploy | vLLM, TensorRT-LLM | FP8/FP4 quantization, 24x throughput | High-throughput serving, cost optimization | | Monitoring | Arize Phoenix, LangFuse | Drift detection, 18-second response | Production LLM systems |
Building LLM application: [Architecture Selection]
├─ Need current knowledge?
│ ├─ Simple Q&A? → Basic RAG (page-level chunking + hybrid retrieval)
│ └─ Complex retrieval? → Advanced RAG (reranking + contextual retrieval)
│
├─ Need tool use / actions?
│ ├─ Single task? → Simple agent (ReAct pattern)
│ └─ Multi-step workflow? → Multi-agent (LangGraph, CrewAI)
│
├─ Static behavior sufficient?
│ ├─ Quick MVP? → Prompt engineering (CI/CD integrated)
│ └─ Production quality? → Fine-tuning (PEFT/LoRA)
│
└─ Best results?
└─ Hybrid (RAG + Fine-tuning + Agents) → Comprehensive solution
See Decision Matrices for detailed selection criteria.
Claude should invoke this skill when the user asks about:
Comprehensive operational guides with checklists, patterns, and decision frameworks:
Project Planning Patterns - Stack selection, FTI pipeline, performance budgeting
Production Checklists - Pre-deployment validation and operational checklists
Common Design Patterns - Copy-paste ready implementation examples
Decision Matrices - Quick reference tables for selection
Anti-Patterns - Common mistakes and prevention strategies
Note: Each resource file includes preflight/validation checklists, copy-paste reference tables, inline templates, anti-patterns, and decision matrices.
Production templates by use case and technology:
This skill integrates with complementary Claude Code skills:
See data/sources.json for 50+ curated authoritative sources:
Quick Decisions: Decision Matrices Pre-Deployment: Production Checklists Planning: Project Planning Patterns Implementation: Common Design Patterns Troubleshooting: Anti-Patterns
Domain Depth: LLMOps | Evaluation | Prompts | Agents | RAG
Templates: templates/ - Copy-paste ready production code
Sources: data/sources.json - Authoritative documentation links
tools
Generate typed TypeScript SDKs for AI agents to interact with MCP servers. Converts verbose JSON-RPC curl commands to clean function calls (docs.createDocument() vs curl). Auto-detects MCP tools from server modules, generates TypeScript types and client methods, creates runnable example scripts. Use when: building MCP-enabled applications, need typed programmatic access to MCP tools, want Claude Code to manage apps via scripts, eliminating manual JSON-RPC curl commands, validating MCP inputs/outputs, or creating reusable agent automation.
testing
Generate structured task lists from specs or requirements. IMPORTANT: After completing ANY spec via ExitSpecMode, ALWAYS ask the user: "Would you like me to generate a task list for this spec?" Use when user confirms or explicitly requests task generation from a plan/spec/PRD.
tools
Create compelling story-format summaries using UltraThink to find the best narrative framing. Support multiple formats - 3-part narrative, n-length with inline links, abridged 5-line, or comprehensive via Foundry MCP. USE WHEN user says 'create story explanation', 'narrative summary', 'explain as a story', or wants content in Daniel's conversational first-person voice.
testing
Navigate through the original three-world shamanic technology. Deploy when soul retrieval, power animal guidance, or journey between realms emerges. Deeply respectful of Tungus, Buryat, Yakut, Evenki traditions. Use for consciousness navigation, NOT cultural appropriation.