skills_all/ai-llm-engineering/SKILL.md
Operational skill hub for LLM system architecture, evaluation, deployment, and optimization (modern production standards). Links to specialized skills for prompts, RAG, agents, and safety. Integrates recent advances: PEFT/LoRA fine-tuning, hybrid RAG handoff (see dedicated skill), vLLM 24x throughput, multi-layered security (90%+ bypass for single-layer), automated drift detection (18-second response), and CI/CD-aligned evaluation.
npx skillsauth add microck/ordinary-claude-skills ai-llm-engineeringInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A single resource for executing, validating, and scaling LLM systems with modern production standards, while delegating domain depth to specialized skills.
This skill provides quick reference, decision frameworks, and navigation to detailed operational patterns for:
For detailed patterns: See Resources and Templates sections below.
| Task | Tool/Framework | Command/Pattern | When to Use | |------|----------------|-----------------|-------------| | RAG Pipeline | LlamaIndex, LangChain | Page-level chunking + hybrid retrieval | Dynamic knowledge, 0.648 accuracy | | Agentic Workflow | LangGraph, AutoGen, CrewAI | ReAct, multi-agent orchestration | Complex tasks, tool use required | | Prompt Design | Anthropic, OpenAI guides | CoT, few-shot, structured | Task-specific behavior control | | Evaluation | LangSmith, W&B, RAGAS | Multi-metric (hallucination, bias, cost) | Quality validation, A/B testing | | Production Deploy | vLLM, TensorRT-LLM | FP8/FP4 quantization, 24x throughput | High-throughput serving, cost optimization | | Monitoring | Arize Phoenix, LangFuse | Drift detection, 18-second response | Production LLM systems |
Building LLM application: [Architecture Selection]
├─ Need current knowledge?
│ ├─ Simple Q&A? → Basic RAG (page-level chunking + hybrid retrieval)
│ └─ Complex retrieval? → Advanced RAG (reranking + contextual retrieval)
│
├─ Need tool use / actions?
│ ├─ Single task? → Simple agent (ReAct pattern)
│ └─ Multi-step workflow? → Multi-agent (LangGraph, CrewAI)
│
├─ Static behavior sufficient?
│ ├─ Quick MVP? → Prompt engineering (CI/CD integrated)
│ └─ Production quality? → Fine-tuning (PEFT/LoRA)
│
└─ Best results?
└─ Hybrid (RAG + Fine-tuning + Agents) → Comprehensive solution
See Decision Matrices for detailed selection criteria.
Claude should invoke this skill when the user asks about:
Comprehensive operational guides with checklists, patterns, and decision frameworks:
Project Planning Patterns - Stack selection, FTI pipeline, performance budgeting
Production Checklists - Pre-deployment validation and operational checklists
Common Design Patterns - Copy-paste ready implementation examples
Decision Matrices - Quick reference tables for selection
Anti-Patterns - Common mistakes and prevention strategies
Note: Each resource file includes preflight/validation checklists, copy-paste reference tables, inline templates, anti-patterns, and decision matrices.
Production templates by use case and technology:
This skill integrates with complementary Claude Code skills:
See data/sources.json for 50+ curated authoritative sources:
Quick Decisions: Decision Matrices Pre-Deployment: Production Checklists Planning: Project Planning Patterns Implementation: Common Design Patterns Troubleshooting: Anti-Patterns
Domain Depth: LLMOps | Evaluation | Prompts | Agents | RAG
Templates: templates/ - Copy-paste ready production code
Sources: data/sources.json - Authoritative documentation links
development
Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5. Use when the user wants to update their codebase, prompts, or API calls to use Opus 4.5. Handles model string updates and prompt adjustments for known Opus 4.5 behavioral differences. Does NOT migrate Haiku 4.5.
development
Analyzes Claude Code usage patterns and provides comprehensive recommendations. Runs usage analysis, discovers GitHub community resources, suggests CLAUDE.md improvements, and fetches latest docs on-demand. Use when user wants to optimize their Claude Code workflow, create configurations (agents/skills/commands), or set up project documentation.
development
Quantum computing framework for building, simulating, optimizing, and executing quantum circuits. Use this skill when working with quantum algorithms, quantum circuit design, quantum simulation (noiseless or noisy), running on quantum hardware (Google, IonQ, AQT, Pasqal), circuit optimization and compilation, noise modeling and characterization, or quantum experiments and benchmarking (VQE, QAOA, QPE, randomized benchmarking).
tools
Browser automation, debugging, and performance analysis using Puppeteer CLI scripts. Use for automating browsers, taking screenshots, analyzing performance, monitoring network traffic, web scraping, form automation, and JavaScript debugging.