skills/ai-engineer/SKILL.md
Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations. Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications.
npx skillsauth add CenredJun/openclaw-claudecode-setup-kit ai-engineerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Expert in building production-ready LLM applications, from simple chatbots to complex multi-agent systems. Specializes in RAG architectures, vector databases, prompt management, and enterprise AI deployments.
User: "Build a customer support chatbot with our product documentation"
AI Engineer:
1. Design RAG architecture (chunking, embedding, retrieval)
2. Set up vector database (Pinecone/Weaviate/Chroma)
3. Implement retrieval pipeline with reranking
4. Build conversation management with context
5. Add guardrails and fallback handling
6. Deploy with monitoring and observability
Result: Production-ready AI chatbot in days, not weeks
| Component | Implementation | Best Practices | |-----------|---------------|----------------| | Chunking | Semantic, token-based, hierarchical | 512-1024 tokens, overlap 10-20% | | Embedding | OpenAI, Cohere, local models | Match model to domain | | Vector DB | Pinecone, Weaviate, Chroma, Qdrant | Index by use case | | Retrieval | Dense, sparse, hybrid | Start hybrid, tune | | Reranking | Cross-encoder, Cohere Rerank | Always rerank top-k |
// Simple RAG implementation
async function ragQuery(query: string): Promise<string> {
// 1. Embed the query
const queryEmbedding = await embed(query);
// 2. Retrieve relevant chunks
const chunks = await vectorDb.query({
vector: queryEmbedding,
topK: 10,
includeMetadata: true
});
// 3. Rerank for relevance
const reranked = await reranker.rank(query, chunks);
const topChunks = reranked.slice(0, 5);
// 4. Generate response with context
const response = await llm.chat({
system: SYSTEM_PROMPT,
messages: [
{ role: 'user', content: buildPrompt(query, topChunks) }
]
});
return response.content;
}
// Agentic loop with tool use
interface Agent {
systemPrompt: string;
tools: Tool[];
maxIterations: number;
}
async function runAgent(agent: Agent, task: string): Promise<string> {
const messages: Message[] = [];
let iterations = 0;
while (iterations < agent.maxIterations) {
const response = await llm.chat({
system: agent.systemPrompt,
messages: [...messages, { role: 'user', content: task }],
tools: agent.tools
});
if (!response.toolCalls) {
return response.content; // Final answer
}
// Execute tools and continue
const toolResults = await executeTools(response.toolCalls);
messages.push({ role: 'assistant', content: response });
messages.push({ role: 'tool', content: toolResults });
iterations++;
}
throw new Error('Max iterations exceeded');
}
// Route queries to appropriate models
const MODEL_ROUTER = {
simple: 'claude-3-haiku', // Fast, cheap
moderate: 'claude-3-sonnet', // Balanced
complex: 'claude-3-opus', // Best quality
};
function routeQuery(query: string, context: any): ModelId {
// Classify complexity
if (isSimpleQuery(query)) return MODEL_ROUTER.simple;
if (requiresReasoning(query, context)) return MODEL_ROUTER.complex;
return MODEL_ROUTER.moderate;
}
What it looks like: Using RAG for every query Why wrong: Adds latency, cost, and complexity when unnecessary Instead: Classify queries, use RAG only when context needed
What it looks like: text.slice(0, 1000) for chunks
Why wrong: Breaks semantic meaning, poor retrieval
Instead: Semantic chunking respecting document structure
What it looks like: Using raw vector similarity as final ranking Why wrong: Embedding similarity != relevance for query Instead: Always add cross-encoder reranking
What it looks like: Stuffing all retrieved chunks into prompt Why wrong: Dilutes relevance, wastes tokens, confuses model Instead: Top 3-5 chunks after reranking, dynamic selection
What it looks like: Direct user input to LLM Why wrong: Prompt injection, toxic outputs, off-topic responses Instead: Input validation, output filtering, topic guardrails
| Database | Best For | Notes | |----------|----------|-------| | Pinecone | Production, scale | Managed, fast | | Weaviate | Hybrid search | GraphQL, modules | | Chroma | Development, local | Embedded, simple | | Qdrant | Self-hosted, filters | Rust, performant | | pgvector | Existing Postgres | Easy integration |
| Framework | Best For | Notes | |-----------|----------|-------| | LangChain | Prototyping | Many integrations | | LlamaIndex | RAG focus | Document handling | | Vercel AI SDK | Streaming, React | Edge-ready | | Anthropic SDK | Direct API | Full control |
| Model | Dimensions | Notes | |-------|------------|-------| | text-embedding-3-large | 3072 | Best quality | | text-embedding-3-small | 1536 | Cost-effective | | voyage-2 | 1024 | Code, technical | | bge-large | 1024 | Open source |
Use for:
Do NOT use for:
Core insight: Production AI systems need more than good prompts—they need robust retrieval, intelligent routing, comprehensive monitoring, and graceful failure handling.
Use with: prompt-engineer (optimization) | chatbot-analytics (monitoring) | backend-architect (infrastructure)
development
Execute autonomous multi-step research using Google Gemini Deep Research Agent. Use for: market analysis, competitive landscaping, literature reviews, technical research, due diligence. Takes 2-10 ...
testing
Tracks cumulative LLM costs across DAG execution and makes real-time decisions to stay within budget. Downgrades models, skips optional nodes, or stops early when cost exceeds thresholds. Use when managing execution budgets, analyzing cost breakdowns, or optimizing model routing for cost. Activate on "cost budget", "too expensive", "reduce cost", "cost optimization", "model downgrade", "budget exceeded". NOT for LLM model selection logic (use llm-router), pricing comparisons across providers, or billing/invoicing.
development
When the user wants to write, rewrite, or improve marketing copy for any page — including homepage, landing pages, pricing pages, feature pages, about pages, or product pages. Also use when the user says "write copy for," "improve this copy," "rewrite this page," "marketing copy," "headline help," "CTA copy," "value proposition," "tagline," "subheadline," "hero section copy," "above the fold," "this copy is weak," "make this more compelling," or "help me describe my product." Use this whenever someone is working on website text that needs to persuade or convert. For email copy, see email-sequence. For popup copy, see popup-cro. For editing existing copy, see copy-editing.
testing
Elite content marketing strategist specializing in AI-powered content creation, omnichannel distribution, SEO optimization, and data-driven performance marketing.