Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

tranhieutt/llm-app-patterns

Name: llm-app-patterns
Author: tranhieutt

.claude/skills/llm-app-patterns/SKILL.md

npx skillsauth add tranhieutt/software_development_department llm-app-patterns

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

LLM Application Patterns

Architecture decision matrix

| Pattern | Use when | Cost | |---|---|---| | Simple RAG | FAQ, docs Q&A | Low | | Hybrid RAG (semantic + BM25) | Mixed query types | Medium | | Function calling | Structured tool use | Low | | ReAct agent | Multi-step reasoning | Medium | | Plan-and-execute | Complex decomposable tasks | High | | Multi-agent | Research, critique-refine | Very High |

RAG: critical config numbers

CHUNK_CONFIG = {
    "chunk_size": 512,       # tokens — sweet spot for most docs
    "chunk_overlap": 50,     # prevents context loss at boundaries
    "separators": ["\n\n", "\n", ". ", " "],
}
# Hybrid search alpha: 1.0=semantic only, 0.0=BM25 only, 0.5=balanced

RAG: retrieval strategies

# Basic: semantic search
results = vector_db.similarity_search(embed(query), top_k=5)

# Better: hybrid (semantic + keyword via RRF)
def hybrid_search(query, alpha=0.5):
    return rrf_merge(vector_db.search(query), bm25_search(query), alpha)

# Best for recall: multi-query (3 variations, deduplicate)
queries = llm.generate_variations(query, n=3)
results = deduplicate([semantic_search(q) for q in queries])

RAG: generation prompt template

RAG_PROMPT = """Answer based ONLY on the context below.
If insufficient, say "I don't have enough information."

Context: {context}
Question: {question}
Answer:"""

Agent: function calling loop

messages = [{"role": "user", "content": question}]
while True:
    response = llm.chat(messages=messages, tools=TOOLS, tool_choice="auto")
    if not response.tool_calls:
        return response.content
    for call in response.tool_calls:
        result = execute_tool(call.name, call.arguments)
        messages.append({"role": "tool", "tool_call_id": call.id, "content": str(result)})

Production: caching (only temperature=0 responses)

def get_or_generate(prompt, model, **kwargs):
    deterministic = kwargs.get("temperature", 1.0) == 0
    if deterministic:
        key = sha256(f"{model}:{prompt}:{json.dumps(kwargs, sort_keys=True)}")
        if cached := redis.get(key): return cached
    response = llm.generate(prompt, model=model, **kwargs)
    if deterministic: redis.setex(key, 3600, response)
    return response

Production: retry + fallback

from tenacity import retry, wait_exponential, stop_after_attempt

@retry(wait=wait_exponential(multiplier=1, min=4, max=60), stop=stop_after_attempt(5))
def call_llm(prompt): return llm.generate(prompt)

# Fallback chain
for model in [primary] + fallbacks:
    try: return llm.generate(prompt, model=model)
    except (RateLimitError, APIError): continue

LLMOps: key metrics

Latency : p50, p99 response time
Quality : satisfaction (thumbs), task completion %, hallucination rate
Cost    : cost_per_request, tokens_per_request, cache_hit_rate
Health  : error_rate, timeout_rate, retry_rate

Embedding model selection

| Model | Dims | Cost | Use | |---|---|---|---| | text-embedding-3-small | 1536 | $0.02/1M | Most cases | | text-embedding-3-large | 3072 | $0.13/1M | High accuracy | | bge-large (local) | 1024 | Free | Self-hosted |

tranhieutt/llm-app-patterns

.claude/skills/llm-app-patterns/SKILL.md

Provides architectural patterns for LLM-powered applications including prompt engineering, RAG, agent loops, and evaluation. Use when building LLM-based features or when the user mentions LLM app architecture, prompt design, or AI system patterns.

60 stars

development

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add tranhieutt/software_development_department llm-app-patterns

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 15, 2026, 11:49 PM9.2s1 file scanned

SKILL.md

name:: llm-app-patterns
type:: reference
description:: Provides architectural patterns for LLM-powered applications including prompt engineering, RAG, agent loops, and evaluation. Use when building LLM-based features or when the user mentions LLM app architecture, prompt design, or AI system patterns.
paths:: ["**/*.py", "**/*.ts", "**/openai*", "**/anthropic*", "**/langchain*"]
effort:: 3
allowed-tools:: Read, Glob, Grep, Write, Edit, Bash
user-invocable:: true
when_to_use:: When designing LLM applications, implementing RAG pipelines, building agent architectures, or setting up LLMOps monitoring

LLM Application Patterns

Architecture decision matrix

RAG: critical config numbers

CHUNK_CONFIG = {
    "chunk_size": 512,       # tokens — sweet spot for most docs
    "chunk_overlap": 50,     # prevents context loss at boundaries
    "separators": ["\n\n", "\n", ". ", " "],
}
# Hybrid search alpha: 1.0=semantic only, 0.0=BM25 only, 0.5=balanced

RAG: retrieval strategies

# Basic: semantic search
results = vector_db.similarity_search(embed(query), top_k=5)

# Better: hybrid (semantic + keyword via RRF)
def hybrid_search(query, alpha=0.5):
    return rrf_merge(vector_db.search(query), bm25_search(query), alpha)

# Best for recall: multi-query (3 variations, deduplicate)
queries = llm.generate_variations(query, n=3)
results = deduplicate([semantic_search(q) for q in queries])

RAG: generation prompt template

RAG_PROMPT = """Answer based ONLY on the context below.
If insufficient, say "I don't have enough information."

Context: {context}
Question: {question}
Answer:"""

Agent: function calling loop

messages = [{"role": "user", "content": question}]
while True:
    response = llm.chat(messages=messages, tools=TOOLS, tool_choice="auto")
    if not response.tool_calls:
        return response.content
    for call in response.tool_calls:
        result = execute_tool(call.name, call.arguments)
        messages.append({"role": "tool", "tool_call_id": call.id, "content": str(result)})

Production: caching (only temperature=0 responses)

def get_or_generate(prompt, model, **kwargs):
    deterministic = kwargs.get("temperature", 1.0) == 0
    if deterministic:
        key = sha256(f"{model}:{prompt}:{json.dumps(kwargs, sort_keys=True)}")
        if cached := redis.get(key): return cached
    response = llm.generate(prompt, model=model, **kwargs)
    if deterministic: redis.setex(key, 3600, response)
    return response

Production: retry + fallback

from tenacity import retry, wait_exponential, stop_after_attempt

@retry(wait=wait_exponential(multiplier=1, min=4, max=60), stop=stop_after_attempt(5))
def call_llm(prompt): return llm.generate(prompt)

# Fallback chain
for model in [primary] + fallbacks:
    try: return llm.generate(prompt, model=model)
    except (RateLimitError, APIError): continue

LLMOps: key metrics

Latency : p50, p99 response time
Quality : satisfaction (thumbs), task completion %, hallucination rate
Cost    : cost_per_request, tokens_per_request, cache_hit_rate
Health  : error_rate, timeout_rate, retry_rate

Embedding model selection

Related Skills

tranhieutt/visual-engineer

testing

VerifiedTrustedCommunity

Generates high-fidelity architecture diagrams, sequence flows, and component maps for SDD projects. Use when finalizing a design phase, documenting system architecture, or visualizing agentic workflows. Default style: Style 6 (Claude Official).

60SKILL.mdUpdated Apr 15, 2026

tranhieutt/visual-engineer

tranhieutt/vector-database-engineer

data-ai

VerifiedTrustedCommunity

Provides vector database and semantic search patterns for Pinecone, Weaviate, Qdrant, Milvus, and pgvector in RAG and recommendation systems. Use when implementing vector search or when the user mentions vector database, semantic search, embeddings, or similarity search.

60SKILL.mdUpdated Apr 15, 2026

tranhieutt/vector-database-engineer

tranhieutt/update-codemap

development

VerifiedTrustedCommunity

Updates docs/technical/CODEMAP.md by scanning the current codebase structure. Run after a significant feature merge, refactor, or when CODEMAP feels stale.

60SKILL.mdUpdated Apr 15, 2026

tranhieutt/update-codemap

tranhieutt/unfreeze

development

VerifiedTrustedCommunity

Unlocks the codebase after a release freeze or incident freeze period to resume normal development. Use when a freeze period ends or when the user mentions unfreezing or lifting the code freeze.

60SKILL.mdUpdated Apr 15, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/tranhieutt/software_development_department.git

# Copy into Claude Code skills folder (global)
cp -r software_development_department/.claude/skills/llm-app-patterns ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

tranhieutt/software_development_department

60 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT