skills/llm-app-patterns/SKILL.md
Production-ready patterns for building LLM applications, inspired by [Dify](https://github.com/langgenius/dify) and industry best practices.
npx skillsauth add ranbot-ai/awesome-skills llm-app-patternsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Production-ready patterns for building LLM applications, inspired by Dify and industry best practices.
Use this skill when:
RAG (Retrieval-Augmented Generation) grounds LLM responses in your data.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Ingest │────▶│ Retrieve │────▶│ Generate │
│ Documents │ │ Context │ │ Response │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌───────────┐ ┌───────────┐
│ Chunking│ │ Vector │ │ LLM │
│Embedding│ │ Search │ │ + Context│
└─────────┘ └───────────┘ └───────────┘
# Chunking strategies
class ChunkingStrategy:
# Fixed-size chunks (simple but may break context)
FIXED_SIZE = "fixed_size" # e.g., 512 tokens
# Semantic chunking (preserves meaning)
SEMANTIC = "semantic" # Split on paragraphs/sections
# Recursive splitting (tries multiple separators)
RECURSIVE = "recursive" # ["\n\n", "\n", " ", ""]
# Document-aware (respects structure)
DOCUMENT_AWARE = "document_aware" # Headers, lists, etc.
# Recommended settings
CHUNK_CONFIG = {
"chunk_size": 512, # tokens
"chunk_overlap": 50, # token overlap between chunks
"separators": ["\n\n", "\n", ". ", " "],
}
# Vector database selection
VECTOR_DB_OPTIONS = {
"pinecone": {
"use_case": "Production, managed service",
"scale": "Billions of vectors",
"features": ["Hybrid search", "Metadata filtering"]
},
"weaviate": {
"use_case": "Self-hosted, multi-modal",
"scale": "Millions of vectors",
"features": ["GraphQL API", "Modules"]
},
"chromadb": {
"use_case": "Development, prototyping",
"scale": "Thousands of vectors",
"features": ["Simple API", "In-memory option"]
},
"pgvector": {
"use_case": "Existing Postgres infrastructure",
"scale": "Millions of vectors",
"features": ["SQL integration", "ACID compliance"]
}
}
# Embedding model selection
EMBEDDING_MODELS = {
"openai/text-embedding-3-small": {
"dimensions": 1536,
"cost": "$0.02/1M tokens",
"quality": "Good for most use cases"
},
"openai/text-embedding-3-large": {
"dimensions": 3072,
"cost": "$0.13/1M tokens",
"quality": "Best for complex queries"
},
"local/bge-large": {
"dimensions": 1024,
"cost": "Free (compute only)",
"quality": "Comparable to OpenAI small"
}
}
# Basic semantic search
def semantic_search(query: str, top_k: int = 5):
query_embedding = embed(query)
results = vector_db.similarity_search(
query_embedding,
top_k=top_k
)
return results
# Hybrid search (semantic + keyword)
def hybrid_search(query: str, top_k: int = 5, alpha: float = 0.5):
"""
alpha=1.0: Pure semantic
alpha=0.0: Pure keyword (BM25)
alpha=0.5: Balanced
"""
semantic_results = vector_db.similarity_search(query)
keyword_results = bm25_search(query)
# Reciprocal Rank Fusion
return rrf_merge(semantic_results, keyword_results, alpha)
# Multi-query retrieval
def multi_query_retrieval(query: str):
"""Generate multiple query variations for better recall"""
queries = llm.generate_query_variations(query, n=3)
all_results = []
for q in queries:
all_results.extend(semantic_search(q))
return deduplicate(all_results)
# Contextual compression
def compressed_retrieval(query: str):
"""Retrieve then compress to relevant parts only"""
docs = semantic_search(query, top_k=10)
compressed = llm.extract_relevant_parts(docs, query)
return compressed
RAG_PROMPT_TEMPLATE = """
Answer the user's question based ONLY on the following context.
If the context doesn't contain enough information, say "I don't have enough information to answer that."
Context:
{context}
Question: {question}
Answer:"""
def generate_with_rag(question: str):
# Retrieve
context_docs = hybrid_search(question, top_k=5)
context = "\n\n".join([doc.content for doc in context_docs])
# Generate
prompt = RAG_PROMPT_TEMPLATE.format(
context=context,
question=question
)
response = llm.generate(prompt)
# Return with citations
return {
"answer": response,
"sources": [do
testing
Fix SEO indexing issues, crawl budget problems, and Search Console coverage errors for Next.js apps. Covers canonical tags, noindex audits, sitemap health, static rendering, and internal linking.
data-ai
Analyze AI disruption pressure across a business, map competitive exposure, and produce a 90-day defensive action plan.
tools
--- name: longbridge description: 125+ agent skills for Longbridge Securities — real-time quotes, charts, fundamentals, portfolio analysis, options, and more for HK/US/A-share/SG markets. Trilingual: Simplified Chinese, Traditional category: AI & Agents source: antigravity tags: [api, mcp, claude, ai, agent, security, cro] url: https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/longbridge --- # Longbridge ## Overview Longbridge is the official skill collection for Longbr
tools
Design, debug, and harden GitHub Actions CI/CD workflows, including reusable workflows, matrix builds, self-hosted runners, OIDC authentication, caching, environments, secrets, and release automation.