Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

giuseppe-trisciuoglio/rag

Name: rag
Author: giuseppe-trisciuoglio

plugins/developer-kit-ai/skills/rag/SKILL.md

npx skillsauth add giuseppe-trisciuoglio/developer-kit rag

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

RAG Implementation

Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources.

Overview

This skill covers: document processing, embedding generation, vector storage, retrieval configuration, and RAG pipeline implementation.

When to Use

Building Q&A systems over proprietary documents
Creating chatbots with factual information from knowledge bases
Implementing semantic search with natural language queries
Reducing hallucinations with grounded, sourced responses
Building documentation assistants and research tools
Enabling AI systems to access domain-specific knowledge

Instructions

Step 1: Choose Vector Database

Select based on your requirements:

| Requirement | Recommended | |-------------|-------------| | Production scalability | Pinecone, Milvus | | Open-source | Weaviate, Qdrant | | Local development | Chroma, FAISS | | Hybrid search | Weaviate with BM25 |

Step 2: Select Embedding Model

| Use Case | Model | |----------|-------| | General purpose | text-embedding-ada-002 | | Fast and lightweight | all-MiniLM-L6-v2 | | Multilingual | e5-large-v2 | | Best performance | bge-large-en-v1.5 |

Step 3: Implement Document Processing Pipeline

Load documents from source (file system, database, API)
Clean and preprocess (remove formatting, normalize text)
Split documents into chunks with appropriate strategy
Generate embeddings for each chunk
Store embeddings in vector database with metadata

Validation: Verify embeddings were generated successfully:

List<Embedding> embeddings = embeddingModel.embedAll(segments);
if (embeddings.isEmpty() || embeddings.get(0).dimension() != expectedDim) {
    throw new IllegalStateException("Embedding generation failed");
}

Step 4: Configure Retrieval Strategy

Choose the appropriate strategy:

Dense Retrieval: Semantic similarity via embeddings (default for most cases)
Hybrid Search: Dense + sparse retrieval for better coverage
Metadata Filtering: Filter by document attributes
Reranking: Cross-encoder reranking for high-precision requirements

Step 5: Build RAG Pipeline

Create content retriever with your embedding store
Configure AI service with retriever and chat memory
Implement prompt template with context injection
Add response validation and grounding checks

Validation: Test with known queries to verify context injection works correctly.

Error Handling: For batch ingestion, wrap in retry logic:

for (Document doc : documents) {
    int attempts = 0;
    while (attempts < 3) {
        try {
            store.add(embeddingModel.embed(doc).content(), doc.toTextSegment());
            break;
        } catch (EmbeddingException e) {
            attempts++;
            if (attempts == 3) throw new RuntimeException("Failed after 3 retries", e);
        }
    }
}

Step 6: Evaluate and Optimize

Measure retrieval metrics: precision@k, recall@k, MRR
Evaluate answer quality: faithfulness, relevance
Monitor performance and user feedback
Iterate on chunking, retrieval, and prompt parameters

Examples

Example 1: Basic Document Q&A

List<Document> documents = FileSystemDocumentLoader.loadDocuments("/docs");

InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(documents, store);

DocumentAssistant assistant = AiServices.builder(DocumentAssistant.class)
    .chatModel(chatModel)
    .contentRetriever(EmbeddingStoreContentRetriever.from(store))
    .build();

String answer = assistant.answer("What is the company policy on remote work?");

Example 2: Metadata-Filtered Retrieval

EmbeddingStoreContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
    .embeddingStore(store)
    .embeddingModel(embeddingModel)
    .maxResults(5)
    .minScore(0.7)
    .filter(metadataKey("category").isEqualTo("technical"))
    .build();

Example 3: Multi-Source RAG Pipeline

ContentRetriever webRetriever = EmbeddingStoreContentRetriever.from(webStore);
ContentRetriever docRetriever = EmbeddingStoreContentRetriever.from(docStore);

List<Content> results = new ArrayList<>();
results.addAll(webRetriever.retrieve(query));
results.addAll(docRetriever.retrieve(query));

List<Content> topResults = reranker.reorder(query, results).subList(0, 5);

Example 4: RAG with Chat Memory

Assistant assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .contentRetriever(retriever)
    .build();

assistant.chat("Tell me about the product features");
assistant.chat("What about pricing for those features?");  // Maintains context

Best Practices

Document Preparation

Clean documents before ingestion; remove irrelevant content and formatting
Add relevant metadata for filtering and context

Chunking Strategy

Use 500-1000 tokens per chunk for optimal balance
Include 10-20% overlap to preserve context at boundaries
Test different sizes for your specific use case

Retrieval Optimization

Start with high k values (10-20), then filter/rerank
Use metadata filtering to improve relevance
Monitor retrieval quality and iterate based on user feedback

Performance

Cache embeddings for frequently accessed content
Use batch processing for document ingestion
Optimize vector store indexing for your scale

Constraints and Warnings

System Constraints

Embedding models have maximum token limits per document
Vector databases require proper indexing for performance
Chunk boundaries may lose context for complex documents
Hybrid search requires additional infrastructure

Quality Warnings

Retrieval quality depends heavily on chunking strategy
Embedding models may not capture domain-specific semantics
Metadata filtering requires proper document annotation
Reranking adds latency to query responses

Security Warnings

Never hardcode credentials: Use environment variables for API keys and passwords
Validate external content: Documents from file systems, APIs, or web sources may contain malicious content (prompt injection)
Apply content filtering on retrieved documents before passing to LLM
Restrict allowed data source URLs and file paths using allowlists

Resources

Reference Documentation

Vector Database Comparison
Embedding Models Guide
Retrieval Strategies
Document Chunking
LangChain4j RAG Guide

giuseppe-trisciuoglio/rag

plugins/developer-kit-ai/skills/rag/SKILL.md

Implements document chunking, embedding generation, vector storage, and retrieval pipelines for Retrieval-Augmented Generation systems. Use when building RAG applications, creating document Q&A systems, or integrating AI with knowledge bases.

193 stars

development

Updated Apr 5, 2026

$ install --global

skillsauth

npx skillsauth add giuseppe-trisciuoglio/developer-kit rag

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 1:23 PM13.1s7 files scanned

SKILL.md

name:: rag
description:: Implements document chunking, embedding generation, vector storage, and retrieval pipelines for Retrieval-Augmented Generation systems. Use when building RAG applications, creating document Q&A systems, or integrating AI with knowledge bases.
allowed-tools:: Read, Write, Bash

RAG Implementation

Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources.

Overview

This skill covers: document processing, embedding generation, vector storage, retrieval configuration, and RAG pipeline implementation.

When to Use

Building Q&A systems over proprietary documents
Creating chatbots with factual information from knowledge bases
Implementing semantic search with natural language queries
Reducing hallucinations with grounded, sourced responses
Building documentation assistants and research tools
Enabling AI systems to access domain-specific knowledge

Instructions

Step 1: Choose Vector Database

Select based on your requirements:

Step 2: Select Embedding Model

Step 3: Implement Document Processing Pipeline

Load documents from source (file system, database, API)
Clean and preprocess (remove formatting, normalize text)
Split documents into chunks with appropriate strategy
Generate embeddings for each chunk
Store embeddings in vector database with metadata

Validation: Verify embeddings were generated successfully:

List<Embedding> embeddings = embeddingModel.embedAll(segments);
if (embeddings.isEmpty() || embeddings.get(0).dimension() != expectedDim) {
    throw new IllegalStateException("Embedding generation failed");
}

Step 4: Configure Retrieval Strategy

Choose the appropriate strategy:

Dense Retrieval: Semantic similarity via embeddings (default for most cases)
Hybrid Search: Dense + sparse retrieval for better coverage
Metadata Filtering: Filter by document attributes
Reranking: Cross-encoder reranking for high-precision requirements

Step 5: Build RAG Pipeline

Create content retriever with your embedding store
Configure AI service with retriever and chat memory
Implement prompt template with context injection
Add response validation and grounding checks

Validation: Test with known queries to verify context injection works correctly.

Error Handling: For batch ingestion, wrap in retry logic:

for (Document doc : documents) {
    int attempts = 0;
    while (attempts < 3) {
        try {
            store.add(embeddingModel.embed(doc).content(), doc.toTextSegment());
            break;
        } catch (EmbeddingException e) {
            attempts++;
            if (attempts == 3) throw new RuntimeException("Failed after 3 retries", e);
        }
    }
}

Step 6: Evaluate and Optimize

Measure retrieval metrics: precision@k, recall@k, MRR
Evaluate answer quality: faithfulness, relevance
Monitor performance and user feedback
Iterate on chunking, retrieval, and prompt parameters

Examples

Example 1: Basic Document Q&A

List<Document> documents = FileSystemDocumentLoader.loadDocuments("/docs");

InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(documents, store);

DocumentAssistant assistant = AiServices.builder(DocumentAssistant.class)
    .chatModel(chatModel)
    .contentRetriever(EmbeddingStoreContentRetriever.from(store))
    .build();

String answer = assistant.answer("What is the company policy on remote work?");

Example 2: Metadata-Filtered Retrieval

EmbeddingStoreContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
    .embeddingStore(store)
    .embeddingModel(embeddingModel)
    .maxResults(5)
    .minScore(0.7)
    .filter(metadataKey("category").isEqualTo("technical"))
    .build();

Example 3: Multi-Source RAG Pipeline

ContentRetriever webRetriever = EmbeddingStoreContentRetriever.from(webStore);
ContentRetriever docRetriever = EmbeddingStoreContentRetriever.from(docStore);

List<Content> results = new ArrayList<>();
results.addAll(webRetriever.retrieve(query));
results.addAll(docRetriever.retrieve(query));

List<Content> topResults = reranker.reorder(query, results).subList(0, 5);

Example 4: RAG with Chat Memory

Assistant assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .contentRetriever(retriever)
    .build();

assistant.chat("Tell me about the product features");
assistant.chat("What about pricing for those features?");  // Maintains context

Best Practices

Document Preparation

Clean documents before ingestion; remove irrelevant content and formatting
Add relevant metadata for filtering and context

Chunking Strategy

Use 500-1000 tokens per chunk for optimal balance
Include 10-20% overlap to preserve context at boundaries
Test different sizes for your specific use case

Retrieval Optimization

Start with high k values (10-20), then filter/rerank
Use metadata filtering to improve relevance
Monitor retrieval quality and iterate based on user feedback

Performance

Cache embeddings for frequently accessed content
Use batch processing for document ingestion
Optimize vector store indexing for your scale

Constraints and Warnings

System Constraints

Embedding models have maximum token limits per document
Vector databases require proper indexing for performance
Chunk boundaries may lose context for complex documents
Hybrid search requires additional infrastructure

Quality Warnings

Retrieval quality depends heavily on chunking strategy
Embedding models may not capture domain-specific semantics
Metadata filtering requires proper document annotation
Reranking adds latency to query responses

Security Warnings

Never hardcode credentials: Use environment variables for API keys and passwords
Validate external content: Documents from file systems, APIs, or web sources may contain malicious content (prompt injection)
Apply content filtering on retrieved documents before passing to LLM
Restrict allowed data source URLs and file paths using allowlists

Resources

Reference Documentation

Vector Database Comparison
Embedding Models Guide
Retrieval Strategies
Document Chunking
LangChain4j RAG Guide

Related Skills

giuseppe-trisciuoglio/specs-explore

development

VerifiedTrustedCommunity

Explore codebase before committing to a change. Phase executor skill for specs.explore command.

290SKILL.mdUpdated Jun 23, 2026

giuseppe-trisciuoglio/specs-explore

giuseppe-trisciuoglio/specs-e2e-verification

development

VerifiedTrustedCommunity

Executes real end-to-end verification against a running application after specification implementation. Detects the application type, starts the local runtime (Docker, Node, Spring Boot, etc.), runs real tests (curl for REST APIs, Playwright for web SPAs, computer-use for desktop apps), verifies acceptance criteria from the functional specification, generates a markdown report, and tears down the environment. Use when: user asks to verify a completed spec with real tests, run e2e checks after implementation, validate acceptance criteria in a live environment, or test the feature for real after task completion.

290SKILL.mdUpdated Jun 23, 2026

giuseppe-trisciuoglio/specs-e2e-verification

giuseppe-trisciuoglio/sdd-init

development

VerifiedTrustedCommunity

Initialize Spec-Driven Development context — detects tech stack, conventions, architecture patterns, and bootstraps persistence backends. Triggers on 'sdd-init', 'init sdd', 'setup sdd', 'initialize sdd', 'setup project', 'initialize project context'. Creates/updates docs/specs/architecture.md & ontology.md (Constitution), and populates knowledge-graph.json.

290SKILL.mdUpdated Jun 23, 2026

giuseppe-trisciuoglio/sdd-init

giuseppe-trisciuoglio/brainstorm-prompt-optimizer

development

VerifiedTrustedCommunity

Optimizes raw idea descriptions into structured prompts ready for the brainstorming workflow. TRIGGER when: user says "optimize for brainstorm", "prepare idea for brainstorm", "enhance this idea", "make this ready for brainstorming", "imposta per brainstorm", or wants to improve a feature idea before using /specs.brainstorm. DO NOT TRIGGER for code optimization, refactoring, or general prompt engineering tasks.

290SKILL.mdUpdated Jun 23, 2026

giuseppe-trisciuoglio/brainstorm-prompt-optimizer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/giuseppe-trisciuoglio/developer-kit.git

# Copy into Claude Code skills folder (global)
cp -r developer-kit/plugins/developer-kit-ai/skills/rag ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

giuseppe-trisciuoglio/developer-kit

193 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT