gemini/skills/lancer/SKILL.md
Use lancer CLI for LanceDB semantic and multi-modal search with document ingestion, vector embeddings, and MCP server integration for knowledge retrieval.
npx skillsauth add lanej/dotfiles lancerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a specialist in using lancer, a CLI and MCP server for LanceDB that provides semantic and full-text search with multi-modal support (text and images). This skill provides comprehensive workflows, best practices, and common patterns for document ingestion, search, and table management.
lancer is a powerful tool for:
# Search all tables
lancer search "how to deploy kubernetes"
# Search specific table with more results
lancer search -t docs -l 20 "authentication methods"
# Search with similarity threshold
lancer search --threshold 0.7 "error handling patterns"
# Ingest a single file
lancer ingest document.md
# Ingest a directory
lancer ingest ./docs/
# Ingest multiple paths
lancer ingest file1.md file2.pdf ./images/
# Ingest to specific table
lancer ingest -t my_docs document.md
# Ingest with file extension filter
lancer ingest -e md,txt,pdf ./docs/
# Ingest from stdin (pipe file paths)
find ./docs -name "*.md" | lancer ingest --stdin
# Ingest from file list
lancer ingest --files-from paths.txt
# Custom chunk size and overlap
lancer ingest --chunk-size 2000 --chunk-overlap 400 document.md
Text formats:
txt - Plain text filesmd - Markdown documentspdf - PDF documentssql - SQL scriptsImage formats:
jpg, jpeg - JPEG imagespng - PNG imagesgif - GIF imagesbmp - Bitmap imageswebp - WebP imagestiff, tif - TIFF imagessvg - SVG vector graphicsico - Icon filesText models:
# Default: all-MiniLM-L6-v2 (fast, good quality)
lancer ingest document.md
# Larger model for better quality
lancer ingest --text-model all-MiniLM-L12-v2 document.md
# BGE models (better semantic understanding)
lancer ingest --text-model bge-small-en-v1.5 document.md
lancer ingest --text-model bge-base-en-v1.5 document.md
Image models:
# Default: clip-vit-b-32 (cross-modal text/image)
lancer ingest image.jpg
# ResNet50 for image-only search
lancer ingest --image-model resnet50 image.jpg
Advanced: Force specific model:
# Force CLIP for text (enables future image additions)
lancer ingest --embedding-model clip-vit-b-32 document.md
# Force BGE for performance (text-only)
lancer ingest --embedding-model BAAI/bge-small-en-v1.5 document.md
# Filter by file size
lancer ingest --min-file-size 1000 --max-file-size 10000000 ./docs/
# Skip embedding generation (metadata only)
lancer ingest --no-embeddings document.md
# Custom batch size for database writes
lancer ingest --batch-size 200 ./large-dataset/
# JSON output for scripting
lancer ingest --format json document.md
# Basic search
lancer search "kubernetes deployment"
# Search specific table
lancer search -t docs "authentication"
# Limit results
lancer search -l 5 "error handling"
# Set similarity threshold (0.0-1.0)
lancer search --threshold 0.6 "database migration"
# Include embeddings in results
lancer search --include-embeddings "API design"
# JSON output
lancer search --format json "machine learning"
# Single filter (field:operator:value)
lancer search --filter "author:eq:John" "AI research"
# Multiple filters
lancer search \
--filter "author:eq:John" \
--filter "year:gt:2020" \
"deep learning"
# Available operators:
# eq (equals), ne (not equals)
# gt (greater than), lt (less than)
# gte (greater/equal), lte (less/equal)
# in (in list), contains (string contains)
# Find recent documentation
lancer search \
-t docs \
--filter "date:gte:2024-01-01" \
-l 10 \
"API endpoints"
# Search by category
lancer search \
--filter "category:eq:tutorial" \
"getting started"
# Multi-criteria search
lancer search \
-t technical_docs \
--filter "language:eq:python" \
--filter "level:eq:advanced" \
--threshold 0.7 \
-l 15 \
"async programming patterns"
# List all tables
lancer tables list
# JSON output
lancer tables list --format json
# Get table details
lancer tables info my_table
# JSON output for scripting
lancer tables info my_table --format json
# Delete a table (be careful!)
lancer tables delete old_table
# Remove specific documents from a table
lancer remove -t docs document_id
# Remove multiple documents
lancer remove -t docs id1 id2 id3
# Specify config file
lancer -c ~/.lancer/config.toml search "query"
# Set default table in config
lancer -c config.toml ingest document.md
# Set default table
export LANCER_TABLE=my_docs
lancer search "query" # Searches my_docs
# Set log level
export LANCER_LOG_LEVEL=debug
lancer ingest document.md
# Error only
lancer --log-level error search "query"
# Warning
lancer --log-level warn ingest document.md
# Info (default)
lancer --log-level info search "query"
# Debug
lancer --log-level debug ingest document.md
# Trace (verbose)
lancer --log-level trace search "query"
# 1. Ingest markdown docs
lancer ingest -t docs -e md ./documentation/
# 2. Verify ingestion
lancer tables info docs
# 3. Test search
lancer search -t docs "installation guide"
# 4. Refine search with threshold
lancer search -t docs --threshold 0.7 -l 5 "configuration"
# 1. Ingest images with CLIP model
lancer ingest -t images -e jpg,png,webp \
--image-model clip-vit-b-32 \
./photos/
# 2. Search images with text query
lancer search -t images "sunset over mountains"
# 3. Search with higher threshold for precision
lancer search -t images --threshold 0.8 "red car"
# 1. Ingest with CLIP for cross-modal search
lancer ingest -t knowledge_base \
--embedding-model clip-vit-b-32 \
-e md,pdf,jpg,png \
./content/
# 2. Search text and images together
lancer search -t knowledge_base "architecture diagrams"
# 3. Filter by file type
lancer search -t knowledge_base \
--filter "file_type:eq:png" \
"system design"
# 1. Generate file list
find ./corpus -type f -name "*.md" > files.txt
# 2. Ingest from list with custom settings
lancer ingest -t corpus \
--files-from files.txt \
--chunk-size 1500 \
--chunk-overlap 300 \
--batch-size 150
# 3. Verify ingestion
lancer tables info corpus
# 4. Test search quality
lancer search -t corpus -l 10 "sample query"
# 1. Ingest new documents
lancer ingest -t docs ./new_docs/
# 2. Search to verify new content
lancer search -t docs "recent feature"
# 3. Remove outdated documents
lancer remove -t docs old_doc_id
# 4. Verify final state
lancer tables info docs
For text-only corpora:
# Fast and efficient
lancer ingest --text-model all-MiniLM-L6-v2 document.md
# Better quality
lancer ingest --text-model bge-base-en-v1.5 document.md
For images or mixed content:
# Cross-modal search (text queries → image results)
lancer ingest --embedding-model clip-vit-b-32 content/
Short documents (< 500 words):
lancer ingest --chunk-size 500 --chunk-overlap 100 article.md
Long documents (> 2000 words):
lancer ingest --chunk-size 2000 --chunk-overlap 400 book.pdf
Code documentation:
lancer ingest --chunk-size 1000 --chunk-overlap 200 docs/
# Separate tables by content type
lancer ingest -t api_docs ./api/*.md
lancer ingest -t tutorials ./tutorials/*.md
lancer ingest -t images ./screenshots/*.png
# Search specific context
lancer search -t api_docs "authentication endpoints"
Broad exploration:
lancer search --threshold 0.4 "general topic"
Precise matching:
lancer search --threshold 0.75 "specific concept"
Very high precision:
lancer search --threshold 0.85 -l 3 "exact information"
# Combine semantic search with metadata
lancer search \
--filter "status:eq:published" \
--filter "category:eq:tutorial" \
--threshold 0.6 \
"getting started guide"
# JSON output for automation
lancer search --format json "query" | jq '.results[] | .path'
# List tables programmatically
lancer tables list --format json | jq '.[] | .name'
# Start MCP server for Claude Desktop integration
lancer mcp
# With custom config
lancer mcp -c ~/.lancer/config.toml
# With specific log level
lancer mcp --log-level info
Add to Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"lancer": {
"command": "lancer",
"args": ["mcp"]
}
}
}
# Ingest multiple files at once
lancer ingest file1.md file2.md file3.md
# Use --stdin for large batches
find ./docs -name "*.md" | lancer ingest --stdin
# Larger batches for bulk ingestion
lancer ingest --batch-size 500 ./large-corpus/
# Smaller batches for limited memory
lancer ingest --batch-size 50 ./documents/
# Index metadata without generating embeddings
lancer ingest --no-embeddings ./archive/
# Faster ingestion with smaller model
lancer ingest --text-model all-MiniLM-L6-v2 ./docs/
# Better quality with larger model (slower)
lancer ingest --text-model bge-base-en-v1.5 ./docs/
Solutions:
# Lower the similarity threshold
lancer search --threshold 0.3 "query"
# Check table exists and has documents
lancer tables list
lancer tables info my_table
# Try different search terms
lancer search "alternative phrasing"
Solutions:
# Check supported extensions
lancer ingest -e md,txt,pdf ./docs/
# Set file size limits
lancer ingest --max-file-size 100000000 ./docs/
# Use debug logging
lancer --log-level debug ingest document.pdf
Solutions:
# Use better embedding model
lancer ingest --text-model bge-base-en-v1.5 document.md
# Adjust chunk size
lancer ingest --chunk-size 1500 --chunk-overlap 300 document.md
# Adjust search threshold
lancer search --threshold 0.6 "query"
Solutions:
# Increase batch size
lancer ingest --batch-size 300 ./docs/
# Use faster embedding model
lancer ingest --text-model all-MiniLM-L6-v2 ./docs/
# Skip embeddings if not needed
lancer ingest --no-embeddings ./docs/
# Ingestion
lancer ingest document.md # Ingest single file
lancer ingest -t docs ./directory/ # Ingest to specific table
lancer ingest -e md,pdf ./docs/ # Filter by extensions
lancer ingest --chunk-size 2000 document.md # Custom chunk size
# Search
lancer search "query" # Search all tables
lancer search -t docs "query" # Search specific table
lancer search -l 20 "query" # Limit results
lancer search --threshold 0.7 "query" # Set similarity threshold
lancer search --filter "author:eq:John" "query" # Metadata filter
# Table management
lancer tables list # List all tables
lancer tables info my_table # Table information
lancer tables delete old_table # Delete table
# Configuration
lancer -c config.toml search "query" # Use config file
lancer --log-level debug ingest doc.md # Set log level
export LANCER_TABLE=docs # Set default table
# MCP server
lancer mcp # Start MCP server
lancer search -t docs --threshold 0.7 -l 5 "how to configure authentication"
lancer ingest -t test_docs document.md && \
lancer search -t test_docs "key concept from document"
lancer search -t images --threshold 0.8 "sunset landscape photography"
find ./docs -name "*.md" | lancer ingest -t docs --stdin && \
lancer tables info docs
lancer search -t technical_docs \
--filter "language:eq:rust" \
--threshold 0.75 \
-l 10 \
"async trait implementation patterns"
Primary use cases:
Key advantages:
Most common commands:
lancer ingest document.md - Index documentslancer search "query" - Search semanticallylancer tables list - Manage tableslancer search -t docs --threshold 0.7 "query" - Precise searchdevops
DORA engineering metrics project at ~/src/dora. Load when: querying DORA BigQuery views (deployment frequency, lead time, change failure rate, alerts, review time) from any project; joining against DORA.unified_identity or DORA_clean.* views from any project; running the data pipeline (just refresh, just download-*, just upload-*); making OpenTofu infrastructure changes to DORA tables or views; working with team attribution, team identity, or engineer roster data.
development
Data pipeline architecture patterns and best practices, including medallion/three-layer architecture (Raw/Staging/Enriched or Bronze/Silver/Gold), YAML-based schema management, and ETL workflow patterns. Use when designing or implementing data pipelines, working with data warehouse layers, or managing table schemas in YAML.
data-ai
Delegate research and context-gathering tasks to a sub-agent to protect the primary context window. Use when the user asks to "research X", "look into X", "find out about X", "gather context on X", or any investigative framing where answering requires 2+ searches or multiple sources. Also use proactively before starting substantive work when prior context is unknown. Never run research inline — always delegate.
documentation
--- name: qmd-math description: Math notation conventions for Quarto/EPQ documents rendered via lualatex. Use when: writing or adding a formula, equation, or mathematical expression to a .qmd file; asked about display math, inline math, or LaTeX notation in a QMD/Quarto context; defining a where-clause or variable definitions for an equation; converting prose variable descriptions into structured math notation; fixing math that renders badly in a PDF; using \lvert, \begin{aligned}, \tfrac, \text