dist/codex/nlweb-protocol/skills/nlweb-retrieval-backends/SKILL.md
Choose and configure NLWeb retrieval backends — Qdrant (local + remote), Azure AI Search, Elasticsearch, OpenSearch (with/without k-NN), Postgres pgvector, Milvus, Snowflake Cortex Search, Cloudflare AutoRAG, Shopify MCP, and Bing Web Search. Covers `config_retrieval.yaml`, the single `write_endpoint` rule, parallel read-fanout with URL dedup, and per-backend setup pages. Use when picking a retrieval store, migrating between backends, or debugging "results are empty."
npx skillsauth add orcaqubits/agentic-commerce-claude-plugins nlweb-retrieval-backendsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fetch live docs:
docs/setup-*.md (Qdrant, Azure AI Search, Elasticsearch, OpenSearch, Postgres, Snowflake, Cloudflare AutoRAG).AskAgent/python/retrieval_providers/<backend>.py for the exact client signature and required env vars.NLWeb does something unusual: it reads from every enabled retrieval endpoint in parallel and deduplicates by URL, but writes go to exactly one write_endpoint. This means:
db_load against the new write_endpoint| Endpoint key (config_retrieval.yaml) | Backend | Notes |
|--------------------------------------|---------|-------|
| qdrant_local | Qdrant file-backed | Default-enabled; data in ../data/db |
| qdrant_url | Qdrant remote | Set URL + API key in env |
| nlweb_west | Azure AI Search | Default-enabled MS-hosted demo instance — usually disable |
| azure_ai_search | Azure AI Search (your own) | Bring your own index name |
| milvus | Milvus | Flagged "under development" in YAML |
| elasticsearch | Elasticsearch | dense_vector + int8_hnsw |
| opensearch_knn | OpenSearch + k-NN plugin | The recommended OpenSearch path |
| opensearch_script | OpenSearch no plugin | script_score fallback, slower |
| postgres | Postgres + pgvector | Good if you already run Postgres |
| snowflake_cortex_search_1 | Snowflake Cortex Search | Data lives in Snowflake tables |
| cloudflare_autorag | Cloudflare AutoRAG | Indexing managed by CF; ingest via R2 |
| shopify_mcp | Shopify's MCP endpoint | Default-enabled; live proxy, no ingest |
| bing_search | Bing Web Search API | Live web fallback; not a vector store |
Every endpoint declares:
enabled: true/false — whether /ask queries itread: true/false — finer-grained: enable for readswrite_endpoint)The default config has qdrant_local, nlweb_west, and shopify_mcp enabled — for local dev disable the last two.
| If you need... | Use |
|----------------|-----|
| Local dev with no cloud deps | qdrant_local |
| Largest scale + Microsoft-stack | azure_ai_search |
| Already on AWS | opensearch_knn |
| Already on Postgres | postgres (pgvector) |
| Live e-commerce catalog | shopify_mcp |
| Snowflake-resident data | snowflake_cortex_search_1 |
| Edge deployment | cloudflare_autorag |
| Live news/freshness | bing_search (combine with a vector backend) |
Each backend stores fixed-dimension vectors. The embedding provider must emit the same dimension:
| Embedding provider | Default model | Dim |
|--------------------|---------------|-----|
| OpenAI text-embedding-3-small | default | 1536 |
| OpenAI text-embedding-3-large | — | 3072 |
| Azure OpenAI text-embedding-3-small | default | 1536 |
| Gemini text-embedding-004 | — | 768 |
| Snowflake arctic-embed-m-v1.5 | — | 768 |
| Elasticsearch multilingual-e5-small | — | 384 |
Pick the embedding provider FIRST, configure the backend's index to match THAT dimension, then ingest.
Most NLWeb providers use cosine similarity. When creating a new index manually (Azure AI Search, OpenSearch, Postgres) make sure the metric matches what the retrieval provider class expects. Look in retrieval_providers/<backend>.py for the metric the SDK call passes.
nlweb_west Trapnlweb_west is a Microsoft-hosted demo Azure AI Search instance that's enabled by default. For most users this:
Disable it in local dev unless you specifically want the demo content.
Edit config/config_retrieval.yaml:
write_endpoint: azure_ai_search
endpoints:
qdrant_local:
enabled: false
azure_ai_search:
enabled: true
api_key_env: AZURE_SEARCH_API_KEY
endpoint_env: AZURE_SEARCH_ENDPOINT
index_name: nlweb-main
Then re-ingest:
python -m data_loading.db_load --only-delete delete-site <site>
python -m data_loading.db_load <source> <site> --database azure_ai_search
Leave several enabled: true simultaneously — /ask will fan out reads, dedup by URL. Useful for:
If NLWeb doesn't ship the backend you need:
retrieval_providers/ — look at any existing one for the contract (search, upsert, delete-by-site).config_retrieval.yaml.Qdrant local: zero setup; collection lives at ../data/db. To reset, delete the directory.
Azure AI Search: create the index manually (or via the setup doc's ARM template). Vector field must be vector (or whatever the provider class names it — verify).
Postgres + pgvector: CREATE EXTENSION vector; then ensure the column is vector(1536) or whichever dim matches your embedding. NLWeb uses cosine distance by default.
Snowflake Cortex Search: data is in a Snowflake table; you create a CORTEX SEARCH SERVICE over it. NLWeb queries via the Cortex API. No db_load.py involvement.
Cloudflare AutoRAG: upload files to R2, point AutoRAG at the bucket, wire NLWeb to the AutoRAG endpoint. CF manages indexing.
Shopify MCP: zero ingest. NLWeb proxies queries to a Shopify store's MCP endpoint. Configure the shop domain per-site. Disable for non-commerce deployments.
Bing: API key required; only useful combined with at least one vector backend (Bing returns web pages, not your indexed content).
Diagnostic ladder:
curl http://localhost:8000/sites — site is registered?curl 'http://localhost:8000/ask?query=test&site=X&mode=list&streaming=false' — any results at all?python -c "from embedding_providers import get_default; print(get_default().dim)" (verify exact API) and compare to your index schema.qdrant CLI / Azure Search Studio / SELECT count(*) FROM index for Postgres.Always re-fetch config_retrieval.yaml from the live repo before generating config — keys change.
development
Build with Spree's headless Next.js storefront — the official `spree/storefront` repo (Next.js 16 App Router with Server Actions and Turbopack, React 19 Server Components, Tailwind CSS 4, TypeScript 5, `@spree/sdk`, Sentry), server-only auth (httpOnly JWT cookies + publishable key), MeiliSearch faceted catalog, one-page checkout with Apple/Google Pay/Klarna/Affirm/SEPA, multi-region market routing, GA4 + JSON-LD SEO, and Vercel/Docker deployment. Use when forking or customizing the storefront, or evaluating headless adoption.
tools
Build Spree extensions as Rails engines — gem scaffolding, `bin/rails g spree:extension`, mounting routes/migrations/assets, the modern `prepend` decorator pattern (`*_decorator.rb` with `self.prepended(base)`), generators (`spree:model_decorator`, `spree:controller_decorator`), the four customization surfaces in preference order (Events > Webhooks > Dependencies > Decorators), Spree::Dependencies for swapping service objects, gem release/versioning, and the deprecated Deface engine. Use when building a reusable Spree extension or adding non-trivial customization to an app.
development
Build with Spree's event bus and Webhooks 2.0 — `Spree::Events` publication, `Spree::Subscriber` DSL with `subscribes_to` and `on`, wildcard matching, lifecycle events (`{model}.created/.updated/.deleted` via `publishes_lifecycle_events`), the canonical event catalog (order.*, payment.*, shipment.*, product.*), Webhooks 2.0 endpoints, HMAC-SHA256 signing (`X-Spree-Webhook-Signature`), exponential-backoff retries, and Sidekiq job orchestration. Use when wiring event-driven business logic, building webhook consumers, or replacing ActiveSupport callback chains.
tools
Cross-cutting Spree development patterns — the customization preference hierarchy (Events > Webhooks > Dependencies > Decorators), `Spree::Dependencies` service-object swapping, the `_decorator.rb` + `prepend` + `self.prepended` idiom, idempotent subscribers and webhook receivers, multi-store scoping discipline, prefixed IDs, calculator polymorphism (shipping/promotion/tax share the base), service-object composition with `dry-monads` or simple results, why to avoid `class_eval` reopening and Deface, and Spree-on-Rails idioms (Hotwire/Turbo Stimulus, ActiveStorage, Action Cable, Sidekiq). Use when designing the architecture of a Spree extension or solving cross-cutting concerns.