dist/codex/nlweb-protocol/skills/nlweb-deployment/SKILL.md
Deploy NLWeb to production — Azure App Service (`deploy_azure_webapp.sh` + AI Search + Azure OpenAI), Snowflake Container Services, Cloudflare Worker + AutoRAG, Docker, and self-hosted. Covers env-var conventions, `mode: production` lockdown, scaling, TLS, OAuth, and CI for data reloads. Use when going from local dev to a hosted, internet-facing NLWeb instance.
npx skillsauth add orcaqubits/agentic-commerce-claude-plugins nlweb-deploymentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fetch live docs:
deploy_azure_webapp.sh, setup.sh, startup_aiohttp.sh in the live repo for current commands.| Target | Notes | Setup doc |
|--------|-------|-----------|
| Azure App Service | Reference deployment; ships shell scripts | docs/setup-azure.md |
| Snowflake Container Services | NLWeb runs inside Snowflake compute, closest to data | docs/setup-snowflake.md |
| Cloudflare Worker + AutoRAG | Edge deployment; CF manages indexing | docs/setup-cloudflare-autorag.md |
| Docker | Bring-your-own host | Build from Dockerfile if shipped, else manual |
| Bare Python | systemd + venv on a VM | Use app-aiohttp.py directly |
| WordPress plugin | For WP sites | code/wordpress/nlweb/ |
Before exposing /ask or /mcp to the internet:
mode: production in config_webserver.yaml — disables query-string config overrides.sites: allowlist in config_nlweb.yaml — only the sites you want public.who_endpoint_enabled if you don't want federated traffic going to nlwm.azurewebsites.net.config_retrieval.yaml (nlweb_west, shopify_mcp unless needed).nlweb-auth-multitenancy)./ask directly.config_storage.yaml), not in-memory.Secrets always in env vars — never in config_*.yaml. The convention NLWeb uses:
# config_llm.yaml
providers:
azure_openai:
api_key_env: AZURE_OPENAI_API_KEY # references env var, doesn't store value
endpoint_env: AZURE_OPENAI_ENDPOINT
.env is typical for dev; in cloud deployments use the platform's secret manager (Azure Key Vault, Snowflake secrets, CF Workers KV / Secrets, etc.) and inject as env vars.
A full production NLWeb deployment may have:
/ask, /mcp, /who, /sites, /config, /api/oauth/*Plus optionally the Node.js MCP server in openai-apps-sdk-integration/ if you want the React widget for ChatGPT.
NLWeb streams SSE. Make sure your reverse proxy:
/ask paths (X-Accel-Buffering: no is sent, but nginx still needs proxy_buffering off)./ask streams.X-Forwarded-For) for rate limiting.Most deployments reload site data on a schedule:
# .github/workflows/nlweb-reload.yml (sketch)
on:
schedule:
- cron: '0 3 * * *' # daily 03:00 UTC
jobs:
reload:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install -r requirements.txt
- run: python -m data_loading.db_load https://example.com/feed.xml my-site
env:
AZURE_SEARCH_API_KEY: ${{ secrets.AZURE_SEARCH_API_KEY }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
Run reload as a separate process — don't bake it into the server's startup.
NLWeb is stateless per-request (state is in conversation storage + the vector backend). Scale horizontally:
/ask (each request is self-contained)LLM and embedding API quota is usually the binding constraint, not CPU.
Walk through deploy_azure_webapp.sh — it provisions:
Customize the resource names, set WEBSITES_PORT=8000 (or whichever the script uses), deploy via git push or az webapp deploy. Verify mode: production in the deployed config_webserver.yaml.
NLWeb runs as a containerized service inside Snowflake compute, queries Cortex Search (data is already in Snowflake tables). Use the setup-snowflake.md doc — it covers the SPCS service spec, image build, and Cortex Search setup.
Cloudflare maintains a hosted variant. Two options:
docs/setup-cloudflare-autorag.md — covers the worker template and AutoRAG wiring.If a Dockerfile ships in the repo, use it. Otherwise:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "AskAgent/python/app-aiohttp.py"]
Mount config/ and .env as bind mounts or use env vars + ConfigMaps. Persist data/db/ (Qdrant local) on a volume if not using a remote vector store.
GET /mcp/health (or /sites as a fallback)GET /sites — fails fast if config is brokenNLWeb logs to stdout via Python logging. Wire to your platform's log aggregator (Azure Monitor, CloudWatch, etc.). Key metrics:
/ask latency (p50, p95, p99) — SSE makes this tricky; measure TTFB and total/ask 500s: usually an env var missing — check the log for the failing provider.tool_selection_enabled, model tiers, and who_endpoint_enabled.--batch-size, use a separate embedding deployment, or run reloads off-peak.Always re-fetch the per-target setup doc and deploy_*.sh scripts before deploying — these are the most release-sensitive parts of the codebase.
development
Build with Spree's headless Next.js storefront — the official `spree/storefront` repo (Next.js 16 App Router with Server Actions and Turbopack, React 19 Server Components, Tailwind CSS 4, TypeScript 5, `@spree/sdk`, Sentry), server-only auth (httpOnly JWT cookies + publishable key), MeiliSearch faceted catalog, one-page checkout with Apple/Google Pay/Klarna/Affirm/SEPA, multi-region market routing, GA4 + JSON-LD SEO, and Vercel/Docker deployment. Use when forking or customizing the storefront, or evaluating headless adoption.
tools
Build Spree extensions as Rails engines — gem scaffolding, `bin/rails g spree:extension`, mounting routes/migrations/assets, the modern `prepend` decorator pattern (`*_decorator.rb` with `self.prepended(base)`), generators (`spree:model_decorator`, `spree:controller_decorator`), the four customization surfaces in preference order (Events > Webhooks > Dependencies > Decorators), Spree::Dependencies for swapping service objects, gem release/versioning, and the deprecated Deface engine. Use when building a reusable Spree extension or adding non-trivial customization to an app.
development
Build with Spree's event bus and Webhooks 2.0 — `Spree::Events` publication, `Spree::Subscriber` DSL with `subscribes_to` and `on`, wildcard matching, lifecycle events (`{model}.created/.updated/.deleted` via `publishes_lifecycle_events`), the canonical event catalog (order.*, payment.*, shipment.*, product.*), Webhooks 2.0 endpoints, HMAC-SHA256 signing (`X-Spree-Webhook-Signature`), exponential-backoff retries, and Sidekiq job orchestration. Use when wiring event-driven business logic, building webhook consumers, or replacing ActiveSupport callback chains.
tools
Cross-cutting Spree development patterns — the customization preference hierarchy (Events > Webhooks > Dependencies > Decorators), `Spree::Dependencies` service-object swapping, the `_decorator.rb` + `prepend` + `self.prepended` idiom, idempotent subscribers and webhook receivers, multi-store scoping discipline, prefixed IDs, calculator polymorphism (shipping/promotion/tax share the base), service-object composition with `dry-monads` or simple results, why to avoid `class_eval` reopening and Deface, and Spree-on-Rails idioms (Hotwire/Turbo Stimulus, ActiveStorage, Action Cable, Sidekiq). Use when designing the architecture of a Spree extension or solving cross-cutting concerns.