dist/codex/nlweb-protocol/skills/nlweb-ask-endpoint/SKILL.md
Implement and consume the NLWeb /ask REST endpoint — request shape (GET/POST, query-string and v0.55 structured body), SSE streaming response, modes (list/summarize/generate), in-stream "message_type" headers, error envelopes, and client-side parsing. Use when building an NLWeb server route, calling /ask from a custom agent, or debugging /ask responses.
npx skillsauth add orcaqubits/agentic-commerce-claude-plugins nlweb-ask-endpointInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fetch live spec:
/ask contract — request params, response shape, and streaming format.query.text, prefer.mode, prefer.streaming, meta.version) — this is newer than the GET-only legacy contract.webserver/routes/api.py in the live repo to confirm exact param names.| Route | Method | Purpose |
|-------|--------|---------|
| /ask | GET, POST | Main NL query |
| /who | GET | Site relevance for a query (federated) |
| /sites | GET | List configured sites |
| /config | GET | Public config (safe subset) |
Verify exact names against the live routes/api.py. Stable subset:
| Param | Type | Required | Default | Notes |
|-------|------|----------|---------|-------|
| query | string | yes | — | NL question |
| site | string | no | all | Backend partition; in MCP can be array |
| prev | string | no | — | Comma-separated previous queries (conversation context) |
| decontextualized_query | string | no | — | Pre-resolved query; skips server-side decontextualization |
| streaming | bool | no | true | "0" / "false" / "False" disables |
| query_id | string | no | auto | Echoed in response |
| mode | enum | no | list | list | summarize | generate |
| scorer | string | no | default | e.g., nlwebscorer for the neural reranker |
| itemType | string | no | — | Schema.org type hint (skip type detection) |
| response_format | string | no | — | v0.55 structured-body field |
The newer body format groups fields:
{
"query": { "text": "your question" },
"context": { "prev": ["previous q1", "previous q2"] },
"prefer": {
"mode": "list",
"streaming": true,
"response_format": "schema"
},
"meta": { "version": "0.55" }
}
Verify the exact field names against the live docs before relying on this — fields are still settling.
NLWeb uses Server-Sent Events when streaming=true (the default):
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Accel-Buffering: no
Each chunk is:
data: <json>\n\n
The <json> is one of:
{"message_type": "license", "content": {...}}{"results": [...]} (results may arrive incrementally as FastTrack streams){"query_id": "...", "complete": true} (exact field — verify live)NLWeb's "headers" are NOT HTTP response headers — they are JSON objects in the SSE stream with a message_type discriminator. Known types:
| message_type | Purpose |
|--------------|---------|
| license | Content license terms |
| data_retention | How long the agent may cache results |
| cache_policy | Caching directives |
| ui_component | Optional rendering hint |
| usage_terms | Acceptable use |
| rate_limits | Calls/sec / day budget |
| data_freshness | When the underlying data was last indexed |
| api_version | Server's NLWeb version |
Client parsing rule: don't assume results is the first chunk. Buffer message objects until you see the result stream or a terminal marker.
With streaming=false, the server returns a single application/json body:
{
"query_id": "abc-123",
"messages": [{"message_type": "license", "content": {...}}, ...],
"results": [
{
"url": "https://example.com/article/x",
"name": "Article X",
"site": "example",
"score": 0.83,
"description": "...",
"schema_object": { "@type": "Article", "@context": "https://schema.org", ... }
}
]
}
The schema_object is the original Schema.org JSON-LD that was indexed — this is what makes NLWeb results agent-actionable, not just text snippets.
| Mode | Behavior | Use case |
|------|----------|----------|
| list | Return ranked Schema.org results, no LLM synthesis | Agent does its own rendering / re-ranking |
| summarize | LLM condenses top results into a short answer + still returns results | Conversational UIs |
| generate | Full RAG — LLM synthesizes an answer grounded in results | Q&A endpoints |
For /ask, errors generally come back as 500 with a JSON envelope. For /mcp, errors use JSON-RPC 2.0:
{
"jsonrpc": "2.0",
"id": 1,
"error": { "code": -32603, "message": "Internal error", "data": {...} }
}
Always check status code before parsing — partial SSE streams can drop with 200 followed by silence.
If you need to extend /ask (e.g., add an auth check or custom param):
webserver/routes/api.pywebserver/middleware/ rather than modifying the route directly — keeps you upgrade-safeNLWebHandler (core/baseHandler.py) unchanged so the streaming + ranking pipeline still runs# Python streaming client (sketch — verify response shape against the live spec)
import httpx, json
async with httpx.AsyncClient() as client:
async with client.stream("GET", "http://localhost:8000/ask",
params={"query": "best running shoes", "site": "shoes", "mode": "generate"}) as r:
async for line in r.aiter_lines():
if not line.startswith("data: "):
continue
obj = json.loads(line[6:])
if "message_type" in obj:
handle_header(obj)
elif "results" in obj:
handle_results(obj["results"])
mode=listmode=summarizemode=generatestreaming=false first — easier to inspect a single JSON body.decontextualized_query to bypass query rewriting and isolate ranking issues.mode=list to see the raw retrieval — if results are bad here, the problem is ingest/embeddings, not the LLM.query_id and grep server logs for it.tool_selection_enabled in config_nlweb.yaml to bypass the router and force straight retrieval.Always verify the exact param names and message_type values against the live spec — they evolve.
development
Build with Spree's headless Next.js storefront — the official `spree/storefront` repo (Next.js 16 App Router with Server Actions and Turbopack, React 19 Server Components, Tailwind CSS 4, TypeScript 5, `@spree/sdk`, Sentry), server-only auth (httpOnly JWT cookies + publishable key), MeiliSearch faceted catalog, one-page checkout with Apple/Google Pay/Klarna/Affirm/SEPA, multi-region market routing, GA4 + JSON-LD SEO, and Vercel/Docker deployment. Use when forking or customizing the storefront, or evaluating headless adoption.
tools
Build Spree extensions as Rails engines — gem scaffolding, `bin/rails g spree:extension`, mounting routes/migrations/assets, the modern `prepend` decorator pattern (`*_decorator.rb` with `self.prepended(base)`), generators (`spree:model_decorator`, `spree:controller_decorator`), the four customization surfaces in preference order (Events > Webhooks > Dependencies > Decorators), Spree::Dependencies for swapping service objects, gem release/versioning, and the deprecated Deface engine. Use when building a reusable Spree extension or adding non-trivial customization to an app.
development
Build with Spree's event bus and Webhooks 2.0 — `Spree::Events` publication, `Spree::Subscriber` DSL with `subscribes_to` and `on`, wildcard matching, lifecycle events (`{model}.created/.updated/.deleted` via `publishes_lifecycle_events`), the canonical event catalog (order.*, payment.*, shipment.*, product.*), Webhooks 2.0 endpoints, HMAC-SHA256 signing (`X-Spree-Webhook-Signature`), exponential-backoff retries, and Sidekiq job orchestration. Use when wiring event-driven business logic, building webhook consumers, or replacing ActiveSupport callback chains.
tools
Cross-cutting Spree development patterns — the customization preference hierarchy (Events > Webhooks > Dependencies > Decorators), `Spree::Dependencies` service-object swapping, the `_decorator.rb` + `prepend` + `self.prepended` idiom, idempotent subscribers and webhook receivers, multi-store scoping discipline, prefixed IDs, calculator polymorphism (shipping/promotion/tax share the base), service-object composition with `dry-monads` or simple results, why to avoid `class_eval` reopening and Deface, and Spree-on-Rails idioms (Hotwire/Turbo Stimulus, ActiveStorage, Action Cable, Sidekiq). Use when designing the architecture of a Spree extension or solving cross-cutting concerns.