Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

orcaqubits/nlweb-ask-endpoint

Name: nlweb-ask-endpoint
Author: orcaqubits

dist/codex/nlweb-protocol/skills/nlweb-ask-endpoint/SKILL.md

npx skillsauth add orcaqubits/agentic-commerce-claude-plugins nlweb-ask-endpoint

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

NLWeb /ask Endpoint

Before writing code

Fetch live spec:

Fetch https://github.com/nlweb-ai/NLWeb/blob/main/docs/nlweb-rest-api.md for the canonical /ask contract — request params, response shape, and streaming format.
Fetch https://github.com/nlweb-ai/NLWeb/blob/main/docs/life-of-a-chat-query.md to trace a request end-to-end.
Fetch https://github.com/nlweb-ai/NLWeb/blob/main/docs/nlweb-headers.md for the in-stream "headers" mechanism (license, data-retention, rate-limit messages are NOT HTTP headers).
Web-search the latest release notes for the v0.55+ structured POST body shape (query.text, prefer.mode, prefer.streaming, meta.version) — this is newer than the GET-only legacy contract.
Check webserver/routes/api.py in the live repo to confirm exact param names.

Conceptual Architecture

Routes

| Route | Method | Purpose | |-------|--------|---------| | /ask | GET, POST | Main NL query | | /who | GET | Site relevance for a query (federated) | | /sites | GET | List configured sites | | /config | GET | Public config (safe subset) |

Request Parameters

Verify exact names against the live routes/api.py. Stable subset:

| Param | Type | Required | Default | Notes | |-------|------|----------|---------|-------| | query | string | yes | — | NL question | | site | string | no | all | Backend partition; in MCP can be array | | prev | string | no | — | Comma-separated previous queries (conversation context) | | decontextualized_query | string | no | — | Pre-resolved query; skips server-side decontextualization | | streaming | bool | no | true | "0" / "false" / "False" disables | | query_id | string | no | auto | Echoed in response | | mode | enum | no | list | list | summarize | generate | | scorer | string | no | default | e.g., nlwebscorer for the neural reranker | | itemType | string | no | — | Schema.org type hint (skip type detection) | | response_format | string | no | — | v0.55 structured-body field |

v0.55 Structured POST Body

The newer body format groups fields:

{
  "query": { "text": "your question" },
  "context": { "prev": ["previous q1", "previous q2"] },
  "prefer": {
    "mode": "list",
    "streaming": true,
    "response_format": "schema"
  },
  "meta": { "version": "0.55" }
}

Verify the exact field names against the live docs before relying on this — fields are still settling.

Streaming Response Format (SSE)

NLWeb uses Server-Sent Events when streaming=true (the default):

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Accel-Buffering: no

Each chunk is:

data: <json>\n\n

The <json> is one of:

A message object (header-like): {"message_type": "license", "content": {...}}
A partial result: {"results": [...]} (results may arrive incrementally as FastTrack streams)
A terminal object: {"query_id": "...", "complete": true} (exact field — verify live)

In-Stream "Headers" (NLWS Mechanism)

NLWeb's "headers" are NOT HTTP response headers — they are JSON objects in the SSE stream with a message_type discriminator. Known types:

| message_type | Purpose | |--------------|---------| | license | Content license terms | | data_retention | How long the agent may cache results | | cache_policy | Caching directives | | ui_component | Optional rendering hint | | usage_terms | Acceptable use | | rate_limits | Calls/sec / day budget | | data_freshness | When the underlying data was last indexed | | api_version | Server's NLWeb version |

Client parsing rule: don't assume results is the first chunk. Buffer message objects until you see the result stream or a terminal marker.

Non-Streaming Response

With streaming=false, the server returns a single application/json body:

{
  "query_id": "abc-123",
  "messages": [{"message_type": "license", "content": {...}}, ...],
  "results": [
    {
      "url": "https://example.com/article/x",
      "name": "Article X",
      "site": "example",
      "score": 0.83,
      "description": "...",
      "schema_object": { "@type": "Article", "@context": "https://schema.org", ... }
    }
  ]
}

The schema_object is the original Schema.org JSON-LD that was indexed — this is what makes NLWeb results agent-actionable, not just text snippets.

Three Modes

| Mode | Behavior | Use case | |------|----------|----------| | list | Return ranked Schema.org results, no LLM synthesis | Agent does its own rendering / re-ranking | | summarize | LLM condenses top results into a short answer + still returns results | Conversational UIs | | generate | Full RAG — LLM synthesizes an answer grounded in results | Q&A endpoints |

Errors

For /ask, errors generally come back as 500 with a JSON envelope. For /mcp, errors use JSON-RPC 2.0:

{
  "jsonrpc": "2.0",
  "id": 1,
  "error": { "code": -32603, "message": "Internal error", "data": {...} }
}

Always check status code before parsing — partial SSE streams can drop with 200 followed by silence.

Implementation Guidance

Server-Side (extending the route)

If you need to extend /ask (e.g., add an auth check or custom param):

Locate webserver/routes/api.py
Add middleware in webserver/middleware/ rather than modifying the route directly — keeps you upgrade-safe
Forward to NLWebHandler (core/baseHandler.py) unchanged so the streaming + ranking pipeline still runs

Client-Side (calling /ask)

# Python streaming client (sketch — verify response shape against the live spec)
import httpx, json

async with httpx.AsyncClient() as client:
    async with client.stream("GET", "http://localhost:8000/ask",
                              params={"query": "best running shoes", "site": "shoes", "mode": "generate"}) as r:
        async for line in r.aiter_lines():
            if not line.startswith("data: "):
                continue
            obj = json.loads(line[6:])
            if "message_type" in obj:
                handle_header(obj)
            elif "results" in obj:
                handle_results(obj["results"])

When to use which mode

Agent that re-ranks and selects on its own → mode=list
Quick conversational answer with citations → mode=summarize
Single synthesized answer (chatbot-style) → mode=generate

Debugging /ask

Set streaming=false first — easier to inspect a single JSON body.
Add decontextualized_query to bypass query rewriting and isolate ranking issues.
Try mode=list to see the raw retrieval — if results are bad here, the problem is ingest/embeddings, not the LLM.
Pass query_id and grep server logs for it.
Disable tool_selection_enabled in config_nlweb.yaml to bypass the router and force straight retrieval.

Always verify the exact param names and message_type values against the live spec — they evolve.

orcaqubits/nlweb-ask-endpoint

dist/codex/nlweb-protocol/skills/nlweb-ask-endpoint/SKILL.md

Implement and consume the NLWeb /ask REST endpoint — request shape (GET/POST, query-string and v0.55 structured body), SSE streaming response, modes (list/summarize/generate), in-stream "message_type" headers, error envelopes, and client-side parsing. Use when building an NLWeb server route, calling /ask from a custom agent, or debugging /ask responses.

27 stars

tools

Updated May 14, 2026

$ install --global

skillsauth

npx skillsauth add orcaqubits/agentic-commerce-claude-plugins nlweb-ask-endpoint

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 14, 2026, 5:54 AM116.0s1 file scanned

SKILL.md

name:: nlweb-ask-endpoint
description:: >

NLWeb /ask Endpoint

Before writing code

Fetch live spec:

Fetch https://github.com/nlweb-ai/NLWeb/blob/main/docs/nlweb-rest-api.md for the canonical /ask contract — request params, response shape, and streaming format.
Fetch https://github.com/nlweb-ai/NLWeb/blob/main/docs/life-of-a-chat-query.md to trace a request end-to-end.
Fetch https://github.com/nlweb-ai/NLWeb/blob/main/docs/nlweb-headers.md for the in-stream "headers" mechanism (license, data-retention, rate-limit messages are NOT HTTP headers).
Web-search the latest release notes for the v0.55+ structured POST body shape (query.text, prefer.mode, prefer.streaming, meta.version) — this is newer than the GET-only legacy contract.
Check webserver/routes/api.py in the live repo to confirm exact param names.

Conceptual Architecture

Routes

Request Parameters

Verify exact names against the live routes/api.py. Stable subset:

v0.55 Structured POST Body

The newer body format groups fields:

{
  "query": { "text": "your question" },
  "context": { "prev": ["previous q1", "previous q2"] },
  "prefer": {
    "mode": "list",
    "streaming": true,
    "response_format": "schema"
  },
  "meta": { "version": "0.55" }
}

Verify the exact field names against the live docs before relying on this — fields are still settling.

Streaming Response Format (SSE)

NLWeb uses Server-Sent Events when streaming=true (the default):

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Accel-Buffering: no

Each chunk is:

data: <json>\n\n

The <json> is one of:

A message object (header-like): {"message_type": "license", "content": {...}}
A partial result: {"results": [...]} (results may arrive incrementally as FastTrack streams)
A terminal object: {"query_id": "...", "complete": true} (exact field — verify live)

In-Stream "Headers" (NLWS Mechanism)

NLWeb's "headers" are NOT HTTP response headers — they are JSON objects in the SSE stream with a message_type discriminator. Known types:

Client parsing rule: don't assume results is the first chunk. Buffer message objects until you see the result stream or a terminal marker.

Non-Streaming Response

With streaming=false, the server returns a single application/json body:

{
  "query_id": "abc-123",
  "messages": [{"message_type": "license", "content": {...}}, ...],
  "results": [
    {
      "url": "https://example.com/article/x",
      "name": "Article X",
      "site": "example",
      "score": 0.83,
      "description": "...",
      "schema_object": { "@type": "Article", "@context": "https://schema.org", ... }
    }
  ]
}

The schema_object is the original Schema.org JSON-LD that was indexed — this is what makes NLWeb results agent-actionable, not just text snippets.

Three Modes

Errors

For /ask, errors generally come back as 500 with a JSON envelope. For /mcp, errors use JSON-RPC 2.0:

{
  "jsonrpc": "2.0",
  "id": 1,
  "error": { "code": -32603, "message": "Internal error", "data": {...} }
}

Always check status code before parsing — partial SSE streams can drop with 200 followed by silence.

Implementation Guidance

Server-Side (extending the route)

If you need to extend /ask (e.g., add an auth check or custom param):

Locate webserver/routes/api.py
Add middleware in webserver/middleware/ rather than modifying the route directly — keeps you upgrade-safe
Forward to NLWebHandler (core/baseHandler.py) unchanged so the streaming + ranking pipeline still runs

Client-Side (calling /ask)

# Python streaming client (sketch — verify response shape against the live spec)
import httpx, json

async with httpx.AsyncClient() as client:
    async with client.stream("GET", "http://localhost:8000/ask",
                              params={"query": "best running shoes", "site": "shoes", "mode": "generate"}) as r:
        async for line in r.aiter_lines():
            if not line.startswith("data: "):
                continue
            obj = json.loads(line[6:])
            if "message_type" in obj:
                handle_header(obj)
            elif "results" in obj:
                handle_results(obj["results"])

When to use which mode

Agent that re-ranks and selects on its own → mode=list
Quick conversational answer with citations → mode=summarize
Single synthesized answer (chatbot-style) → mode=generate

Debugging /ask

Set streaming=false first — easier to inspect a single JSON body.
Add decontextualized_query to bypass query rewriting and isolate ranking issues.
Try mode=list to see the raw retrieval — if results are bad here, the problem is ingest/embeddings, not the LLM.
Pass query_id and grep server logs for it.
Disable tool_selection_enabled in config_nlweb.yaml to bypass the router and force straight retrieval.

Always verify the exact param names and message_type values against the live spec — they evolve.

Related Skills

orcaqubits/spree-headless-storefront

development

VerifiedTrustedCommunity

Build with Spree's headless Next.js storefront — the official `spree/storefront` repo (Next.js 16 App Router with Server Actions and Turbopack, React 19 Server Components, Tailwind CSS 4, TypeScript 5, `@spree/sdk`, Sentry), server-only auth (httpOnly JWT cookies + publishable key), MeiliSearch faceted catalog, one-page checkout with Apple/Google Pay/Klarna/Affirm/SEPA, multi-region market routing, GA4 + JSON-LD SEO, and Vercel/Docker deployment. Use when forking or customizing the storefront, or evaluating headless adoption.

27SKILL.mdUpdated May 14, 2026

orcaqubits/spree-headless-storefront

orcaqubits/spree-extensions

tools

VerifiedTrustedCommunity

Build Spree extensions as Rails engines — gem scaffolding, `bin/rails g spree:extension`, mounting routes/migrations/assets, the modern `prepend` decorator pattern (`*_decorator.rb` with `self.prepended(base)`), generators (`spree:model_decorator`, `spree:controller_decorator`), the four customization surfaces in preference order (Events > Webhooks > Dependencies > Decorators), Spree::Dependencies for swapping service objects, gem release/versioning, and the deprecated Deface engine. Use when building a reusable Spree extension or adding non-trivial customization to an app.

27SKILL.mdUpdated May 14, 2026

orcaqubits/spree-extensions

orcaqubits/spree-events-webhooks

development

VerifiedTrustedCommunity

Build with Spree's event bus and Webhooks 2.0 — `Spree::Events` publication, `Spree::Subscriber` DSL with `subscribes_to` and `on`, wildcard matching, lifecycle events (`{model}.created/.updated/.deleted` via `publishes_lifecycle_events`), the canonical event catalog (order.*, payment.*, shipment.*, product.*), Webhooks 2.0 endpoints, HMAC-SHA256 signing (`X-Spree-Webhook-Signature`), exponential-backoff retries, and Sidekiq job orchestration. Use when wiring event-driven business logic, building webhook consumers, or replacing ActiveSupport callback chains.

27SKILL.mdUpdated May 14, 2026

orcaqubits/spree-events-webhooks

orcaqubits/spree-dev-patterns

tools

VerifiedTrustedCommunity

Cross-cutting Spree development patterns — the customization preference hierarchy (Events > Webhooks > Dependencies > Decorators), `Spree::Dependencies` service-object swapping, the `_decorator.rb` + `prepend` + `self.prepended` idiom, idempotent subscribers and webhook receivers, multi-store scoping discipline, prefixed IDs, calculator polymorphism (shipping/promotion/tax share the base), service-object composition with `dry-monads` or simple results, why to avoid `class_eval` reopening and Deface, and Spree-on-Rails idioms (Hotwire/Turbo Stimulus, ActiveStorage, Action Cable, Sidekiq). Use when designing the architecture of a Spree extension or solving cross-cutting concerns.

27SKILL.mdUpdated May 14, 2026

orcaqubits/spree-dev-patterns

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/orcaqubits/agentic-commerce-claude-plugins.git

# Copy into Claude Code skills folder (global)
cp -r agentic-commerce-claude-plugins/dist/codex/nlweb-protocol/skills/nlweb-ask-endpoint ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

orcaqubits/agentic-commerce-claude-plugins

27 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT