plugins/nvidia/skills/dynamo-router-starter/SKILL.md
Start or patch Dynamo router modes and run router endpoint smoke checks. Use for round-robin, KV-aware, least-loaded, or device-aware routing setup; use recipe-runner for recipe deployment and troubleshoot for failure diagnosis.
npx skillsauth add openai/plugins dynamo-router-starterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Make Dynamo routing feel easy by getting a baseline router mode running, enabling KV-aware routing when appropriate, and proving the endpoint works. Keep the user focused on exact commands and success signals, not router internals.
dynamo package importable (python3 -m dynamo.frontend --help works).kubectl configured with access to the target namespace and a deployed Dynamo recipe./v1/models returns at least one entry).Collect or infer:
round-robin, kv, least-loaded, device-aware-weighted, direct, or random/v1/models cannot discover itFor local bring-up with already registered workers:
python3 -m dynamo.frontend --router-mode round-robin --http-port 8000
For Kubernetes, inspect the selected recipe deploy.yaml and locate the
frontend service. If the recipe is not already deployed, use
dynamo-recipe-runner first.
For local frontend:
python3 -m dynamo.frontend --router-mode kv --http-port 8000
For Kubernetes, patch only the frontend service env:
envs:
- name: DYN_ROUTER_MODE
value: kv
If backend workers are not publishing KV cache events, set approximate mode instead of leaving the router waiting for events:
envs:
- name: DYN_ROUTER_USE_KV_EVENTS
value: "false"
After port-forwarding the frontend service or starting local frontend, run:
python3 scripts/check_router_health.py \
--base-url http://127.0.0.1:8000
This must verify /v1/models and, when a model is discoverable, one
/v1/chat/completions request.
When comparing round-robin vs KV routing:
If the endpoint is unhealthy or workers are missing, switch to
dynamo-troubleshoot.
| Script | Purpose | Arguments |
|---|---|---|
| scripts/check_router_health.py | Smoke-test /v1/models and one chat completion against a Dynamo frontend | --base-url, --retries, --timeout |
Invoke via the agentskills.io run_script() protocol:
run_script("scripts/check_router_health.py", args=["--base-url", "http://127.0.0.1:8000"])
Local KV-routed frontend on port 8000, then smoke-test it:
python3 -m dynamo.frontend --router-mode kv --http-port 8000 &
python3 scripts/check_router_health.py --base-url http://127.0.0.1:8000
Kubernetes-deployed frontend reachable via port-forward:
kubectl port-forward svc/qwen-vllm-disagg-frontend 8000:8000 -n dynamo-demo &
python3 scripts/check_router_health.py --base-url http://127.0.0.1:8000 --retries 3
Equivalent through the agent protocol:
run_script("scripts/check_router_health.py", args=["--base-url", "http://127.0.0.1:8000", "--retries", "3"])
Return:
dynamo-benchmark for throughput/latency numbers.| Symptom | Likely cause | Next step |
|---|---|---|
| /v1/models returns empty list | No worker registered with the frontend | Verify worker pods are Ready; confirm they connect to the same etcd/NATS |
| Smoke chat request times out | Frontend up, workers not serving | Switch to dynamo-troubleshoot; inspect worker logs |
| KV mode hangs | Workers do not publish KV cache events | Set DYN_ROUTER_USE_KV_EVENTS=false (approximate mode) |
| Connection refused on port-forward | Port-forward dropped or wrong service name | Re-run port-forward; verify the frontend service name matches the recipe |
See BENCHMARK.md for the NVCARPS-EVAL performance report (auto-generated by the NVSkills CI pipeline). To refresh, re-run /nvskills-ci on an upstream PR touching this skill.
references/router-modes.md for the compact mode/env map.scripts/check_router_health.py for endpoint smoke tests.tools
Top-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.
data-ai
Use when the user mentions MagicPath, designs, UI components, themes, canvas selections, or repo-to-canvas UI work; run magicpath-ai to search, inspect, install, or author components.
documentation
Use as the top-level router for Omniverse Realtime Viewer USD app requests and focused viewer reference documents.
tools
Turn Notion specs into implementation plans, tasks, and progress tracking; use when implementing PRDs/feature specs and creating Notion plans + tasks from them.