src/langsmith_cli/SKILL.md
Inspect and manage LangSmith traces, runs, datasets, and prompts using the 'langsmith-cli'.
npx skillsauth add gigaverse-app/langsmith-cli langsmithInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this tool to debug AI chains, inspect past runs, manage datasets, and analyze token costs in LangSmith.
uv tool install langsmith-cli
/plugin marketplace add gigaverse-app/langsmith-cli
See Installation Guide if install fails or for alternative methods.
--jsonALWAYS pass --json for machine-readable output. It can appear before or after the subcommand, but putting it first is still the clearest convention.
# ✅ CORRECT
langsmith-cli --json runs list --project my-project --limit 5
# ❌ WRONG — Rich table output, cannot be parsed
langsmith-cli runs list --project my-project --limit 5
--output for data extraction — never shell redirection# ✅ CORRECT — atomic write, errors visible, non-zero exit on failure
langsmith-cli --json runs list --project my-project --output runs.jsonl
langsmith-cli runs list --project my-project --format json --output runs.json
python3 -c "import json; runs = [json.loads(l) for l in open('runs.jsonl')]"
# ❌ WRONG — errors go to stderr silently, you get empty/corrupt file
langsmith-cli --json runs list --project my-project > runs.json
# ❌ WRONG — heredoc overrides pipe stdin, python3 reads empty stdin
langsmith-cli --json runs get <id> --fields outputs | python3 << 'EOF'
import sys, json; data = json.load(sys.stdin) # stdin is EMPTY
EOF
Use python3 -c "..." (no heredoc) if you must pipe inline.
Step 1: langsmith-cli runs cache list
↓
Step 2: Is the project listed with recent data?
YES → Use `runs cache grep` directly. Zero API calls. STOP.
NO → Tell user: "Project X is not in cache. Downloading in background."
Run `langsmith-cli --json runs cache download ...` in background,
poll TaskOutput(block=false) for progress, use cache grep when done.
Red flags — STOP if you're about to:
runs list, --fetch N) when the project is already cachedruns cache download without first checking runs cache listruns cache listruns cache download without --json — Rich output is swallowed when captured to a file, leaving you with zero progress visibility--fetch N after a cache download — --fetch always hits the API, never the cacheBackground download + progress tracking:
# ✅ CORRECT — --json emits {"event":"progress","project":"...","new_runs":N} to stderr per batch
langsmith-cli --json runs cache download --project "dev/my-project" --last 30d
# Run in background, poll TaskOutput(block=false), relay new_runs count to user
# Final stdout: {"event":"download_complete","total_new_runs":N}
--fields to reduce token usagelangsmith-cli --json runs list --fields id,name,status,start_time
langsmith-cli --json runs get <id> --fields inputs,outputs,error
| Task | Command |
|------|---------|
| List recent runs | langsmith-cli --json runs list --project <name> --limit 10 --fields id,name,status |
| Get a single run | langsmith-cli --json runs get <id> --fields inputs,outputs,error |
| Get run + child outputs | langsmith-cli --json runs get <id> --follow-children --fields id,name,inputs,outputs |
| Get latest run | langsmith-cli --json runs get-latest --project <name> --fields inputs,outputs |
| Get latest error | langsmith-cli --json runs get-latest --project <name> --failed --fields id,name,error |
| Server-side search | langsmith-cli --json runs search "pattern" --fields id,name,status --limit 20 |
| Scoped content search | langsmith-cli --json runs search "pattern" --in outputs --fields id,name,outputs --limit 20 |
| Search cached runs | langsmith-cli --json runs cache grep "pattern" -E --grep-in outputs --project <name> --fields id,name,outputs |
| Download cache | langsmith-cli --json runs cache download --project <name> --last 7d |
| List cache | langsmith-cli --json runs cache list --fields project_name,run_count,path |
| Discover cache schema | langsmith-cli --json runs cache schema --project <name> --include outputs |
| Analyze token costs | langsmith-cli --json runs usage --from-cache --breakdown model --active-only |
| List projects | langsmith-cli --json projects list --name-pattern "dev/*" --fields name |
| Count runs | langsmith-cli --json runs list --project <name> --count |
| Run stats | langsmith-cli --json runs stats --project <name> |
| Stratified run sample | langsmith-cli --json runs sample --project <name> --stratify-by tag:<key> --values a,b --fields id,name,stratum |
| List datasets | langsmith-cli --json datasets list --fields id,name |
| List prompts | langsmith-cli --json prompts list --fields repo_handle,description |
| List feedback for a run | langsmith-cli --json feedback list --run-id <run-id> --fields id,key,score |
| Create feedback | langsmith-cli --json feedback create <run-id> --key correctness --score 0.9 |
| List annotation queues | langsmith-cli --json annotation-queues list --fields id,name |
| Get annotation queue | langsmith-cli --json annotation-queues get <queue-id> |
| View experiment results | langsmith-cli --json experiments results <experiment-name> |
| Open run in browser | Construct URL manually — see LangSmith URLs section below |
When your task matches one of the sections below, you MUST load that reference file before proceeding — don't load them speculatively for unrelated tasks.
runs list, runs get, runs get-latest, runs search, runs sample, runs analyze, runs tags, runs fields, runs export--trace-filter, --tree-filter, --sort-by, --roots, --run-type, --tag, --model, --min-latency, --max-latency--metadata key=value (supports wildcards key=val* and regex key=/pattern/)--query (server-side, fast, first 250 chars) vs --grep (client-side, all content, regex)eq, gt, has, and, search, metadata_key/metadata_value)runs usage, runs pricing, runs cache download + --from-cache--from-cache, --group-by, --breakdown, --apply-pricing)--fields, --count, --output, and --format json|csv|yamlfeedback commands when:feedback list [--run-id <id>] [--key <key>] [--limit N] [--fields a,b] [--count] [--output file.jsonl] [--format json|csv|yaml], feedback get <id> [--fields a,b] [--output file.json], feedback create <run-id> --key <key> [--score N] [--comment <str>], feedback delete <id> [--yes]annotation-queues commands when:annotation-queues list [--fields a,b] [--count] [--output file.jsonl] [--format json|csv|yaml], annotation-queues get <id> [--fields a,b] [--output file.json], annotation-queues create <name> [--description <str>], annotation-queues update <id> [--name <str>] [--description <str>], annotation-queues delete <id> [--yes]experiments commands when:experiments results <experiment-name>--filter expression and want operator referencemetadata_key/metadata_value filter syntaxextracted_entities) — there's a complete recipe covering cache download, Python JSONL scanning, deduplication of sub-runs, and llm_recognition filteringruns open generates broken URLs. Build trace URLs manually using the project's id and tenant_id:
# Step 1: Get org ID (tenant_id) and project ID
langsmith-cli --json projects get "dev/my-project" --fields id,tenant_id
# Step 2: Build the URL
# https://smith.langchain.com/o/{tenant_id}/projects/p/{project_id}?peek={run_id}&peeked_trace={trace_id}
Example:
org_id = "b658ea18-0431-42c0-8d03-337d43fed8cf" # tenant_id from projects get
proj_id = "730acc6c-ec97-4f08-915e-7d3f7f775300" # id from projects get
url = f"https://smith.langchain.com/o/{org_id}/projects/p/{proj_id}?peek={run_id}&peeked_trace={trace_id}"
peek = the specific run ID to open in the side panelpeeked_trace = the trace (root run) ID it belongs torun.id and run.trace_id)# Multi-project matching
--project-pattern "prd/*" # wildcard alias for --project-name-pattern
--project-name-pattern "prd/*" # wildcard
--project-name-regex "^(prd|stg)" # regex
# Time windows (combinable)
--since 2026-01-15 --before 2026-01-29
--last 7d
--since 2026-01-15 --last 14d # forward window
# Content search
--query "text" # server-side, fast, first ~250 chars only
--grep "pattern" --grep-regex --grep-in inputs,outputs # client-side, all content
# Metadata filter (server-side, supports wildcards and regex)
--metadata channel_id=Gigaverse_Daily_Standup*
--metadata channel_id=/^Gigaverse/
# Reduce output size
--fields id,name,status,start_time
--roots # root traces only (cleaner)
--all-runs # include nested child runs
--limit 10 --fetch 500 # fetch 500 from API, return top 10 matches
tools
Inspect and manage LangSmith traces, runs, datasets, and prompts using the 'langsmith-cli'.
tools
Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layers like Lobster, ACPX, plugins, or plain code. Keep conditional logic in the caller; use TaskFlow for flow identity, child-task linkage, waiting state, revision-checked mutations, and user-facing emergence.
tools
# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------
tools
# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------