skills/langsmith-trace-analyzer/SKILL.md
Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.
npx skillsauth add lubu-labs/langchain-agent-skills langsmith-trace-analyzerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill to move from raw LangSmith traces to actionable debugging/evaluation insights.
# Install dependencies
uv pip install langsmith langsmith-fetch
# Auth
export LANGSMITH_API_KEY=<your_langsmith_api_key>
scripts/download_traces.py (or scripts/download_traces.ts).scripts/analyze_traces.py.references/filtering-querying.md for query/filter syntaxreferences/analysis-patterns.md for deeper diagnosticsreferences/benchmark-analysis.md for benchmark-specific workflowsKnown trace IDs
Use langsmith-fetch trace <id> directly, or --trace-ids in downloader scripts.
Need to discover traces first
Use LangSmith SDK list_runs/listRuns with filters, then download selected trace IDs.
Need aggregate insights
Run analyze_traces.py for summary stats, patterns, and passed-vs-failed comparisons.
Python:
uv run skills/langsmith-trace-analyzer/scripts/download_traces.py \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./traces \
--organize
TypeScript:
ts-node skills/langsmith-trace-analyzer/scripts/download_traces.ts \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./traces
Output layout:
traces/
├── manifest.json
└── by-outcome/
├── passed/
├── failed/
└── error/
├── GraphRecursionError/
├── TimeoutError/
└── DaytonaError/
Notes:
--organize/--no-organize.langsmith-fetch for full trace payload export.# Markdown report
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --output report.md
# JSON output
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --json
# Compare passed vs failed (expects by-outcome folders)
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --compare --output comparison.md
The analyzer reports:
Use official LangSmith run filter syntax via filter and/or start_time:
from datetime import datetime, timedelta, timezone
from langsmith import Client
client = Client()
start = datetime.now(timezone.utc) - timedelta(hours=24)
filter_query = 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))'
runs = client.list_runs(
project_name="my-project",
is_root=True,
start_time=start,
filter=filter_query,
)
For TypeScript:
import { Client } from "langsmith";
const client = new Client();
for await (const run of client.listRuns({
projectName: "my-project",
isRoot: true,
filter: 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))',
})) {
console.log(run.id, run.status);
}
status, error, total_tokens, start_time, end_time).metadata or extra.metadata) and/or messages.analyze_traces.py is resilient to multiple payload shapes, including raw array payloads.list_runs results.| Issue | Likely Cause | Action |
|---|---|---|
| LANGSMITH_API_KEY missing | Auth not configured | export LANGSMITH_API_KEY=<your_langsmith_api_key> |
| No runs returned | Wrong project/filter/time range | Verify project name and filter syntax |
| Empty/partial message arrays | Run schema differs or incomplete data | Use downloaded trace JSON and inspect status/error fields |
| JSON parse error on downloaded files | Bad/incomplete export | Re-download trace; use --format raw paths in scripts |
| Re-downloading same traces repeatedly | Existing files in nested folders | Use current scripts (they check existing files across output tree) |
manifest.json, trace JSON dumps) unless sanitized.scripts/download_traces.py: Python downloader + organizerscripts/download_traces.ts: TypeScript downloader + organizerscripts/analyze_traces.py: Offline analysis and reportingreferences/filtering-querying.md: LangSmith query/filter examplesreferences/analysis-patterns.md: Diagnostic patterns and heuristicsreferences/benchmark-analysis.md: Benchmark-oriented analysistools
Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
tools
Use this skill when you need to test or evaluate LangGraph/LangChain agents: writing unit or integration tests, generating test scaffolds, mocking LLM/tool behavior, running trajectory evaluation (match or LLM-as-judge), running LangSmith dataset evaluations, and comparing two agent versions with A/B-style offline analysis. Use it for Python and JavaScript/TypeScript workflows, evaluator design, experiment setup, regression gates, and debugging flaky/incorrect evaluation results.
development
Design state schemas, implement reducers, configure persistence, and debug state issues for LangGraph applications. Use when users want to (1) design or define state schemas for LangGraph graphs, (2) implement reducer functions for state accumulation, (3) configure persistence with checkpointers (InMemorySaver/MemorySaver, SqliteSaver, PostgresSaver), (4) debug state update issues or unexpected state behavior, (5) migrate state schemas between versions, (6) validate state schema structure, (7) choose between TypedDict and MessagesState patterns, (8) implement custom reducers for lists, dicts, or sets, (9) use the Overwrite type to bypass reducers, (10) set up thread-based persistence for multi-turn conversations, or (11) inspect checkpoints for debugging.
development
Initialize and configure LangGraph projects with proper structure, langgraph.json configuration, environment variables, and dependency management. Use when users want to (1) create a new LangGraph project, (2) set up langgraph.json for deployment, (3) configure environment variables for LLM providers, (4) initialize project structure for agents, (5) set up local development with LangGraph Studio, (6) configure dependencies (pyproject.toml, requirements.txt, package.json), or (7) troubleshoot project configuration issues.