Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

Reinasboo/langsmith

Name: langsmith
Author: Reinasboo

.agents/skills/langsmith/SKILL.md

npx skillsauth add Reinasboo/Bountylab langsmith

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

langsmith — LLM Observability, Evaluation & Prompt Management

Keyword: langsmith · llm tracing · llm evaluation · @traceable · langsmith evaluate

LangSmith is a framework-agnostic platform for developing, debugging, and deploying LLM applications. It provides end-to-end tracing, quality evaluation, prompt versioning, and production monitoring.

When to use this skill

Add tracing to any LLM pipeline (OpenAI, Anthropic, LangChain, custom models)
Run offline evaluations with evaluate() against a curated dataset
Set up production monitoring and online evaluation
Manage and version prompts in the Prompt Hub
Create datasets for regression testing and benchmarking
Attach human or automated feedback to traces
Use LLM-as-judge scoring with openevals
Debug agent failures with end-to-end trace inspection

Instructions

Install SDK: pip install -U langsmith (Python) or npm install langsmith (TypeScript)
Set environment variables: LANGSMITH_TRACING=true, LANGSMITH_API_KEY=lsv2_...
Instrument with @traceable decorator or wrap_openai() wrapper
View traces at smith.langchain.com
For evaluation setup, see references/python-sdk.md
For CLI commands, see references/cli.md
Run bash scripts/setup.sh to auto-configure environment

API Key: Get from smith.langchain.com → Settings → API Keys Docs: https://docs.langchain.com/langsmith

Quick Start

Python

pip install -U langsmith openai
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."
export OPENAI_API_KEY="sk-..."

from langsmith import traceable
from langsmith.wrappers import wrap_openai
from openai import OpenAI

client = wrap_openai(OpenAI())

@traceable
def rag_pipeline(question: str) -> str:
    """Automatically traced in LangSmith"""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}]
    )
    return response.choices[0].message.content

result = rag_pipeline("What is LangSmith?")

TypeScript

npm install langsmith openai
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."

import { traceable } from "langsmith/traceable";
import { wrapOpenAI } from "langsmith/wrappers";
import { OpenAI } from "openai";

const client = wrapOpenAI(new OpenAI());

const pipeline = traceable(async (question: string): Promise<string> => {
  const res = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: question }],
  });
  return res.choices[0].message.content ?? "";
}, { name: "RAG Pipeline" });

await pipeline("What is LangSmith?");

Core Concepts

| Concept | Description | |---------|-------------| | Run | Individual operation (LLM call, tool call, retrieval). The fundamental unit. | | Trace | All runs from a single user request, linked by trace_id. | | Thread | Multiple traces in a conversation, linked by session_id or thread_id. | | Project | Container grouping related traces (set via LANGSMITH_PROJECT). | | Dataset | Collection of {inputs, outputs} examples for offline evaluation. | | Experiment | Result set from running evaluate() against a dataset. | | Feedback | Score/label attached to a run — numeric, categorical, or freeform. |

Tracing

@traceable decorator (Python)

from langsmith import traceable

@traceable(
    run_type="chain",          # llm | chain | tool | retriever | embedding
    name="My Pipeline",
    tags=["production", "v2"],
    metadata={"version": "2.1", "env": "prod"},
    project_name="my-project"
)
def pipeline(question: str) -> str:
    return generate_answer(question)

Selective tracing context

import langsmith as ls

# Enable tracing for this block only
with ls.tracing_context(enabled=True, project_name="debug"):
    result = chain.invoke({"input": "..."})

# Disable tracing despite LANGSMITH_TRACING=true
with ls.tracing_context(enabled=False):
    result = chain.invoke({"input": "..."})

Wrap provider clients

from langsmith.wrappers import wrap_openai, wrap_anthropic
from openai import OpenAI
import anthropic

openai_client = wrap_openai(OpenAI())           # All calls auto-traced
anthropic_client = wrap_anthropic(anthropic.Anthropic())

Distributed tracing (microservices)

from langsmith.run_helpers import get_current_run_tree
import langsmith

@langsmith.traceable
def service_a(inputs):
    rt = get_current_run_tree()
    headers = rt.to_headers()     # Pass to child service
    return call_service_b(headers=headers)

@langsmith.traceable
def service_b(x, headers):
    with langsmith.tracing_context(parent=headers):
        return process(x)

Evaluation

Basic evaluation with evaluate()

from langsmith import Client
from langsmith.wrappers import wrap_openai
from openai import OpenAI

client = Client()
oai = wrap_openai(OpenAI())

# 1. Create dataset
dataset = client.create_dataset("Geography QA")
client.create_examples(
    dataset_id=dataset.id,
    examples=[
        {"inputs": {"q": "Capital of France?"}, "outputs": {"a": "Paris"}},
        {"inputs": {"q": "Capital of Germany?"}, "outputs": {"a": "Berlin"}},
    ]
)

# 2. Target function
def target(inputs: dict) -> dict:
    res = oai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": inputs["q"]}]
    )
    return {"a": res.choices[0].message.content}

# 3. Evaluator
def exact_match(inputs, outputs, reference_outputs):
    return outputs["a"].strip().lower() == reference_outputs["a"].strip().lower()

# 4. Run experiment
results = client.evaluate(
    target,
    data="Geography QA",
    evaluators=[exact_match],
    experiment_prefix="gpt-4o-mini-v1",
    max_concurrency=4
)

LLM-as-judge with openevals

pip install -U openevals

from openevals.llm import create_llm_as_judge
from openevals.prompts import CORRECTNESS_PROMPT

judge = create_llm_as_judge(
    prompt=CORRECTNESS_PROMPT,
    model="openai:o3-mini",
    feedback_key="correctness",
)

results = client.evaluate(target, data="my-dataset", evaluators=[judge])

Evaluation types

| Type | When to use | |------|------------| | Code/Heuristic | Exact match, format checks, rule-based | | LLM-as-judge | Subjective quality, safety, reference-free | | Human | Annotation queues, pairwise comparison | | Pairwise | Compare two app versions | | Online | Production traces, real traffic |

Prompt Hub

from langsmith import Client
from langchain_core.prompts import ChatPromptTemplate

client = Client()

# Push a prompt
prompt = ChatPromptTemplate([
    ("system", "You are a helpful assistant."),
    ("user", "{question}"),
])
client.push_prompt("my-assistant-prompt", object=prompt)

# Pull and use
prompt = client.pull_prompt("my-assistant-prompt")
# Pull specific version:
prompt = client.pull_prompt("my-assistant-prompt:abc123")

Feedback

from langsmith import Client
import uuid

client = Client()

# Custom run ID for later feedback linking
my_run_id = str(uuid.uuid4())
result = chain.invoke({"input": "..."}, {"run_id": my_run_id})

# Attach feedback
client.create_feedback(
    key="correctness",
    score=1,              # 0-1 numeric or categorical
    run_id=my_run_id,
    comment="Accurate and concise"
)

References

Python SDK Reference — full Client API, @traceable signature, evaluate()
TypeScript SDK Reference — Client, traceable, wrappers, evaluate
CLI Reference — langsmith CLI commands
Official Docs — langchain.com/langsmith
SDK GitHub — MIT License, v0.7.17
openevals — Prebuilt LLM evaluators

Reinasboo/langsmith

.agents/skills/langsmith/SKILL.md

Instrument, trace, evaluate, and monitor LLM applications and AI agents with LangSmith. Use when setting up observability for LLM pipelines, running offline or online evaluations, managing prompts in the Prompt Hub, creating datasets for regression testing, or deploying agent servers. Triggers on: langsmith, langchain tracing, llm tracing, llm observability, llm evaluation, trace llm calls, @traceable, wrap_openai, langsmith evaluate, langsmith dataset, langsmith feedback, langsmith prompt hub, langsmith project, llm monitoring, llm debugging, llm quality, openevals, langsmith cli, langsmith experiment, annotate llm, llm judge.

tools

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add Reinasboo/Bountylab langsmith

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 6, 2026, 11:35 AM6.1s6 files scanned

SKILL.md

name:: langsmith
description:: >
agent servers. Triggers on:: langsmith, langchain tracing, llm tracing, llm observability,
allowed-tools:: Bash Read Write Edit Glob Grep WebFetch
tags:: langsmith, langchain, tracing, observability, evaluation, llm-monitoring, prompt-hub, datasets, openevals
version:: 1.0
source:: https://docs.langchain.com/langsmith/home
license:: MIT

langsmith — LLM Observability, Evaluation & Prompt Management

Keyword: langsmith · llm tracing · llm evaluation · @traceable · langsmith evaluate

LangSmith is a framework-agnostic platform for developing, debugging, and deploying LLM applications. It provides end-to-end tracing, quality evaluation, prompt versioning, and production monitoring.

When to use this skill

Add tracing to any LLM pipeline (OpenAI, Anthropic, LangChain, custom models)
Run offline evaluations with evaluate() against a curated dataset
Set up production monitoring and online evaluation
Manage and version prompts in the Prompt Hub
Create datasets for regression testing and benchmarking
Attach human or automated feedback to traces
Use LLM-as-judge scoring with openevals
Debug agent failures with end-to-end trace inspection

Instructions

Install SDK: pip install -U langsmith (Python) or npm install langsmith (TypeScript)
Set environment variables: LANGSMITH_TRACING=true, LANGSMITH_API_KEY=lsv2_...
Instrument with @traceable decorator or wrap_openai() wrapper
View traces at smith.langchain.com
For evaluation setup, see references/python-sdk.md
For CLI commands, see references/cli.md
Run bash scripts/setup.sh to auto-configure environment

API Key: Get from smith.langchain.com → Settings → API Keys Docs: https://docs.langchain.com/langsmith

Quick Start

Python

pip install -U langsmith openai
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."
export OPENAI_API_KEY="sk-..."

from langsmith import traceable
from langsmith.wrappers import wrap_openai
from openai import OpenAI

client = wrap_openai(OpenAI())

@traceable
def rag_pipeline(question: str) -> str:
    """Automatically traced in LangSmith"""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}]
    )
    return response.choices[0].message.content

result = rag_pipeline("What is LangSmith?")

TypeScript

npm install langsmith openai
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."

import { traceable } from "langsmith/traceable";
import { wrapOpenAI } from "langsmith/wrappers";
import { OpenAI } from "openai";

const client = wrapOpenAI(new OpenAI());

const pipeline = traceable(async (question: string): Promise<string> => {
  const res = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: question }],
  });
  return res.choices[0].message.content ?? "";
}, { name: "RAG Pipeline" });

await pipeline("What is LangSmith?");

Core Concepts

Tracing

@traceable decorator (Python)

from langsmith import traceable

@traceable(
    run_type="chain",          # llm | chain | tool | retriever | embedding
    name="My Pipeline",
    tags=["production", "v2"],
    metadata={"version": "2.1", "env": "prod"},
    project_name="my-project"
)
def pipeline(question: str) -> str:
    return generate_answer(question)

Selective tracing context

import langsmith as ls

# Enable tracing for this block only
with ls.tracing_context(enabled=True, project_name="debug"):
    result = chain.invoke({"input": "..."})

# Disable tracing despite LANGSMITH_TRACING=true
with ls.tracing_context(enabled=False):
    result = chain.invoke({"input": "..."})

Wrap provider clients

from langsmith.wrappers import wrap_openai, wrap_anthropic
from openai import OpenAI
import anthropic

openai_client = wrap_openai(OpenAI())           # All calls auto-traced
anthropic_client = wrap_anthropic(anthropic.Anthropic())

Distributed tracing (microservices)

from langsmith.run_helpers import get_current_run_tree
import langsmith

@langsmith.traceable
def service_a(inputs):
    rt = get_current_run_tree()
    headers = rt.to_headers()     # Pass to child service
    return call_service_b(headers=headers)

@langsmith.traceable
def service_b(x, headers):
    with langsmith.tracing_context(parent=headers):
        return process(x)

Evaluation

Basic evaluation with evaluate()

from langsmith import Client
from langsmith.wrappers import wrap_openai
from openai import OpenAI

client = Client()
oai = wrap_openai(OpenAI())

# 1. Create dataset
dataset = client.create_dataset("Geography QA")
client.create_examples(
    dataset_id=dataset.id,
    examples=[
        {"inputs": {"q": "Capital of France?"}, "outputs": {"a": "Paris"}},
        {"inputs": {"q": "Capital of Germany?"}, "outputs": {"a": "Berlin"}},
    ]
)

# 2. Target function
def target(inputs: dict) -> dict:
    res = oai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": inputs["q"]}]
    )
    return {"a": res.choices[0].message.content}

# 3. Evaluator
def exact_match(inputs, outputs, reference_outputs):
    return outputs["a"].strip().lower() == reference_outputs["a"].strip().lower()

# 4. Run experiment
results = client.evaluate(
    target,
    data="Geography QA",
    evaluators=[exact_match],
    experiment_prefix="gpt-4o-mini-v1",
    max_concurrency=4
)

LLM-as-judge with openevals

pip install -U openevals

from openevals.llm import create_llm_as_judge
from openevals.prompts import CORRECTNESS_PROMPT

judge = create_llm_as_judge(
    prompt=CORRECTNESS_PROMPT,
    model="openai:o3-mini",
    feedback_key="correctness",
)

results = client.evaluate(target, data="my-dataset", evaluators=[judge])

Evaluation types

Prompt Hub

from langsmith import Client
from langchain_core.prompts import ChatPromptTemplate

client = Client()

# Push a prompt
prompt = ChatPromptTemplate([
    ("system", "You are a helpful assistant."),
    ("user", "{question}"),
])
client.push_prompt("my-assistant-prompt", object=prompt)

# Pull and use
prompt = client.pull_prompt("my-assistant-prompt")
# Pull specific version:
prompt = client.pull_prompt("my-assistant-prompt:abc123")

Feedback

from langsmith import Client
import uuid

client = Client()

# Custom run ID for later feedback linking
my_run_id = str(uuid.uuid4())
result = chain.invoke({"input": "..."}, {"run_id": my_run_id})

# Attach feedback
client.create_feedback(
    key="correctness",
    score=1,              # 0-1 numeric or categorical
    run_id=my_run_id,
    comment="Accurate and concise"
)

References

Python SDK Reference — full Client API, @traceable signature, evaluate()
TypeScript SDK Reference — Client, traceable, wrappers, evaluate
CLI Reference — langsmith CLI commands
Official Docs — langchain.com/langsmith
SDK GitHub — MIT License, v0.7.17
openevals — Prebuilt LLM evaluators

Related Skills

Reinasboo/security-review

development

VerifiedTrustedCommunity

Security code review for vulnerabilities. Use when asked to "security review", "find vulnerabilities", "check for security issues", "audit security", "OWASP review", or review code for injection, XSS, authentication, authorization, cryptography issues. Provides systematic review with confidence-based reporting.

SKILL.mdUpdated Apr 15, 2026

Reinasboo/security-review

Reinasboo/security-best-practices

development

VerifiedTrustedCommunity

Implement security best practices for web applications and infrastructure. Use when securing APIs, preventing common vulnerabilities, or implementing security policies. Handles HTTPS, CORS, XSS, SQL Injection, CSRF, rate limiting, and OWASP Top 10.

SKILL.mdUpdated Apr 15, 2026

Reinasboo/security-best-practices

Reinasboo/responsive-design

development

VerifiedTrustedCommunity

Create responsive web designs that work across all devices and screen sizes. Use when building mobile-first layouts, implementing breakpoints, or optimizing for different viewports. Handles CSS Grid, Flexbox, media queries, viewport units, and responsive images.

SKILL.mdUpdated Apr 15, 2026

Reinasboo/responsive-design

Reinasboo/remotion-video-production

content-media

VerifiedTrustedCommunity

Produce programmable videos with Remotion using scene planning, asset orchestration, and validation gates for automated, brand-consistent video content.

SKILL.mdUpdated Apr 15, 2026

Reinasboo/remotion-video-production

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/Reinasboo/Bountylab.git

# Copy into Claude Code skills folder (global)
cp -r Bountylab/.agents/skills/langsmith ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

Reinasboo/Bountylab

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT