Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

stevengonsalvez/claude-langfuse

Name: claude-langfuse
Author: stevengonsalvez

toolkit/packages/skills/claude-langfuse/SKILL.md

npx skillsauth add stevengonsalvez/agents-in-a-box claude-langfuse

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Claude Langfuse Observability Skill

Analyze Claude Code session traces stored in Langfuse to extract learnings, identify patterns, and drive continuous improvement.

Sub-Commands

| Command | Description | |---------|-------------| | /claude-langfuse | Show help and available sub-commands | | /claude-langfuse:status | Current session status and recent traces | | /claude-langfuse:reflect | Analyze recent sessions for learnings and corrections | | /claude-langfuse:insights [trace_id] | Deep analysis of a specific session | | /claude-langfuse:patterns | Identify recurring patterns across sessions |

Usage

Status Check

/claude-langfuse:status

Shows:

Current session trace ID and observation count
Last 5 sessions with quick stats
Tool usage breakdown

Reflect on Sessions

/claude-langfuse:reflect
/claude-langfuse:reflect --sessions 10
/claude-langfuse:reflect --since 2024-01-01

Analyzes traces to find:

High confidence signals: Explicit corrections ("never", "always", "don't", "must")
Medium confidence signals: Patterns that worked well, positive feedback
Low confidence signals: Observations and preferences to review later

Deep Insights

/claude-langfuse:insights <trace_id>

Provides detailed analysis of a specific session including:

Full timeline of tool usage
User prompt analysis
Error patterns
Success patterns

Implementation

When this skill is invoked, execute the appropriate sub-command:

For `/claude-langfuse` or `/langfuse:status`:

Query Langfuse API for recent traces
Display current session info
Show summary statistics

source ~/.secrets && python3 {{HOME_TOOL_DIR}}/skills/claude-langfuse/utils/status.py

For `/claude-langfuse:reflect`:

Fetch recent session traces from Langfuse
Extract user prompts and tool outputs
Scan for correction signals (high/medium/low confidence)
Match learnings to relevant agent files
Propose updates with diff format
Present for user approval

source ~/.secrets && python3 {{HOME_TOOL_DIR}}/skills/claude-langfuse/utils/reflect.py $ARGUMENTS

For `/claude-langfuse:insights <trace_id>`:

source ~/.secrets && python3 {{HOME_TOOL_DIR}}/skills/claude-langfuse/utils/insights.py $ARGUMENTS

Signal Detection Patterns

High Confidence (Explicit Corrections)

"never do X", "don't ever Y"
"always check Z", "must verify"
"stop doing X", "wrong approach"
Repeated corrections for same issue

Medium Confidence (Success Patterns)

"perfect", "exactly what I wanted"
"good approach", "keep doing this"
Approved solutions that can be templated

Low Confidence (Observations)

Preferences mentioned in passing
One-time edge cases
Context-specific decisions

Learning Categories

| Category | Examples | Target Files | |----------|----------|--------------| | Code Style | Formatting, naming conventions | agents/code-reviewer.md | | Architecture | Design patterns, boundaries | agents/solution-architect.md | | Process | Workflow, review practices | CLAUDE.md | | Tools | Preferred utilities, commands | agents/superstar-engineer.md | | Domain | Project-specific knowledge | Project CLAUDE.md |

Output Format

Reflect Output

═══════════════════════════════════════════════════════════════
  LANGFUSE REFLECT - Session Analysis
═══════════════════════════════════════════════════════════════

Sessions Analyzed: 5
Time Range: 2024-01-08 to 2024-01-10

┌─────────────────────────────────────────────────────────────┐
│ HIGH CONFIDENCE SIGNALS (3 found)                           │
├─────────────────────────────────────────────────────────────┤
│ [1] "Never guess file paths - always verify with ls first"  │
│     Session: abc123... @ 2024-01-09                         │
│     Target: agents/superstar-engineer.md                    │
│     Proposed: Add to working rules section                  │
├─────────────────────────────────────────────────────────────┤
│ [2] "Always use ast-grep for code searches"                 │
│     Session: def456... @ 2024-01-10                         │
│     Target: CLAUDE.md                                       │
│     Proposed: Already exists - reinforce                    │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ MEDIUM CONFIDENCE SIGNALS (2 found)                         │
├─────────────────────────────────────────────────────────────┤
│ [1] User approved parallel agent pattern                    │
│     Session: ghi789... @ 2024-01-10                         │
│     Pattern: Launch 3+ agents for independent tasks         │
└─────────────────────────────────────────────────────────────┘

Apply these learnings? [Y/n/modify]:

Integration with Hooks

The Langfuse hooks (session_start, pre_tool_use, post_tool_use, stop) automatically capture:

Session metadata (project, branch, user)
All tool invocations with inputs/outputs
User prompts
Timing information

This skill reads that data to power reflection and insights.

Configuration

Requires Langfuse credentials in ~/.secrets:

export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"  # optional

stevengonsalvez/claude-langfuse

toolkit/packages/skills/claude-langfuse/SKILL.md

Claude Code observability skill: analyze session traces stored in Langfuse, extract learnings from corrections, identify success patterns, and propose agent/skill improvements based on historical data. Powers self-improvement through trace analysis of Claude Code sessions.

10 stars

development

Updated Apr 22, 2026

$ install --global

skillsauth

npx skillsauth add stevengonsalvez/agents-in-a-box claude-langfuse

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 22, 2026, 12:29 PM128.2s6 files scanned

SKILL.md

name:: claude-langfuse
description:: >
Claude Code observability skill:: analyze session traces stored in Langfuse,
version:: 1.0.0

Claude Langfuse Observability Skill

Analyze Claude Code session traces stored in Langfuse to extract learnings, identify patterns, and drive continuous improvement.

Sub-Commands

Usage

Status Check

/claude-langfuse:status

Shows:

Current session trace ID and observation count
Last 5 sessions with quick stats
Tool usage breakdown

Reflect on Sessions

/claude-langfuse:reflect
/claude-langfuse:reflect --sessions 10
/claude-langfuse:reflect --since 2024-01-01

Analyzes traces to find:

High confidence signals: Explicit corrections ("never", "always", "don't", "must")
Medium confidence signals: Patterns that worked well, positive feedback
Low confidence signals: Observations and preferences to review later

Deep Insights

/claude-langfuse:insights <trace_id>

Provides detailed analysis of a specific session including:

Full timeline of tool usage
User prompt analysis
Error patterns
Success patterns

Implementation

When this skill is invoked, execute the appropriate sub-command:

For `/claude-langfuse` or `/langfuse:status`:

Query Langfuse API for recent traces
Display current session info
Show summary statistics

source ~/.secrets && python3 {{HOME_TOOL_DIR}}/skills/claude-langfuse/utils/status.py

For `/claude-langfuse:reflect`:

Fetch recent session traces from Langfuse
Extract user prompts and tool outputs
Scan for correction signals (high/medium/low confidence)
Match learnings to relevant agent files
Propose updates with diff format
Present for user approval

source ~/.secrets && python3 {{HOME_TOOL_DIR}}/skills/claude-langfuse/utils/reflect.py $ARGUMENTS

For `/claude-langfuse:insights <trace_id>`:

source ~/.secrets && python3 {{HOME_TOOL_DIR}}/skills/claude-langfuse/utils/insights.py $ARGUMENTS

Signal Detection Patterns

High Confidence (Explicit Corrections)

"never do X", "don't ever Y"
"always check Z", "must verify"
"stop doing X", "wrong approach"
Repeated corrections for same issue

Medium Confidence (Success Patterns)

"perfect", "exactly what I wanted"
"good approach", "keep doing this"
Approved solutions that can be templated

Low Confidence (Observations)

Preferences mentioned in passing
One-time edge cases
Context-specific decisions

Learning Categories

Output Format

Reflect Output

═══════════════════════════════════════════════════════════════
  LANGFUSE REFLECT - Session Analysis
═══════════════════════════════════════════════════════════════

Sessions Analyzed: 5
Time Range: 2024-01-08 to 2024-01-10

┌─────────────────────────────────────────────────────────────┐
│ HIGH CONFIDENCE SIGNALS (3 found)                           │
├─────────────────────────────────────────────────────────────┤
│ [1] "Never guess file paths - always verify with ls first"  │
│     Session: abc123... @ 2024-01-09                         │
│     Target: agents/superstar-engineer.md                    │
│     Proposed: Add to working rules section                  │
├─────────────────────────────────────────────────────────────┤
│ [2] "Always use ast-grep for code searches"                 │
│     Session: def456... @ 2024-01-10                         │
│     Target: CLAUDE.md                                       │
│     Proposed: Already exists - reinforce                    │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ MEDIUM CONFIDENCE SIGNALS (2 found)                         │
├─────────────────────────────────────────────────────────────┤
│ [1] User approved parallel agent pattern                    │
│     Session: ghi789... @ 2024-01-10                         │
│     Pattern: Launch 3+ agents for independent tasks         │
└─────────────────────────────────────────────────────────────┘

Apply these learnings? [Y/n/modify]:

Integration with Hooks

The Langfuse hooks (session_start, pre_tool_use, post_tool_use, stop) automatically capture:

Session metadata (project, branch, user)
All tool invocations with inputs/outputs
User prompts
Timing information

This skill reads that data to power reflection and insights.

Configuration

Requires Langfuse credentials in ~/.secrets:

export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"  # optional

Related Skills

stevengonsalvez/reflect:cost

documentation

VerifiedTrustedCommunity

Report reflect drain spend over a time window — tokens split by cached (cache_read), uncached writes (cache_creation), and io (input+output), with a $ estimate, grouped by day / outcome / model / transcript. Reads the drainer's cost log and surfaces outlier runs and cache-reuse health (the 41.5M-token failure mode = low cache reuse + high cache writes). Use to answer "what is reflection costing me" for the last day / week.

12SKILL.mdUpdated Jun 2, 2026

stevengonsalvez/reflect:cost

stevengonsalvez/ainb-fleet:standup

development

VerifiedTrustedCommunity

Show fleet status — every claude session running on the host, merged across ainb + claude-peers broker + background jobs. Use when you need to enumerate sessions before composing an action, see which sessions have a peer registered (broker-routable) vs tmux-only, check the `summary` of each session, or pipe the list into jq for filtering. Default output: text table. Pass --format json for LLM consumption.

10SKILL.mdUpdated May 31, 2026

stevengonsalvez/ainb-fleet:standup

stevengonsalvez/ainb-fleet:sequence

testing

VerifiedTrustedCommunity

Ordered multi-step prompts to fleet targets, ack-gated between steps via JSONL assistant-turn-end detection. Use for cycles like disconnect→reconnect→verify, or any flow where step N+1 requires step N to have completed first. The skill BLOCKS until each target's transcript shows the next assistant turn finishing OR per-step timeout fires (default 300s).

10SKILL.mdUpdated May 31, 2026

stevengonsalvez/ainb-fleet:sequence

stevengonsalvez/ainb-fleet:needs

development

VerifiedTrustedCommunity

Center control panel — enumerate every claude session that is blocked waiting on something: a user answer (AskUserQuestion fired), an API error retry, an idle assistant turn-end with no follow-up, or an explicit WAITING: marker. Returns rich JSON with signal kind + context per session. Use this when you've stepped away from the fleet and want one place to see everything that wants your attention and answer it.

10SKILL.mdUpdated May 31, 2026

stevengonsalvez/ainb-fleet:needs

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/stevengonsalvez/agents-in-a-box.git

# Copy into Claude Code skills folder (global)
cp -r agents-in-a-box/toolkit/packages/skills/claude-langfuse ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

stevengonsalvez/agents-in-a-box

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

stevengonsalvez/claude-langfuse

$ install --global

Security Scan Results

SKILL.md

Claude Langfuse Observability Skill

Sub-Commands

Usage

Status Check

Reflect on Sessions

Deep Insights

Implementation

For /claude-langfuse or /langfuse:status:

For /claude-langfuse:reflect:

For /claude-langfuse:insights <trace_id>:

Signal Detection Patterns

High Confidence (Explicit Corrections)

Medium Confidence (Success Patterns)

Low Confidence (Observations)

Learning Categories

Output Format

Reflect Output

Integration with Hooks

Configuration

Related Skills

stevengonsalvez/reflect:cost

stevengonsalvez/ainb-fleet:standup

stevengonsalvez/ainb-fleet:sequence

stevengonsalvez/ainb-fleet:needs

stevengonsalvez/claude-langfuse

$ install --global

Security Scan Results

SKILL.md

Claude Langfuse Observability Skill

Sub-Commands

Usage

Status Check

Reflect on Sessions

Deep Insights

Implementation

For /claude-langfuse or /langfuse:status:

For /claude-langfuse:reflect:

For /claude-langfuse:insights <trace_id>:

Signal Detection Patterns

High Confidence (Explicit Corrections)

Medium Confidence (Success Patterns)

Low Confidence (Observations)

Learning Categories

Output Format

Reflect Output

Integration with Hooks

Configuration

Related Skills

stevengonsalvez/reflect:cost

stevengonsalvez/ainb-fleet:standup

stevengonsalvez/ainb-fleet:sequence

stevengonsalvez/ainb-fleet:needs

For `/claude-langfuse` or `/langfuse:status`:

For `/claude-langfuse:reflect`:

For `/claude-langfuse:insights <trace_id>`:

For `/claude-langfuse` or `/langfuse:status`:

For `/claude-langfuse:reflect`:

For `/claude-langfuse:insights <trace_id>`: