skills/integrations/langfuse/langfuse-datasets/SKILL.md
Manage Langfuse datasets, items, runs, and run-items. Load when user says 'list datasets', 'create dataset', 'get dataset', 'dataset items', 'dataset runs', 'add test case', 'evaluation results', 'delete dataset item', 'delete dataset run'.
npx skillsauth add beam-ai-team/beam-next-skills langfuse-datasetsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Unified skill for Langfuse dataset management — datasets, items, runs, and run-items.
Replaces: langfuse-list-datasets, langfuse-create-dataset, langfuse-get-dataset, langfuse-list-dataset-items, langfuse-create-dataset-item, langfuse-get-dataset-item, langfuse-delete-dataset-item, langfuse-list-dataset-runs, langfuse-get-dataset-run, langfuse-delete-dataset-run, langfuse-list-dataset-run-items, langfuse-create-dataset-run-item.
All operations go through a single script with --resource and --action:
uv run python scripts/datasets.py --resource <resource> --action <action> [options]
# List datasets
uv run python scripts/datasets.py --resource datasets --action list --limit 50
# List all datasets (paginated)
uv run python scripts/datasets.py --resource datasets --action list --all
# Create a dataset
uv run python scripts/datasets.py --resource datasets --action create --name "my-eval-set" --description "Regression tests"
# Get a dataset by name
uv run python scripts/datasets.py --resource datasets --action get --name "my-eval-set"
# List items (optionally filter by dataset)
uv run python scripts/datasets.py --resource items --action list --dataset "my-eval-set"
# Create an item (test case)
uv run python scripts/datasets.py --resource items --action create \
--dataset "my-eval-set" \
--input '{"query": "What is LangChain?"}' \
--expected '{"answer": "A framework for LLM apps"}'
# Get item by ID
uv run python scripts/datasets.py --resource items --action get --id "item-uuid"
# Delete item
uv run python scripts/datasets.py --resource items --action delete --id "item-uuid"
# List runs for a dataset
uv run python scripts/datasets.py --resource runs --action list --dataset "my-eval-set"
# Get a specific run
uv run python scripts/datasets.py --resource runs --action get --dataset "my-eval-set" --run "run-name"
# Delete a run
uv run python scripts/datasets.py --resource runs --action delete --dataset "my-eval-set" --run "run-name"
# List run items (filter by dataset and/or run)
uv run python scripts/datasets.py --resource run-items --action list --dataset "my-eval-set" --run "run-name"
# Create a run item (log an evaluation result)
uv run python scripts/datasets.py --resource run-items --action create \
--run "run-name" --dataset-item "item-uuid" --trace "trace-uuid"
| Option | Description |
|--------|-------------|
| --limit | Max results per page (default 50, max 100) |
| --page | Page number (default 1) |
| --all | Fetch all pages (for list actions) |
| --max-pages | Stop after N pages with --all |
| --output, -o | Write JSON to file instead of stdout |
| Resource | Action | Method | Endpoint |
|----------|--------|--------|----------|
| datasets | list | GET | /api/public/v2/datasets |
| datasets | create | POST | /api/public/v2/datasets |
| datasets | get | GET | /api/public/v2/datasets/{name} |
| items | list | GET | /api/public/dataset-items |
| items | create | POST | /api/public/dataset-items |
| items | get | GET | /api/public/dataset-items/{id} |
| items | delete | DELETE | /api/public/dataset-items/{id} |
| runs | list | GET | /api/public/datasets/{name}/runs |
| runs | get | GET | /api/public/datasets/{name}/runs/{run} |
| runs | delete | DELETE | /api/public/datasets/{name}/runs/{run} |
| run-items | list | GET | /api/public/dataset-run-items |
| run-items | create | POST | /api/public/dataset-run-items |
development
--- name: taste-skill type: skill version: '1.0' author: Leonxlnx (packaged by Zhichao Li) category: general tags: - frontend - design - anti-slop - landing-page updated: '2026-06-11' visibility: public description: Anti-slop frontend skill for landing pages, portfolios, and redesigns. The agent reads the brief, infers the right design direction, and ships interfaces that do not look templated. Real design systems when applicable, audit-first on redesigns, strict pre-flight check. license: MIT.
development
Use when communicating quantitative information in any form — Slack updates, emails, reports, decks, dashboards, landing pages, product UI, public talks. Covers two integrated layers: (1) making numbers semantically meaningful (translation, anchoring, simplification, story-pairing) and (2) showing numbers cleanly (chart vs table vs prose, chart-by-message, pre-attentive emphasis, color discipline, decluttering). Distilled and integrated from *Show Me the Numbers* (Stephen Few) and *Make Numbers Count* (Chip Heath & Karla Starr). Not for raw data analysis or statistics — this is about communication of numbers, not their derivation.
development
Use when the user wants to design, redesign, shape, critique, audit, polish, clarify, distill, harden, optimize, adapt, animate, colorize, extract, or otherwise improve a frontend interface. Covers websites, landing pages, dashboards, product UI, app shells, components, forms, settings, onboarding, and empty states. Handles UX review, visual hierarchy, information architecture, cognitive load, accessibility, performance, responsive behavior, theming, anti-patterns, typography, fonts, spacing, layout, alignment, color, motion, micro-interactions, UX copy, error states, edge cases, i18n, and reusable design systems or tokens. Also use for bland designs that need to become bolder or more delightful, loud designs that should become quieter, live browser iteration on UI elements, or ambitious visual effects that should feel technically extraordinary. Not for backend-only or non-UI tasks.
tools
Stateful multi-session tutor adapted for Beam — teach a stakeholder to understand, trust, and operate a specific agent, or teach a Solution Engineer a client's business process for delivery. Grounds every lesson in Knowledge Hub sources (real agent graphs, real tasks, transcripts, Linear) before any web resource. Also works for any general topic. Trigger on "teach me", "beam teach", "教我", "onboard <person> on <agent>", "help <stakeholder> understand the agent", "learn this client's process".