.claude/skills/skills/data-describe/SKILL.md
Generate AI-powered Data Dictionary, Description, and Tags for a CSV/TSV/Excel file
npx skillsauth add dathere/qsv data-describeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate AI-powered documentation for a tabular data file using describegpt. Produces a Data Dictionary (column labels, descriptions, types), a natural-language Description of the dataset, and semantic Tags — all via the connected LLM (no API key needed in MCP mode).
Cowork note: If relative paths don't resolve, call
mcp__qsv__qsv_get_working_dirandmcp__qsv__qsv_set_working_dirto sync the working directory.
Index: Run mcp__qsv__qsv_index on the file for fast random access.
Profile: Run mcp__qsv__qsv_stats with cardinality: true, stats_jsonl: true to generate the stats cache. describegpt reads this cache for column metadata, so it must exist first.
Describe: Run mcp__qsv__qsv_describegpt with the requested options (recommend all: true for comprehensive output). At least one inference option (dictionary, description, tags, or all) is required. Output defaults to <filestem>.describegpt.md.
Present: Display the generated Data Dictionary table, Description, and Tags to the user.
| Option | Effect |
|--------|--------|
| --all (recommended) | Generate Dictionary + Description + Tags in one pass |
| --dictionary | Data Dictionary only — column labels, descriptions, types |
| --description | Natural-language dataset Description only |
| --tags | Semantic Tags only |
| --format | Output format: Markdown (default), JSON, TSV, TOON |
| --language | Generate output in a non-English language (e.g. Spanish, French) |
| --addl-cols-list | Enrich the dictionary with extra columns (e.g. "everything", "moar!") |
| --tag-vocab | Constrain tags to a controlled vocabulary (comma-separated) |
| --num-tags | Number of tags to generate (default: 5) |
| --num-examples | Number of example values per column in the dictionary |
| --enum-threshold | Max cardinality to treat a column as an enum in the dictionary |
<filestem>.describegpt.md--format JSON when you need machine-readable output for downstream processing--language to generate documentation in the user's preferred languagedevelopment
Machine-readable journal format for reproducible data analysis operations
documentation
Performance guide covering index files, stats cache, and frequency cache accelerators for qsv
data-ai
Infer a semantic ontology from all files in the working directory - entities, attributes, relationships, domain taxonomy, and cross-file join paths. Outputs ONTOLOGY.md.
development
Create publication-quality visualizations from CSV/TSV/Excel data using Python