exploring-data/SKILL.md
Exploratory data analysis using ydata-profiling. Use when users upload .csv/.xlsx/.json/.parquet files or request "explore data", "analyze dataset", "EDA", "profile data". Generates interactive HTML or JSON reports with statistics, visualizations, correlations, and quality alerts.
npx skillsauth add oaustegard/claude-skills exploring-dataInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
bash /mnt/skills/user/exploring-data/scripts/check_install.sh
Returns: installed or not_installed
if [ "$(bash check_install.sh)" = "not_installed" ]; then
bash /mnt/skills/user/exploring-data/scripts/install_ydata.sh
fi
bash /mnt/skills/user/exploring-data/scripts/analyze.sh <filepath> [minimal|full] [html|json]
Defaults: minimal + html (also generates JSON)
Output:
eda_report.html - Interactive report for usereda_report.json - Machine-readable for Claude analysispython /mnt/skills/user/exploring-data/scripts/summarize_insights.py /mnt/user-data/outputs/eda_report.json
Reads: eda_report.json (comprehensive ydata output)
Writes: eda_insights_summary.md (condensed for Claude)
Outputs to stdout: Formatted markdown summary
Claude should read the stdout markdown summary, NOT the full JSON report.
# Standard workflow (user views HTML)
bash analyze.sh /mnt/user-data/uploads/data.csv
# Produces: eda_report.html + eda_report.json
# Link user to: computer:///mnt/user-data/outputs/eda_report.html
# User asks Claude to analyze
bash analyze.sh /mnt/user-data/uploads/data.csv
python summarize_insights.py /mnt/user-data/outputs/eda_report.json
# Claude reads the stdout markdown summary
# Claude can then provide analysis based on patterns/insights
# Full mode for comprehensive analysis
bash analyze.sh /mnt/user-data/uploads/data.csv full
# JSON-only output (skip HTML generation)
bash analyze.sh /mnt/user-data/uploads/data.csv minimal json
Minimal (default, 5-10s): Dataset overview, variable analysis, correlations, missing values, alerts
Full (10-20s): Everything in minimal + scatter matrices, sample data, character analysis, more visualizations
"comprehensive analysis", "detailed EDA", "full profiling", "deep analysis"
Otherwise use minimal.
testing
Disciplined, validation-gated revision of an EXISTING skill so each edit is a measured improvement rather than a guess. Use when editing, revising, or tuning a skill that already exists and there is evidence it underperforms (observed failures, drift, complaints) — invoke by name, or have versioning-skills / creating-skill defer to it before applying edits. Not for authoring a brand-new skill from scratch (use creating-skill) or one-off prose.
development
Skill-aware orchestration with context routing. Decomposes complex tasks into skill-typed subtasks, extracts targeted context subsets, executes subagents in parallel, and synthesizes results. Self-answers trivial lookups inline. No SDK dependency — uses raw HTTP via httpx. Use when tasks require multiple analytical perspectives, when context is large and subtasks only need portions, or when orchestrating-agents spawns too many redundant subagents.
tools
Orchestrates parallel API instances, delegated sub-tasks, and multi-agent workflows with streaming and tool-enabled delegation patterns. Use for parallel analysis, multi-perspective reviews, or complex task decomposition.
development
Invokes Google Gemini models for structured outputs, image generation, multi-modal tasks, and Google-specific features. Use when users request Gemini, image generation, structured JSON output, Google API integration, or cost-effective parallel processing.