skills/q-scholar/q-eda/SKILL.md
Run exploratory data analysis on tabular datasets with measurement-appropriate statistics. Use for EDA, descriptive statistics, data exploration, or preparing data summaries for reports and manuscripts.
npx skillsauth add TyrealQ/q-skills q-edaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Universal exploratory data analysis for tabular datasets. Interviews the user to confirm column measurement levels, runs statistically appropriate analysis per variable type, and produces structured CSVs with a narrative summary.
IMPORTANT: This skill requires Bash execution. Use the pre-built
scripts/run_eda.pyfrom${SKILL_DIR}/scripts/— do NOT write a new script or inline Python.If in plan mode: write a brief plan — "Run q-eda skill: interview user for context and column types, execute run_eda.py, write EXPLORATORY_SUMMARY.md from generated CSVs." — then exit plan mode immediately. Do NOT attempt interview stages, script execution, or any analysis while plan mode is active.
Agent execution instructions:
SKILL_DIR.${SKILL_DIR}/scripts/run_eda.py.${SKILL_DIR}/references/<ref-name>.pandas
numpy
scipy
openpyxl # required for .xlsx input and Phase 6 Excel report
Install: pip install pandas numpy scipy openpyxl
run_eda.py| Step | Action | Reference |
|------|--------|-----------|
| 1 | Interview: context questions, then column classification with user confirmation | references/interview_protocol.md |
| 2 | Execute: run run_eda.py with confirmed types (see Pipeline below) | references/invocation_guide.md |
| 3 | Summarize: write tables-eda/EXPLORATORY_SUMMARY.md from generated CSVs | references/summary_template.md, references/summary_instructions.md |
| Phase | Output | Content |
|-------|--------|---------|
| 0 | (console) | Data loading, column classification, schema summary |
| 1 | 01_dataset_profile.csv | Shape, column types, missing%, uniqueness |
| 2 | 02_data_quality.csv | Missing counts/%, duplicates, constant columns, outliers (IQR) |
| 3 | 03-08_*.csv | Univariate: nominal frequencies, binary summary, ordinal/discrete/continuous descriptives |
| 4 | 09-12_*.csv | Bivariate: Pearson/Spearman correlations, grouped descriptives, cross-tabs |
| 5 | 13-14_*.csv | Specialized: text analysis, temporal trends |
| 6 | EXPLORATORY_REPORT.xlsx | APA-7th formatted workbook (B&W, one sheet per CSV) |
Files are omitted when no columns of that type exist. Output directory: tables-eda/.
Include: Any .xlsx/.csv dataset — academic, business, or general. Outputs feed directly into q-methods and q-results.
Exclude: Confirmatory statistics, visualization, hypothesis testing, data cleaning beyond script internals.
--col_types; grouping columns via --groupEXPLORATORY_SUMMARY.md follows references/summary_template.md structuretesting
Capture session decisions, conventions, and lessons into plan files, auto-memory, and CLAUDE.md so a fresh session resumes cleanly. Use for "hand off," "wrap up," "update docs for next session," or before /compact.
research
Orchestrate end-to-end academic manuscript preparation following APA 7th edition. Use for writing papers, drafting sections, or academic writing support.
development
Generate professional slide deck images from content with smart logo branding. Use for creating slides, presentations, decks, or PPT output.
documentation
Convert documents into business stories and infographics. Use for turning reports, documents, or text into visual summaries or infographics.