skills/notebook-standardizer/SKILL.md
Standardize Jupyter notebooks (.ipynb) for interactive data analysis workflows. Enforces a mandatory cell manifest (M1-M8 + archetype chapters) with tags ([CONFIG]/[SETUP]/[FUNC]/[RUN]/[VIZ]/[EXPORT]), structured markdown sections, and output prefixes ([OK]/[WARN]/[SKIP]). Use when the user wants to standardize, clean up, or create a notebook from scratch. Two archetypes: problem-driven (question-answer analysis) and monitoring (dimension-based periodic reporting).
npx skillsauth add OliverOuyang/shuhe-work-skills notebook-standardizerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Transform notebooks into standardized, self-documenting analysis tools following a mandatory cell manifest.
| Cell | Tag | Name | Content Summary | Skip Condition |
|------|-----|------|-----------------|----------------|
| M1 | markdown | Title Card | # {Title} + summary table: type, source, output paths, SQL files, ARCHETYPE | Never |
| M2 | [SETUP] | Environment | Imports, paths, clients, dp.ping() health check | Never |
| M3 | [CONFIG] | Parameters | All adjustable params with type annotations. ARCHETYPE = "problem-driven" \| "monitoring", DATA_MODE = "sql" \| "csv" | Never |
| M4 | [RUN] | Field Validation | TABLES_TO_VALIDATE dict + meta loop; print [OK]/[WARN] per table | DATA_MODE="csv" |
| M5 | [RUN] | SQL Transparency | Print full parameterized SQL before execution | No SQL files used |
| M6 | [RUN] | Data Execution | pipe.run() or parallel execution + auto CSV save to data/ | Never |
| M7 | [RUN] | Data Quality Gate | Row count, null ratio, value range, cross-dataset checks; halt on critical issues | Never |
| M8 | [RUN]+[VIZ] | EDA | Per-DataFrame: shape, dtypes, describe, value_counts, null check | Never |
See templates/ for code patterns for each cell (field_validation.py, sql_transparency.py, etc.).
ARCHETYPE = "problem-driven")Used when: explicit business questions drive the analysis (e.g., "What is the spillover trend?").
Example: 排除包效果回收_By月.ipynb
| Cell | Tag | Name | Content Summary | Skip Condition |
|------|-----|------|-----------------|----------------|
| M9 | markdown | Analysis Framework | Mermaid flowchart TD: question nodes, data source nodes, flow edges | Single-question notebook |
| M9.5 | [RUN] | Chart Registry | CHART_REGISTRY dict mapping fig_var → html_slot | No HTML report |
| Cell | Tag | Name | Content Summary | Skip Condition |
|------|-----|------|-----------------|----------------|
| Ch.X.0 | markdown | Chapter Header | ## X. {Question} + one-sentence question + Data + Method | Never |
| Ch.X.1 | [RUN] | Data Preparation | Filter/transform/aggregate; split into sub-cells (Ch.X.1a/1b/1c) if > 25 lines; print shape + preview per sub-cell | Never |
| Ch.X.2 | [VIZ] | Visualization | Charts and/or formatted tables; include chart reading hints | Never |
| Ch.X.3 | [RUN] | Agent Conclusion | Agent interprets viz output, prints structured findings | CONDITIONAL: include only when analysis requires explanatory interpretation. Omit for purely descriptive chapters. Validator does NOT flag its absence. |
| Ch.X.4 | [RUN] | Chapter Summary | Consolidate findings: key metrics + trend + recommendation. Reader only needs this cell. | Never |
When Ch.X.3 is omitted, Ch.X.2 connects directly to Ch.X.4.
| Cell | Tag | Name | Content Summary | Skip Condition |
|------|-----|------|-----------------|----------------|
| S1 | [RUN] | Cross-Chapter Synthesis | Executive summary consolidating all chapter summaries | 1 chapter only |
| S2 | [EXPORT] | CSV Export | Save to data/ with standard naming (data_{topic}_{granularity}.csv) | Never |
| S3 | [EXPORT] | HTML Report | Follow html-report-framework protocol (see S3 Spec below) | Optional |
| S4 | markdown | Appendix | Quick reference, structure map, glossary | Never |
ARCHETYPE = "monitoring")Used when: periodic trend reporting or dashboard refresh with no single guiding question.
Example: 定向配置分析_By周.ipynb
Note: No M9 (analysis framework) — dimensions are parallel, not sequential.
| Cell | Tag | Name | Content Summary | Skip Condition |
|------|-----|------|-----------------|----------------|
| Dim.X.0 | markdown | Dimension Header | ## X. {Dimension Name} + one-sentence scope | Never |
| Dim.X.1 | [VIZ] | Visualization + Table | Charts, pivot tables, trend lines for this dimension | Never |
| Dim.X.2 | [RUN] | Brief Takeaway | 2-5 bullets: what changed, what's notable, what needs attention | Never |
| Cell | Tag | Name | Content Summary | Skip Condition |
|------|-----|------|-----------------|----------------|
| R1 | [RUN] | Chart Registry | CHART_REGISTRY dict mapping fig_var → html_slot | No HTML report |
| S2 | [EXPORT] | CSV Export | Save to data/ with standard naming | Never |
| S3 | [EXPORT] | HTML Report | Follow html-report-framework protocol | Optional |
| S4 | markdown | Appendix | Quick reference | Never |
S3 generates an HTML report following html-report-framework conventions. It does NOT invoke another skill at runtime — the agent applies html-report-framework knowledge at notebook-build time.
Agent steps when building S3:
html-report-framework/SKILL.md to understand the protocolhtml-report-framework/resources/starter-template.html__PLACEHOLDER__ markers with actual notebook data, adds ECharts configs, writes report_{topic}_{granularity}.htmlS3 must NOT:
<!DOCTYPE html> literal in cell)generate_report.py / generate_config_report.pySee templates/export_html.py for the pattern.
Read the target .ipynb file and report:
ARCHETYPE value in CONFIG cell (or infer from chapter/dimension markers)Follow the manifest for the detected archetype. For each cell:
templates/ before writing codereferences/conventions.md_build_notebook.py pattern (nbformat programmatic build) to avoid JSON encoding issuesRun the validation script:
python <skill-path>/scripts/validate_notebook.py <notebook.ipynb>
Fix all errors. Investigate warnings. Do NOT report completion while errors exist.
Run the full notebook end-to-end. Fix any runtime errors before completion:
jupyter nbconvert --to notebook --execute <notebook.ipynb> --output _test_run.ipynb
Skip execution only when DATA_MODE="csv" path is unavailable. After success, delete _build_notebook.py and _test_run.ipynb.
Before reporting completion, verify:
Manifest completeness:
DATA_MODE="csv", M5 only if no SQL files)Cell conventions:
# [TAG] line> summary line# type: description | options formatprint() uses [OK] / [WARN] / [SKIP] prefixesCell readability:
Chart traceability:
Output files:
data/ with correct naming (data_{topic}_{granularity}.csv)report_ prefix in project rootsql/ directory; no inline SQL > 20 linesExecution:
references/conventions.mdtemplates/ (field_validation.py, sql_transparency.py, data_execution.py, quality_gate.py, eda.py, chapter_summary.py, dim_takeaway.py, export_html.py, config_block.py, chart_registry.py, cellmap_generator.py)html-report-framework/SKILL.md + html-report-framework/resources/starter-template.htmltools
SQL 分段验证、自我修复、结果导出与智能分析。流程:解析SQL → Dataphin MCP 验证元数据 → 自动修复 → 分段执行验证 → 导出 CSV → 智能分析(漏斗解读、异常识别、预判用户问题)。适用场景:"跑一下这个SQL"、"验证这个查询"、"帮我执行并导出"、"分析一下结果"等。
testing
Security-first vetting for OpenClaw skills. Use before installing any skill from ClawHub, GitHub, or other sources. Checks for red flags, permission scope, and suspicious patterns.
development
A universal self-improving agent that learns from ALL skill experiences. Uses multi-memory architecture (semantic + episodic + working) to continuously evolve the codebase. Auto-triggers on skill completion/error with hooks-based self-correction.
testing
Execute Jupyter notebooks end-to-end with SQL pre-validation, error diagnosis, and auto-fix loops. Use when "run notebook", "execute notebook", "test notebook", or "validate notebook execution".