bundled/skills/data-exploration-visualization/SKILL.md
自动化数据探索和可视化工具,提供从数据加载到专业报告生成的完整EDA解决方案。支持多种图表类型、智能数据诊断、建模评估和HTML报告生成。适用于医疗、金融、电商等领域的数据分析项目。
npx skillsauth add foryourhealth111-pixel/vco-skills-codex data-exploration-visualizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
数据探索可视化技能是一个基于《数据分析咖哥十话》第2课理论的自动化EDA工具包,提供从数据加载到专业分析报告生成的完整解决方案。该技能集成了最先进的数据探索、可视化和机器学习技术,帮助用户快速深入理解数据特征和规律。
基础数据探索
from scripts.eda_analyzer import EDAAnalyzer
# 初始化分析器
analyzer = EDAAnalyzer()
# 加载数据并自动分析
data = analyzer.load_data('data.csv')
report = analyzer.auto_eda(data)
可视化生成
from scripts.visualizer import DataVisualizer
# 初始化可视化器
visualizer = DataVisualizer()
# 自动生成所有图表
charts = visualizer.auto_visualize(data)
# 生成特定类型图表
dist_plot = visualizer.plot_distribution(data, 'column_name')
corr_heatmap = visualizer.plot_correlation(data)
建模评估
from scripts.modeling_evaluator import ModelingEvaluator
# 初始化建模器
modeler = ModelingEvaluator()
# 自动建模和评估
results = modeler.auto_modeling(
data=data,
target_col='target',
algorithms=['logistic', 'rf', 'xgboost']
)
报告生成
from scripts.report_generator import ReportGenerator
# 生成完整报告
generator = ReportGenerator()
report = generator.generate_comprehensive_report(
data=data,
model_results=model_results,
output_path='analysis_report.html'
)
医疗数据分析
# 医疗数据特殊处理
from scripts.medical_analyzer import MedicalDataAnalyzer
medical_analyzer = MedicalDataAnalyzer()
medical_report = medical_analyzer.analyze_medical_data(
data=medical_df,
diagnosis_col='diagnosis',
biomarker_cols=['biomarker1', 'biomarker2']
)
交互式仪表板
# 生成交互式仪表板
dashboard = visualizer.create_dashboard(
data=data,
charts=['distribution', 'correlation', 'model_performance']
)
批量数据处理
# 批量分析多个数据集
batch_results = analyzer.batch_analyze(
data_files=['data1.csv', 'data2.csv'],
analysis_types=['eda', 'modeling', 'visualization']
)
# 乳腺检查数据示例
medical_data = {
'patient_id': ['P001', 'P002', ...],
'diagnosis': ['Malignant', 'Benign', ...],
'radius_mean': [17.99, 20.57, ...],
'texture_mean': [10.38, 17.77, ...],
'perimeter_mean': [122.8, 132.9, ...]
}
# 信用评分数据示例
financial_data = {
'customer_id': ['C001', 'C002', ...],
'credit_score': [720, 680, ...],
'income': [85000, 62000, ...],
'debt_ratio': [0.15, 0.32, ...],
'default': [0, 1, ...]
}
A: 技能自动检测和处理中文编码,支持UTF-8、GBK等多种编码格式。
A: 支持CSV、Excel、JSON、Parquet等常见格式,也支持数据库连接。
A: 可以通过配置文件自定义颜色、字体、图表布局等样式参数。
A: 技能采用交叉验证、多种评估指标和集成方法来确保模型的可靠性和泛化能力。
✅ 智能化程度高 - 90%的EDA工作自动化 ✅ 专业性突出 - 医疗数据专精处理 ✅ 可视化丰富 - 20+种专业图表类型 ✅ 建模能力强 - 多算法集成和自动调优 ✅ 报告质量高 - 可发表级分析报告 ✅ 易用性好 - 简单API,复杂流程自动化 ✅ 扩展性强 - 模块化设计,易于定制扩展
通过这个技能,您可以大幅提升数据分析效率,从重复性工作中解放出来,专注于洞察发现和决策支持。
development
Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model.
development
Use when the user asks to inspect Sentry issues or events, summarize recent production errors, or pull basic Sentry health data via the Sentry API; perform read-only queries with the bundled script and require `SENTRY_AUTH_TOKEN`.
development
World-class prompt engineering skill for LLM optimization, prompt patterns, structured outputs, and AI product development. Expertise in Claude, GPT-4, prompt design patterns, few-shot learning, chain-of-thought, and AI evaluation. Includes RAG optimization, agent design, and LLM system architecture. Use when building AI products, optimizing LLM performance, designing agentic systems, or implementing advanced prompting techniques.
development
World-class ML engineering skill for productionizing ML models, MLOps, and building scalable ML systems. Expertise in PyTorch, TensorFlow, model deployment, feature stores, model monitoring, and ML infrastructure. Includes LLM integration, fine-tuning, RAG systems, and agentic AI. Use when deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs into production systems.