engineering-team/senior-data-scientist/SKILL.md
World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.
npx skillsauth add tiandiyiqi/ai-skills senior-data-scientistInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
World-class senior data scientist skill for production-grade AI/ML/Data systems.
# Core Tool 1
python scripts/experiment_designer.py --input data/ --output results/
# Core Tool 2
python scripts/feature_engineering_pipeline.py --target project/ --analyze
# Core Tool 3
python scripts/model_evaluation_suite.py --config config.yaml --deploy
This skill covers world-class capabilities in:
Languages: Python, SQL, R, Scala, Go ML Frameworks: PyTorch, TensorFlow, Scikit-learn, XGBoost Data Tools: Spark, Airflow, dbt, Kafka, Databricks LLM Frameworks: LangChain, LlamaIndex, DSPy Deployment: Docker, Kubernetes, AWS/GCP/Azure Monitoring: MLflow, Weights & Biases, Prometheus Databases: PostgreSQL, BigQuery, Snowflake, Pinecone
Comprehensive guide available in references/statistical_methods_advanced.md covering:
Complete workflow documentation in references/experiment_design_frameworks.md including:
Technical reference guide in references/feature_engineering_patterns.md with:
Enterprise-scale data processing with distributed computing:
Production ML system with high availability:
High-throughput inference system:
Latency:
Throughput:
Availability:
# Development
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/
# Training
python scripts/train.py --config prod.yaml
python scripts/evaluate.py --model best.pth
# Deployment
docker build -t service:v1 .
kubectl apply -f k8s/
helm upgrade service ./charts/
# Monitoring
kubectl logs -f deployment/service
python scripts/health_check.py
references/statistical_methods_advanced.mdreferences/experiment_design_frameworks.mdreferences/feature_engineering_patterns.mdscripts/ directoryAs a world-class senior professional:
Technical Leadership
Strategic Thinking
Collaboration
Innovation
Production Excellence
business
为 Slack 优化创建动画 GIF 的知识和工具。提供约束、验证工具和动画概念。当用户请求为 Slack 创建动画 GIF 时使用,如"为我制作一个关于 X 做 Y 的 Slack GIF"。
development
从列表、电子表格或 Google 表格中为赠品、抽奖和竞赛随机选择获奖者。确保公平、公正的选择和透明度。
development
为你的项目生成创意域名创意,并检查多个顶级域名(.com、.io、.dev、.ai 等)的可用性。节省数小时的头脑风暴和手动检查时间。
development
使用 Twitter 开源算法洞察分析和优化推文以获得最大覆盖范围。根据推荐系统对内容排名的方式重写和编辑用户推文,以提升参与度和可见性。