engineering-team/senior-data-engineer/SKILL.md
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.
npx skillsauth add tiandiyiqi/ai-skills senior-data-engineerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
World-class senior data engineer skill for production-grade AI/ML/Data systems.
# Core Tool 1
python scripts/pipeline_orchestrator.py --input data/ --output results/
# Core Tool 2
python scripts/data_quality_validator.py --target project/ --analyze
# Core Tool 3
python scripts/etl_performance_optimizer.py --config config.yaml --deploy
This skill covers world-class capabilities in:
Languages: Python, SQL, R, Scala, Go ML Frameworks: PyTorch, TensorFlow, Scikit-learn, XGBoost Data Tools: Spark, Airflow, dbt, Kafka, Databricks LLM Frameworks: LangChain, LlamaIndex, DSPy Deployment: Docker, Kubernetes, AWS/GCP/Azure Monitoring: MLflow, Weights & Biases, Prometheus Databases: PostgreSQL, BigQuery, Snowflake, Pinecone
Comprehensive guide available in references/data_pipeline_architecture.md covering:
Complete workflow documentation in references/data_modeling_patterns.md including:
Technical reference guide in references/dataops_best_practices.md with:
Enterprise-scale data processing with distributed computing:
Production ML system with high availability:
High-throughput inference system:
Latency:
Throughput:
Availability:
# Development
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/
# Training
python scripts/train.py --config prod.yaml
python scripts/evaluate.py --model best.pth
# Deployment
docker build -t service:v1 .
kubectl apply -f k8s/
helm upgrade service ./charts/
# Monitoring
kubectl logs -f deployment/service
python scripts/health_check.py
references/data_pipeline_architecture.mdreferences/data_modeling_patterns.mdreferences/dataops_best_practices.mdscripts/ directoryAs a world-class senior professional:
Technical Leadership
Strategic Thinking
Collaboration
Innovation
Production Excellence
business
为 Slack 优化创建动画 GIF 的知识和工具。提供约束、验证工具和动画概念。当用户请求为 Slack 创建动画 GIF 时使用,如"为我制作一个关于 X 做 Y 的 Slack GIF"。
development
从列表、电子表格或 Google 表格中为赠品、抽奖和竞赛随机选择获奖者。确保公平、公正的选择和透明度。
development
为你的项目生成创意域名创意,并检查多个顶级域名(.com、.io、.dev、.ai 等)的可用性。节省数小时的头脑风暴和手动检查时间。
development
使用 Twitter 开源算法洞察分析和优化推文以获得最大覆盖范围。根据推荐系统对内容排名的方式重写和编辑用户推文,以提升参与度和可见性。