plugins/kaggle-master/skills/competition-workflows/SKILL.md
Kaggle competition notebook workflows and submissions. PROACTIVELY activate for: (1) submitting notebook outputs to competitions, (2) `kaggle competitions submit -k`, (3) downloading competition data, (4) validating submission.csv format, (5) leakage review, (6) cross-validation split design, (7) public leaderboard overfitting concerns, (8) competition rule compliance, (9) reproducible top-to-bottom notebook execution, (10) fold-aware preprocessing for ML pipelines. Provides: submission commands, validation checklist, leakage controls, and competition-ready notebook guidance.
npx skillsauth add JosiahSiegel/claude-plugin-marketplace competition-workflowsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill for Kaggle competition workflows from data access through notebook execution and submission. Prioritize reproducibility, rule compliance, and leakage prevention over leaderboard-chasing shortcuts.
Use the notebook-kernel submission form when submitting an output file produced by a notebook version:
kaggle competitions submit <competition> -k <owner/slug> -f <file> -v <version> -m "<message>"
Before submission, confirm the notebook run completed successfully, the output file exists, and the version number matches the intended run. Use notebook-lifecycle for status, logs, files, and output download commands.
Use Kaggle CLI or kagglehub for competition downloads. Attach competition sources in kernel-metadata.json when the notebook must run on Kaggle. Confirm users accepted competition rules before assuming downloads or submissions will work.
/kaggle/working during Kaggle runs.Use group/time/stratified splits that match the competition structure. Apply preprocessing inside folds to avoid fitting transforms on validation data. Avoid using public leaderboard feedback as a validation set; repeated leaderboard probing can overfit. Confirm external data, pretrained models, internet access, and ensemble sources comply with rules.
Notebook should execute top-to-bottom from a clean Kaggle session. Set seeds for Python, NumPy, framework libraries, and splitters where applicable. Pin package versions when environment drift could affect results. Use a DEBUG flag to run small samples locally or during fast checks without changing final-run logic.
| Symptom | Check | |---|---| | Submission rejected | Filename, columns, rows, competition slug, accepted rules | | Score impossible | Leakage, target contamination, ID mismatch, train/test merge mistake | | Notebook output missing | Save location, run failure, timeout, file pattern | | Local score diverges | Split mismatch, fold leakage, random seeds, preprocessing outside folds |
Do not bypass competition rules or suggest hidden test reconstruction, private data scraping, or external data not allowed by rules. Warn before long GPU/TPU training runs. If secrets are needed, advise Kaggle Secrets through the UI; do not claim public API support for secrets administration.
development
This skill should be used when the user asks to train, debug, scale, or improve ML models. PROACTIVELY activate for: (1) PyTorch, TensorFlow/Keras, JAX, Flax, Hugging Face Trainer/Accelerate training loops, (2) distributed training, DDP/FSDP/DeepSpeed, TPU/GPU setup, (3) mixed precision AMP/bf16, gradient accumulation, checkpointing, seeding, (4) overfitting, imbalance, loss functions, regularization, LR schedules, warmup, (5) memory optimization, gradient checkpointing, offloading, quantization-aware training. Provides: reproducible training best practices across deep learning and classical ML.
development
This skill should be used when the user asks to productionize, track, version, govern, monitor, or automate ML systems. PROACTIVELY activate for: (1) MLflow, Weights & Biases, Neptune, Comet, ClearML experiment tracking, (2) model registry, model versioning, artifact lineage, reproducibility, (3) Kubeflow, SageMaker Pipelines, Vertex AI Pipelines, Azure ML pipelines, Databricks workflows, (4) CI/CD, continuous training/evaluation, A/B tests, canary/shadow deployments, (5) drift detection, model monitoring, data validation, responsible AI governance. Provides: end-to-end MLOps architecture and operational safeguards.
development
This skill should be used when the user asks to optimize, export, serve, compress, or accelerate ML inference. PROACTIVELY activate for: (1) latency, throughput, p95/p99, batching, concurrency, KV cache, memory, or cost issues, (2) quantization INT8/INT4, GPTQ, AWQ, bitsandbytes, pruning, sparsity, distillation, (3) ONNX export, ONNX Runtime, TensorRT, TorchScript, torch.compile, XLA, OpenVINO, Core ML, TFLite, (4) Triton, TorchServe, TF Serving, BentoML, Seldon, KServe configuration, (5) edge deployment, CPU/GPU/TPU/Inferentia serving. Provides: hardware-aware inference optimization and safe benchmarking.
testing
This skill should be used when the user asks to tune hyperparameters, run sweeps, optimize search spaces, or use AutoML. PROACTIVELY activate for: (1) Optuna, Ray Tune, FLAML, AutoGluon, Hyperopt, Nevergrad, KerasTuner, W&B sweeps, (2) grid search, random search, Bayesian optimization, TPE, Gaussian processes, evolutionary search, (3) ASHA, Hyperband, successive halving, multi-fidelity optimization, population-based training, (4) learning-rate finder, batch-size search, early stopping, pruning, (5) reproducible sweep design and experiment analysis. Provides: budget-aware hyperparameter search strategy.