skills/33-Galaxy-Dawn-claude-scholar/skills/kaggle-learner/SKILL.md
This skill should be used when the user asks to "learn from Kaggle", "study Kaggle solutions", "analyze Kaggle competitions", or mentions Kaggle competition URLs. Provides access to extracted knowledge from winning Kaggle solutions across NLP, CV, time series, tabular, and multimodal domains.
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research kaggle-learnerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Extract and apply knowledge from Kaggle competition winning solutions. This skill provides access to a continuously updated knowledge base of techniques, code patterns, and best practices from top Kaggle competitors.
Kaggle competitions are at the forefront of practical machine learning. Winning solutions often innovate with novel techniques, clever feature engineering, and optimized pipelines. This skill captures that knowledge and makes it accessible for your projects.
Use this skill when:
| Category | Focus | Directory |
|----------|-------|-----------|
| NLP | Text classification, NER, translation, LLM applications | references/knowledge/nlp/ |
| CV | Image classification, detection, segmentation, generation | references/knowledge/cv/ |
| Time Series | Forecasting, anomaly detection, sequence modeling | references/knowledge/time-series/ |
| Tabular | Feature engineering, traditional ML, structured data | references/knowledge/tabular/ |
| Multimodal | Cross-modal tasks, vision-language models | references/knowledge/multimodal/ |
文件组织结构:每个竞赛一个独立的 markdown 文件,按 domain 分类到对应目录。
示例:
time-series/birdclef-plus-2025.mdnlp/aimo-2-2025.mdTo learn from a competition:
To browse existing knowledge:
references/knowledge/[domain]/This skill automatically updates its knowledge base when the kaggle-miner agent processes new competitions. The more you use it, the smarter it becomes.
每次从 Kaggle 竞赛提取知识时,必须包含以下标准部分:
| 部分 | 说明 | 必需性 | |------|------|--------| | Competition Brief | 竞赛背景、任务描述、数据规模、评估指标 | ✅ 必需 | | Original Summaries | 前排方案的简要概述 | ✅ 必需 | | 前排方案详细技术分析 | Top 20 方案的核心技巧和实现细节 | ✅ 必需 ⭐ | | Code Templates | 可复用的代码模板 | ✅ 必需 | | Best Practices | 最佳实践和常见陷阱 | ✅ 必需 | | Metadata | 数据源标签和日期 | ✅ 必需 |
每个前排方案应包含:
示例格式:
**排名 Place - 核心技术名称 (作者)**
核心技巧:
- **技巧1**: 简短说明
- **技巧2**: 简短说明
实现细节:
- 具体参数、模型、配置
- 数据和实验结果
建议覆盖 Top 20 方案,获取更多前排选手的创新技巧
references/knowledge/nlp/ - NLP competition techniquesreferences/knowledge/cv/ - Computer vision techniquesreferences/knowledge/time-series/ - Time series methodsreferences/knowledge/tabular/ - Tabular data approachesreferences/knowledge/multimodal/ - Multimodal solutionstime-series/birdclef-plus-2025.md) - 包含完整的 Top 14 前排方案详细技术分析time-series/birdclef-2024.md) - 包含 Top 3 方案详细技术分析nlp/aimo-2-2025.md) - 包含 Top 12+ 前排方案技术总结development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.