Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

brycewang-stanford/architecture-design

Name: architecture-design
Author: brycewang-stanford

skills/33-Galaxy-Dawn-claude-scholar/skills/architecture-design/SKILL.md

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research architecture-design

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Architecture Design - ML Project Template

This skill defines the standard code architecture for machine learning projects based on the template structure. When modifying or extending code, follow these patterns to maintain consistency.

Overview

The project follows a modular, extensible architecture with clear separation of concerns. Each module (data, model, trainer, analysis) is independently organized using factory and registry patterns for maximum flexibility.

When to Use

Use this skill when:

Creating a new Dataset class that needs @register_dataset
Creating a new Model class that needs @register_model
Creating a new module directory with __init__.py factory wiring
Initializing a new ML project structure from scratch
Adding new component types such as Augmentation, CollateFunction, or Metrics

When Not to Use

Do not use this skill when:

Modifying existing functions or methods
Fixing bugs in existing code
Adding helper functions or utilities
Refactoring without adding new registrable components
Making simple code changes to a single file
Modifying configuration files
Reading or understanding existing code

Key indicator: if the task does not require a @register_* decorator or a Factory pattern, skip this skill.

Core Design Patterns

Factory Pattern

Each module uses a factory to create instances dynamically:

# Example from data_module/dataset/__init__.py
DATASET_FACTORY: Dict = {}

def DatasetFactory(data_name: str):
    dataset = DATASET_FACTORY.get(data_name, None)
    if dataset is None:
        print(f"{data_name} dataset is not implementation, use simple dataset")
        dataset = DATASET_FACTORY.get('simple')
    return dataset

For detailed guidance, refer to references/factory_pattern.md.

Registry Pattern

Components register themselves via decorators:

# Example from data_module/dataset/simple_dataset.py
@register_dataset("simple")
class SimpleDataset(Dataset):
    def __init__(self, data):
        self.data = data

For detailed guidance, refer to references/registry_pattern.md.

Auto-Import Pattern

Modules automatically discover and import submodules:

# Example from data_module/dataset/__init__.py
models_dir = os.path.dirname(__file__)
import_modules(models_dir, "src.data_module.dataset")

For detailed guidance, refer to references/auto_import.md.

Directory Structure

project/
├── run/
│   ├── pipeline/            # Main workflow scripts
│   │   ├── training/        # Training pipelines
│   │   ├── prepare_data/    # Data preparation pipelines
│   │   └── analysis/        # Analysis pipelines
│   └── conf/                # Hydra configuration files
│       ├── training/        # Training configs
│       ├── dataset/         # Dataset configs
│       ├── model/           # Model configs
│       ├── prepare_data/    # Data prep configs
│       └── analysis/        # Analysis configs
│
├── src/
│   ├── data_module/         # Data processing module
│   │   ├── dataset/         # Dataset implementations
│   │   ├── augmentation/    # Data augmentation
│   │   ├── collate_fn/      # Collate functions
│   │   ├── compute_metrics/ # Metrics computation
│   │   ├── prepare_data/    # Data preparation logic
│   │   ├── data_func/       # Data utility functions
│   │   └── utils.py         # Module-specific utilities
│   │
│   ├── model_module/        # Model implementations
│   │   ├── brain_decoder/   # Brain decoder models
│   │   └── model/           # Alternative model location
│   │
│   ├── trainer_module/      # Training logic
│   ├── analysis_module/     # Analysis and evaluation
│   ├── llm/                 # LLM-related code
│   └── utils/               # Shared utilities
│
├── data/
│   ├── raw/                 # Original, immutable data
│   ├── processed/           # Cleaned, transformed data
│   └── external/            # Third-party data
│
├── outputs/
│   ├── logs/                # Training and evaluation logs
│   ├── checkpoints/         # Model checkpoints
│   ├── tables/              # Result tables
│   └── figures/             # Plots and visualizations
│
├── pyproject.toml           # Project configuration
├── uv.lock                  # Dependency lock file
├── TODO.md                  # Task tracking
├── README.md                # Project documentation
└── .gitignore               # Git ignore rules

For detailed directory structure with file descriptions, refer to references/structure.md.

Module Organization

Creating a New Dataset

When adding a new dataset:

Create file in src/data_module/dataset/
Use @register_dataset("name") decorator
Inherit from torch.utils.data.Dataset
Implement __init__, __len__, __getitem__

from torch.utils.data import Dataset
from typing import Dict
import torch
from src.data_module.dataset import register_dataset

@register_dataset("custom")
class CustomDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, i: int) -> Dict[str, torch.Tensor]:
        return self.data[i]

Creating a New Model

CRITICAL: Models use config-driven pattern

When adding a new model:

Create file in src/model_module/model/ or appropriate module subdirectory
Use @register_model('ModelName') decorator
__init__ accepts ONLY cfg parameter - all hyperparameters come from config
forward() returns dict: {"loss": loss, "labels": labels, "logits": logits}
Handle training vs inference modes using self.training

from src.model_module.brain_decoder import register_model

@register_model('MyModel')
class MyModel(nn.Module):
    def __init__(self, cfg):
        super().__init__()
        self.cfg = cfg
        self.task = cfg.dataset.task

        # ALL parameters from cfg
        self.hidden_dim = cfg.model.hidden_dim
        self.output_dim = cfg.dataset.target_size[cfg.dataset.task]

    def forward(self, x, labels=None, **kwargs):
        if self.training:
            # Training logic
            pass
        else:
            # Inference logic
            pass

        return {"loss": loss, "labels": labels, "logits": logits}

Adding Data Augmentation

When adding augmentation:

Create file in src/data_module/augmentation/
Implement transformation function
Register with factory if needed

Code Style Guidelines

For comprehensive style guidelines, refer to references/code_style.md.

Key principles:

Always use type hints for function signatures
Follow import order: standard library → third-party → local
Module __init__.py files contain factory/registry logic
Model classes must be config-driven

Configuration Management

The project uses Hydra for configuration management:

Config files in run/conf/ organize by module
Each stage (training, analysis) has its own config structure
Use YAML files for all configuration

When Working on This Project

Before Modifying Code

Read the relevant module's factory/registry pattern
Check existing implementations for consistency
Follow the established directory structure
Use registration decorators for new components

Adding New Features

Determine which module the feature belongs to
Check if similar functionality exists
Follow factory/registry pattern if creating new component types
Add configuration files if needed
Update documentation

Code Review Checklist

[ ] Uses factory/registry pattern appropriately
[ ] Follows module directory structure
[ ] Has proper type annotations
[ ] Imports are correctly ordered
[ ] Registration decorator is used
[ ] Configuration files are added if needed

Additional Resources

Reference Files

For detailed information, consult:

references/structure.md - Detailed directory structure with file descriptions
references/factory_pattern.md - Factory pattern in-depth explanation
references/registry_pattern.md - Registry pattern in-depth explanation
references/auto_import.md - Auto-import pattern in-depth explanation
references/code_style.md - Comprehensive code style guidelines

Example Files

Working examples in examples/:

examples/custom_dataset.py - Custom dataset implementation
examples/custom_model.py - Custom model implementation
examples/augmentation_example.py - Data augmentation example
examples/config_example.yaml - Configuration file example
examples/pipeline_example.sh - Pipeline script example

brycewang-stanford/architecture-design

skills/33-Galaxy-Dawn-claude-scholar/skills/architecture-design/SKILL.md

Use only when creating new registrable ML components that require Factory or Registry patterns.

566 stars

data-ai

Updated May 1, 2026

$ install --global

skillsauth

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research architecture-design

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 1, 2026, 5:53 AM254.1s11 files scanned

SKILL.md

name:: architecture-design
description:: Use only when creating new registrable ML components that require Factory or Registry patterns.
version:: 1.2.0

Architecture Design - ML Project Template

This skill defines the standard code architecture for machine learning projects based on the template structure. When modifying or extending code, follow these patterns to maintain consistency.

Overview

When to Use

Use this skill when:

Creating a new Dataset class that needs @register_dataset
Creating a new Model class that needs @register_model
Creating a new module directory with __init__.py factory wiring
Initializing a new ML project structure from scratch
Adding new component types such as Augmentation, CollateFunction, or Metrics

When Not to Use

Do not use this skill when:

Modifying existing functions or methods
Fixing bugs in existing code
Adding helper functions or utilities
Refactoring without adding new registrable components
Making simple code changes to a single file
Modifying configuration files
Reading or understanding existing code

Key indicator: if the task does not require a @register_* decorator or a Factory pattern, skip this skill.

Core Design Patterns

Factory Pattern

Each module uses a factory to create instances dynamically:

# Example from data_module/dataset/__init__.py
DATASET_FACTORY: Dict = {}

def DatasetFactory(data_name: str):
    dataset = DATASET_FACTORY.get(data_name, None)
    if dataset is None:
        print(f"{data_name} dataset is not implementation, use simple dataset")
        dataset = DATASET_FACTORY.get('simple')
    return dataset

For detailed guidance, refer to references/factory_pattern.md.

Registry Pattern

Components register themselves via decorators:

# Example from data_module/dataset/simple_dataset.py
@register_dataset("simple")
class SimpleDataset(Dataset):
    def __init__(self, data):
        self.data = data

For detailed guidance, refer to references/registry_pattern.md.

Auto-Import Pattern

Modules automatically discover and import submodules:

# Example from data_module/dataset/__init__.py
models_dir = os.path.dirname(__file__)
import_modules(models_dir, "src.data_module.dataset")

For detailed guidance, refer to references/auto_import.md.

Directory Structure

project/
├── run/
│   ├── pipeline/            # Main workflow scripts
│   │   ├── training/        # Training pipelines
│   │   ├── prepare_data/    # Data preparation pipelines
│   │   └── analysis/        # Analysis pipelines
│   └── conf/                # Hydra configuration files
│       ├── training/        # Training configs
│       ├── dataset/         # Dataset configs
│       ├── model/           # Model configs
│       ├── prepare_data/    # Data prep configs
│       └── analysis/        # Analysis configs
│
├── src/
│   ├── data_module/         # Data processing module
│   │   ├── dataset/         # Dataset implementations
│   │   ├── augmentation/    # Data augmentation
│   │   ├── collate_fn/      # Collate functions
│   │   ├── compute_metrics/ # Metrics computation
│   │   ├── prepare_data/    # Data preparation logic
│   │   ├── data_func/       # Data utility functions
│   │   └── utils.py         # Module-specific utilities
│   │
│   ├── model_module/        # Model implementations
│   │   ├── brain_decoder/   # Brain decoder models
│   │   └── model/           # Alternative model location
│   │
│   ├── trainer_module/      # Training logic
│   ├── analysis_module/     # Analysis and evaluation
│   ├── llm/                 # LLM-related code
│   └── utils/               # Shared utilities
│
├── data/
│   ├── raw/                 # Original, immutable data
│   ├── processed/           # Cleaned, transformed data
│   └── external/            # Third-party data
│
├── outputs/
│   ├── logs/                # Training and evaluation logs
│   ├── checkpoints/         # Model checkpoints
│   ├── tables/              # Result tables
│   └── figures/             # Plots and visualizations
│
├── pyproject.toml           # Project configuration
├── uv.lock                  # Dependency lock file
├── TODO.md                  # Task tracking
├── README.md                # Project documentation
└── .gitignore               # Git ignore rules

For detailed directory structure with file descriptions, refer to references/structure.md.

Module Organization

Creating a New Dataset

When adding a new dataset:

Create file in src/data_module/dataset/
Use @register_dataset("name") decorator
Inherit from torch.utils.data.Dataset
Implement __init__, __len__, __getitem__

from torch.utils.data import Dataset
from typing import Dict
import torch
from src.data_module.dataset import register_dataset

@register_dataset("custom")
class CustomDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, i: int) -> Dict[str, torch.Tensor]:
        return self.data[i]

Creating a New Model

CRITICAL: Models use config-driven pattern

When adding a new model:

Create file in src/model_module/model/ or appropriate module subdirectory
Use @register_model('ModelName') decorator
__init__ accepts ONLY cfg parameter - all hyperparameters come from config
forward() returns dict: {"loss": loss, "labels": labels, "logits": logits}
Handle training vs inference modes using self.training

from src.model_module.brain_decoder import register_model

@register_model('MyModel')
class MyModel(nn.Module):
    def __init__(self, cfg):
        super().__init__()
        self.cfg = cfg
        self.task = cfg.dataset.task

        # ALL parameters from cfg
        self.hidden_dim = cfg.model.hidden_dim
        self.output_dim = cfg.dataset.target_size[cfg.dataset.task]

    def forward(self, x, labels=None, **kwargs):
        if self.training:
            # Training logic
            pass
        else:
            # Inference logic
            pass

        return {"loss": loss, "labels": labels, "logits": logits}

Adding Data Augmentation

When adding augmentation:

Create file in src/data_module/augmentation/
Implement transformation function
Register with factory if needed

Code Style Guidelines

For comprehensive style guidelines, refer to references/code_style.md.

Key principles:

Always use type hints for function signatures
Follow import order: standard library → third-party → local
Module __init__.py files contain factory/registry logic
Model classes must be config-driven

Configuration Management

The project uses Hydra for configuration management:

Config files in run/conf/ organize by module
Each stage (training, analysis) has its own config structure
Use YAML files for all configuration

When Working on This Project

Before Modifying Code

Read the relevant module's factory/registry pattern
Check existing implementations for consistency
Follow the established directory structure
Use registration decorators for new components

Adding New Features

Determine which module the feature belongs to
Check if similar functionality exists
Follow factory/registry pattern if creating new component types
Add configuration files if needed
Update documentation

Code Review Checklist

[ ] Uses factory/registry pattern appropriately
[ ] Follows module directory structure
[ ] Has proper type annotations
[ ] Imports are correctly ordered
[ ] Registration decorator is used
[ ] Configuration files are added if needed

Additional Resources

Reference Files

For detailed information, consult:

references/structure.md - Detailed directory structure with file descriptions
references/factory_pattern.md - Factory pattern in-depth explanation
references/registry_pattern.md - Registry pattern in-depth explanation
references/auto_import.md - Auto-import pattern in-depth explanation
references/code_style.md - Comprehensive code style guidelines

Example Files

Working examples in examples/:

examples/custom_dataset.py - Custom dataset implementation
examples/custom_model.py - Custom model implementation
examples/augmentation_example.py - Data augmentation example
examples/config_example.yaml - Configuration file example
examples/pipeline_example.sh - Pipeline script example

Related Skills

brycewang-stanford/literature-review-tools

tools

VerifiedTrustedCommunity

Recommend AND run open-source AI tools, agents, Claude Code / Codex skills, and MCP servers for any stage of a literature review — searching, reading, extracting, synthesizing, screening, citation-checking, and paper writing. Use when the user asks "what tool should I use to..." OR "install/run/use <tool> to ..." for research/lit-review work: automating a survey or related-work section, PDF→Markdown extraction for LLMs (MinerU/marker/docling), PRISMA / systematic review (ASReview), citation-backed Q&A over PDFs (PaperQA2), wiring papers into Claude/Cursor via MCP (arxiv/paper-search/zotero servers), or chatting with a Zotero library. Ships a launcher (scripts/litrun.py) that installs each tool in an isolated venv and runs it. Curated catalog of 70+ vetted projects. 支持中英文（用于「文献综述工具选型」与「一键安装/运行」）。

3,109SKILL.mdUpdated Jul 28, 2026

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

development

VerifiedTrustedCommunity

Route empirical-research requests through the Auto-Empirical Research Skills catalog when this whole repository is installed as one skill in Codex, CodeBuddy, Claude Code, or another IDE. Use to choose and load the right vendored AERS skill for causal inference, econometrics, replication, data acquisition, manuscript writing, peer review and referee responses, citation checking, de-AIGC editing, or full empirical-paper workflows without reading the entire repository at once.

3,109SKILL.mdUpdated Jun 27, 2026

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

documentation

VerifiedTrustedCommunity

Use when the project collects primary data or runs a field, lab, or survey experiment, before the intervention begins — write the pre-analysis plan, size the sample from a power calculation, and register with the AEA RCT Registry. Apply after the design is chosen in aer-identification and before any outcome data are seen.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

tools

VerifiedTrustedCommunity

Guide economists to authoritative data sources with explicit, confirmed data specifications before retrieval; interfaces with Playwright MCP to navigate portals and extract real data, not articles about data.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/economist-data-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.git

# Copy into Claude Code skills folder (global)
cp -r Awesome-Agent-Skills-for-Empirical-Research/skills/33-Galaxy-Dawn-claude-scholar/skills/architecture-design ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

566 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT