Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

aiskillstore/ml-antipattern-validator

Name: ml-antipattern-validator
Author: aiskillstore

skills/doyajin174/ml-antipattern-validator/SKILL.md

npx skillsauth add aiskillstore/marketplace ml-antipattern-validator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

ML Antipattern Validator

Overview

AI/ML 개발에서 30+ 안티패턴을 감지하고 방지하는 스킬입니다.

Key Principle: Honest evaluation > Impressive metrics.

When to Activate

Automatic Triggers:

ML training code (train*.py, model training)
Dataset preparation or splitting
Model evaluation or testing
Production deployment planning

Manual Triggers:

@validate-ml - Full validation
@check-leakage - Data leakage detection
@verify-eval - Evaluation methodology

Pre-Implementation Checklist

✅ Requirements:
□ Problem clearly defined with success metrics
□ Train/test split strategy defined
□ Evaluation methodology matches business objective

✅ Data Integrity:
□ No temporal leakage (future → past)
□ No target leakage (answer in features)
□ No preprocessing leakage (fit on all data)
□ No group leakage (related samples split)

✅ Evaluation Setup:
□ Test set completely held out
□ Metrics aligned with business objective
□ Baseline models defined

Critical Antipatterns

Category 1: Data Leakage 🚨

1.1 Target Leakage

❌ WRONG: Using "refund_issued" to predict "purchase_fraud"
✅ CORRECT: Only use features available at purchase time

1.2 Temporal Leakage

❌ WRONG: train = df[df['date'] > '2024-06-01']  # Future data
✅ CORRECT: train = df[df['date'] < '2024-06-01']  # Past for training

1.3 Preprocessing Leakage

❌ WRONG: X_scaled = scaler.fit_transform(X); train_test_split(X_scaled)
✅ CORRECT: Split first, then scaler.fit(X_train)

1.4 Group Leakage

❌ WRONG: train_test_split(df)  # Same user in both sets
✅ CORRECT: GroupShuffleSplit(groups=df['user_id'])

1.5 Data Augmentation Leakage

❌ WRONG: augment(X) → train_test_split()
✅ CORRECT: train_test_split() → augment(X_train)

Category 2: Evaluation Mistakes ⚠️

2.1 Testing on Training Data

❌ WRONG: evaluate(model, training_data)
✅ CORRECT: evaluate(model, unseen_test_data)

2.2 Metric Misalignment

Business Objective → Appropriate Metric:
- Ranking → NDCG, MRR, MAP
- Imbalanced → F1, Precision@K, AUC-PR
- Balanced → Accuracy, AUC-ROC

2.3 Accuracy Paradox

❌ WRONG: 99% accuracy on 99:1 imbalanced data
✅ CORRECT: Check per-class metrics with classification_report()

2.4 Invalid Time Series CV

❌ WRONG: cross_val_score(model, X, y, cv=5)  # Shuffles time!
✅ CORRECT: TimeSeriesSplit(n_splits=5)

2.5 Hyperparameter Tuning on Test Set

❌ WRONG: grid_search(model, X_test, y_test)
✅ CORRECT: train/validation/test three-way split

Category 3: Training Pitfalls 🔧

3.1 Batch Norm Inference Error

❌ WRONG: predictions = model(X_test)  # Still in train mode
✅ CORRECT: model.eval(); with torch.no_grad(): predictions = model(X_test)

3.2 Early Stopping Overfitting

❌ WRONG: EarlyStopping(patience=50)
✅ CORRECT: EarlyStopping(patience=5, min_delta=0.001, restore_best_weights=True)

3.3 Learning Rate Warmup

✅ CORRECT: get_linear_schedule_with_warmup(num_warmup_steps=1000)

3.4 Class Imbalance

❌ WRONG: CrossEntropyLoss()  # Biased toward majority
✅ CORRECT: CrossEntropyLoss(weight=class_weights)

Detection Patterns

Leakage Detection

# Check feature-target correlation
correlation = df[features].corrwith(df['target'])
if (correlation.abs() > 0.95).any():
    raise DataLeakageError("Suspiciously high correlation")

# Check temporal ordering
if train['date'].min() > test['date'].max():
    raise TemporalLeakageError("Training on future, testing on past")

# Check group overlap
if train_groups & test_groups:
    raise GroupLeakageError("Overlapping groups")

Mode Check

if model.training:
    raise InferenceModeError("Model in training mode during evaluation")

Validation Checklist

Before deployment:

[ ] No data leakage detected
[ ] Test set never seen during training
[ ] Metrics aligned with business objective
[ ] model.eval() called for inference
[ ] Class imbalance handled
[ ] Covariate shift monitoring planned

References

상세 예시 및 시나리오는 references/REFERENCE.md 참조.

aiskillstore/ml-antipattern-validator

skills/doyajin174/ml-antipattern-validator/SKILL.md

Prevents 30+ critical AI/ML mistakes including data leakage, evaluation errors, training pitfalls, and deployment issues. Use when working with ML training, testing, model evaluation, or deployment.

234 stars

testing

Updated Apr 1, 2026

$ install --global

skillsauth

npx skillsauth add aiskillstore/marketplace ml-antipattern-validator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 1, 2026, 4:07 PM52.8s8 files scanned

SKILL.md

name:: ml-antipattern-validator
description:: Prevents 30+ critical AI/ML mistakes including data leakage, evaluation errors, training pitfalls, and deployment issues. Use when working with ML training, testing, model evaluation, or deployment.

ML Antipattern Validator

Overview

AI/ML 개발에서 30+ 안티패턴을 감지하고 방지하는 스킬입니다.

Key Principle: Honest evaluation > Impressive metrics.

When to Activate

Automatic Triggers:

ML training code (train*.py, model training)
Dataset preparation or splitting
Model evaluation or testing
Production deployment planning

Manual Triggers:

@validate-ml - Full validation
@check-leakage - Data leakage detection
@verify-eval - Evaluation methodology

Pre-Implementation Checklist

✅ Requirements:
□ Problem clearly defined with success metrics
□ Train/test split strategy defined
□ Evaluation methodology matches business objective

✅ Data Integrity:
□ No temporal leakage (future → past)
□ No target leakage (answer in features)
□ No preprocessing leakage (fit on all data)
□ No group leakage (related samples split)

✅ Evaluation Setup:
□ Test set completely held out
□ Metrics aligned with business objective
□ Baseline models defined

Critical Antipatterns

Category 1: Data Leakage 🚨

1.1 Target Leakage

❌ WRONG: Using "refund_issued" to predict "purchase_fraud"
✅ CORRECT: Only use features available at purchase time

1.2 Temporal Leakage

❌ WRONG: train = df[df['date'] > '2024-06-01']  # Future data
✅ CORRECT: train = df[df['date'] < '2024-06-01']  # Past for training

1.3 Preprocessing Leakage

❌ WRONG: X_scaled = scaler.fit_transform(X); train_test_split(X_scaled)
✅ CORRECT: Split first, then scaler.fit(X_train)

1.4 Group Leakage

❌ WRONG: train_test_split(df)  # Same user in both sets
✅ CORRECT: GroupShuffleSplit(groups=df['user_id'])

1.5 Data Augmentation Leakage

❌ WRONG: augment(X) → train_test_split()
✅ CORRECT: train_test_split() → augment(X_train)

Category 2: Evaluation Mistakes ⚠️

2.1 Testing on Training Data

❌ WRONG: evaluate(model, training_data)
✅ CORRECT: evaluate(model, unseen_test_data)

2.2 Metric Misalignment

Business Objective → Appropriate Metric:
- Ranking → NDCG, MRR, MAP
- Imbalanced → F1, Precision@K, AUC-PR
- Balanced → Accuracy, AUC-ROC

2.3 Accuracy Paradox

❌ WRONG: 99% accuracy on 99:1 imbalanced data
✅ CORRECT: Check per-class metrics with classification_report()

2.4 Invalid Time Series CV

❌ WRONG: cross_val_score(model, X, y, cv=5)  # Shuffles time!
✅ CORRECT: TimeSeriesSplit(n_splits=5)

2.5 Hyperparameter Tuning on Test Set

❌ WRONG: grid_search(model, X_test, y_test)
✅ CORRECT: train/validation/test three-way split

Category 3: Training Pitfalls 🔧

3.1 Batch Norm Inference Error

❌ WRONG: predictions = model(X_test)  # Still in train mode
✅ CORRECT: model.eval(); with torch.no_grad(): predictions = model(X_test)

3.2 Early Stopping Overfitting

❌ WRONG: EarlyStopping(patience=50)
✅ CORRECT: EarlyStopping(patience=5, min_delta=0.001, restore_best_weights=True)

3.3 Learning Rate Warmup

✅ CORRECT: get_linear_schedule_with_warmup(num_warmup_steps=1000)

3.4 Class Imbalance

❌ WRONG: CrossEntropyLoss()  # Biased toward majority
✅ CORRECT: CrossEntropyLoss(weight=class_weights)

Detection Patterns

Leakage Detection

# Check feature-target correlation
correlation = df[features].corrwith(df['target'])
if (correlation.abs() > 0.95).any():
    raise DataLeakageError("Suspiciously high correlation")

# Check temporal ordering
if train['date'].min() > test['date'].max():
    raise TemporalLeakageError("Training on future, testing on past")

# Check group overlap
if train_groups & test_groups:
    raise GroupLeakageError("Overlapping groups")

Mode Check

if model.training:
    raise InferenceModeError("Model in training mode during evaluation")

Validation Checklist

Before deployment:

[ ] No data leakage detected
[ ] Test set never seen during training
[ ] Metrics aligned with business objective
[ ] model.eval() called for inference
[ ] Class imbalance handled
[ ] Covariate shift monitoring planned

References

상세 예시 및 시나리오는 references/REFERENCE.md 참조.

Related Skills

aiskillstore/hig-components-content

development

VerifiedTrustedCommunity

Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.

244SKILL.mdUpdated Apr 10, 2026

aiskillstore/hig-components-content

aiskillstore/helpdesk-automation

tools

VerifiedTrustedCommunity

Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.

244SKILL.mdUpdated Apr 10, 2026

aiskillstore/helpdesk-automation

aiskillstore/haskell-pro

testing

VerifiedTrustedCommunity

Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.

244SKILL.mdUpdated Apr 10, 2026

aiskillstore/haskell-pro

aiskillstore/graphql

tools

VerifiedTrustedCommunity

GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.

244SKILL.mdUpdated Apr 10, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/aiskillstore/marketplace.git

# Copy into Claude Code skills folder (global)
cp -r marketplace/skills/doyajin174/ml-antipattern-validator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

aiskillstore/marketplace

234 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT