skills/doyajin174/ml-antipattern-validator/SKILL.md
Prevents 30+ critical AI/ML mistakes including data leakage, evaluation errors, training pitfalls, and deployment issues. Use when working with ML training, testing, model evaluation, or deployment.
npx skillsauth add aiskillstore/marketplace ml-antipattern-validatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
AI/ML 개발에서 30+ 안티패턴을 감지하고 방지하는 스킬입니다.
Key Principle: Honest evaluation > Impressive metrics.
Automatic Triggers:
train*.py, model training)Manual Triggers:
@validate-ml - Full validation@check-leakage - Data leakage detection@verify-eval - Evaluation methodology✅ Requirements:
□ Problem clearly defined with success metrics
□ Train/test split strategy defined
□ Evaluation methodology matches business objective
✅ Data Integrity:
□ No temporal leakage (future → past)
□ No target leakage (answer in features)
□ No preprocessing leakage (fit on all data)
□ No group leakage (related samples split)
✅ Evaluation Setup:
□ Test set completely held out
□ Metrics aligned with business objective
□ Baseline models defined
❌ WRONG: Using "refund_issued" to predict "purchase_fraud"
✅ CORRECT: Only use features available at purchase time
❌ WRONG: train = df[df['date'] > '2024-06-01'] # Future data
✅ CORRECT: train = df[df['date'] < '2024-06-01'] # Past for training
❌ WRONG: X_scaled = scaler.fit_transform(X); train_test_split(X_scaled)
✅ CORRECT: Split first, then scaler.fit(X_train)
❌ WRONG: train_test_split(df) # Same user in both sets
✅ CORRECT: GroupShuffleSplit(groups=df['user_id'])
❌ WRONG: augment(X) → train_test_split()
✅ CORRECT: train_test_split() → augment(X_train)
❌ WRONG: evaluate(model, training_data)
✅ CORRECT: evaluate(model, unseen_test_data)
Business Objective → Appropriate Metric:
- Ranking → NDCG, MRR, MAP
- Imbalanced → F1, Precision@K, AUC-PR
- Balanced → Accuracy, AUC-ROC
❌ WRONG: 99% accuracy on 99:1 imbalanced data
✅ CORRECT: Check per-class metrics with classification_report()
❌ WRONG: cross_val_score(model, X, y, cv=5) # Shuffles time!
✅ CORRECT: TimeSeriesSplit(n_splits=5)
❌ WRONG: grid_search(model, X_test, y_test)
✅ CORRECT: train/validation/test three-way split
❌ WRONG: predictions = model(X_test) # Still in train mode
✅ CORRECT: model.eval(); with torch.no_grad(): predictions = model(X_test)
❌ WRONG: EarlyStopping(patience=50)
✅ CORRECT: EarlyStopping(patience=5, min_delta=0.001, restore_best_weights=True)
✅ CORRECT: get_linear_schedule_with_warmup(num_warmup_steps=1000)
❌ WRONG: CrossEntropyLoss() # Biased toward majority
✅ CORRECT: CrossEntropyLoss(weight=class_weights)
# Check feature-target correlation
correlation = df[features].corrwith(df['target'])
if (correlation.abs() > 0.95).any():
raise DataLeakageError("Suspiciously high correlation")
# Check temporal ordering
if train['date'].min() > test['date'].max():
raise TemporalLeakageError("Training on future, testing on past")
# Check group overlap
if train_groups & test_groups:
raise GroupLeakageError("Overlapping groups")
if model.training:
raise InferenceModeError("Model in training mode during evaluation")
Before deployment:
상세 예시 및 시나리오는 references/REFERENCE.md 참조.
development
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.