Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

intent-solutions-io/evaluating-machine-learning-models

Name: evaluating-machine-learning-models
Author: intent-solutions-io

010-archive/backups-20251108/skill-structure-cleanup-20251108-073936/plugins/ai-ml/model-evaluation-suite/skills/model-evaluation-suite/SKILL.md

npx skillsauth add intent-solutions-io/plugins-nixtla evaluating-machine-learning-models

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Overview

This skill empowers Claude to perform thorough evaluations of machine learning models, providing detailed performance insights. It leverages the model-evaluation-suite plugin to generate a range of metrics, enabling informed decisions about model selection and optimization.

How It Works

Analyzing Context: Claude analyzes the user's request to identify the model to be evaluated and any specific metrics of interest.
Executing Evaluation: Claude uses the /eval-model command to initiate the model evaluation process within the model-evaluation-suite plugin.
Presenting Results: Claude presents the generated metrics and insights to the user, highlighting key performance indicators and potential areas for improvement.

When to Use This Skill

This skill activates when you need to:

Assess the performance of a machine learning model.
Compare the performance of multiple models.
Identify areas where a model can be improved.
Validate a model's performance before deployment.

Examples

Example 1: Evaluating Model Accuracy

User request: "Evaluate the accuracy of my image classification model."

The skill will:

Invoke the /eval-model command.
Analyze the model's performance on a held-out dataset.
Report the accuracy score and other relevant metrics.

Example 2: Comparing Model Performance

User request: "Compare the F1-score of model A and model B."

The skill will:

Invoke the /eval-model command for both models.
Extract the F1-score from the evaluation results.
Present a comparison of the F1-scores for model A and model B.

Best Practices

Specify Metrics: Clearly define the specific metrics of interest for the evaluation.
Data Validation: Ensure the data used for evaluation is representative of the real-world data the model will encounter.
Interpret Results: Provide context and interpretation of the evaluation results to facilitate informed decision-making.

Integration

This skill integrates seamlessly with the model-evaluation-suite plugin, providing a comprehensive solution for model evaluation within the Claude Code environment. It can be combined with other skills to build automated machine learning workflows.

intent-solutions-io/evaluating-machine-learning-models

010-archive/backups-20251108/skill-structure-cleanup-20251108-073936/plugins/ai-ml/model-evaluation-suite/skills/model-evaluation-suite/SKILL.md

This skill allows Claude to evaluate machine learning models using a comprehensive suite of metrics. It should be used when the user requests model performance analysis, validation, or testing. Claude can use this skill to assess model accuracy, precision, recall, F1-score, and other relevant metrics. Trigger this skill when the user mentions "evaluate model", "model performance", "testing metrics", "validation results", or requests a comprehensive "model evaluation".

6 stars

testing

Updated Apr 5, 2026

$ install --global

skillsauth

npx skillsauth add intent-solutions-io/plugins-nixtla evaluating-machine-learning-models

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 11:47 PM6.0s4 files scanned

SKILL.md

name:: evaluating-machine-learning-models
description:: |
allowed-tools:: Read, Write, Edit, Grep, Glob, Bash
version:: 1.0.0

Overview

How It Works

Analyzing Context: Claude analyzes the user's request to identify the model to be evaluated and any specific metrics of interest.
Executing Evaluation: Claude uses the /eval-model command to initiate the model evaluation process within the model-evaluation-suite plugin.
Presenting Results: Claude presents the generated metrics and insights to the user, highlighting key performance indicators and potential areas for improvement.

When to Use This Skill

This skill activates when you need to:

Assess the performance of a machine learning model.
Compare the performance of multiple models.
Identify areas where a model can be improved.
Validate a model's performance before deployment.

Examples

Example 1: Evaluating Model Accuracy

User request: "Evaluate the accuracy of my image classification model."

The skill will:

Invoke the /eval-model command.
Analyze the model's performance on a held-out dataset.
Report the accuracy score and other relevant metrics.

Example 2: Comparing Model Performance

User request: "Compare the F1-score of model A and model B."

The skill will:

Invoke the /eval-model command for both models.
Extract the F1-score from the evaluation results.
Present a comparison of the F1-scores for model A and model B.

Best Practices

Specify Metrics: Clearly define the specific metrics of interest for the evaluation.
Data Validation: Ensure the data used for evaluation is representative of the real-world data the model will encounter.
Interpret Results: Provide context and interpretation of the evaluation results to facilitate informed decision-making.

Integration

Related Skills

intent-solutions-io/managing-database-sharding

tools

VerifiedTrustedCommunity

This skill assists with managing database sharding strategies. It is activated when the user needs to implement horizontal database sharding to scale beyond single-server limitations. The skill supports designing sharding strategies, distributing data across multiple database instances, and implementing consistent hashing, automatic rebalancing, and cross-shard query coordination. Use this skill when the user mentions "database sharding", "sharding implementation", "scale database", or "horizontal partitioning". The plugin helps design and implement sharding for high-scale applications.

8SKILL.mdUpdated Jul 11, 2026

intent-solutions-io/managing-database-sharding

intent-solutions-io/scanning-database-security

tools

VerifiedTrustedCommunity

This skill enables Claude to perform comprehensive database security scans using the database-security-scanner plugin. It is triggered when the user requests a security assessment of a database, including identifying vulnerabilities like weak passwords, SQL injection risks, and insecure configurations. The skill leverages OWASP guidelines to ensure thorough coverage and provides remediation suggestions. Use this skill when the user asks to "scan database security", "check database for vulnerabilities", "perform OWASP compliance check on database", or "assess database security posture". The plugin supports PostgreSQL and MySQL.

8SKILL.mdUpdated Jul 11, 2026

intent-solutions-io/scanning-database-security

intent-solutions-io/designing-database-schemas

testing

VerifiedTrustedCommunity

This skill enables Claude to design and visualize database schemas. It leverages normalization guidance (1NF through BCNF), relationship mapping, and ERD generation to create efficient and well-structured databases. Use this skill when the user requests to "design a database schema", "create a database model", "generate an ERD", "normalize a database", or needs help with "database design best practices". The skill is triggered by terms like "database schema", "ERD diagram", "database normalization", and "relational database design".

8SKILL.mdUpdated Jul 11, 2026

intent-solutions-io/designing-database-schemas

intent-solutions-io/managing-database-replication

tools

VerifiedTrustedCommunity

This skill enables Claude to manage database replication, failover, and high availability configurations using the database-replication-manager plugin. It is designed to assist with tasks such as setting up master-slave replication, configuring automatic failover, monitoring replication lag, and implementing read scaling. Use this skill when the user requests help with "database replication", "failover configuration", "high availability", "replication lag", or "read scaling" for databases like PostgreSQL or MySQL. The plugin facilitates both physical and logical replication strategies.

8SKILL.mdUpdated Jul 11, 2026

intent-solutions-io/managing-database-replication

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/intent-solutions-io/plugins-nixtla.git

# Copy into Claude Code skills folder (global)
cp -r plugins-nixtla/010-archive/backups-20251108/skill-structure-cleanup-20251108-073936/plugins/ai-ml/model-evaluation-suite/skills/model-evaluation-suite ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

intent-solutions-io/plugins-nixtla

6 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT