Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

drillan/mixseek-evaluator-config

Name: mixseek-evaluator-config
Author: drillan

skills/mixseek-evaluator-config/SKILL.md

npx skillsauth add drillan/mixseek-plus mixseek-evaluator-config

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

MixSeek 評価設定生成

概要

MixSeek-Coreの評価設定ファイル（evaluator.toml）と判定設定ファイル（judgment.toml）を生成します。TUMIXトーナメントにおけるSubmissionの評価基準、スコアリング方法、最終判定ロジックを定義します。

前提条件

ワークスペースが初期化されていること（mixseek-workspace-init参照）
環境変数 MIXSEEK_WORKSPACE が設定されていること（推奨）

生成ファイル

| ファイル | 用途 | 配置場所 | |---------|------|---------| | evaluator.toml | Submissionのスコアリング設定 | configs/evaluators/ | | judgment.toml | 最終判定の設定 | configs/judgment/ |

使用方法

Step 1: 要件のヒアリング

ユーザーに以下を確認してください:

評価の重点: 何を重視して評価するか（明確性、カバレッジ、関連性など）
重み付け: 各メトリクスの重要度（均等 or カスタム）
判定スタイル: 決定論的（temperature=0）or 多様性重視

Step 2: メトリクス設定の提案

標準メトリクスから選択:

| メトリクス | 説明 | 用途 | |-----------|------|------| | ClarityCoherence | 明確性と一貫性 | 読みやすさ重視のタスク | | Coverage | カバレッジ | 網羅性重視のタスク | | LLMPlain | 汎用LLM評価 | カスタム評価基準が必要なタスク | | Relevance | 関連性 | 的確さ重視のタスク |

Step 3: 設定ファイルの生成

evaluator.toml:

default_model = "google-gla:gemini-2.5-pro"
temperature = 0.0

[[metrics]]
name = "ClarityCoherence"
weight = 0.34

[[metrics]]
name = "Coverage"
weight = 0.33

[[metrics]]
name = "Relevance"
weight = 0.33

judgment.toml:

model = "google-gla:gemini-2.5-pro"
temperature = 0.0
timeout_seconds = 60

Step 4: ファイルの保存

$MIXSEEK_WORKSPACE/configs/evaluators/evaluator.toml
$MIXSEEK_WORKSPACE/configs/judgment/judgment.toml

重要: カスタムパス（configs/evaluators/やconfigs/judgment/）を使用する場合は、必ずorchestrator.tomlでパスを明示的に指定してください。指定しないとデフォルトパス（configs/evaluator.toml、configs/judgment.toml）が検索され、設定が反映されません。

# orchestrator.toml
[orchestrator]
evaluator_config = "configs/evaluators/evaluator.toml"
judgment_config = "configs/judgment/judgment.toml"

Step 5: 設定ファイルの検証（必須）

生成後は必ず検証を実行してください。

# Evaluator設定の検証
uv run python skills/mixseek-config-validate/scripts/validate-config.py \
    $MIXSEEK_WORKSPACE/configs/evaluators/evaluator.toml --type evaluator

# Judgment設定の検証
uv run python skills/mixseek-config-validate/scripts/validate-config.py \
    $MIXSEEK_WORKSPACE/configs/judgment/judgment.toml --type judgment

検証が成功したら、ユーザーに結果を報告します。失敗した場合は、エラー内容を確認して設定を修正してください。

標準メトリクス詳細

ClarityCoherence（明確性・一貫性）

回答の読みやすさと論理的一貫性を評価します。

評価観点:

文章構造の明確さ
論理的な流れ
専門用語の適切な使用
結論の明確さ

推奨用途:

説明文の生成
レポート作成
教育コンテンツ

Coverage（カバレッジ）

質問に対する回答の網羅性を評価します。

評価観点:

質問の全側面への対応
関連トピックの包含
例示の充実度
補足情報の有無

推奨用途:

リサーチタスク
FAQ作成
技術ドキュメント

Relevance（関連性）

回答が質問に対してどれだけ的確かを評価します。

評価観点:

質問への直接的な回答
不要な情報の排除
焦点の維持
文脈への適合

推奨用途:

Q&A
カスタマーサポート
検索結果の評価

LLMPlain（汎用LLM評価）

system_instructionで定義したカスタム評価基準に基づいてLLMが評価します。

特徴:

事前定義された評価ロジックを持たない
system_instructionで完全にカスタマイズ可能
特殊な評価基準が必要な場合に使用

推奨用途:

ドメイン固有の評価（法律、医療など）
プロジェクト固有の品質基準
他のメトリクスでカバーできない観点

設定例:

[[metrics]]
name = "LLMPlain"
weight = 0.5
system_instruction = """
セキュリティ観点から回答を評価してください:
1. 機密情報の漏洩リスク
2. 安全なコーディング実践
3. 脆弱性の有無
0-100のスコアで評価してください。
"""

例

均等重み付け設定

User: 評価設定を作成して

Agent: 評価設定を提案します。

       メトリクス（均等重み付け）:
       - ClarityCoherence: 33.4%
       - Coverage: 33.3%
       - Relevance: 33.3%

       この構成でよろしいですか？

User: はい

Agent: 設定ファイルを生成しました:
       ✓ configs/evaluators/evaluator.toml
       ✓ configs/judgment/judgment.toml

カスタム重み付け設定

User: 正確性を重視した評価設定を作って

Agent: 関連性（Relevance）を重視した設定を提案します。

       メトリクス:
       - Relevance: 50%（重視）
       - ClarityCoherence: 30%
       - Coverage: 20%

       この構成でよろしいですか？

User: はい

生成される設定ファイル例

evaluator.toml（カスタム重み付け）:

# MixSeek Evaluator Configuration
# Generated by mixseek-evaluator-config skill

default_model = "google-gla:gemini-2.5-pro"
temperature = 0.0
timeout_seconds = 300
max_retries = 3

[[metrics]]
name = "Relevance"
weight = 0.5

[[metrics]]
name = "ClarityCoherence"
weight = 0.3

[[metrics]]
name = "Coverage"
weight = 0.2

judgment.toml:

# MixSeek Judgment Configuration
# Generated by mixseek-evaluator-config skill

model = "google-gla:gemini-2.5-pro"
temperature = 0.0
timeout_seconds = 60
max_retries = 3

重み付けルール

重み付けには以下のルールがあります:

全て指定 or 全て省略: 一部のメトリクスだけに重みを指定することはできません
合計1.0: 全ての重みの合計は1.0（±0.001）である必要があります
省略時は均等: 重みを省略すると自動的に均等配分されます

# 有効: 全て指定
[[metrics]]
name = "ClarityCoherence"
weight = 0.5

[[metrics]]
name = "Coverage"
weight = 0.5

# 有効: 全て省略（均等配分）
[[metrics]]
name = "ClarityCoherence"

[[metrics]]
name = "Coverage"

# 無効: 一部のみ指定
[[metrics]]
name = "ClarityCoherence"
weight = 0.5  # ❌

[[metrics]]
name = "Coverage"
# weight省略 ❌

トラブルシューティング

重み合計エラー

Error: Weights must sum to 1.0

解決方法:

全ての重みの合計が1.0になるよう調整
または全ての重みを省略して均等配分

メトリクス名エラー

Error: Unknown metric name

解決方法:

有効なメトリクス名を使用: ClarityCoherence, Coverage, LLMPlain, Relevance
大文字小文字に注意

判定が不安定

解決方法:

judgment.tomlのtemperatureを0.0に設定（決定論的）
seedを固定値に設定

参照

TOMLスキーマ詳細: references/TOML-SCHEMA.md
標準メトリクス: references/METRICS.md
オーケストレーター設定: skills/mixseek-orchestrator-config/

drillan/mixseek-evaluator-config

skills/mixseek-evaluator-config/SKILL.md

MixSeekの評価設定ファイル（evaluator.toml、judgment.toml）を生成します。「評価設定を作成」「スコアリング設定」「判定設定を作って」「メトリクスを設定」といった依頼で使用してください。Submissionの評価基準と最終判定ロジックを定義します。

1 stars

data-ai

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add drillan/mixseek-plus mixseek-evaluator-config

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 6:27 PM4.3s3 files scanned

SKILL.md

name:: mixseek-evaluator-config
description:: MixSeekの評価設定ファイル（evaluator.toml、judgment.toml）を生成します。「評価設定を作成」「スコアリング設定」「判定設定を作って」「メトリクスを設定」といった依頼で使用してください。Submissionの評価基準と最終判定ロジックを定義します。
license:: Apache-2.0
compatibility:: Requires mixseek-core or mixseek-plus. Python 3.13+, uv recommended.
author:: mixseek
version:: 1.0.0

MixSeek 評価設定生成

概要

前提条件

ワークスペースが初期化されていること（mixseek-workspace-init参照）
環境変数 MIXSEEK_WORKSPACE が設定されていること（推奨）

生成ファイル

使用方法

Step 1: 要件のヒアリング

ユーザーに以下を確認してください:

評価の重点: 何を重視して評価するか（明確性、カバレッジ、関連性など）
重み付け: 各メトリクスの重要度（均等 or カスタム）
判定スタイル: 決定論的（temperature=0）or 多様性重視

Step 2: メトリクス設定の提案

標準メトリクスから選択:

Step 3: 設定ファイルの生成

evaluator.toml:

default_model = "google-gla:gemini-2.5-pro"
temperature = 0.0

[[metrics]]
name = "ClarityCoherence"
weight = 0.34

[[metrics]]
name = "Coverage"
weight = 0.33

[[metrics]]
name = "Relevance"
weight = 0.33

judgment.toml:

model = "google-gla:gemini-2.5-pro"
temperature = 0.0
timeout_seconds = 60

Step 4: ファイルの保存

$MIXSEEK_WORKSPACE/configs/evaluators/evaluator.toml
$MIXSEEK_WORKSPACE/configs/judgment/judgment.toml

# orchestrator.toml
[orchestrator]
evaluator_config = "configs/evaluators/evaluator.toml"
judgment_config = "configs/judgment/judgment.toml"

Step 5: 設定ファイルの検証（必須）

生成後は必ず検証を実行してください。

# Evaluator設定の検証
uv run python skills/mixseek-config-validate/scripts/validate-config.py \
    $MIXSEEK_WORKSPACE/configs/evaluators/evaluator.toml --type evaluator

# Judgment設定の検証
uv run python skills/mixseek-config-validate/scripts/validate-config.py \
    $MIXSEEK_WORKSPACE/configs/judgment/judgment.toml --type judgment

検証が成功したら、ユーザーに結果を報告します。失敗した場合は、エラー内容を確認して設定を修正してください。

標準メトリクス詳細

ClarityCoherence（明確性・一貫性）

回答の読みやすさと論理的一貫性を評価します。

評価観点:

文章構造の明確さ
論理的な流れ
専門用語の適切な使用
結論の明確さ

推奨用途:

説明文の生成
レポート作成
教育コンテンツ

Coverage（カバレッジ）

質問に対する回答の網羅性を評価します。

評価観点:

質問の全側面への対応
関連トピックの包含
例示の充実度
補足情報の有無

推奨用途:

リサーチタスク
FAQ作成
技術ドキュメント

Relevance（関連性）

回答が質問に対してどれだけ的確かを評価します。

評価観点:

質問への直接的な回答
不要な情報の排除
焦点の維持
文脈への適合

推奨用途:

Q&A
カスタマーサポート
検索結果の評価

LLMPlain（汎用LLM評価）

system_instructionで定義したカスタム評価基準に基づいてLLMが評価します。

特徴:

事前定義された評価ロジックを持たない
system_instructionで完全にカスタマイズ可能
特殊な評価基準が必要な場合に使用

推奨用途:

ドメイン固有の評価（法律、医療など）
プロジェクト固有の品質基準
他のメトリクスでカバーできない観点

設定例:

[[metrics]]
name = "LLMPlain"
weight = 0.5
system_instruction = """
セキュリティ観点から回答を評価してください:
1. 機密情報の漏洩リスク
2. 安全なコーディング実践
3. 脆弱性の有無
0-100のスコアで評価してください。
"""

例

均等重み付け設定

User: 評価設定を作成して

Agent: 評価設定を提案します。

       メトリクス（均等重み付け）:
       - ClarityCoherence: 33.4%
       - Coverage: 33.3%
       - Relevance: 33.3%

       この構成でよろしいですか？

User: はい

Agent: 設定ファイルを生成しました:
       ✓ configs/evaluators/evaluator.toml
       ✓ configs/judgment/judgment.toml

カスタム重み付け設定

User: 正確性を重視した評価設定を作って

Agent: 関連性（Relevance）を重視した設定を提案します。

       メトリクス:
       - Relevance: 50%（重視）
       - ClarityCoherence: 30%
       - Coverage: 20%

       この構成でよろしいですか？

User: はい

生成される設定ファイル例

evaluator.toml（カスタム重み付け）:

# MixSeek Evaluator Configuration
# Generated by mixseek-evaluator-config skill

default_model = "google-gla:gemini-2.5-pro"
temperature = 0.0
timeout_seconds = 300
max_retries = 3

[[metrics]]
name = "Relevance"
weight = 0.5

[[metrics]]
name = "ClarityCoherence"
weight = 0.3

[[metrics]]
name = "Coverage"
weight = 0.2

judgment.toml:

# MixSeek Judgment Configuration
# Generated by mixseek-evaluator-config skill

model = "google-gla:gemini-2.5-pro"
temperature = 0.0
timeout_seconds = 60
max_retries = 3

重み付けルール

重み付けには以下のルールがあります:

全て指定 or 全て省略: 一部のメトリクスだけに重みを指定することはできません
合計1.0: 全ての重みの合計は1.0（±0.001）である必要があります
省略時は均等: 重みを省略すると自動的に均等配分されます

# 有効: 全て指定
[[metrics]]
name = "ClarityCoherence"
weight = 0.5

[[metrics]]
name = "Coverage"
weight = 0.5

# 有効: 全て省略（均等配分）
[[metrics]]
name = "ClarityCoherence"

[[metrics]]
name = "Coverage"

# 無効: 一部のみ指定
[[metrics]]
name = "ClarityCoherence"
weight = 0.5  # ❌

[[metrics]]
name = "Coverage"
# weight省略 ❌

トラブルシューティング

重み合計エラー

Error: Weights must sum to 1.0

解決方法:

全ての重みの合計が1.0になるよう調整
または全ての重みを省略して均等配分

メトリクス名エラー

Error: Unknown metric name

解決方法:

有効なメトリクス名を使用: ClarityCoherence, Coverage, LLMPlain, Relevance
大文字小文字に注意

判定が不安定

解決方法:

judgment.tomlのtemperatureを0.0に設定（決定論的）
seedを固定値に設定

参照

TOMLスキーマ詳細: references/TOML-SCHEMA.md
標準メトリクス: references/METRICS.md
オーケストレーター設定: skills/mixseek-orchestrator-config/

Related Skills

drillan/mixseek-workspace-init

tools

VerifiedTrustedCommunity

MixSeekワークスペースを初期化し、設定ファイル用ディレクトリ構造を作成します。「ワークスペースを初期化」「mixseekのセットアップ」「設定ディレクトリを作成」「新しいプロジェクトを始める」といった依頼で使用してください。

1SKILL.mdUpdated Apr 4, 2026

drillan/mixseek-workspace-init

drillan/mixseek-team-config

development

VerifiedTrustedCommunity

MixSeekのチーム設定ファイル（team.toml）を生成します。「チームを作成」「エージェント設定を生成」「Web検索チームを作って」「分析チームを設定」といった依頼で使用してください。Leader AgentとMember Agentの構成を定義します。

1SKILL.mdUpdated Apr 4, 2026

drillan/mixseek-team-config

drillan/mixseek-prompt-builder

development

VerifiedTrustedCommunity

MixSeekのプロンプトビルダー設定ファイル（prompt_builder.toml）を生成します。「プロンプトを設定」「プロンプトビルダーを作成」「ラウンド別プロンプト」といった依頼で使用してください。

1SKILL.mdUpdated Apr 4, 2026

drillan/mixseek-prompt-builder

drillan/mixseek-orchestrator-config

data-ai

VerifiedTrustedCommunity

MixSeekのオーケストレーター設定ファイル（orchestrator.toml）を生成します。「オーケストレーターを設定」「チーム競合設定」「複数チームで競わせる」「マルチチーム実行設定」といった依頼で使用してください。複数チームを並列実行して最良の結果を選択する設定を定義します。

1SKILL.mdUpdated Apr 4, 2026

drillan/mixseek-orchestrator-config

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/drillan/mixseek-plus.git

# Copy into Claude Code skills folder (global)
cp -r mixseek-plus/skills/mixseek-evaluator-config ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

drillan/mixseek-plus

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT