plugins/multimodel/skills/model-tracking-protocol/SKILL.md
MANDATORY tracking protocol for multi-model validation. Creates structured tracking tables BEFORE launching models, tracks progress during execution, and ensures complete results presentation. Use when running 2+ external AI models in parallel.
npx skillsauth add madappgang/magus model-tracking-protocolInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Version: 1.0.0 Purpose: MANDATORY tracking protocol for multi-model validation to prevent incomplete reviews Status: Production Ready
This skill defines the MANDATORY tracking protocol for multi-model validation. It provides templates and procedures that make proper tracking unforgettable.
The Problem This Solves:
Agents often launch multiple external AI models but fail to:
The Solution:
This skill provides MANDATORY checklists, templates, and protocols that ensure complete tracking. Missing ANY of these steps = INCOMPLETE review.
You MUST complete ALL items before launching ANY external models.
This is NOT optional. If you skip this, your multi-model validation is INCOMPLETE.
PRE-LAUNCH VERIFICATION (complete before Task calls):
[ ] 1. SESSION_ID created: ________________________
[ ] 2. SESSION_DIR created: ________________________
[ ] 3. Tracking table written to: $SESSION_DIR/tracking.md
[ ] 4. Start time recorded: SESSION_START=$(date +%s)
[ ] 5. Model list confirmed (comma-separated): ________________________
[ ] 6. Per-model timing arrays initialized
[ ] 7. Code context written to session directory
[ ] 8. Tracking marker created: /tmp/.claude-multi-model-active
If ANY item is unchecked, STOP and complete it before proceeding.
Without pre-launch setup, you will:
CRITICAL CONSENSUS FIX APPLIED: Use file-based detection instead of environment variables.
#!/bin/bash
# Run this BEFORE launching any Task calls
# 1. Create unique session
SESSION_ID="review-$(date +%Y%m%d-%H%M%S)-$(head -c 4 /dev/urandom | xxd -p)"
SESSION_DIR="/tmp/${SESSION_ID}"
mkdir -p "$SESSION_DIR"
# 2. Record start time
SESSION_START=$(date +%s)
# 3. Create tracking table
cat > "$SESSION_DIR/tracking.md" << EOF
# Multi-Model Tracking
## Session Info
- Session ID: ${SESSION_ID}
- Started: $(date -u +%Y-%m-%dT%H:%M:%SZ)
- Models Requested: [FILL]
## Model Status
| Model | Agent ID | Status | Start | End | Duration | Issues | Quality | Notes |
|-------|----------|--------|-------|-----|----------|--------|---------|-------|
| [MODEL 1] | | pending | | | | | | |
| [MODEL 2] | | pending | | | | | | |
| [MODEL 3] | | pending | | | | | | |
## Failures
| Model | Failure Type | Error Message | Retry? |
|-------|--------------|---------------|--------|
## Consensus
| Issue | Model 1 | Model 2 | Model 3 | Agreement |
|-------|---------|---------|---------|-----------|
EOF
# 4. Initialize timing arrays
declare -A MODEL_START_TIMES
declare -A MODEL_END_TIMES
declare -A MODEL_STATUS
# 5. Create tracking marker file (CRITICAL FIX)
# This allows hooks to detect that tracking is active
echo "$SESSION_DIR" > /tmp/.claude-multi-model-active
echo "Pre-launch setup complete. Session: $SESSION_ID"
echo "Directory: $SESSION_DIR"
echo "Tracking table: $SESSION_DIR/tracking.md"
For stricter enforcement, set:
export CLAUDE_STRICT_TRACKING=true
When enabled, hooks will BLOCK execution if tracking is not set up, rather than just warning.
| Model | Status | Time | Issues | Quality | Cost |
|-------|--------|------|--------|---------|------|
| claude-embedded | pending | - | - | - | FREE |
| grok | pending | - | - | - | - |
| qwen | pending | - | - | - | FREE |
Update as each completes:
| Model | Status | Time | Issues | Quality | Cost |
|-------|--------|------|--------|---------|------|
| claude-embedded | success | 32s | 8 | 95% | FREE |
| grok | success | 45s | 6 | 87% | $0.002 |
| qwen | timeout | - | - | - | - |
## Model Execution Status
### Summary
- Total Requested: 8
- Completed: 0
- In Progress: 0
- Failed: 0
- Pending: 8
### Detailed Status
| # | Model | Provider | Status | Start | Duration | Issues | Quality | Cost | Error |
|---|-------|----------|--------|-------|----------|--------|---------|------|-------|
| 1 | claude-embedded | Anthropic | pending | - | - | - | - | FREE | - |
| 2 | grok | X-ai | pending | - | - | - | - | - | - |
| 3 | qwen | Qwen | pending | - | - | - | - | FREE | - |
| 4 | gemini | Google | pending | - | - | - | - | - | - |
| 5 | gpt | OpenAI | pending | - | - | - | - | - | - |
| 6 | devstral | Mistral | pending | - | - | - | - | FREE | - |
| 7 | deepseek-r1 | DeepSeek | pending | - | - | - | - | - | - |
| 8 | claude-sonnet | Anthropic | pending | - | - | - | - | - | - |
Create this file at $SESSION_DIR/tracking.md:
# Multi-Model Validation Tracking
Session: ${SESSION_ID}
Started: ${TIMESTAMP}
## Pre-Launch Verification
- [x] Session directory created: ${SESSION_DIR}
- [x] Tracking table initialized
- [x] Start time recorded: ${SESSION_START}
- [x] Model list: ${MODEL_LIST}
## Model Status
| Model | Status | Start | Duration | Issues | Quality |
|-------|--------|-------|----------|--------|---------|
| claude | pending | - | - | - | - |
| grok | pending | - | - | - | - |
| gemini | pending | - | - | - | - |
## Failures
(populated as failures occur)
## Consensus
(populated after all complete)
As each model completes, IMMEDIATELY update:
pending -> in_progress -> success/failed/timeoutDO NOT wait until all models finish. Update as each completes.
Do NOT wait until all models finish. Update tracking AS EACH COMPLETES.
# Call this when each model completes
update_model_status() {
local model="$1"
local status="$2"
local issues="${3:-0}"
local quality="${4:-}"
local error="${5:-}"
local end_time=$(date +%s)
local start_time="${MODEL_START_TIMES[$model]}"
local duration=$((end_time - start_time))
# Update arrays
MODEL_END_TIMES["$model"]=$end_time
MODEL_STATUS["$model"]="$status"
# Log update to session tracking file
echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) - Model: $model, Status: $status, Duration: ${duration}s" >> "$SESSION_DIR/execution.log"
# Update tracking table (append to tracking.md)
echo "| $model | $status | ${duration}s | $issues | ${quality:-N/A} | ${error:-} |" >> "$SESSION_DIR/tracking.md"
# Track performance in global statistics
if [[ "$status" == "success" ]]; then
track_model_performance "$model" "success" "$duration" "$issues" "$quality"
else
track_model_performance "$model" "$status" "$duration" 0 ""
fi
}
# Usage examples:
update_model_status "claude-embedded" "success" 8 95
update_model_status "grok" "success" 6 87
update_model_status "some-model" "timeout" 0 "" "Exceeded 120s limit"
update_model_status "other-model" "failed" 0 "" "API 500 error"
| Status | Meaning | Action |
|--------|---------|--------|
| pending | Not started | Wait |
| in_progress | Currently executing | Monitor |
| success | Completed successfully | Collect results |
| failed | Error during execution | Document error |
| timeout | Exceeded time limit | Note timeout |
| cancelled | User cancelled | Note cancellation |
Show user progress as models complete:
Model Status (3/5 complete):
✓ claude-embedded (32s, 8 issues)
✓ grok (45s, 6 issues)
✓ qwen (52s, 5 issues)
⏳ gpt (in progress, 60s elapsed)
⏳ gemini (in progress, 48s elapsed)
EVERY failed model MUST be documented with:
## Failed Models Report
### Model: grok
- **Failure Type:** API Error
- **Error Message:** "500 Internal Server Error from OpenRouter"
- **Retry Attempted:** Yes, 1 retry, same error
- **Impact:** Review results based on 3/4 models instead of 4
- **Recommendation:** Check OpenRouter status, retry later
### Model: gemini
- **Failure Type:** Timeout
- **Error Message:** "Exceeded 120s limit, response incomplete"
- **Retry Attempted:** No, time constraints
- **Impact:** Lost Gemini perspective, consensus based on remaining models
- **Recommendation:** Extend timeout to 180s for this model
| Category | Common Causes | Recovery | |----------|---------------|----------| | Timeout | Model slow, large input, network latency | Retry with extended timeout | | API Error | Provider down, rate limit, auth issue | Wait and retry, check API status | | Parse Error | Malformed response, encoding issue | Retry, simplify prompt | | Auth Error | Invalid API key, expired token | Check credentials | | Context Limit | Input too large for model | Reduce context, split task | | Rate Limit | Too many requests | Wait, implement backoff |
Always include this in final results:
## Execution Summary
| Metric | Value |
|--------|-------|
| Models Requested | 8 |
| Successful | 5 (62.5%) |
| Failed | 3 (37.5%) |
### Failed Models
| Model | Failure | Recoverable? | Action |
|-------|---------|--------------|--------|
| grok | API 500 | Yes - retry later | Check OpenRouter status |
| gemini | Timeout | Yes - extend limit | Use 180s timeout |
| deepseek-r1 | Auth Error | No - check key | Verify API key valid |
# Document failure immediately when it occurs
document_failure() {
local model="$1"
local failure_type="$2"
local error_msg="$3"
local retry_attempted="${4:-No}"
cat >> "$SESSION_DIR/failures.md" << EOF
### Model: $model
- **Failure Type:** $failure_type
- **Error Message:** "$error_msg"
- **Retry Attempted:** $retry_attempted
- **Timestamp:** $(date -u +%Y-%m-%dT%H:%M:%SZ)
EOF
echo "Failure documented: $model ($failure_type)" >&2
}
# Usage:
document_failure "grok" "API Error" "500 Internal Server Error" "Yes, 1 retry"
After ALL models complete (or max wait time), you MUST perform consensus analysis.
This is NOT optional. Even with 2 successful models, compare their findings.
With only 2 models, consensus is simple:
| Issue | Model 1 | Model 2 | Consensus |
|-------|---------|---------|-----------|
| SQL injection | Yes | Yes | AGREE |
| Missing validation | Yes | No | Model 1 only |
| Weak hashing | No | Yes | Model 2 only |
| Issue | Claude | Grok | Gemini | Agreement |
|-------|--------|------|--------|-----------|
| SQL injection | Yes | Yes | Yes | UNANIMOUS (3/3) |
| Missing validation | Yes | Yes | No | STRONG (2/3) |
| Rate limiting | Yes | No | No | DIVERGENT (1/3) |
For 6+ models, add summary statistics:
## Consensus Summary
- **Unanimous Issues (100%):** 3 issues
- **Strong Consensus (67%+):** 5 issues
- **Majority (50%+):** 2 issues
- **Divergent (<50%):** 4 issues
## Top 5 by Consensus
1. [6/6] SQL injection in search - FIX IMMEDIATELY
2. [6/6] Missing input validation - FIX IMMEDIATELY
3. [5/6] Weak password hashing - RECOMMENDED
4. [4/6] Missing rate limiting - CONSIDER
5. [3/6] Error handling gaps - INVESTIGATE
# Perform consensus analysis on all model findings
analyze_consensus() {
local session_dir="$1"
local num_models="$2"
echo "## Consensus Analysis" > "$session_dir/consensus.md"
echo "" >> "$session_dir/consensus.md"
echo "Based on $num_models model reviews:" >> "$session_dir/consensus.md"
echo "" >> "$session_dir/consensus.md"
# Read all review files and extract issues
# (simplified - actual implementation would parse review markdown)
for review in "$session_dir"/*-review.md; do
echo "Processing: $review"
# Extract issues, compare, categorize by agreement level
done
# Calculate consensus levels
echo "### Consensus Levels" >> "$session_dir/consensus.md"
echo "" >> "$session_dir/consensus.md"
echo "- UNANIMOUS: All $num_models models agree" >> "$session_dir/consensus.md"
echo "- STRONG: ≥67% of models agree" >> "$session_dir/consensus.md"
echo "- MAJORITY: ≥50% of models agree" >> "$session_dir/consensus.md"
echo "- DIVERGENT: <50% of models agree" >> "$session_dir/consensus.md"
}
If you present results without a consensus comparison, your review is INCOMPLETE.
Minimum Requirements:
$SESSION_DIR/consensus.mdYour final output MUST include ALL of these sections.
## Multi-Model Review Complete
### Execution Summary
| Metric | Value |
|--------|-------|
| Session ID | review-20251224-143052-a3f2 |
| Session Directory | /tmp/review-20251224-143052-a3f2 |
| Models Requested | 5 |
| Successful | 4 (80%) |
| Failed | 1 (20%) |
| Total Duration | 68s (parallel) |
| Sequential Equivalent | 245s |
| Speedup | 3.6x |
### Model Performance
| Model | Time | Issues | Quality | Status | Cost |
|-------|------|--------|---------|--------|------|
| claude-embedded | 32s | 8 | 95% | Success | FREE |
| grok | 45s | 6 | 87% | Success | $0.002 |
| qwen | 52s | 5 | 82% | Success | FREE |
| gpt | 68s | 7 | 89% | Success | $0.015 |
| devstral | - | - | - | Timeout | - |
### Failed Models
| Model | Failure | Error |
|-------|---------|-------|
| devstral | Timeout | Exceeded 120s limit |
### Top Issues by Consensus
1. **[UNANIMOUS]** SQL injection in search endpoint
- Flagged by: claude, grok, qwen, gpt-5 (4/4)
- Severity: CRITICAL
- Action: FIX IMMEDIATELY
2. **[UNANIMOUS]** Missing input validation
- Flagged by: claude, grok, qwen, gpt-5 (4/4)
- Severity: CRITICAL
- Action: FIX IMMEDIATELY
3. **[STRONG]** Weak password hashing
- Flagged by: claude, grok, gpt-5 (3/4)
- Severity: HIGH
- Action: RECOMMENDED
### Detailed Reports
- Session directory: /tmp/review-20251224-143052-a3f2
- Consolidated review: /tmp/review-20251224-143052-a3f2/consolidated-review.md
- Individual reviews: /tmp/review-20251224-143052-a3f2/{model}-review.md
- Tracking data: /tmp/review-20251224-143052-a3f2/tracking.md
- Consensus analysis: /tmp/review-20251224-143052-a3f2/consensus.md
### Statistics Saved
- Performance data logged to: ai-docs/llm-performance.json
Before presenting, verify ALL sections are present:
verify_output_complete() {
local output="$1"
local required=(
"Execution Summary"
"Model Performance"
"Top Issues"
"Detailed Reports"
"Statistics"
)
local missing=()
for section in "${required[@]}"; do
if ! echo "$output" | grep -q "$section"; then
missing+=("$section")
fi
done
if [ ${#missing[@]} -gt 0 ]; then
echo "ERROR: Missing required sections: ${missing[*]}" >&2
return 1
fi
return 0
}
Checklist before presenting results:
Symptom: Results presented as prose, not structured data
What went wrong:
"I ran 5 models. 3 succeeded and found various issues."
(No table, no structure)
Prevention:
$SESSION_DIR/tracking.md before Task callsDetection: SubagentStop hook warns if no tracking found
Symptom: "Duration: unknown" or missing speed stats
What went wrong:
# Launched models without recording start time
Task: reviewer1
Task: reviewer2
# No SESSION_START, cannot calculate duration!
Prevention:
# ALWAYS do this first
SESSION_START=$(date +%s)
MODEL_START_TIMES["model1"]=$SESSION_START
Detection: Hook checks for timing data in output
Symptom: "2 of 8 succeeded" with no failure details
What went wrong:
"Launched 8 models. 2 succeeded."
(No info on why 6 failed)
Prevention:
# Immediately when model fails
document_failure "model-name" "Timeout" "Exceeded 120s" "No"
Detection: Hook checks for failure section when success < total
Symptom: Individual model results listed without comparison
What went wrong:
"Model 1 found: A, B, C
Model 2 found: B, D, E"
(No comparison: which issues do they agree on?)
Prevention:
Detection: Hook checks for consensus keywords
Symptom: No record in ai-docs/llm-performance.json
What went wrong:
# Forgot to call tracking functions
# No record of this session
Prevention:
# ALWAYS call these
track_model_performance "model" "status" duration issues quality
record_session_stats total success failed parallel sequential speedup
Detection: Hook checks file modification time
Before presenting results, verify:
[ ] Tracking table exists at $SESSION_DIR/tracking.md
[ ] Tracking table is populated with all model results
[ ] All model times recorded (or "timeout"/"failed" noted)
[ ] All failures documented in $SESSION_DIR/failures.md
[ ] Consensus analysis performed in $SESSION_DIR/consensus.md
[ ] Results match required output format
[ ] Statistics saved to ai-docs/llm-performance.json
[ ] Session directory contains all artifacts
#!/bin/bash
# Full multi-model review with complete tracking
# ============================================================================
# PHASE 1: PRE-LAUNCH (MANDATORY)
# ============================================================================
# 1. Create unique session
SESSION_ID="review-$(date +%Y%m%d-%H%M%S)-$(head -c 4 /dev/urandom | xxd -p)"
SESSION_DIR="/tmp/${SESSION_ID}"
mkdir -p "$SESSION_DIR"
# 2. Record start time
SESSION_START=$(date +%s)
# 3. Create tracking table
cat > "$SESSION_DIR/tracking.md" << EOF
# Multi-Model Validation Tracking
## Session: $SESSION_ID
Started: $(date -u +%Y-%m-%dT%H:%M:%SZ)
## Model Status
| Model | Status | Duration | Issues | Quality |
|-------|--------|----------|--------|---------|
EOF
# 4. Initialize timing arrays
declare -A MODEL_START_TIMES
declare -A MODEL_END_TIMES
# 5. Create tracking marker
echo "$SESSION_DIR" > /tmp/.claude-multi-model-active
# 6. Write code context
git diff > "$SESSION_DIR/code-context.md"
echo "Pre-launch complete. Session: $SESSION_ID"
# ============================================================================
# PHASE 2: MODEL EXECUTION (Parallel Task calls)
# ============================================================================
# Record start times for each model
MODEL_START_TIMES["claude-embedded"]=$(date +%s)
MODEL_START_TIMES["grok"]=$(date +%s)
MODEL_START_TIMES["qwen"]=$(date +%s)
# Launch all models in single message (parallel execution)
# (These would be actual Task calls in practice)
echo "Launching 3 models in parallel..."
# ============================================================================
# PHASE 3: RESULTS COLLECTION (as each completes)
# ============================================================================
# Update status immediately after each completes
update_model_status() {
local model="$1" status="$2" issues="${3:-0}" quality="${4:-}"
local end_time=$(date +%s)
local duration=$((end_time - MODEL_START_TIMES["$model"]))
echo "| $model | $status | ${duration}s | $issues | ${quality:-N/A} |" >> "$SESSION_DIR/tracking.md"
track_model_performance "$model" "$status" "$duration" "$issues" "$quality"
}
# Example completions
update_model_status "claude-embedded" "success" 8 95
update_model_status "grok" "success" 6 87
update_model_status "qwen" "timeout"
# ============================================================================
# PHASE 4: CONSENSUS ANALYSIS (MANDATORY)
# ============================================================================
# Consolidate and compare findings
echo "Performing consensus analysis..."
# (Would launch consolidation agent here)
# ============================================================================
# PHASE 5: STATISTICS & PRESENTATION
# ============================================================================
# Calculate session stats
PARALLEL_TIME=52 # max of all durations
SEQUENTIAL_TIME=129 # sum of all durations
SPEEDUP=2.5
# Record session
record_session_stats 3 2 1 "$PARALLEL_TIME" "$SEQUENTIAL_TIME" "$SPEEDUP"
# Present results
cat << RESULTS
## Multi-Model Review Complete
Session: $SESSION_ID
Directory: $SESSION_DIR
Models: 3 requested, 2 successful, 1 failed
See tracking table: $SESSION_DIR/tracking.md
See consensus: $SESSION_DIR/consensus.md
Statistics saved to: ai-docs/llm-performance.json
RESULTS
# Cleanup marker
rm -f /tmp/.claude-multi-model-active
# Simplest viable multi-model validation
# Pre-launch
SESSION_ID="review-$(date +%s)"
SESSION_DIR="/tmp/$SESSION_ID"
mkdir -p "$SESSION_DIR"
SESSION_START=$(date +%s)
echo "$SESSION_DIR" > /tmp/.claude-multi-model-active
# Launch
echo "Launching Claude + Grok..."
# Task: claude-embedded
# Task: claudish grok
# Track
track_model_performance "claude" "success" 32 8 95
track_model_performance "grok" "success" 45 6 87
# Consensus
echo "Issues both found: SQL injection, missing validation" > "$SESSION_DIR/consensus.md"
# Stats
record_session_stats 2 2 0 45 77 1.7
# Cleanup
rm -f /tmp/.claude-multi-model-active
# Multi-model with failure handling
# Pre-launch (same as Example 1)
# ... setup code ...
# Launch 4 models
# ... Task calls ...
# Model 1: Success
update_model_status "claude" "success" 32 8 95
# Model 2: Success
update_model_status "grok" "success" 45 6 87
# Model 3: Timeout
update_model_status "gemini" "timeout"
document_failure "gemini" "Timeout" "Exceeded 120s limit" "No"
# Model 4: API Error
update_model_status "gpt5" "failed"
document_failure "gpt5" "API Error" "500 from OpenRouter" "Yes, 1 retry"
# Proceed with 2 successful models
if [ "$SUCCESS_COUNT" -ge 2 ]; then
echo "Proceeding with $SUCCESS_COUNT successful models"
# Consensus with partial data
else
echo "ERROR: Only $SUCCESS_COUNT succeeded, need minimum 2"
fi
multi-model-validationThe multi-model-validation skill defines the execution patterns (4-Message Pattern, parallel execution, proxy mode). This skill (model-tracking-protocol) defines the tracking infrastructure.
Use together:
skills: multimodel:multi-model-validation, multimodel:model-tracking-protocol
Workflow:
multi-model-validation for execution patternsmodel-tracking-protocol for tracking setupquality-gatesUse quality gates to ensure tracking is complete before proceeding:
# After tracking setup, verify completeness
if [ ! -f "$SESSION_DIR/tracking.md" ]; then
echo "QUALITY GATE FAILED: No tracking table"
exit 1
fi
# Before presenting results, verify all sections present
verify_output_complete "$OUTPUT" || exit 1
task-orchestrationTrack progress through multi-model phases:
Tasks:
1. [~] Pre-launch setup (tracking protocol)
2. [ ] Launch models (validation patterns)
3. [ ] Collect results (tracking updates)
4. [ ] Consensus analysis (protocol requirement)
5. [ ] Present results (protocol template)
Create marker after pre-launch setup:
echo "$SESSION_DIR" > /tmp/.claude-multi-model-active
Check if tracking active (in hooks):
if [[ -f /tmp/.claude-multi-model-active ]]; then
SESSION_DIR=$(cat /tmp/.claude-multi-model-active)
[[ -f "$SESSION_DIR/tracking.md" ]] && echo "Tracking active"
fi
Remove marker when done:
rm -f /tmp/.claude-multi-model-active
SESSION_ID="review-$(date +%Y%m%d-%H%M%S)-$(head -c 4 /dev/urandom | xxd -p)"
SESSION_DIR="/tmp/${SESSION_ID}"
mkdir -p "$SESSION_DIR"
SESSION_START=$(date +%s)
echo "$SESSION_DIR" > /tmp/.claude-multi-model-active
update_model_status "model" "status" issues quality
document_failure "model" "type" "error" "retry"
track_model_performance "model" "status" duration issues quality
record_session_stats total success failed parallel sequential speedup
verify_output_complete "$OUTPUT"
[ -f "$SESSION_DIR/tracking.md" ] && echo "Tracking exists"
[ -f ai-docs/llm-performance.json ] && echo "Statistics saved"
This skill provides MANDATORY tracking infrastructure for multi-model validation:
Key Innovation: File-based tracking marker (/tmp/.claude-multi-model-active) allows hooks to detect active tracking without relying on environment variables.
Use this skill when: Running 2+ external AI models in parallel for validation, review, or consensus analysis.
Missing tracking = INCOMPLETE validation.
testing
A test skill for validation testing. Use when testing skill parsing and validation logic.
tools
--- name: bad-skill description: This skill has invalid YAML in frontmatter allowed-tools: [invalid, array, syntax prerequisites: not-an-array --- # Bad Skill This skill has malformed frontmatter that should fail parsing. The YAML has: - Unclosed array bracket - Wrong type for prerequisites (should be array, not string)
development
Sync model aliases from the curated Firebase database. Fetches default model assignments, short aliases, team compositions, and known model metadata from the claudish API. Run this to get fresh model recommendations.
tools
Release one or more Magus plugins to the distribution repos (magus, magus-alpha, magus-marketing). Handles version inference from git history, marketplace.json updates, tagging, and force-push to lean dist repos. Use whenever the user says "release kanban", "release the dev plugin", "cut a new version of gtd", "bump kanban to 1.7", or hands you a batch like "release kanban and gtd". Also use for multi-plugin releases and for checking what a release would contain before committing.