skills/multi-model-meta-analysis/SKILL.md
Synthesize outputs from multiple AI models into a comprehensive, verified assessment. Use when: (1) User pastes feedback/analysis from multiple LLMs (Claude, GPT, Gemini, etc.) about code or a project, (2) User wants to consolidate model outputs into a single reliable document, (3) User needs conflicting model claims resolved against actual source code. This skill verifies model claims against the codebase, resolves contradictions with evidence, and produces a more reliable assessment than any single model.
npx skillsauth add petekp/agent-skills multi-model-meta-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Combine outputs from multiple AI models into a verified, comprehensive assessment by cross-referencing claims against the actual codebase.
Models hallucinate and contradict each other. The source code is the source of truth. Every significant claim must be verified before inclusion in the final assessment.
Parse each model's output and extract discrete claims:
Tag each claim with its source model.
Group semantically equivalent claims:
Create canonical phrasing. Track which models mentioned each.
For each factual claim or identified issue:
CLAIM: "The auth middleware doesn't check token expiry"
VERIFY: Read the auth middleware file
FINDING: [Confirmed | Refuted | Partially true | Cannot verify]
EVIDENCE: [Quote relevant code or explain why claim is wrong]
Use Grep, Glob, and Read tools to locate and examine relevant code. Do not trust model claims without verification.
When models contradict each other:
CONFLICT: Model A says "uses SHA-256", Model B says "uses MD5"
INVESTIGATION: Read crypto.js lines 45-60
RESOLUTION: Model B is correct - line 52 shows MD5 usage
EVIDENCE: `const hash = crypto.createHash('md5')`
Produce a final document that:
# Synthesized Assessment: [Topic]
## Summary
[2-3 sentences describing the verified findings]
## Verified Findings
### Confirmed Issues
| Issue | Severity | Evidence | Models |
|-------|----------|----------|--------|
| [Issue] | High/Med/Low | [file:line or quote] | Claude, GPT |
### Refuted Claims
| Claim | Source | Reality |
|-------|--------|---------|
| [What model said] | GPT-4 | [What code actually shows] |
### Unverifiable Claims
| Claim | Source | Why Unverifiable |
|-------|--------|------------------|
| [Claim] | Claude | [Requires runtime testing / external system / etc.] |
## Consensus Recommendations
[Items where 2+ models agree AND verification supports the suggestion]
## Unique Insights Worth Considering
[Valuable suggestions from single models that weren't contradicted]
## Conflicts Resolved
| Topic | Model A | Model B | Verdict | Evidence |
|-------|---------|---------|---------|----------|
| [Topic] | [Position] | [Position] | [Which is correct] | [Code reference] |
## Action Items
### Critical (Verified, High Impact)
- [ ] [Item] — Evidence: [file:line]
### Important (Verified, Medium Impact)
- [ ] [Item] — Evidence: [file:line]
### Suggested (Unverified but Reasonable)
- [ ] [Item] — Source: [Models]
Always verify:
Trust but note source:
Mark as unverifiable:
development
Compile a plain-language task into a concise, auditable Codex or Claude Code `/goal`, or explain why a normal prompt fits better. Use when the user asks to draft, formulate, rewrite, tighten, or create a goal for multi-step work that needs a durable objective, transcript-visible proof, constraints, bounded stop conditions, host-aware operation, and risk-based review depth.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.
testing
Apply professional typography principles to create readable, hierarchical, and aesthetically refined interfaces. Use when setting type scales, choosing fonts, adjusting spacing, designing text-heavy layouts, implementing dark mode typography, or when asked about readability, font pairing, line height, measure, typographic hierarchy, variable fonts, font loading, or OpenType features.
development
Create visual parameter tuning panels for iterative adjustment of animations, layouts, colors, typography, physics, or any numeric/visual values. Use when the user asks to "create a tuning panel", "add parameter controls", "build a debug panel", "tweak parameters visually", "fine-tune values", "dial in the settings", or "adjust parameters interactively". Also triggers on mentions of "leva", "dat.GUI", or "tweakpane".