Candidate Compare

Data-driven candidate selection based on interview performance.

Safety Contract

Before fetching interview transcripts, saving comparison notes, or sharing recommendations, show the candidate names, role, transcript/JD sources, destination path, and whether personally sensitive hiring feedback is included. Require explicit user approval in the current turn before fetching from external systems or writing local files. Do not contact candidates, update ATS records, or make hiring decisions automatically from this skill.

When to use: After interviewing multiple candidates for the same role, compare them objectively

Prerequisites:

fathom-fetch-meetings — Get interview transcripts

workable-fetch-jd — Pull job description for evaluation criteria

Output: Ranked candidates with scores, strengths/gaps analysis, hiring recommendation

Workflow

Step 1: Parse Input

Expected format:

User: "compare Sarah, John, Ahmed for Product Manager"

Extract:

Candidate names: ["Sarah", "John", "Ahmed"]
Role: Product Manager

Alternative triggers:

"select best candidate for [role]" → ask for names
"who should we hire for [role]" → ask for names
"rank candidates" → ask for role and names

Step 2: Fetch Job Description

python3 03-skills/workable-fetch-jd/scripts/fetch_jd.py --job "[role]"

Extract key requirements from JD:

Required experience (years, type)
Technical skills
Soft skills
Nice-to-haves

Step 3: Fetch Transcripts

For each candidate, search Fathom:

# Search for meetings containing candidate name
curl -s --request GET \
  --url 'https://api.fathom.ai/external/v1/meetings?include_summary=true&include_transcript=true' \
  --header 'X-Api-Key: {FATHOM_API_KEY}' \
  | jq '.items[] | select(.title | ascii_downcase | contains("{candidate_name}"))'

If multiple meetings found for a candidate, pick the most recent or ask user to select.

Fallback: If Fathom unavailable, ask user to paste transcripts.

Step 4: Evaluate Each Candidate

Score each candidate on JD requirements:

| Dimension | Weight | What to Look For | |-----------|--------|------------------| | Experience Match | 25% | Years, industry, relevant projects | | Technical Skills | 25% | Demonstrated competency in required skills | | Problem Solving | 20% | How they approached challenges, structured thinking | | Communication | 15% | Clarity, conciseness, ability to explain complex ideas | | Culture Fit | 15% | Values alignment, work style, collaboration signals |

Scoring: 1-5 scale per dimension

5: Exceeds requirements
4: Meets requirements strongly
3: Meets requirements adequately
2: Partially meets requirements
1: Does not meet requirements

Step 4b: Barrel vs Ammunition Classification (REQUIRED)

Alongside the JD-fit score, classify each candidate as BARREL / POTENTIAL BARREL / AMMUNITION / PASS using Keith Rabois's framework. This is a pattern match, not a checklist — fire the "barrel alarm" on operator type, not on score arithmetic.

The 7 indicators (look for evidence in transcripts, project stories, summaries):

| Indicator | Signal | Anti-signal | |-----------|--------|-------------| | Takes initiative | "I noticed X and built Y without waiting" | "I was assigned to..." | | Ships high-quality work | "We iterated until it shipped at 95% accuracy" | "We hit the deadline" | | Values speed | "Rough first version live in a week, then iterated" | "We spent months designing" | | Takes accountability | "I owned it — and when it failed, here's what I did" | "The team's deliverable was..." | | Seen as a resource | "Teammates came to me when X broke" | No peer-pull evidence | | Recruits & motivates others | "I convinced 3 engineers to..." | "I worked with my assigned team" | | Handles adversity | "Compliance blocked us so I rebuilt on-prem" | "We were waiting for approval" |

Barrel arc (strongest single signal): Can the candidate credibly narrate Understand → Ideate → Take initiative → Recruit others → Deliver results on a real project? If yes, that's a BARREL flag even if individual indicators are missing.

Ownership language test: Count "I shipped" / "I owned" / "I convinced" vs "we did" / "the team" / "I was asked." Heavy "we"-framing without specifics is an Ammunition tell.

Classification rules:

| Tag | Criteria | Recommendation override | |-----|----------|------------------------| | BARREL 🛢️ | Most indicators fire AND can narrate the full arc on a real project | "Close on sight" — hire even if JD-fit weak, find or shape a role (per Rabois) | | POTENTIAL BARREL | 3–4 indicators fire OR partial arc with strong single signals | Watch-list; might emerge with right environment + stretch | | AMMUNITION | High-quality execution within defined scope; doesn't pull others | Hire if JD-fit strong. Not a put-down — necessary, just bounded throughput | | PASS | Neither barrel signals nor quality execution signals | Decline |

Critical: If a candidate classifies as BARREL but JD-fit < 3.5/5, the final recommendation must surface the Rabois override: "Close them anyway — find or shape a role later." Do not soften this; it's the user's hiring philosophy.

Evidence-or-flag rule: If an indicator can't be assessed from available data, write "insufficient evidence" — don't score it 0. Barrel classification is qualitative, not arithmetic.

Step 5: Compare & Rank

Calculate weighted JD-fit score for each candidate:

JD Fit = (Experience × 0.25) + (Technical × 0.25) + (Problem Solving × 0.20) + (Communication × 0.15) + (Culture × 0.15)

Rank candidates by JD-fit score, but report the Barrel classification independently — it is not a tiebreaker, it is a separate decision dimension that can override the JD-fit rank (see BARREL override above).

Output Format

This is THE format. Do not deviate without explicit user instruction.

# Candidate Comparison: [Role]

## Summary

| Rank | Candidate | JD Fit | Barrel | Recommendation |
|------|-----------|--------|--------|----------------|
| 1 | [Name] | X.X/5 | [BARREL 🛢️ / POTENTIAL BARREL / Ammunition / Pass] | [verdict] |
| 2 | [Name] | X.X/5 | [classification] | [verdict] |

> **Override:** [Triggered — see Recommendation] OR [Not triggered. No candidate classifies as BARREL, so JD-fit rank stands.]

Always print the Override line, even when not triggered — it confirms the check ran.

---

## [Candidate Name] — JD Fit X.X/5 · [CLASSIFICATION]

> Header embeds classification inline. No separate "Detailed Analysis" wrapper section, no rank number prefix, no ⭐ — the rank lives in the Summary table.

**JD-Fit Scores:**
| Dimension | Score | Notes |
|-----------|-------|-------|
| Experience Match | /5 | [brief note] |
| Technical Skills | /5 | [brief note] |
| Problem Solving | /5 | [brief note] |
| Communication | /5 | [brief note] |
| Culture Fit | /5 | [brief note] |

**Barrel Assessment:**

| Indicator | Fired? | Evidence |
|-----------|--------|----------|
| Takes initiative | ✅ / ⚠️ / ❌ / — | "[quote or 1-line evidence; '—' = insufficient evidence, never score 0]" |
| Ships high-quality work | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Values speed | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Takes accountability | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Seen as a resource | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Recruits & motivates | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Handles adversity | ✅ / ⚠️ / ❌ / — | "[evidence]" |

**Barrel arc** (Understand → Ideate → Initiative → Recruit → Deliver): **[Yes / Partial / No]** — [1-sentence note on whether they can credibly narrate the full loop on a real project named in the transcript]

**Ownership language ratio:** [observation on "I shipped/owned/convinced" vs "we did / the team / I was assigned"]

**Why [CLASSIFICATION] not [next-tier-up or next-tier-down]:** [REQUIRED — 2–4 sentences explaining the classification boundary call. Name the structural tells, not just the indicator counts. This is the most important paragraph in the section.]

**Strengths:**
- [Strength 1 with evidence]
- [Strength 2 with evidence]

**Gaps/Risks:**
- [Gap 1 — and how to mitigate or what to probe in next round]

**Key Quote:**
> "[memorable quote from interview]"

---

## [Next Candidate Name] — JD Fit X.X/5 · [CLASSIFICATION]

[Same structure — repeat the full block per candidate]

---

## Recommendation

**Hire / Advance / Pass:** [Name(s) and decision]

**Rationale:** [2–3 sentences. MUST cite both JD-fit AND Barrel classification. If both candidates are non-Barrel, name the structural reason for the recommendation (e.g., service-vs-product mismatch, AI/ML gap), not just the score gap.]

**Barrel override:**
- If a candidate classifies as **BARREL** but ranks below on JD-fit, surface verbatim: *"[BARREL Name] ranked below [other] on JD-fit, but is a Barrel. Per Rabois, close them — find or shape a role. Do not let JD-fit alone make this call."*
- If no override triggers, write: *"No override. JD-fit rank stands."*

**Probing questions for next round** (if any candidate is POTENTIAL BARREL):
- [Specific question to test the missing arc step, e.g., "Walk me through a project you championed against resistance — original idea, who else you needed, what you'd do differently."]
- [Specific question to confirm or downgrade the classification]

**Next Steps:**
- [ ] [Action item 1]
- [ ] [Action item 2]

---

## Comparison Matrix (vs. JD)

| Requirement (from JD) | [Candidate 1] | [Candidate 2] | [Candidate 3] |
|-----------------------|---------------|---------------|---------------|
| [Req 1] | ✅ Strong | ⚠️ Partial | ❌ Gap |
| [Req 2] | ✅ Strong | ✅ Strong | ⚠️ Partial |
| [Req 3] | ⚠️ Partial | ✅ Strong | ✅ Strong |

---

## Notes (optional but recommended)

- Use this section for structural observations: interviewer consistency, transcript quality caveats, hypotheses about why disqualified candidates were rejected vs. what the screening surfaced, etc.
- Example: "Both screens by Rabia Zaki — interviewer consistency holds." or "Adeel's summary is ~50% longer than Hannan's — could reflect substance or just verbosity; don't over-weight."

Triggers

| User Says | Action | |-----------|--------| | "compare [name1], [name2], [name3] for [role]" | Full workflow | | "compare candidates for [role]" | Ask for candidate names | | "select best candidate" | Ask for role and names | | "who should we hire for [role]" | Ask for names | | "rank candidates for [role]" | Ask for names | | "best candidate for [role]" | Ask for names |

Integration

Connected skills: | Skill | Purpose | |-------|---------| | fathom-fetch-meetings | Get interview transcripts by candidate name | | workable-fetch-jd | Pull job requirements for evaluation criteria |

Example Flow:

Manager: "compare Sarah Chen, John Smith, Ahmed Hassan for Product Manager"
→ AI fetches Product Manager JD from Workable
→ AI searches Fathom for "Sarah Chen", "John Smith", "Ahmed Hassan"
→ AI evaluates each against JD requirements
→ AI outputs ranked comparison with recommendation

Edge Cases

| Scenario | Handling | |----------|----------| | Candidate not found in Fathom | Ask user to paste transcript or skip candidate | | Multiple meetings for same candidate | Use most recent, or ask user to select | | JD not found in Workable | Ask user to paste JD or describe role requirements | | Only 2 candidates | Run comparison with 2 | | Tie in scores | Break tie with qualitative analysis, highlight trade-offs |

Bias Mitigation

Evaluate all candidates against the same criteria (JD requirements)
Use specific evidence from transcripts, not impressions
Flag when a dimension lacks data: "Insufficient evidence to score"
Present trade-offs, not just a winner

Version: 1.2 | Created: 2026-02-06 | Updated: 2026-06-12

Changelog:

v1.2: Hardcoded output format. Per-candidate header now embeds classification inline (## Name — JD Fit X.X/5 · CLASSIFICATION), no rank prefix or ⭐. Override line always printed under Summary (even when not triggered). Required "Why [CLASSIFICATION] not [next-tier]:" rationale paragraph per candidate. Probing-questions block in Recommendation when any candidate is POTENTIAL BARREL. Notes section for structural observations.
v1.1: Added Barrel vs Ammunition classification (Keith Rabois framework) as a parallel decision dimension to JD-fit. 4-tier tag (BARREL / POTENTIAL BARREL / AMMUNITION / PASS), 7-indicator pattern match + Barrel-arc test, "close on sight" override when BARREL underperforms on JD-fit.
v1.0: Initial creation — compare up to 5 candidates, weighted scoring, JD-based evaluation

Candidate Compare

Data-driven candidate selection based on interview performance.

Safety Contract

When to use: After interviewing multiple candidates for the same role, compare them objectively

Prerequisites:

fathom-fetch-meetings — Get interview transcripts

workable-fetch-jd — Pull job description for evaluation criteria

Output: Ranked candidates with scores, strengths/gaps analysis, hiring recommendation

Workflow

Step 1: Parse Input

Expected format:

User: "compare Sarah, John, Ahmed for Product Manager"

Extract:

Candidate names: ["Sarah", "John", "Ahmed"]
Role: Product Manager

Alternative triggers:

"select best candidate for [role]" → ask for names
"who should we hire for [role]" → ask for names
"rank candidates" → ask for role and names

Step 2: Fetch Job Description

python3 03-skills/workable-fetch-jd/scripts/fetch_jd.py --job "[role]"

Extract key requirements from JD:

Required experience (years, type)
Technical skills
Soft skills
Nice-to-haves

Step 3: Fetch Transcripts

For each candidate, search Fathom:

# Search for meetings containing candidate name
curl -s --request GET \
  --url 'https://api.fathom.ai/external/v1/meetings?include_summary=true&include_transcript=true' \
  --header 'X-Api-Key: {FATHOM_API_KEY}' \
  | jq '.items[] | select(.title | ascii_downcase | contains("{candidate_name}"))'

If multiple meetings found for a candidate, pick the most recent or ask user to select.

Fallback: If Fathom unavailable, ask user to paste transcripts.

Step 4: Evaluate Each Candidate

Score each candidate on JD requirements:

Scoring: 1-5 scale per dimension

5: Exceeds requirements
4: Meets requirements strongly
3: Meets requirements adequately
2: Partially meets requirements
1: Does not meet requirements

Step 4b: Barrel vs Ammunition Classification (REQUIRED)

The 7 indicators (look for evidence in transcripts, project stories, summaries):

Ownership language test: Count "I shipped" / "I owned" / "I convinced" vs "we did" / "the team" / "I was asked." Heavy "we"-framing without specifics is an Ammunition tell.

Classification rules:

Evidence-or-flag rule: If an indicator can't be assessed from available data, write "insufficient evidence" — don't score it 0. Barrel classification is qualitative, not arithmetic.

Step 5: Compare & Rank

Calculate weighted JD-fit score for each candidate:

JD Fit = (Experience × 0.25) + (Technical × 0.25) + (Problem Solving × 0.20) + (Communication × 0.15) + (Culture × 0.15)

Output Format

This is THE format. Do not deviate without explicit user instruction.

# Candidate Comparison: [Role]

## Summary

| Rank | Candidate | JD Fit | Barrel | Recommendation |
|------|-----------|--------|--------|----------------|
| 1 | [Name] | X.X/5 | [BARREL 🛢️ / POTENTIAL BARREL / Ammunition / Pass] | [verdict] |
| 2 | [Name] | X.X/5 | [classification] | [verdict] |

> **Override:** [Triggered — see Recommendation] OR [Not triggered. No candidate classifies as BARREL, so JD-fit rank stands.]

Always print the Override line, even when not triggered — it confirms the check ran.

---

## [Candidate Name] — JD Fit X.X/5 · [CLASSIFICATION]

> Header embeds classification inline. No separate "Detailed Analysis" wrapper section, no rank number prefix, no ⭐ — the rank lives in the Summary table.

**JD-Fit Scores:**
| Dimension | Score | Notes |
|-----------|-------|-------|
| Experience Match | /5 | [brief note] |
| Technical Skills | /5 | [brief note] |
| Problem Solving | /5 | [brief note] |
| Communication | /5 | [brief note] |
| Culture Fit | /5 | [brief note] |

**Barrel Assessment:**

| Indicator | Fired? | Evidence |
|-----------|--------|----------|
| Takes initiative | ✅ / ⚠️ / ❌ / — | "[quote or 1-line evidence; '—' = insufficient evidence, never score 0]" |
| Ships high-quality work | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Values speed | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Takes accountability | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Seen as a resource | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Recruits & motivates | ✅ / ⚠️ / ❌ / — | "[evidence]" |
| Handles adversity | ✅ / ⚠️ / ❌ / — | "[evidence]" |

**Barrel arc** (Understand → Ideate → Initiative → Recruit → Deliver): **[Yes / Partial / No]** — [1-sentence note on whether they can credibly narrate the full loop on a real project named in the transcript]

**Ownership language ratio:** [observation on "I shipped/owned/convinced" vs "we did / the team / I was assigned"]

**Why [CLASSIFICATION] not [next-tier-up or next-tier-down]:** [REQUIRED — 2–4 sentences explaining the classification boundary call. Name the structural tells, not just the indicator counts. This is the most important paragraph in the section.]

**Strengths:**
- [Strength 1 with evidence]
- [Strength 2 with evidence]

**Gaps/Risks:**
- [Gap 1 — and how to mitigate or what to probe in next round]

**Key Quote:**
> "[memorable quote from interview]"

---

## [Next Candidate Name] — JD Fit X.X/5 · [CLASSIFICATION]

[Same structure — repeat the full block per candidate]

---

## Recommendation

**Hire / Advance / Pass:** [Name(s) and decision]

**Rationale:** [2–3 sentences. MUST cite both JD-fit AND Barrel classification. If both candidates are non-Barrel, name the structural reason for the recommendation (e.g., service-vs-product mismatch, AI/ML gap), not just the score gap.]

**Barrel override:**
- If a candidate classifies as **BARREL** but ranks below on JD-fit, surface verbatim: *"[BARREL Name] ranked below [other] on JD-fit, but is a Barrel. Per Rabois, close them — find or shape a role. Do not let JD-fit alone make this call."*
- If no override triggers, write: *"No override. JD-fit rank stands."*

**Probing questions for next round** (if any candidate is POTENTIAL BARREL):
- [Specific question to test the missing arc step, e.g., "Walk me through a project you championed against resistance — original idea, who else you needed, what you'd do differently."]
- [Specific question to confirm or downgrade the classification]

**Next Steps:**
- [ ] [Action item 1]
- [ ] [Action item 2]

---

## Comparison Matrix (vs. JD)

| Requirement (from JD) | [Candidate 1] | [Candidate 2] | [Candidate 3] |
|-----------------------|---------------|---------------|---------------|
| [Req 1] | ✅ Strong | ⚠️ Partial | ❌ Gap |
| [Req 2] | ✅ Strong | ✅ Strong | ⚠️ Partial |
| [Req 3] | ⚠️ Partial | ✅ Strong | ✅ Strong |

---

## Notes (optional but recommended)

- Use this section for structural observations: interviewer consistency, transcript quality caveats, hypotheses about why disqualified candidates were rejected vs. what the screening surfaced, etc.
- Example: "Both screens by Rabia Zaki — interviewer consistency holds." or "Adeel's summary is ~50% longer than Hannan's — could reflect substance or just verbosity; don't over-weight."

Triggers

Integration

Example Flow:

Manager: "compare Sarah Chen, John Smith, Ahmed Hassan for Product Manager"
→ AI fetches Product Manager JD from Workable
→ AI searches Fathom for "Sarah Chen", "John Smith", "Ahmed Hassan"
→ AI evaluates each against JD requirements
→ AI outputs ranked comparison with recommendation

Edge Cases

Bias Mitigation

Evaluate all candidates against the same criteria (JD requirements)
Use specific evidence from transcripts, not impressions
Flag when a dimension lacks data: "Insufficient evidence to score"
Present trade-offs, not just a winner

Version: 1.2 | Created: 2026-02-06 | Updated: 2026-06-12

Changelog:

v1.2: Hardcoded output format. Per-candidate header now embeds classification inline (## Name — JD Fit X.X/5 · CLASSIFICATION), no rank prefix or ⭐. Override line always printed under Summary (even when not triggered). Required "Why [CLASSIFICATION] not [next-tier]:" rationale paragraph per candidate. Probing-questions block in Recommendation when any candidate is POTENTIAL BARREL. Notes section for structural observations.
v1.1: Added Barrel vs Ammunition classification (Keith Rabois framework) as a parallel decision dimension to JD-fit. 4-tier tag (BARREL / POTENTIAL BARREL / AMMUNITION / PASS), 7-indicator pattern match + Barrel-arc test, "close on sight" override when BARREL underperforms on JD-fit.
v1.0: Initial creation — compare up to 5 candidates, weighted scoring, JD-based evaluation

Adoption

beam-ai-team/candidate-compare

$ install --global

Security Scan Results

SKILL.md

Candidate Compare

Safety Contract

Workflow

Step 1: Parse Input

Step 2: Fetch Job Description

Step 3: Fetch Transcripts

Step 4: Evaluate Each Candidate

Step 4b: Barrel vs Ammunition Classification (REQUIRED)

Step 5: Compare & Rank

Output Format

Triggers

Integration

Edge Cases

Bias Mitigation

Related Skills

beam-ai-team/use-case-proposal

beam-ai-team/beam-figma-to-html-slides

beam-ai-team/beam-ai-slide-library

beam-ai-team/beam-ai-deck-templates

beam-ai-team/candidate-compare

$ install --global

Security Scan Results

SKILL.md

Candidate Compare

Safety Contract

Workflow

Step 1: Parse Input

Step 2: Fetch Job Description

Step 3: Fetch Transcripts

Step 4: Evaluate Each Candidate

Step 4b: Barrel vs Ammunition Classification (REQUIRED)

Step 5: Compare & Rank

Output Format

Triggers

Integration

Edge Cases

Bias Mitigation

Related Skills

beam-ai-team/use-case-proposal

beam-ai-team/beam-figma-to-html-slides

beam-ai-team/beam-ai-slide-library

beam-ai-team/beam-ai-deck-templates