<Purpose> Deep Interview implements Ouroboros-inspired Socratic questioning with mathematical ambiguity scoring. It replaces vague ideas with crystal-clear specifications by asking targeted questions that expose hidden assumptions, measuring clarity across weighted dimensions, and refusing to proceed until ambiguity drops below the resolved threshold for this run. The output feeds into a planning step: **deep-interview → prometheus (planning → execution)**, ensuring maximum clarity at every stage. </Purpose>

<Use_When>

User has a vague idea and wants thorough requirements gathering before execution
User says "deep interview", "interview me", "ask me everything", "don't assume", "make sure you understand"
User says "ouroboros", "socratic", "I have a vague idea", "not sure exactly what I want"
User wants to avoid "that's not what I meant" outcomes from autonomous execution
Task is complex enough that jumping to code would waste cycles on scope discovery
User wants mathematically-validated clarity before committing to execution </Use_When>

<Do_Not_Use_When>

User has a detailed, specific request with file paths, function names, or acceptance criteria -- execute directly
User wants to explore options or brainstorm -- use prometheus skill instead
User wants a quick fix or single change -- delegate to executor or sisyphus-junior
User says "just do it" or "skip the questions" -- respect their intent
User already has a PRD or plan file -- use prometheus or sisyphus with that plan </Do_Not_Use_When>

<Why_This_Exists> AI can build anything. The hard part is knowing what to build. Deep Interview applies Socratic methodology to iteratively expose assumptions and mathematically gate readiness, ensuring the AI has genuine clarity before spending execution cycles.

Inspired by the Ouroboros project which demonstrated that specification quality is the primary bottleneck in AI-assisted development. </Why_This_Exists>

<Execution_Policy>

Ask ONE question at a time -- never batch multiple questions
Target the WEAKEST clarity dimension with each question
Make weakest-dimension targeting explicit every round: name the weakest dimension, state its score/gap, and explain why the next question is aimed there
Gather codebase facts via explore agent BEFORE asking the user about them
For brownfield confirmation questions, cite the repo evidence that triggered the question (file path, symbol, or pattern) instead of asking the user to rediscover it
Score ambiguity after every answer -- display the score transparently
Keep prompt payloads budgeted: summarize or trim oversized initial context/history before composing question, scoring, spec, or handoff prompts
If the user's initial context is oversized, create a concise prompt-safe summary first and wait for that summary before ambiguity scoring, question generation, or downstream execution handoff
Do not proceed to execution until ambiguity ≤ the resolved threshold for this run
Allow early exit with a clear warning if ambiguity is still high
Persist interview state for resume across session interruptions
Challenge agents activate at specific round thresholds to shift perspective </Execution_Policy>

<Steps>

Phase 1: Initialize

Parse the user's idea from {{ARGUMENTS}}
Detect brownfield vs greenfield:
- Run explore agent: check if cwd has existing source code, package files, or git history
- If source files exist AND the user's idea references modifying/extending something: brownfield
- Otherwise: greenfield
For brownfield: Run explore agent to map relevant codebase areas, store as codebase_context 3.5. Load runtime settings:
- Read [$CLAUDE_CONFIG_DIR|~/.claude]/settings.json and ./.claude/settings.json (project overrides user)
- Resolve omt.deepInterview.ambiguityThreshold into <resolvedThreshold>; if it is undefined, use 0.2
- Derive <resolvedThresholdPercent> from <resolvedThreshold> and substitute both placeholders throughout the remaining instructions before continuing 3.6. Normalize oversized initial context before state init:
- Inspect the initial idea plus any pasted artifacts, logs, transcripts, or file excerpts for prompt-budget risk before writing state or generating the first question.
- If the initial context is oversized or likely to crowd out downstream prompts, produce a concise prompt-safe summary that preserves user intent, decisions, constraints, unknowns, cited files/symbols, and any explicit non-goals.
- Treat the summary as the canonical initial_idea and store the raw oversized material only as external/advisory context if it can be referenced safely; do not paste the raw oversized context into question-generation, ambiguity-scoring, spec-crystallization, or execution-handoff prompts.
- Wait until the summary exists before ambiguity scoring, weakest-dimension selection, brownfield exploration prompts, or any bridge to prometheus or sisyphus.
Initialize state by writing $OMT_DIR/deep-interview-active-state-{sessionId}.json via Write tool:

{
  "active": true,
  "current_phase": "deep-interview",
  "state": {
    "interview_id": "<uuid>",
    "type": "greenfield|brownfield",
    "initial_idea": "<prompt-safe initial-context summary or user input>",
    "initial_context_summary": "<summary if oversized, else null>",
    "rounds": [],
    "current_ambiguity": 1.0,
    "threshold": <resolvedThreshold>,
    "codebase_context": null,
    "challenge_modes_used": [],
    "ontology_snapshots": []
  }
}

Announce the interview to the user:

Starting deep interview. I'll ask targeted questions to understand your idea thoroughly before building anything. After each answer, I'll show your clarity score. We'll proceed to execution once ambiguity drops below <resolvedThresholdPercent>.

Your idea: "{initial_idea}" Project type: {greenfield|brownfield} Current ambiguity: 100% (we haven't started yet)

Phase 2: Interview Loop

Repeat until ambiguity ≤ threshold OR user exits early:

Step 2a: Generate Next Question

Build the question generation prompt with:

The prompt-safe initial-context summary (if one was created), otherwise the user's original idea
Prior Q&A rounds trimmed or summarized to fit the prompt budget while preserving decisions, constraints, unresolved gaps, and ontology changes
Current clarity scores per dimension (which is weakest?)
Challenge agent mode (if activated -- see Phase 3)
Brownfield codebase context (if applicable), summarized to cited paths/symbols/patterns instead of raw dumps

If any prompt input is too large, summarize it first and then continue from the summary. Do not ask the next AskUserQuestion, score ambiguity, or hand off to execution from an over-budget raw transcript.

Question targeting strategy:

Identify the dimension with the LOWEST clarity score
Generate a question that specifically improves that dimension
State, in one sentence before the question, why this dimension is now the bottleneck to reducing ambiguity
Questions should expose ASSUMPTIONS, not gather feature lists
If the scope is still conceptually fuzzy (entities keep shifting, the user is naming symptoms, or the core noun is unstable), switch to an ontology-style question that asks what the thing fundamentally IS before returning to feature/detail questions

Question styles by dimension: | Dimension | Question Style | Example | |-----------|---------------|---------| | Goal Clarity | "What exactly happens when...?" | "When you say 'manage tasks', what specific action does a user take first?" | | Constraint Clarity | "What are the boundaries?" | "Should this work offline, or is internet connectivity assumed?" | | Success Criteria | "How do we know it works?" | "If I showed you the finished product, what would make you say 'yes, that's it'?" | | Context Clarity (brownfield) | "How does this fit?" | "I found JWT auth middleware in src/auth/ (pattern: passport + JWT). Should this feature extend that path or intentionally diverge from it?" | | Scope-fuzzy / ontology stress | "What IS the core thing here?" | "You have named Tasks, Projects, and Workspaces across the last rounds. Which one is the core entity, and which are supporting views or containers?" |

Step 2b: Ask the Question

Use AskUserQuestion with the generated question. Present it clearly with the current ambiguity context:

Round {n} | Targeting: {weakest_dimension} | Why now: {one_sentence_targeting_rationale} | Ambiguity: {score}%

{question}

Options should include contextually relevant choices plus free-text.

Step 2c: Score Ambiguity

After receiving the user's answer, score clarity across all dimensions.

Scoring prompt (temperature 0.1 for consistency):

Given the following interview transcript for a {greenfield|brownfield} project, score clarity on each dimension from 0.0 to 1.0. If the initial context or transcript was summarized for prompt safety, score from that summary plus the preserved round decisions/gaps; do not re-expand raw oversized context.

Original idea or prompt-safe initial-context summary: {idea_or_initial_context_summary}

Transcript or prompt-safe transcript summary:
{all rounds Q&A or summarized transcript}

Score each dimension:
1. Goal Clarity (0.0-1.0): Is the primary objective unambiguous? Can you state it in one sentence without qualifiers? Can you name the key entities (nouns) and their relationships (verbs) without ambiguity?
2. Constraint Clarity (0.0-1.0): Are the boundaries, limitations, and non-goals clear?
3. Success Criteria Clarity (0.0-1.0): Could you write a test that verifies success? Are acceptance criteria concrete?
{4. Context Clarity (0.0-1.0): [brownfield only] Do we understand the existing system well enough to modify it safely? Do the identified entities map cleanly to existing codebase structures?}

For each dimension provide:
- score: float (0.0-1.0)
- justification: one sentence explaining the score
- gap: what's still unclear (if score < 0.9)

Also identify:
- weakest_dimension: the single lowest-confidence dimension this round
- weakest_dimension_rationale: one sentence explaining why it is the highest-leverage target for the next question

5. Ontology Extraction: Identify all key entities (nouns) discussed in the transcript.

{If round > 1, inject: "Previous round's entities: {prior_entities_json from state.ontology_snapshots[-1]}. REUSE these entity names where the concept is the same. Only introduce new names for genuinely new concepts."}

For each entity provide:
- name: string (the entity name, e.g., "User", "Order", "PaymentMethod")
- type: string (e.g., "core domain", "supporting", "external system")
- fields: string[] (key attributes mentioned)
- relationships: string[] (e.g., "User has many Orders")

Respond as JSON. Include an additional "ontology" key containing the entities array alongside the dimension scores.

Calculate ambiguity:

Greenfield: ambiguity = 1 - (goal × 0.40 + constraints × 0.30 + criteria × 0.30) Brownfield: ambiguity = 1 - (goal × 0.35 + constraints × 0.25 + criteria × 0.25 + context × 0.15)

Calculate ontology stability:

Round 1 special case: For the first round, skip stability comparison. All entities are "new". Set stability_ratio = N/A. If any round produces zero entities, set stability_ratio = N/A (avoids division by zero).

For rounds 2+, compare with the previous round's entity list:

stable_entities: entities present in both rounds with the same name
changed_entities: entities with different names but the same type AND >50% field overlap (treated as renamed, not new+removed)
new_entities: entities in this round not matched by name or fuzzy-match to any previous entity
removed_entities: entities in the previous round not matched to any current entity
stability_ratio: (stable + changed) / total_entities (0.0 to 1.0, where 1.0 = fully converged)

This formula counts renamed entities (changed) toward stability. Renamed entities indicate the concept persists even if the name shifted — this is convergence, not instability. Two entities with different names but the same type and >50% field overlap should be classified as "changed" (renamed), not as one removed and one added.

Show your work: Before reporting stability numbers, briefly list which entities were matched (by name or fuzzy) and which are new/removed. This lets the user sanity-check the matching.

Store the ontology snapshot (entities + stability_ratio + matching_reasoning) in state.ontology_snapshots[].

Step 2d: Report Progress

After scoring, show the user their progress:

Round {n} complete.

| Dimension | Score | Weight | Weighted | Gap |
|-----------|-------|--------|----------|-----|
| Goal | {s} | {w} | {s*w} | {gap or "Clear"} |
| Constraints | {s} | {w} | {s*w} | {gap or "Clear"} |
| Success Criteria | {s} | {w} | {s*w} | {gap or "Clear"} |
| Context (brownfield) | {s} | {w} | {s*w} | {gap or "Clear"} |
| **Ambiguity** | | | **{score}%** | |

**Ontology:** {entity_count} entities | Stability: {stability_ratio} | New: {new} | Changed: {changed} | Stable: {stable}

**Next target:** {weakest_dimension} — {weakest_dimension_rationale}

{score <= threshold ? "Clarity threshold met! Ready to proceed." : "Focusing next question on: {weakest_dimension}"}

Step 2e: Update State

Update interview state with the new round and scores by writing $OMT_DIR/deep-interview-active-state-{sessionId}.json via Write tool.

Step 2f: Check Soft Limits

Round 3+: Allow early exit if user says "enough", "let's go", "build it"
Round 10: Show soft warning: "We're at 10 rounds. Current ambiguity: {score}%. Continue or proceed with current clarity?"
Round 20: Hard cap: "Maximum interview rounds reached. Proceeding with current clarity level ({score}%)."

Phase 3: Challenge Agents

At specific round thresholds, shift the questioning perspective:

Round 4+: Contrarian Mode

Inject into the question generation prompt:

You are now in CONTRARIAN mode. Your next question should challenge the user's core assumption. Ask "What if the opposite were true?" or "What if this constraint doesn't actually exist?" The goal is to test whether the user's framing is correct or just habitual.

Round 6+: Simplifier Mode

Inject into the question generation prompt:

You are now in SIMPLIFIER mode. Your next question should probe whether complexity can be removed. Ask "What's the simplest version that would still be valuable?" or "Which of these constraints are actually necessary vs. assumed?" The goal is to find the minimal viable specification.

Round 8+: Ontologist Mode (if ambiguity still > 0.3)

Inject into the question generation prompt:

You are now in ONTOLOGIST mode. The ambiguity is still high after 8 rounds, suggesting we may be addressing symptoms rather than the core problem. The tracked entities so far are: {current_entities_summary from latest ontology snapshot}. Ask "What IS this, really?" or "Looking at these entities, which one is the CORE concept and which are just supporting?" The goal is to find the essence by examining the ontology.

Challenge modes are used ONCE each, then return to normal Socratic questioning. Track which modes have been used in state.

Phase 4: Crystallize Spec

When ambiguity ≤ threshold (or hard cap / early exit):

Generate the specification with the prompt-safe transcript. If the full interview transcript or initial context is too large, include the summary plus all concrete decisions, acceptance criteria, unresolved gaps, and ontology snapshots; never overflow the prompt with raw oversized context.
Write to file: $OMT_DIR/deep-interview/{slug}.md

Spec structure:

# Deep Interview Spec: {title}

## Metadata
- Interview ID: {uuid}
- Rounds: {count}
- Final Ambiguity Score: {score}%
- Type: greenfield | brownfield
- Generated: {timestamp}
- Threshold: {threshold}
- Initial Context Summarized: {yes|no}
- Status: {PASSED | BELOW_THRESHOLD_EARLY_EXIT}

## Clarity Breakdown
| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Goal Clarity | {s} | {w} | {s*w} |
| Constraint Clarity | {s} | {w} | {s*w} |
| Success Criteria | {s} | {w} | {s*w} |
| Context Clarity | {s} | {w} | {s*w} |
| **Total Clarity** | | | **{total}** |
| **Ambiguity** | | | **{1-total}** |

## Goal
{crystal-clear goal statement derived from interview}

## Constraints
- {constraint 1}
- {constraint 2}
- ...

## Non-Goals
- {explicitly excluded scope 1}
- {explicitly excluded scope 2}

## Acceptance Criteria
- [ ] {testable criterion 1}
- [ ] {testable criterion 2}
- [ ] {testable criterion 3}
- ...

## Assumptions Exposed & Resolved
| Assumption | Challenge | Resolution |
|------------|-----------|------------|
| {assumption} | {how it was questioned} | {what was decided} |

## Technical Context
{brownfield: relevant codebase findings from explore agent}
{greenfield: technology choices and constraints}

## Ontology (Key Entities)
{Fill from the FINAL round's ontology extraction, not just crystallization-time generation}

| Entity | Type | Fields | Relationships |
|--------|------|--------|---------------|
| {entity.name} | {entity.type} | {entity.fields} | {entity.relationships} |

## Ontology Convergence
{Show how entities stabilized across interview rounds using data from ontology_snapshots in state}

| Round | Entity Count | New | Changed | Stable | Stability Ratio |
|-------|-------------|-----|---------|--------|----------------|
| 1 | {n} | {n} | - | - | - |
| 2 | {n} | {new} | {changed} | {stable} | {ratio}% |
| ... | ... | ... | ... | ... | ... |
| {final} | {n} | {new} | {changed} | {stable} | {ratio}% |

## Interview Transcript
<details>
<summary>Full Q&A ({n} rounds)</summary>

### Round 1
**Q:** {question}
**A:** {answer}
**Ambiguity:** {score}% (Goal: {g}, Constraints: {c}, Criteria: {cr})

...
</details>

Emit the handoff token in the final assistant message before proceeding to Phase 5. The literal token <deep-interview-done/> must appear in the assistant turn that announces spec completion. This signals downstream hooks that the interview phase is complete and state cleanup may proceed.

Phase 5: Execution Bridge

After the spec is written, present execution options via AskUserQuestion:

Question: "Your spec is ready (ambiguity: {score}%). How would you like to proceed?"

Options:

Plan with prometheus (Recommended)
- Description: "Generate a detailed work plan from this spec with TODO decomposition, wave parallelization, and QA scenarios. Best for complex specs."
- Action: Invoke Skill(skill: "prometheus") with the spec file path as context. Prometheus will handle the plan → execution pipeline from there.
Continue interviewing
- Description: "Continue interviewing to improve clarity (current: {score}%)"
- Action: Return to Phase 2 interview loop.

IMPORTANT: On execution selection, MUST invoke the chosen skill via Skill(). Do NOT implement directly. The deep-interview agent is a requirements agent, not an execution agent. If oversized initial context was summarized, pass the spec and prompt-safe summary forward, not the raw oversized source material.

</Steps>

<Tool_Usage>

Use AskUserQuestion for each interview question — provides clickable UI with contextual options
Use Agent(subagent_type="explore") for brownfield codebase exploration (run BEFORE asking user about codebase)
Use temperature 0.1 for ambiguity scoring — consistency is critical
Use Write tool to write and update interview state at $OMT_DIR/deep-interview-active-state-{sessionId}.json
Use Read tool to read back state when resuming an interrupted session
Use Write tool to save the final spec to $OMT_DIR/deep-interview/{slug}.md
Use Skill() to bridge to execution modes — never implement directly
Challenge agent modes are prompt injections, not separate agent spawns </Tool_Usage>

<Examples> <Good> Targeting weakest dimension: ``` Scores: Goal=0.9, Constraints=0.4, Criteria=0.7 Next question targets Constraints (lowest at 0.4): "You mentioned this should 'work on mobile'. Does that mean a native app, a responsive web app, or a PWA? And are there specific devices or OS versions you need to support?" ``` Why good: Identifies weakest dimension, explains why it is now the bottleneck, asks a specific question to improve it, and doesn't batch multiple topics. </Good> <Good> Gathering codebase facts before asking: ``` [spawns explore agent: "find authentication implementation"] [receives: "Auth is in src/auth/ using JWT with passport.js"]

Question: "I found JWT authentication with passport.js in src/auth/ (pattern match from explore). For this new feature, should we extend the existing auth middleware or create a separate authentication flow?"

Why good: Explored first, cited the repo evidence that triggered the question, then asked an informed confirmation question. Never asks the user what the code already reveals.
</Good>

<Good>
Contrarian mode activation:

Round 5 | Contrarian Mode | Ambiguity: 42%

You've said this needs to support 10,000 concurrent users. What if it only needed to handle 100? Would the architecture change fundamentally, or is the 10K number an assumption rather than a measured requirement?

Why good: Challenges a specific assumption (scale requirement) that could dramatically simplify the solution.
</Good>

<Good>
Early exit with warning:

User: "That's enough, just build it" System: "Current ambiguity is 35% (threshold: <resolvedThresholdPercent>). Areas still unclear:

Success Criteria: 0.5 (How do we verify the search ranking algorithm works correctly?)
Constraints: 0.6 (No performance targets defined yet)

Proceeding may require rework. Continue anyway?" [Yes, proceed] [Ask 2-3 more questions] [Cancel]

Why good: Respects user's desire to stop but transparently shows the risk.
</Good>

<Good>
Ontology convergence tracking:

Round 3 entities: User, Task, Project (stability: N/A → 67%) Round 4 entities: User, Task, Project, Tag (stability: 75% — 3 stable, 1 new) Round 5 entities: User, Task, Project, Tag (stability: 100% — all 4 stable)

"Ontology has converged — the same 4 entities appeared in 2 consecutive rounds with no changes. The domain model is stable."

Why good: Shows entity tracking across rounds with visible convergence. Stability ratio increases as the domain model solidifies, giving mathematical evidence that the interview is converging on a stable understanding.
</Good>

<Good>
Ontology-style question for scope-fuzzy tasks:

Round 6 | Targeting: Goal Clarity | Why now: the core entity is still unstable across rounds, so feature questions would compound ambiguity | Ambiguity: 38%

"Across the last rounds you've described this as a workflow, an inbox, and a planner. Which one is the core thing this product IS, and which ones are supporting metaphors or views?"

Why good: Uses ontology-style questioning to stabilize the core noun before drilling into features, which is the right move when the scope is fuzzy rather than merely incomplete.
</Good>

<Bad>
Batching multiple questions:

"What's the target audience? And what tech stack? And how should auth work? Also, what's the deployment target?"

Why bad: Four questions at once — causes shallow answers and makes scoring inaccurate.
</Bad>

<Bad>
Asking about codebase facts:

"What database does your project use?"

Why bad: Should have spawned explore agent to find this. Never ask the user what the code already tells you.
</Bad>

<Bad>
Proceeding despite high ambiguity:

"Ambiguity is at 45% but we've done 5 rounds, so let's start building."

Why bad: 45% ambiguity means nearly half the requirements are unclear. The mathematical gate exists to prevent exactly this.
</Bad>
</Examples>

<Escalation_And_Stop_Conditions>
- **Hard cap at 20 rounds**: Proceed with whatever clarity exists, noting the risk
- **Soft warning at 10 rounds**: Offer to continue or proceed
- **Early exit (round 3+)**: Allow with warning if ambiguity > threshold
- **User says "stop", "cancel", "abort"**: Stop immediately, save state for resume
- **Ambiguity stalls** (same score +-0.05 for 3 rounds): Activate Ontologist mode to reframe
- **All dimensions at 0.9+**: Skip to spec generation even if not at round minimum
- **Codebase exploration fails**: Proceed as greenfield, note the limitation
</Escalation_And_Stop_Conditions>

<Final_Checklist>
- [ ] Interview completed (ambiguity ≤ threshold OR user chose early exit)
- [ ] Oversized initial context/history was summarized before scoring, question generation, spec generation, or execution handoff
- [ ] Ambiguity score displayed after every round
- [ ] Every round explicitly names the weakest dimension and why it is the next target
- [ ] Challenge agents activated at correct thresholds (round 4, 6, 8)
- [ ] Spec file written to `$OMT_DIR/deep-interview/{slug}.md`
- [ ] Spec includes: goal, constraints, acceptance criteria, clarity breakdown, transcript
- [ ] Token `<deep-interview-done/>` emitted in the final assistant message before handoff
- [ ] Execution bridge presented via AskUserQuestion
- [ ] Selected execution mode invoked via Skill() (never direct implementation)
- [ ] State cleaned up after execution handoff
- [ ] Brownfield confirmation questions cite repo evidence (file/path/pattern) before asking the user to decide
- [ ] Scope-fuzzy tasks can trigger ontology-style questioning to stabilize the core entity before feature elaboration
- [ ] Per-round ambiguity report includes Ontology row with entity count and stability ratio
- [ ] Spec includes Ontology (Key Entities) table and Ontology Convergence section
</Final_Checklist>

<Advanced>
## Configuration

Optional settings in `.claude/settings.json`:

```json
{
  "omt": {
    "deepInterview": {
      "ambiguityThreshold": <resolvedThreshold>
    }
  }
}

Resume

If interrupted, run /deep-interview again. The skill reads state from $OMT_DIR/deep-interview-active-state-{sessionId}.json via Read tool and resumes from the last completed round.

Integration with Prometheus

When prometheus receives a vague input (no file paths, function names, or concrete anchors), it can redirect to deep-interview:

User: "prometheus build me a thing"
Prometheus: "Your request is quite open-ended. Would you like to run a deep interview first to clarify requirements?"
  [Yes, interview first] [No, expand directly]

If the user chooses interview, prometheus invokes /deep-interview. When the interview completes and the user selects "Plan with prometheus", the spec becomes Phase 0 output and prometheus continues from planning.

Brownfield vs Greenfield Weights

| Dimension | Greenfield | Brownfield | |-----------|-----------|------------| | Goal Clarity | 40% | 35% | | Constraint Clarity | 30% | 25% | | Success Criteria | 30% | 25% | | Context Clarity | N/A | 15% |

Brownfield adds Context Clarity because modifying existing code safely requires understanding the system being changed.

Challenge Agent Modes

| Mode | Activates | Purpose | Prompt Injection | |------|-----------|---------|-----------------| | Contrarian | Round 4+ | Challenge assumptions | "What if the opposite were true?" | | Simplifier | Round 6+ | Remove complexity | "What's the simplest version?" | | Ontologist | Round 8+ (if ambiguity > 0.3) | Find essence | "What IS this, really?" |

Each mode is used exactly once, then normal Socratic questioning resumes. Modes are tracked in state to prevent repetition.

Ambiguity Score Interpretation

| Score Range | Meaning | Action | |-------------|---------|--------| | 0.0 - 0.1 | Crystal clear | Proceed immediately | | At or below the resolved threshold | Clear enough | Proceed | | Above the resolved threshold with minor gaps | Some gaps | Continue interviewing | | Moderate ambiguity | Significant gaps | Focus on weakest dimensions | | High ambiguity | Very unclear | May need reframing (Ontologist) | | Extreme ambiguity | Almost nothing known | Early stages, keep going | </Advanced>

Task: {{ARGUMENTS}}

<Use_When>

User has a vague idea and wants thorough requirements gathering before execution
User says "deep interview", "interview me", "ask me everything", "don't assume", "make sure you understand"
User says "ouroboros", "socratic", "I have a vague idea", "not sure exactly what I want"
User wants to avoid "that's not what I meant" outcomes from autonomous execution
Task is complex enough that jumping to code would waste cycles on scope discovery
User wants mathematically-validated clarity before committing to execution </Use_When>

<Do_Not_Use_When>

User has a detailed, specific request with file paths, function names, or acceptance criteria -- execute directly
User wants to explore options or brainstorm -- use prometheus skill instead
User wants a quick fix or single change -- delegate to executor or sisyphus-junior
User says "just do it" or "skip the questions" -- respect their intent
User already has a PRD or plan file -- use prometheus or sisyphus with that plan </Do_Not_Use_When>

Inspired by the Ouroboros project which demonstrated that specification quality is the primary bottleneck in AI-assisted development. </Why_This_Exists>

<Execution_Policy>

Ask ONE question at a time -- never batch multiple questions
Target the WEAKEST clarity dimension with each question
Make weakest-dimension targeting explicit every round: name the weakest dimension, state its score/gap, and explain why the next question is aimed there
Gather codebase facts via explore agent BEFORE asking the user about them
For brownfield confirmation questions, cite the repo evidence that triggered the question (file path, symbol, or pattern) instead of asking the user to rediscover it
Score ambiguity after every answer -- display the score transparently
Keep prompt payloads budgeted: summarize or trim oversized initial context/history before composing question, scoring, spec, or handoff prompts
If the user's initial context is oversized, create a concise prompt-safe summary first and wait for that summary before ambiguity scoring, question generation, or downstream execution handoff
Do not proceed to execution until ambiguity ≤ the resolved threshold for this run
Allow early exit with a clear warning if ambiguity is still high
Persist interview state for resume across session interruptions
Challenge agents activate at specific round thresholds to shift perspective </Execution_Policy>

<Steps>

Phase 1: Initialize

Parse the user's idea from {{ARGUMENTS}}
Detect brownfield vs greenfield:
- Run explore agent: check if cwd has existing source code, package files, or git history
- If source files exist AND the user's idea references modifying/extending something: brownfield
- Otherwise: greenfield
For brownfield: Run explore agent to map relevant codebase areas, store as codebase_context 3.5. Load runtime settings:
- Read [$CLAUDE_CONFIG_DIR|~/.claude]/settings.json and ./.claude/settings.json (project overrides user)
- Resolve omt.deepInterview.ambiguityThreshold into <resolvedThreshold>; if it is undefined, use 0.2
- Derive <resolvedThresholdPercent> from <resolvedThreshold> and substitute both placeholders throughout the remaining instructions before continuing 3.6. Normalize oversized initial context before state init:
- Inspect the initial idea plus any pasted artifacts, logs, transcripts, or file excerpts for prompt-budget risk before writing state or generating the first question.
- If the initial context is oversized or likely to crowd out downstream prompts, produce a concise prompt-safe summary that preserves user intent, decisions, constraints, unknowns, cited files/symbols, and any explicit non-goals.
- Treat the summary as the canonical initial_idea and store the raw oversized material only as external/advisory context if it can be referenced safely; do not paste the raw oversized context into question-generation, ambiguity-scoring, spec-crystallization, or execution-handoff prompts.
- Wait until the summary exists before ambiguity scoring, weakest-dimension selection, brownfield exploration prompts, or any bridge to prometheus or sisyphus.
Initialize state by writing $OMT_DIR/deep-interview-active-state-{sessionId}.json via Write tool:

{
  "active": true,
  "current_phase": "deep-interview",
  "state": {
    "interview_id": "<uuid>",
    "type": "greenfield|brownfield",
    "initial_idea": "<prompt-safe initial-context summary or user input>",
    "initial_context_summary": "<summary if oversized, else null>",
    "rounds": [],
    "current_ambiguity": 1.0,
    "threshold": <resolvedThreshold>,
    "codebase_context": null,
    "challenge_modes_used": [],
    "ontology_snapshots": []
  }
}

Announce the interview to the user:

Starting deep interview. I'll ask targeted questions to understand your idea thoroughly before building anything. After each answer, I'll show your clarity score. We'll proceed to execution once ambiguity drops below <resolvedThresholdPercent>.

Your idea: "{initial_idea}" Project type: {greenfield|brownfield} Current ambiguity: 100% (we haven't started yet)

Phase 2: Interview Loop

Repeat until ambiguity ≤ threshold OR user exits early:

Step 2a: Generate Next Question

Build the question generation prompt with:

The prompt-safe initial-context summary (if one was created), otherwise the user's original idea
Prior Q&A rounds trimmed or summarized to fit the prompt budget while preserving decisions, constraints, unresolved gaps, and ontology changes
Current clarity scores per dimension (which is weakest?)
Challenge agent mode (if activated -- see Phase 3)
Brownfield codebase context (if applicable), summarized to cited paths/symbols/patterns instead of raw dumps

Question targeting strategy:

Identify the dimension with the LOWEST clarity score
Generate a question that specifically improves that dimension
State, in one sentence before the question, why this dimension is now the bottleneck to reducing ambiguity
Questions should expose ASSUMPTIONS, not gather feature lists
If the scope is still conceptually fuzzy (entities keep shifting, the user is naming symptoms, or the core noun is unstable), switch to an ontology-style question that asks what the thing fundamentally IS before returning to feature/detail questions

Step 2b: Ask the Question

Use AskUserQuestion with the generated question. Present it clearly with the current ambiguity context:

Round {n} | Targeting: {weakest_dimension} | Why now: {one_sentence_targeting_rationale} | Ambiguity: {score}%

{question}

Options should include contextually relevant choices plus free-text.

Step 2c: Score Ambiguity

After receiving the user's answer, score clarity across all dimensions.

Scoring prompt (temperature 0.1 for consistency):

Given the following interview transcript for a {greenfield|brownfield} project, score clarity on each dimension from 0.0 to 1.0. If the initial context or transcript was summarized for prompt safety, score from that summary plus the preserved round decisions/gaps; do not re-expand raw oversized context.

Original idea or prompt-safe initial-context summary: {idea_or_initial_context_summary}

Transcript or prompt-safe transcript summary:
{all rounds Q&A or summarized transcript}

Score each dimension:
1. Goal Clarity (0.0-1.0): Is the primary objective unambiguous? Can you state it in one sentence without qualifiers? Can you name the key entities (nouns) and their relationships (verbs) without ambiguity?
2. Constraint Clarity (0.0-1.0): Are the boundaries, limitations, and non-goals clear?
3. Success Criteria Clarity (0.0-1.0): Could you write a test that verifies success? Are acceptance criteria concrete?
{4. Context Clarity (0.0-1.0): [brownfield only] Do we understand the existing system well enough to modify it safely? Do the identified entities map cleanly to existing codebase structures?}

For each dimension provide:
- score: float (0.0-1.0)
- justification: one sentence explaining the score
- gap: what's still unclear (if score < 0.9)

Also identify:
- weakest_dimension: the single lowest-confidence dimension this round
- weakest_dimension_rationale: one sentence explaining why it is the highest-leverage target for the next question

5. Ontology Extraction: Identify all key entities (nouns) discussed in the transcript.

{If round > 1, inject: "Previous round's entities: {prior_entities_json from state.ontology_snapshots[-1]}. REUSE these entity names where the concept is the same. Only introduce new names for genuinely new concepts."}

For each entity provide:
- name: string (the entity name, e.g., "User", "Order", "PaymentMethod")
- type: string (e.g., "core domain", "supporting", "external system")
- fields: string[] (key attributes mentioned)
- relationships: string[] (e.g., "User has many Orders")

Respond as JSON. Include an additional "ontology" key containing the entities array alongside the dimension scores.

Calculate ambiguity:

Greenfield: ambiguity = 1 - (goal × 0.40 + constraints × 0.30 + criteria × 0.30) Brownfield: ambiguity = 1 - (goal × 0.35 + constraints × 0.25 + criteria × 0.25 + context × 0.15)

Calculate ontology stability:

For rounds 2+, compare with the previous round's entity list:

stable_entities: entities present in both rounds with the same name
changed_entities: entities with different names but the same type AND >50% field overlap (treated as renamed, not new+removed)
new_entities: entities in this round not matched by name or fuzzy-match to any previous entity
removed_entities: entities in the previous round not matched to any current entity
stability_ratio: (stable + changed) / total_entities (0.0 to 1.0, where 1.0 = fully converged)

Show your work: Before reporting stability numbers, briefly list which entities were matched (by name or fuzzy) and which are new/removed. This lets the user sanity-check the matching.

Store the ontology snapshot (entities + stability_ratio + matching_reasoning) in state.ontology_snapshots[].

Step 2d: Report Progress

After scoring, show the user their progress:

Round {n} complete.

| Dimension | Score | Weight | Weighted | Gap |
|-----------|-------|--------|----------|-----|
| Goal | {s} | {w} | {s*w} | {gap or "Clear"} |
| Constraints | {s} | {w} | {s*w} | {gap or "Clear"} |
| Success Criteria | {s} | {w} | {s*w} | {gap or "Clear"} |
| Context (brownfield) | {s} | {w} | {s*w} | {gap or "Clear"} |
| **Ambiguity** | | | **{score}%** | |

**Ontology:** {entity_count} entities | Stability: {stability_ratio} | New: {new} | Changed: {changed} | Stable: {stable}

**Next target:** {weakest_dimension} — {weakest_dimension_rationale}

{score <= threshold ? "Clarity threshold met! Ready to proceed." : "Focusing next question on: {weakest_dimension}"}

Step 2e: Update State

Update interview state with the new round and scores by writing $OMT_DIR/deep-interview-active-state-{sessionId}.json via Write tool.

Step 2f: Check Soft Limits

Round 3+: Allow early exit if user says "enough", "let's go", "build it"
Round 10: Show soft warning: "We're at 10 rounds. Current ambiguity: {score}%. Continue or proceed with current clarity?"
Round 20: Hard cap: "Maximum interview rounds reached. Proceeding with current clarity level ({score}%)."

Phase 3: Challenge Agents

At specific round thresholds, shift the questioning perspective:

Round 4+: Contrarian Mode

Inject into the question generation prompt:

You are now in CONTRARIAN mode. Your next question should challenge the user's core assumption. Ask "What if the opposite were true?" or "What if this constraint doesn't actually exist?" The goal is to test whether the user's framing is correct or just habitual.

Round 6+: Simplifier Mode

Inject into the question generation prompt:

You are now in SIMPLIFIER mode. Your next question should probe whether complexity can be removed. Ask "What's the simplest version that would still be valuable?" or "Which of these constraints are actually necessary vs. assumed?" The goal is to find the minimal viable specification.

Round 8+: Ontologist Mode (if ambiguity still > 0.3)

Inject into the question generation prompt:

You are now in ONTOLOGIST mode. The ambiguity is still high after 8 rounds, suggesting we may be addressing symptoms rather than the core problem. The tracked entities so far are: {current_entities_summary from latest ontology snapshot}. Ask "What IS this, really?" or "Looking at these entities, which one is the CORE concept and which are just supporting?" The goal is to find the essence by examining the ontology.

Challenge modes are used ONCE each, then return to normal Socratic questioning. Track which modes have been used in state.

Phase 4: Crystallize Spec

When ambiguity ≤ threshold (or hard cap / early exit):

Generate the specification with the prompt-safe transcript. If the full interview transcript or initial context is too large, include the summary plus all concrete decisions, acceptance criteria, unresolved gaps, and ontology snapshots; never overflow the prompt with raw oversized context.
Write to file: $OMT_DIR/deep-interview/{slug}.md

Spec structure:

# Deep Interview Spec: {title}

## Metadata
- Interview ID: {uuid}
- Rounds: {count}
- Final Ambiguity Score: {score}%
- Type: greenfield | brownfield
- Generated: {timestamp}
- Threshold: {threshold}
- Initial Context Summarized: {yes|no}
- Status: {PASSED | BELOW_THRESHOLD_EARLY_EXIT}

## Clarity Breakdown
| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Goal Clarity | {s} | {w} | {s*w} |
| Constraint Clarity | {s} | {w} | {s*w} |
| Success Criteria | {s} | {w} | {s*w} |
| Context Clarity | {s} | {w} | {s*w} |
| **Total Clarity** | | | **{total}** |
| **Ambiguity** | | | **{1-total}** |

## Goal
{crystal-clear goal statement derived from interview}

## Constraints
- {constraint 1}
- {constraint 2}
- ...

## Non-Goals
- {explicitly excluded scope 1}
- {explicitly excluded scope 2}

## Acceptance Criteria
- [ ] {testable criterion 1}
- [ ] {testable criterion 2}
- [ ] {testable criterion 3}
- ...

## Assumptions Exposed & Resolved
| Assumption | Challenge | Resolution |
|------------|-----------|------------|
| {assumption} | {how it was questioned} | {what was decided} |

## Technical Context
{brownfield: relevant codebase findings from explore agent}
{greenfield: technology choices and constraints}

## Ontology (Key Entities)
{Fill from the FINAL round's ontology extraction, not just crystallization-time generation}

| Entity | Type | Fields | Relationships |
|--------|------|--------|---------------|
| {entity.name} | {entity.type} | {entity.fields} | {entity.relationships} |

## Ontology Convergence
{Show how entities stabilized across interview rounds using data from ontology_snapshots in state}

| Round | Entity Count | New | Changed | Stable | Stability Ratio |
|-------|-------------|-----|---------|--------|----------------|
| 1 | {n} | {n} | - | - | - |
| 2 | {n} | {new} | {changed} | {stable} | {ratio}% |
| ... | ... | ... | ... | ... | ... |
| {final} | {n} | {new} | {changed} | {stable} | {ratio}% |

## Interview Transcript
<details>
<summary>Full Q&A ({n} rounds)</summary>

### Round 1
**Q:** {question}
**A:** {answer}
**Ambiguity:** {score}% (Goal: {g}, Constraints: {c}, Criteria: {cr})

...
</details>

Emit the handoff token in the final assistant message before proceeding to Phase 5. The literal token <deep-interview-done/> must appear in the assistant turn that announces spec completion. This signals downstream hooks that the interview phase is complete and state cleanup may proceed.

Phase 5: Execution Bridge

After the spec is written, present execution options via AskUserQuestion:

Question: "Your spec is ready (ambiguity: {score}%). How would you like to proceed?"

Options:

Plan with prometheus (Recommended)
- Description: "Generate a detailed work plan from this spec with TODO decomposition, wave parallelization, and QA scenarios. Best for complex specs."
- Action: Invoke Skill(skill: "prometheus") with the spec file path as context. Prometheus will handle the plan → execution pipeline from there.
Continue interviewing
- Description: "Continue interviewing to improve clarity (current: {score}%)"
- Action: Return to Phase 2 interview loop.

</Steps>

<Tool_Usage>

Use AskUserQuestion for each interview question — provides clickable UI with contextual options
Use Agent(subagent_type="explore") for brownfield codebase exploration (run BEFORE asking user about codebase)
Use temperature 0.1 for ambiguity scoring — consistency is critical
Use Write tool to write and update interview state at $OMT_DIR/deep-interview-active-state-{sessionId}.json
Use Read tool to read back state when resuming an interrupted session
Use Write tool to save the final spec to $OMT_DIR/deep-interview/{slug}.md
Use Skill() to bridge to execution modes — never implement directly
Challenge agent modes are prompt injections, not separate agent spawns </Tool_Usage>

Why good: Explored first, cited the repo evidence that triggered the question, then asked an informed confirmation question. Never asks the user what the code already reveals.
</Good>

<Good>
Contrarian mode activation:

Round 5 | Contrarian Mode | Ambiguity: 42%

Why good: Challenges a specific assumption (scale requirement) that could dramatically simplify the solution.
</Good>

<Good>
Early exit with warning:

User: "That's enough, just build it" System: "Current ambiguity is 35% (threshold: <resolvedThresholdPercent>). Areas still unclear:

Success Criteria: 0.5 (How do we verify the search ranking algorithm works correctly?)
Constraints: 0.6 (No performance targets defined yet)

Proceeding may require rework. Continue anyway?" [Yes, proceed] [Ask 2-3 more questions] [Cancel]

Why good: Respects user's desire to stop but transparently shows the risk.
</Good>

<Good>
Ontology convergence tracking:

"Ontology has converged — the same 4 entities appeared in 2 consecutive rounds with no changes. The domain model is stable."

Why good: Shows entity tracking across rounds with visible convergence. Stability ratio increases as the domain model solidifies, giving mathematical evidence that the interview is converging on a stable understanding.
</Good>

<Good>
Ontology-style question for scope-fuzzy tasks:

Round 6 | Targeting: Goal Clarity | Why now: the core entity is still unstable across rounds, so feature questions would compound ambiguity | Ambiguity: 38%

"Across the last rounds you've described this as a workflow, an inbox, and a planner. Which one is the core thing this product IS, and which ones are supporting metaphors or views?"

Why good: Uses ontology-style questioning to stabilize the core noun before drilling into features, which is the right move when the scope is fuzzy rather than merely incomplete.
</Good>

<Bad>
Batching multiple questions:

"What's the target audience? And what tech stack? And how should auth work? Also, what's the deployment target?"

Why bad: Four questions at once — causes shallow answers and makes scoring inaccurate.
</Bad>

<Bad>
Asking about codebase facts:

"What database does your project use?"

Why bad: Should have spawned explore agent to find this. Never ask the user what the code already tells you.
</Bad>

<Bad>
Proceeding despite high ambiguity:

"Ambiguity is at 45% but we've done 5 rounds, so let's start building."

Why bad: 45% ambiguity means nearly half the requirements are unclear. The mathematical gate exists to prevent exactly this.
</Bad>
</Examples>

<Escalation_And_Stop_Conditions>
- **Hard cap at 20 rounds**: Proceed with whatever clarity exists, noting the risk
- **Soft warning at 10 rounds**: Offer to continue or proceed
- **Early exit (round 3+)**: Allow with warning if ambiguity > threshold
- **User says "stop", "cancel", "abort"**: Stop immediately, save state for resume
- **Ambiguity stalls** (same score +-0.05 for 3 rounds): Activate Ontologist mode to reframe
- **All dimensions at 0.9+**: Skip to spec generation even if not at round minimum
- **Codebase exploration fails**: Proceed as greenfield, note the limitation
</Escalation_And_Stop_Conditions>

<Final_Checklist>
- [ ] Interview completed (ambiguity ≤ threshold OR user chose early exit)
- [ ] Oversized initial context/history was summarized before scoring, question generation, spec generation, or execution handoff
- [ ] Ambiguity score displayed after every round
- [ ] Every round explicitly names the weakest dimension and why it is the next target
- [ ] Challenge agents activated at correct thresholds (round 4, 6, 8)
- [ ] Spec file written to `$OMT_DIR/deep-interview/{slug}.md`
- [ ] Spec includes: goal, constraints, acceptance criteria, clarity breakdown, transcript
- [ ] Token `<deep-interview-done/>` emitted in the final assistant message before handoff
- [ ] Execution bridge presented via AskUserQuestion
- [ ] Selected execution mode invoked via Skill() (never direct implementation)
- [ ] State cleaned up after execution handoff
- [ ] Brownfield confirmation questions cite repo evidence (file/path/pattern) before asking the user to decide
- [ ] Scope-fuzzy tasks can trigger ontology-style questioning to stabilize the core entity before feature elaboration
- [ ] Per-round ambiguity report includes Ontology row with entity count and stability ratio
- [ ] Spec includes Ontology (Key Entities) table and Ontology Convergence section
</Final_Checklist>

<Advanced>
## Configuration

Optional settings in `.claude/settings.json`:

```json
{
  "omt": {
    "deepInterview": {
      "ambiguityThreshold": <resolvedThreshold>
    }
  }
}

Resume

If interrupted, run /deep-interview again. The skill reads state from $OMT_DIR/deep-interview-active-state-{sessionId}.json via Read tool and resumes from the last completed round.

Integration with Prometheus

When prometheus receives a vague input (no file paths, function names, or concrete anchors), it can redirect to deep-interview:

User: "prometheus build me a thing"
Prometheus: "Your request is quite open-ended. Would you like to run a deep interview first to clarify requirements?"
  [Yes, interview first] [No, expand directly]

Brownfield vs Greenfield Weights

Brownfield adds Context Clarity because modifying existing code safely requires understanding the system being changed.

Challenge Agent Modes

Each mode is used exactly once, then normal Socratic questioning resumes. Modes are tracked in state to prevent repetition.

Ambiguity Score Interpretation

Task: {{ARGUMENTS}}

Adoption

toongri/deep-interview

$ install --global

Security Scan Results

SKILL.md

Phase 1: Initialize

Phase 2: Interview Loop

Step 2a: Generate Next Question

Step 2b: Ask the Question

Step 2c: Score Ambiguity

Step 2d: Report Progress

Step 2e: Update State

Step 2f: Check Soft Limits

Phase 3: Challenge Agents

Round 4+: Contrarian Mode

Round 6+: Simplifier Mode

Round 8+: Ontologist Mode (if ambiguity still > 0.3)

Phase 4: Crystallize Spec

Phase 5: Execution Bridge

Resume

Integration with Prometheus

Brownfield vs Greenfield Weights

Challenge Agent Modes

Ambiguity Score Interpretation

Related Skills

toongri/pin-wrap-up

toongri/pin-setup

toongri/pin-record

toongri/pin-query

toongri/deep-interview

$ install --global

Security Scan Results

SKILL.md

Phase 1: Initialize

Phase 2: Interview Loop

Step 2a: Generate Next Question

Step 2b: Ask the Question

Step 2c: Score Ambiguity

Step 2d: Report Progress

Step 2e: Update State

Step 2f: Check Soft Limits

Phase 3: Challenge Agents

Round 4+: Contrarian Mode

Round 6+: Simplifier Mode

Round 8+: Ontologist Mode (if ambiguity still > 0.3)

Phase 4: Crystallize Spec

Phase 5: Execution Bridge

Resume

Integration with Prometheus

Brownfield vs Greenfield Weights

Challenge Agent Modes

Ambiguity Score Interpretation

Related Skills

toongri/pin-wrap-up

toongri/pin-setup

toongri/pin-record

toongri/pin-query