Git Workflow Methodology

When to use this skill

Use this skill when you need to:

Run the claudikins-kernel:execute command
Decompose plans into executable tasks
Set up two-stage code review
Decide batch sizes and checkpoints
Handle stuck agents or failed tasks

Core Philosophy

"I'd use 5-7 agents per SESSION, not 30 per batch." - Boris

Execution is about isolation, verification, and human checkpoints. Not speed.

The Six Principles

One task = one branch - Isolation prevents pollution
Fresh context per task - context: fork gives a clean slate
Two-stage review - Spec compliance first, then code quality
Human checkpoints between batches - Not between individual tasks
Commands own git - Agents never checkout/merge/push
Features are the unit - Batch at feature level, not task level

Batch Size Guidance (GOSPEL)

Wrong: 30 agents for 10 tasks (3 per task micro-management) Right: 5-7 agents total (feature-level batches)

| Scenario | Wrong | Right | | -------------------- | -------------------------- | ------------------ | | 10 tasks, 5 features | 30 micro-task agents | 5-7 feature agents | | Simple refactor | 10 agents for tiny changes | 1-2 feature agents |

Default --batch 1 is correct. Features are the unit of work.

Task Decomposition

From a plan, extract tasks that are:

| Quality | Definition | Example | | --------------- | ---------------------------------------------- | --------------------------------------------- | | Atomic | Completable in one agent session | "Add auth middleware" not "Build auth system" | | Testable | Has measurable acceptance criteria | "Returns 401 for invalid token" | | Independent | Minimal dependencies on other tasks | Can be reviewed in isolation | | Right-sized | Not too small (noise) or large (context death) | 50-200 lines of changes |

See task-decomposition.md for patterns.

Review Stages

Two reviewers with different jobs. Never skip either.

Stage 1: Spec Compliance (spec-reviewer, opus)

Question: "Did it do what was asked?"

Checks:

All acceptance criteria addressed?
Any scope creep (features not in spec)?
Any missing requirements?

Output: PASS or FAIL with line references.

Stage 2: Code Quality (code-reviewer, opus)

Question: "Is it well-written?"

Checks:

Consistency with codebase style
Error handling
Edge cases
Naming clarity
Unnecessary complexity

Output: PASS or CONCERNS with confidence scores.

See review-criteria.md for detailed checklists.

Review Enforcement (MANDATORY)

This is non-negotiable. Violations here break the entire workflow.

The Iron Rule

After EVERY task completes, you MUST spawn BOTH reviewer agents:

spec-reviewer - Spawned via Task(spec-reviewer, {...})
code-reviewer - Spawned via Task(code-reviewer, {...}) (if spec passes)

What "MUST spawn" Means

| Allowed | NOT Allowed | |---------|-------------| | Task(spec-reviewer, { prompt: "...", context: "fork" }) | Inline spec check by orchestrator | | Task(code-reviewer, { prompt: "...", context: "fork" }) | "I'll just verify the code looks good" | | Waiting for agent output JSON | Making your own compliance table | | Reading from .claude/reviews/spec/ | Skipping because "it's a simple task" |

Inline Reviews Are VIOLATIONS

If you find yourself doing ANY of these, you are VIOLATING the methodology:

Creating a "Spec Compliance Check" table yourself
Writing "Verdict: PASS" without spawning an agent
Saying "Let me verify the implementation meets criteria"
Checking acceptance criteria in a loop instead of delegating

The orchestrator does NOT review. The orchestrator SPAWNS reviewers.

Pre-Merge Checklist (HARD GATE)

Before ANY merge decision can be offered to the user:

□ .claude/reviews/spec/{task_id}.json EXISTS for each task
□ .claude/reviews/code/{task_id}.json EXISTS for each task
□ Both files contain valid JSON with "verdict" field
□ spec-reviewer verdict is PASS (or user override documented)

If ANY file is missing: DO NOT proceed to merge. You skipped the review.

Why This Matters

Consistency - Every task gets the same rigor, not "looks simple, I'll check it"
Auditability - Review outputs are artifacts, not orchestrator judgments
Separation of concerns - Orchestrator orchestrates, reviewers review
No rationalization - You can't convince yourself your inline check is "good enough"

Verdict Matrix

What happens when reviewers return their verdicts:

| Spec Result | Code Result | Action | | ----------- | ----------- | ------------------------------------------------ | | PASS | PASS | Offer [Accept] [Revise anyway] | | PASS | CONCERNS | Offer [Accept with caveats] [Fix] [Klaus review] | | FAIL | * | Always [Revise] or [Retry] |

See review-conflict-matrix.md for edge cases.

Batch Checkpoint Flow

All tasks in batch complete?
├── No → Wait for remaining
└── Yes →
    All reviews pass?
    ├── No →
    │   Retry count < 3?
    │   ├── Yes → Retry failed tasks
    │   └── No → Escalate to Klaus or human
    └── Yes →
        Present results to human
        └── Human decides: [Accept] [Revise] [Retry]

See batch-patterns.md for decision trees.

Rationalizations to Resist

Agents under pressure find excuses. These are all violations:

| Excuse | Reality | | ------------------------------------------ | ------------------------------------------------------------------- | | "30 agents is fine, tasks are independent" | More agents = more chaos. 5-7 per session, features as units. | | "I'll just checkout main to compare" | Agents don't own git. Use git show main:file instead. | | "Skip spec review, code looks correct" | Spec review catches scope creep. Never skip. | | "I'll do the review myself, it's simple" | Spawn the reviewer agents. Inline reviews are VIOLATIONS. | | "Both passed, auto-merge is safe" | Human checkpoint required. Always. | | "Context is fine, I'll continue" | ACM at 60% = checkpoint offer. 75% = mandatory stop. | | "This tiny task doesn't need a branch" | One task = one branch. No exceptions. Isolation prevents pollution. | | "Retry limit is just a guideline" | 2 retries then escalate. Infinite retry = infinite waste. | | "I'll merge my changes when done" | Commands own merge. You own implementation. Stay in your lane. |

All of these mean: Follow the methodology. Speed is not the goal.

Red Flags — STOP and Reassess

If you're thinking any of these, you're about to violate the methodology:

"Let me just run git checkout..."
"30 tasks, 30 agents, maximum parallelism"
"Review passed, no need for human checkpoint"
"Context is getting tight but I can finish"
"This is simple, don't need isolation"
"I'll merge it myself"
"Retry limit doesn't apply here"
"Spec review is redundant if code review passes"
"Let me verify the implementation meets criteria" (SPAWN THE AGENT)
"I'll create a quick compliance table" (SPAWN THE AGENT)

All of these mean: STOP. Commands own git. Humans own checkpoints. Reviewers own reviews. You own orchestration.

Robustness Patterns

Things go wrong. Here's how to handle them.

SubagentStop Hook Failure (A-6)

If the capture hook fails, agent output is lost.

Pattern: Write to backup location first, then move to primary.

# Always backup first
echo "$OUTPUT" > "$BACKUP_DIR/agent-$(date +%s).json"
# Then move to primary
mv "$BACKUP_DIR/..." "$PRIMARY"

Malformed JSON Output (A-7)

Agents sometimes produce invalid JSON.

Pattern: Validate required fields before accepting.

REQUIRED='["task_id", "status"]'
jq -e "all($REQUIRED[]; has(.))" "$OUTPUT" || exit 2

Task Branch Directory Export (A-8)

Agents need to know where to work.

Pattern: Export directory as environment variable in SubagentStart hook.

export TASK_BRANCH_DIR="$PROJECT_DIR"
export TASK_BRANCH_NAME="execute/task-${TASK_ID}-${SLUG}"

Model Rate Limiting (A-10)

Opus gets rate limited more than Sonnet.

Pattern: Offer fallback options to human.

Notify: "Opus rate limited. Options:"
Offer: [Wait 60s] [Use Sonnet fallback] [Abort]
If fallback, add caveat to review output

Context Exhaustion Mid-Task (A-11)

Agent runs out of context before finishing.

Pattern: Output partial state and mark as resumable.

{
  "status": "partial",
  "files_changed": ["completed work..."],
  "next_steps": ["what remains..."],
  "checkpoint_hash": "sha256:..."
}

Dependency Failure Chains (S-7)

Task Y depends on Task X. Task X fails. What happens to Y?

See dependency-failure-chains.md.

Branch Collision (S-8)

Two tasks accidentally get the same branch name.

See branch-collision-detection.md.

Branch Guard Recovery (S-9)

The git-branch-guard hook blocks something it shouldn't.

See branch-guard-recovery.md.

Batch Size Verification (S-10)

Validating batch sizes before execution starts.

See batch-size-verification.md.

Task Branch Recovery (S-12)

Recovering orphaned branches from crashed sessions.

See task-branch-recovery.md.

Circuit Breakers (S-13)

Preventing cascading failures when operations fail repeatedly.

Pattern: Track failure rate. If threshold exceeded, "open" the circuit - fail fast without attempting.

Circuit: agent_spawn
State: OPEN (3 failures in 60s)
Reset in: 30 seconds

[Wait for reset] [Force close] [Skip operation]

See circuit-breakers.md.

Execution Tracing (S-14)

Debugging execution graphs and understanding what happened.

Pattern: Record spans for each operation. Visualise as waterfall or dependency graph.

Trace: exec-session-xyz
├── batch_1 (45s)
│   ├── task-1 (20s) ✓
│   └── task-2 (25s) ✓
└── batch_2 (60s)
    └── task-3 (60s) ✓

Critical path: batch_1 → batch_2

See execution-tracing.md.

Stuck Detection

| Signal | Threshold | Response | | --------------------- | ----------------------------- | ------------------- | | Tool call flooding | 20 calls without file changes | Warning, then Klaus | | Time without progress | 10 minutes | Warning, then Klaus | | Repeated failures | Same error 3x | Pause, offer Klaus | | Context burn rate | ACM at 60% | Checkpoint offer | | Review timeout | 5 minutes per reviewer | Offer [Wait] [Skip] |

Anti-Patterns

Don't do these:

Running git checkout/merge/push from agents
Batching 30+ tasks in parallel
Skipping spec review because "code looks fine"
Auto-merging without human checkpoint
Ignoring stuck signals
Continuing after context warnings

References

Full documentation in this skill's references/ folder:

task-decomposition.md - How to break down plans
review-criteria.md - What reviewers check (400 LOC threshold, attack surface tracing)
batch-patterns.md - Checkpoint decision patterns (coordinated checkpoints, load shedding, deadline propagation)
dependency-failure-chains.md - When dependent tasks fail
branch-collision-detection.md - Preventing duplicate branches
branch-guard-recovery.md - Recovering from guard failures
batch-size-verification.md - Validating batch sizes
review-conflict-matrix.md - Handling reviewer disagreements (RESOLVE framework)
task-branch-recovery.md - Recovering orphaned branches
circuit-breakers.md - Preventing cascading failures (circuit breaker pattern, timeout strategies)
execution-tracing.md - Debugging execution graphs (spans, traces, critical path analysis)

Git Workflow Methodology

When to use this skill

Use this skill when you need to:

Run the claudikins-kernel:execute command
Decompose plans into executable tasks
Set up two-stage code review
Decide batch sizes and checkpoints
Handle stuck agents or failed tasks

Core Philosophy

"I'd use 5-7 agents per SESSION, not 30 per batch." - Boris

Execution is about isolation, verification, and human checkpoints. Not speed.

The Six Principles

One task = one branch - Isolation prevents pollution
Fresh context per task - context: fork gives a clean slate
Two-stage review - Spec compliance first, then code quality
Human checkpoints between batches - Not between individual tasks
Commands own git - Agents never checkout/merge/push
Features are the unit - Batch at feature level, not task level

Batch Size Guidance (GOSPEL)

Wrong: 30 agents for 10 tasks (3 per task micro-management) Right: 5-7 agents total (feature-level batches)

Default --batch 1 is correct. Features are the unit of work.

Task Decomposition

From a plan, extract tasks that are:

See task-decomposition.md for patterns.

Review Stages

Two reviewers with different jobs. Never skip either.

Stage 1: Spec Compliance (spec-reviewer, opus)

Question: "Did it do what was asked?"

Checks:

All acceptance criteria addressed?
Any scope creep (features not in spec)?
Any missing requirements?

Output: PASS or FAIL with line references.

Stage 2: Code Quality (code-reviewer, opus)

Question: "Is it well-written?"

Checks:

Consistency with codebase style
Error handling
Edge cases
Naming clarity
Unnecessary complexity

Output: PASS or CONCERNS with confidence scores.

See review-criteria.md for detailed checklists.

Review Enforcement (MANDATORY)

This is non-negotiable. Violations here break the entire workflow.

The Iron Rule

After EVERY task completes, you MUST spawn BOTH reviewer agents:

spec-reviewer - Spawned via Task(spec-reviewer, {...})
code-reviewer - Spawned via Task(code-reviewer, {...}) (if spec passes)

What "MUST spawn" Means

Inline Reviews Are VIOLATIONS

If you find yourself doing ANY of these, you are VIOLATING the methodology:

Creating a "Spec Compliance Check" table yourself
Writing "Verdict: PASS" without spawning an agent
Saying "Let me verify the implementation meets criteria"
Checking acceptance criteria in a loop instead of delegating

The orchestrator does NOT review. The orchestrator SPAWNS reviewers.

Pre-Merge Checklist (HARD GATE)

Before ANY merge decision can be offered to the user:

□ .claude/reviews/spec/{task_id}.json EXISTS for each task
□ .claude/reviews/code/{task_id}.json EXISTS for each task
□ Both files contain valid JSON with "verdict" field
□ spec-reviewer verdict is PASS (or user override documented)

If ANY file is missing: DO NOT proceed to merge. You skipped the review.

Why This Matters

Consistency - Every task gets the same rigor, not "looks simple, I'll check it"
Auditability - Review outputs are artifacts, not orchestrator judgments
Separation of concerns - Orchestrator orchestrates, reviewers review
No rationalization - You can't convince yourself your inline check is "good enough"

Verdict Matrix

What happens when reviewers return their verdicts:

See review-conflict-matrix.md for edge cases.

Batch Checkpoint Flow

All tasks in batch complete?
├── No → Wait for remaining
└── Yes →
    All reviews pass?
    ├── No →
    │   Retry count < 3?
    │   ├── Yes → Retry failed tasks
    │   └── No → Escalate to Klaus or human
    └── Yes →
        Present results to human
        └── Human decides: [Accept] [Revise] [Retry]

See batch-patterns.md for decision trees.

Rationalizations to Resist

Agents under pressure find excuses. These are all violations:

All of these mean: Follow the methodology. Speed is not the goal.

Red Flags — STOP and Reassess

If you're thinking any of these, you're about to violate the methodology:

"Let me just run git checkout..."
"30 tasks, 30 agents, maximum parallelism"
"Review passed, no need for human checkpoint"
"Context is getting tight but I can finish"
"This is simple, don't need isolation"
"I'll merge it myself"
"Retry limit doesn't apply here"
"Spec review is redundant if code review passes"
"Let me verify the implementation meets criteria" (SPAWN THE AGENT)
"I'll create a quick compliance table" (SPAWN THE AGENT)

All of these mean: STOP. Commands own git. Humans own checkpoints. Reviewers own reviews. You own orchestration.

Robustness Patterns

Things go wrong. Here's how to handle them.

SubagentStop Hook Failure (A-6)

If the capture hook fails, agent output is lost.

Pattern: Write to backup location first, then move to primary.

# Always backup first
echo "$OUTPUT" > "$BACKUP_DIR/agent-$(date +%s).json"
# Then move to primary
mv "$BACKUP_DIR/..." "$PRIMARY"

Malformed JSON Output (A-7)

Agents sometimes produce invalid JSON.

Pattern: Validate required fields before accepting.

REQUIRED='["task_id", "status"]'
jq -e "all($REQUIRED[]; has(.))" "$OUTPUT" || exit 2

Task Branch Directory Export (A-8)

Agents need to know where to work.

Pattern: Export directory as environment variable in SubagentStart hook.

export TASK_BRANCH_DIR="$PROJECT_DIR"
export TASK_BRANCH_NAME="execute/task-${TASK_ID}-${SLUG}"

Model Rate Limiting (A-10)

Opus gets rate limited more than Sonnet.

Pattern: Offer fallback options to human.

Notify: "Opus rate limited. Options:"
Offer: [Wait 60s] [Use Sonnet fallback] [Abort]
If fallback, add caveat to review output

Context Exhaustion Mid-Task (A-11)

Agent runs out of context before finishing.

Pattern: Output partial state and mark as resumable.

{
  "status": "partial",
  "files_changed": ["completed work..."],
  "next_steps": ["what remains..."],
  "checkpoint_hash": "sha256:..."
}

Dependency Failure Chains (S-7)

Task Y depends on Task X. Task X fails. What happens to Y?

See dependency-failure-chains.md.

Branch Collision (S-8)

Two tasks accidentally get the same branch name.

See branch-collision-detection.md.

Branch Guard Recovery (S-9)

The git-branch-guard hook blocks something it shouldn't.

See branch-guard-recovery.md.

Batch Size Verification (S-10)

Validating batch sizes before execution starts.

See batch-size-verification.md.

Task Branch Recovery (S-12)

Recovering orphaned branches from crashed sessions.

See task-branch-recovery.md.

Circuit Breakers (S-13)

Preventing cascading failures when operations fail repeatedly.

Pattern: Track failure rate. If threshold exceeded, "open" the circuit - fail fast without attempting.

Circuit: agent_spawn
State: OPEN (3 failures in 60s)
Reset in: 30 seconds

[Wait for reset] [Force close] [Skip operation]

See circuit-breakers.md.

Execution Tracing (S-14)

Debugging execution graphs and understanding what happened.

Pattern: Record spans for each operation. Visualise as waterfall or dependency graph.

Trace: exec-session-xyz
├── batch_1 (45s)
│   ├── task-1 (20s) ✓
│   └── task-2 (25s) ✓
└── batch_2 (60s)
    └── task-3 (60s) ✓

Critical path: batch_1 → batch_2

See execution-tracing.md.

Stuck Detection

Anti-Patterns

Don't do these:

Running git checkout/merge/push from agents
Batching 30+ tasks in parallel
Skipping spec review because "code looks fine"
Auto-merging without human checkpoint
Ignoring stuck signals
Continuing after context warnings

References

Full documentation in this skill's references/ folder:

task-decomposition.md - How to break down plans
review-criteria.md - What reviewers check (400 LOC threshold, attack surface tracing)
batch-patterns.md - Checkpoint decision patterns (coordinated checkpoints, load shedding, deadline propagation)
dependency-failure-chains.md - When dependent tasks fail
branch-collision-detection.md - Preventing duplicate branches
branch-guard-recovery.md - Recovering from guard failures
batch-size-verification.md - Validating batch sizes
review-conflict-matrix.md - Handling reviewer disagreements (RESOLVE framework)
task-branch-recovery.md - Recovering orphaned branches
circuit-breakers.md - Preventing cascading failures (circuit breaker pattern, timeout strategies)
execution-tracing.md - Debugging execution graphs (spans, traces, critical path analysis)

Adoption

elb-pr/git-workflow

$ install --global

Security Scan Results

SKILL.md

Git Workflow Methodology

When to use this skill

Core Philosophy

The Six Principles

Batch Size Guidance (GOSPEL)

Task Decomposition

Review Stages

Stage 1: Spec Compliance (spec-reviewer, opus)

Stage 2: Code Quality (code-reviewer, opus)

Review Enforcement (MANDATORY)

The Iron Rule

What "MUST spawn" Means

Inline Reviews Are VIOLATIONS

Pre-Merge Checklist (HARD GATE)

Why This Matters

Verdict Matrix

Batch Checkpoint Flow

Rationalizations to Resist

Red Flags — STOP and Reassess

Robustness Patterns

SubagentStop Hook Failure (A-6)

Malformed JSON Output (A-7)

Task Branch Directory Export (A-8)

Model Rate Limiting (A-10)

Context Exhaustion Mid-Task (A-11)

Dependency Failure Chains (S-7)

Branch Collision (S-8)

Branch Guard Recovery (S-9)

Batch Size Verification (S-10)

Task Branch Recovery (S-12)

Circuit Breakers (S-13)

Execution Tracing (S-14)

Stuck Detection

Anti-Patterns

References

Related Skills

elb-pr/strict-enforcement

elb-pr/shipping-methodology

elb-pr/brain-jam-plan

steipete/skill-creator

elb-pr/git-workflow

$ install --global

Security Scan Results

SKILL.md

Git Workflow Methodology

When to use this skill

Core Philosophy

The Six Principles

Batch Size Guidance (GOSPEL)

Task Decomposition

Review Stages

Stage 1: Spec Compliance (spec-reviewer, opus)

Stage 2: Code Quality (code-reviewer, opus)

Review Enforcement (MANDATORY)

The Iron Rule

What "MUST spawn" Means

Inline Reviews Are VIOLATIONS

Pre-Merge Checklist (HARD GATE)

Why This Matters

Verdict Matrix

Batch Checkpoint Flow

Rationalizations to Resist

Red Flags — STOP and Reassess

Robustness Patterns

SubagentStop Hook Failure (A-6)

Malformed JSON Output (A-7)

Task Branch Directory Export (A-8)

Model Rate Limiting (A-10)

Context Exhaustion Mid-Task (A-11)

Dependency Failure Chains (S-7)

Branch Collision (S-8)

Branch Guard Recovery (S-9)

Batch Size Verification (S-10)

Task Branch Recovery (S-12)

Circuit Breakers (S-13)