The simplest change that fixes the problem is the safest change to merge. Adding code is easy. Removing the need for code is engineering.

Justify

The Additive Bias Problem

AI models are trained to be helpful, which creates a systematic bias toward adding code rather than fixing root causes:

| AI Default Behavior | Correct Behavior | |---------------------|------------------| | Add a workaround | Fix the root cause | | Modify test expectations | Fix the implementation | | Create a new helper | Use an existing one | | Add error handling | Prevent the error | | Add a compatibility shim | Remove the old code | | Wrap in try/catch | Fix the exception source |

This skill audits changes for these patterns and requires explicit justification for each.

When To Use

After completing implementation work
Before committing or creating PRs
When reviewing your own changes for quality
When scope-guard flags RED/YELLOW zone

When NOT To Use

Before writing the code, because this audits work already done (use imbue:karpathy-principles)
Deciding whether a feature belongs in scope (use imbue:scope-guard)

Audit Protocol

Step 1: Gather the Delta

# Determine base branch
base=$(git merge-base master HEAD 2>/dev/null \
  || git merge-base main HEAD 2>/dev/null)

# Get change statistics
git diff "$base" --stat
git diff "$base" --shortstat
git diff "$base" --diff-filter=A --name-only  # new files
git diff "$base" --diff-filter=M --name-only  # modified files
git diff "$base" --diff-filter=D --name-only  # deleted files

Step 2: Compute Additive Bias Score

Score each dimension 0-3 (0 = clean, 3 = high bias):

| Signal | Weight | How to Measure | |--------|--------|----------------| | Line ratio | 2x | additions / max(deletions, 1) | | New files | 2x | Count of --diff-filter=A | | Test logic changes | 3x | Test assertion/expectation diffs | | New abstractions | 1x | New classes, functions, modules | | Workaround patterns | 2x | Try/catch, if/else guards added |

Line Ratio Scoring:

| Ratio | Score | Interpretation | |-------|-------|----------------| | < 2:1 | 0 | Balanced change | | 2:1 to 5:1 | 1 | Mildly additive | | 5:1 to 10:1 | 2 | Additive bias likely | | > 10:1 | 3 | Strong additive bias |

Aggregate Score:

bias_score = sum(signal_score * weight) / sum(weights)

| Aggregate | Zone | Action | |-----------|------|--------| | 0.0 - 0.5 | GREEN | Proceed | | 0.5 - 1.5 | YELLOW | Justify each signal | | 1.5 - 2.5 | RED | Rethink approach | | 2.5+ | STOP | Likely wrong approach |

Step 3: Iron Law Compliance Check

The Iron Law states: tests drive implementation, not the other way around. Check for violations:

# Find test files that were modified
git diff "$base" --name-only | rg "test_|_test\.|spec\." \
  || git diff "$base" --name-only | grep -E "test_|_test\.|spec\."

# For each modified test file, check what changed
git diff "$base" -- <test_file> | rg "^[-+].*assert|^[-+].*expect|^[-+].*should"

Violation patterns (test logic was tampered):

Assertion values changed (expected output modified)
Test cases removed or commented out
@skip or @pytest.mark.skip added
Error expectations weakened (broad exception types)
Mock return values changed to match new behavior
Test renamed to no longer describe original behavior

Each violation requires explicit justification:

"I changed this test assertion because the requirement changed, not because my implementation couldn't meet the original requirement."

If the requirement didn't change, the test should not change. Fix the implementation instead.

Step 4: Minimal Intervention Analysis

For each changed file, answer:

Was this change necessary? Could the goal be achieved without touching this file?
Was this the minimal change? Could fewer lines achieve the same result?
Did this change add or remove complexity? New functions, classes, or control flow = added complexity that needs justification.
Is there a subtraction-first alternative? Could removing code fix the problem instead of adding code?

Step 4.5: Invariant Impact Analysis

Changes can be minimal and still catastrophically wrong if they silently revise a load-bearing design decision. For each changed file, check whether it touches a design invariant:

What counts as an invariant:

Architectural patterns (module boundaries, layer separation, data flow direction)
Data structure choices (why a map vs list, why normalized vs denormalized)
API contracts (public interfaces, protocol formats)
Error handling strategies (fail-fast vs recovery)
Concurrency models (single-threaded assumption, actor model, shared-nothing)

Detection heuristic:

# Check for structural changes (new modules, moved
# boundaries, changed interfaces)
git diff "$base" --name-only | rg "(interface|abstract|base|core|types|schema|model)" \
  || git diff "$base" --name-only | grep -E "(interface|abstract|base|core|types|schema|model)"

# Check for pattern-breaking changes
git diff "$base" -U5 | rg "(TODO.*refactor|HACK|WORKAROUND|XXX)" \
  || git diff "$base" -U5 | grep -E "(TODO.*refactor|HACK|WORKAROUND|XXX)"

When an invariant conflict is detected:

Do NOT silently pick a resolution. Present the three options to the human:

| Option | Description | When Right | |--------|-------------|------------| | Preserve | Don't add the feature; the invariant pays dividends | Invariant simplifies many things; feature is marginal | | Layer | Add feature inelegantly on top | Feature is needed; invariant is still valuable; imperfection is acceptable | | Revise | Change the invariant itself | Genuine new learning invalidates the original decision |

Add to Justification Report:

### Invariant Impact: NONE / DETECTED

[If DETECTED:]
- **Invariant**: [name the design decision]
- **Conflict**: [what change clashes with it]
- **Option chosen**: Preserve / Layer / Revise
- **Justification**: [why this option, not the others]
- **Human reviewed**: YES / NO — if NO, flag as
  requiring review before merge

Compounding risk warning: Bad invariant decisions accumulate. If this branch has multiple invariant revisions, flag the entire branch for architectural review. Each silent invariant change multiplies the probability of an unsalvageable codebase.

Step 5: Generate Justification Report

Output a structured report:

## Justification Report

**Branch**: feature/xyz
**Base**: master
**Delta**: +N/-M lines, X files changed

### Additive Bias Score: X.X (ZONE)

| Signal | Score | Detail |
|--------|-------|--------|
| Line ratio | N | +A/-D = R:1 |
| New files | N | [list] |
| Test changes | N | [list] |
| New abstractions | N | [list] |
| Workarounds | N | [list] |

### Iron Law Compliance: PASS/FAIL

[List any test logic modifications with justification]

### Change-by-Change Justification

#### file.py (+N/-M)
- **What**: [description]
- **Why**: [root cause this addresses]
- **Alternatives considered**: [what else could work]
- **Why this is minimal**: [why fewer changes won't work]

#### test_file.py (+N/-M)
- **What**: [description]
- **Justification**: [why test logic changed, if it did]
- **Iron Law status**: PASS/VIOLATION

### Risk Assessment

| Factor | Rating |
|--------|--------|
| Lines changed | LOW/MED/HIGH |
| Files touched | LOW/MED/HIGH |
| Test modifications | NONE/JUSTIFIED/VIOLATION |
| New abstractions | NONE/JUSTIFIED/UNNECESSARY |
| Overall merge risk | LOW/MED/HIGH |

### Recommendations

[List any changes that should be reconsidered,
simpler alternatives, or unnecessary additions]

Decision Weights

When evaluating competing approaches, weight these factors:

| Factor | Weight | Rationale | |--------|--------|-----------| | Fewer lines changed | HIGH | Less risk, easier review | | No new files | HIGH | No new maintenance burden | | No test logic changes | HIGH | Iron Law compliance | | Root cause fix | HIGH | Prevents recurrence | | Removes code | BONUS | Reduces maintenance surface | | Adds abstraction | PENALTY | Only justified at 3rd use | | Adds error handling | NEUTRAL | Only at system boundaries |

The Subtraction Test: Before accepting any change, ask: "Could I achieve this by removing code instead of adding it?" If yes, prefer the subtractive approach.

Integration with Proof of Work

Justify extends proof-of-work with change-level accountability:

proof-of-work: "Did it work?" (evidence)
justify: "Was this the right way?" (reasoning)

Both are required before claiming work is complete. Run proof-of-work first, then justify.

Anti-Patterns to Flag

1. Test Mutation

Changing test expectations to match broken code. Fix: Revert the test change, fix the implementation.

2. Shotgun Addition

Adding code in many files for a single-concern fix. Fix: Find the single point of change.

3. Defensive Overengineering

Adding try/catch, null checks, or validation for scenarios that can't happen in practice. Fix: Trust internal code. Only validate at boundaries.

4. Premature Abstraction

Creating a helper/utility/base class for one use case. Fix: Inline the code. Abstract at the 3rd use.

5. Compatibility Shim

Adding backward-compatibility code instead of updating callers. Fix: Update callers directly. Delete dead paths.

6. Silent Invariant Revision

Changing an architectural pattern, data structure choice, or API contract without acknowledging that a design invariant is being revised. Fix: Name the invariant. Present the 3 options (preserve, layer, revise) to a human. Do not make the judgment call yourself: models default to the "average" of training data, and wrong invariant decisions compound into unsalvageable codebases.

Scrutiny Questions (from leyline:additive-bias-defense)

Before justifying any change, apply these questions. If the answer to questions 4 and 5 is not concrete evidence, the change is unjustified.

Priority alignment: Is this a deviation from the current priority?
Criticality: Is it critical to implement at this juncture?
Simplicity: Does a simpler or more elegant solution exist?
Evidence: What evidence proves this is needed (not assumed)?
Consequence: What breaks if we do not add this?

Burden of Proof Inversion

The default stance is: this addition should not exist. The change must prove its necessity, not the reviewer must prove it unnecessary.

When generating the Justification Report (Step 5), add a Burden of Proof section:

| Change | Scrutiny Q4 Answer | Scrutiny Q5 Answer | Verdict | |--------|--------------------|--------------------|---------| | file.py | [evidence] | [consequence] | justified/needs_evidence/unjustified |

Changes with unjustified verdict MUST be removed or reworked before the report passes.

Record the Tradeoff (decision journal)

When this step settles a decision with real alternatives, record it to docs/tradeoffs.md while the reasoning is live (draft and confirm):

If leyline is installed, invoke Skill(leyline:decision-journal) and append a tradeoff entry (the decision, the options weighed, and what was sacrificed; set phase to review). Show the draft; append on confirmation.
Fallback (leyline absent): append to docs/tradeoffs.md using the in-file ENTRY TEMPLATE; assign the next TR-NNN id.

The Wise Counsel

Is what you are doing a deviation of your priority? Is it critical to implement at this juncture? Rely less on AI and initial lines of thinking. Challenge yourself to be better, to think of a more elegant implementation or a simpler solution.

Exit Criteria

[ ] An additive bias score is computed and its zone (GREEN/YELLOW/RED/STOP) is reported.
[ ] Iron Law compliance is marked PASS or FAIL, with justification for any modified test logic.
[ ] Every change carries a verdict (justified, needs_evidence, or unjustified); no unjustified verdict survives in the final report.
[ ] Any detected invariant conflict is surfaced with the chosen option and a human-review flag.
[ ] A justified non-trivial addition is recorded to docs/tradeoffs.md (or the in-file template) before the report passes.

Related Skills

imbue:karpathy-principles - "Surgical Changes" and "Goal-Driven Execution" principles invoke this audit from a higher-level synthesis
leyline:additive-bias-defense - the contract this audit enforces in detail
imbue:proof-of-work - the validation layer this audit complements (proof-of-work asks "did it work?", justify asks "did it need to exist?")
See docs/quality-gates.md#skill-level-quality-gate-composition for the full gate-skill federation graph

The simplest change that fixes the problem is the safest change to merge. Adding code is easy. Removing the need for code is engineering.

Justify

The Additive Bias Problem

AI models are trained to be helpful, which creates a systematic bias toward adding code rather than fixing root causes:

This skill audits changes for these patterns and requires explicit justification for each.

When To Use

After completing implementation work
Before committing or creating PRs
When reviewing your own changes for quality
When scope-guard flags RED/YELLOW zone

When NOT To Use

Before writing the code, because this audits work already done (use imbue:karpathy-principles)
Deciding whether a feature belongs in scope (use imbue:scope-guard)

Audit Protocol

Step 1: Gather the Delta

# Determine base branch
base=$(git merge-base master HEAD 2>/dev/null \
  || git merge-base main HEAD 2>/dev/null)

# Get change statistics
git diff "$base" --stat
git diff "$base" --shortstat
git diff "$base" --diff-filter=A --name-only  # new files
git diff "$base" --diff-filter=M --name-only  # modified files
git diff "$base" --diff-filter=D --name-only  # deleted files

Step 2: Compute Additive Bias Score

Score each dimension 0-3 (0 = clean, 3 = high bias):

Line Ratio Scoring:

Aggregate Score:

bias_score = sum(signal_score * weight) / sum(weights)

Step 3: Iron Law Compliance Check

The Iron Law states: tests drive implementation, not the other way around. Check for violations:

# Find test files that were modified
git diff "$base" --name-only | rg "test_|_test\.|spec\." \
  || git diff "$base" --name-only | grep -E "test_|_test\.|spec\."

# For each modified test file, check what changed
git diff "$base" -- <test_file> | rg "^[-+].*assert|^[-+].*expect|^[-+].*should"

Violation patterns (test logic was tampered):

Assertion values changed (expected output modified)
Test cases removed or commented out
@skip or @pytest.mark.skip added
Error expectations weakened (broad exception types)
Mock return values changed to match new behavior
Test renamed to no longer describe original behavior

Each violation requires explicit justification:

"I changed this test assertion because the requirement changed, not because my implementation couldn't meet the original requirement."

If the requirement didn't change, the test should not change. Fix the implementation instead.

Step 4: Minimal Intervention Analysis

For each changed file, answer:

Was this change necessary? Could the goal be achieved without touching this file?
Was this the minimal change? Could fewer lines achieve the same result?
Did this change add or remove complexity? New functions, classes, or control flow = added complexity that needs justification.
Is there a subtraction-first alternative? Could removing code fix the problem instead of adding code?

Step 4.5: Invariant Impact Analysis

Changes can be minimal and still catastrophically wrong if they silently revise a load-bearing design decision. For each changed file, check whether it touches a design invariant:

What counts as an invariant:

Architectural patterns (module boundaries, layer separation, data flow direction)
Data structure choices (why a map vs list, why normalized vs denormalized)
API contracts (public interfaces, protocol formats)
Error handling strategies (fail-fast vs recovery)
Concurrency models (single-threaded assumption, actor model, shared-nothing)

Detection heuristic:

# Check for structural changes (new modules, moved
# boundaries, changed interfaces)
git diff "$base" --name-only | rg "(interface|abstract|base|core|types|schema|model)" \
  || git diff "$base" --name-only | grep -E "(interface|abstract|base|core|types|schema|model)"

# Check for pattern-breaking changes
git diff "$base" -U5 | rg "(TODO.*refactor|HACK|WORKAROUND|XXX)" \
  || git diff "$base" -U5 | grep -E "(TODO.*refactor|HACK|WORKAROUND|XXX)"

When an invariant conflict is detected:

Do NOT silently pick a resolution. Present the three options to the human:

Add to Justification Report:

### Invariant Impact: NONE / DETECTED

[If DETECTED:]
- **Invariant**: [name the design decision]
- **Conflict**: [what change clashes with it]
- **Option chosen**: Preserve / Layer / Revise
- **Justification**: [why this option, not the others]
- **Human reviewed**: YES / NO — if NO, flag as
  requiring review before merge

Step 5: Generate Justification Report

Output a structured report:

## Justification Report

**Branch**: feature/xyz
**Base**: master
**Delta**: +N/-M lines, X files changed

### Additive Bias Score: X.X (ZONE)

| Signal | Score | Detail |
|--------|-------|--------|
| Line ratio | N | +A/-D = R:1 |
| New files | N | [list] |
| Test changes | N | [list] |
| New abstractions | N | [list] |
| Workarounds | N | [list] |

### Iron Law Compliance: PASS/FAIL

[List any test logic modifications with justification]

### Change-by-Change Justification

#### file.py (+N/-M)
- **What**: [description]
- **Why**: [root cause this addresses]
- **Alternatives considered**: [what else could work]
- **Why this is minimal**: [why fewer changes won't work]

#### test_file.py (+N/-M)
- **What**: [description]
- **Justification**: [why test logic changed, if it did]
- **Iron Law status**: PASS/VIOLATION

### Risk Assessment

| Factor | Rating |
|--------|--------|
| Lines changed | LOW/MED/HIGH |
| Files touched | LOW/MED/HIGH |
| Test modifications | NONE/JUSTIFIED/VIOLATION |
| New abstractions | NONE/JUSTIFIED/UNNECESSARY |
| Overall merge risk | LOW/MED/HIGH |

### Recommendations

[List any changes that should be reconsidered,
simpler alternatives, or unnecessary additions]

Decision Weights

When evaluating competing approaches, weight these factors:

The Subtraction Test: Before accepting any change, ask: "Could I achieve this by removing code instead of adding it?" If yes, prefer the subtractive approach.

Integration with Proof of Work

Justify extends proof-of-work with change-level accountability:

proof-of-work: "Did it work?" (evidence)
justify: "Was this the right way?" (reasoning)

Both are required before claiming work is complete. Run proof-of-work first, then justify.

Anti-Patterns to Flag

1. Test Mutation

Changing test expectations to match broken code. Fix: Revert the test change, fix the implementation.

2. Shotgun Addition

Adding code in many files for a single-concern fix. Fix: Find the single point of change.

3. Defensive Overengineering

Adding try/catch, null checks, or validation for scenarios that can't happen in practice. Fix: Trust internal code. Only validate at boundaries.

4. Premature Abstraction

Creating a helper/utility/base class for one use case. Fix: Inline the code. Abstract at the 3rd use.

5. Compatibility Shim

Adding backward-compatibility code instead of updating callers. Fix: Update callers directly. Delete dead paths.

6. Silent Invariant Revision

Scrutiny Questions (from leyline:additive-bias-defense)

Before justifying any change, apply these questions. If the answer to questions 4 and 5 is not concrete evidence, the change is unjustified.

Priority alignment: Is this a deviation from the current priority?
Criticality: Is it critical to implement at this juncture?
Simplicity: Does a simpler or more elegant solution exist?
Evidence: What evidence proves this is needed (not assumed)?
Consequence: What breaks if we do not add this?

Burden of Proof Inversion

The default stance is: this addition should not exist. The change must prove its necessity, not the reviewer must prove it unnecessary.

When generating the Justification Report (Step 5), add a Burden of Proof section:

Changes with unjustified verdict MUST be removed or reworked before the report passes.

Record the Tradeoff (decision journal)

When this step settles a decision with real alternatives, record it to docs/tradeoffs.md while the reasoning is live (draft and confirm):

If leyline is installed, invoke Skill(leyline:decision-journal) and append a tradeoff entry (the decision, the options weighed, and what was sacrificed; set phase to review). Show the draft; append on confirmation.
Fallback (leyline absent): append to docs/tradeoffs.md using the in-file ENTRY TEMPLATE; assign the next TR-NNN id.

The Wise Counsel

Is what you are doing a deviation of your priority? Is it critical to implement at this juncture? Rely less on AI and initial lines of thinking. Challenge yourself to be better, to think of a more elegant implementation or a simpler solution.

Exit Criteria

[ ] An additive bias score is computed and its zone (GREEN/YELLOW/RED/STOP) is reported.
[ ] Iron Law compliance is marked PASS or FAIL, with justification for any modified test logic.
[ ] Every change carries a verdict (justified, needs_evidence, or unjustified); no unjustified verdict survives in the final report.
[ ] Any detected invariant conflict is surfaced with the chosen option and a human-review flag.
[ ] A justified non-trivial addition is recorded to docs/tradeoffs.md (or the in-file template) before the report passes.

Related Skills

imbue:karpathy-principles - "Surgical Changes" and "Goal-Driven Execution" principles invoke this audit from a higher-level synthesis
leyline:additive-bias-defense - the contract this audit enforces in detail
imbue:proof-of-work - the validation layer this audit complements (proof-of-work asks "did it work?", justify asks "did it need to exist?")
See docs/quality-gates.md#skill-level-quality-gate-composition for the full gate-skill federation graph

Adoption

athola/justify

$ install --global

Security Scan Results

SKILL.md

Justify

The Additive Bias Problem

When To Use

When NOT To Use

Audit Protocol

Step 1: Gather the Delta

Step 2: Compute Additive Bias Score

Step 3: Iron Law Compliance Check

Step 4: Minimal Intervention Analysis

Step 4.5: Invariant Impact Analysis

Step 5: Generate Justification Report

Decision Weights

Integration with Proof of Work

Anti-Patterns to Flag

1. Test Mutation

2. Shotgun Addition

3. Defensive Overengineering

4. Premature Abstraction

5. Compatibility Shim

6. Silent Invariant Revision

Scrutiny Questions (from leyline:additive-bias-defense)

Burden of Proof Inversion

Record the Tradeoff (decision journal)

The Wise Counsel

Exit Criteria

Related Skills

Related Skills

athola/architecture-paradigm-domain-driven

athola/ideate

athola/validate-pr

athola/graduated-implementation

athola/justify

$ install --global

Security Scan Results

SKILL.md

Justify

The Additive Bias Problem

When To Use

When NOT To Use

Audit Protocol

Step 1: Gather the Delta

Step 2: Compute Additive Bias Score

Step 3: Iron Law Compliance Check

Step 4: Minimal Intervention Analysis

Step 4.5: Invariant Impact Analysis

Step 5: Generate Justification Report

Decision Weights

Integration with Proof of Work

Anti-Patterns to Flag

1. Test Mutation

2. Shotgun Addition

3. Defensive Overengineering

4. Premature Abstraction

5. Compatibility Shim

6. Silent Invariant Revision

Scrutiny Questions (from leyline:additive-bias-defense)

Burden of Proof Inversion

Record the Tradeoff (decision journal)

The Wise Counsel

Exit Criteria

Related Skills

Related Skills

athola/architecture-paradigm-domain-driven

athola/ideate

athola/validate-pr

athola/graduated-implementation