Resolve Review Skill

Read all review comments (inline + summary) on an open GitHub PR, apply targeted fixes for actionable findings, commit each fix, and verify tests still pass.

Arguments

/autoskillit:resolve-review <feature_branch> <base_branch> [mode=<local|github>]

feature_branch — The PR's head branch (used to find the open PR)
base_branch — The PR's base branch (e.g., "main")
mode (optional, default: github) — Controls where findings are read from and how threads are handled:
- mode=github (or absent/unrecognized): current behavior — fetch findings from GitHub API, post deferred observations from prior local rounds, resolve threads and post inline replies.
- mode=local: read findings from local JSON (written by review-pr in mode=local), skip all GitHub API fetching, accumulate DISCUSS/REJECT to persistent local files, skip thread resolution and inline reply API calls, still run task test-check.

The cwd is provided by the recipe step's cwd: field — the clone with the feature branch already checked out.

When to Use

Called by the recipe orchestrator via run_skill after review_pr reports changes_requested or needs_human verdict
MCP-only: not user-invocable directly

Critical Constraints

NEVER:

Create files outside {{AUTOSKILLIT_TEMP}}/resolve-review/
Merge, push, or call merge_worktree
Fix issues beyond the explicit scope of the reviewer's comments
Exceed 3 fix-and-retest iterations
Delete or discard the working directory on failure
Modify tests to suppress failures introduced by reviewer fixes
Run subagents in the background (run_in_background: true is prohibited)

ALWAYS:

Find the PR by feature branch at invocation time (not a hardcoded number)
Fetch both inline comments (pulls/{number}/comments) and top-level review bodies (pulls/{number}/reviews) via the GitHub API
Commit each distinct fix separately with a message describing what was addressed
Run {test_command} (from config, default: task test-check) after applying all fixes to catch regressions
Gracefully degrade (exit 0, report skip) if gh is unavailable or no PR is found
Report a structured summary: findings fetched, fixes applied, fixes skipped (with reasons)
Read before editing: Before issuing an Edit call on any file, ensure you have issued a Read on that file earlier in this session. Claude Code rejects Edit on unread files — the retry wastes a full API turn at current context size. If you are uncertain whether a file was read, issue a targeted Read (offset + limit to the region you plan to edit) rather than risk an error.
CWD awareness: Before running python3 or other interpreters, verify your current working directory is the worktree root (not the orchestrator's project root). Use absolute paths for imports or cd to the worktree first. A wrong-CWD import error wastes a full API turn.

Context Limit Behavior

When context is exhausted mid-execution, edits may be on disk but not committed. The recipe routes to on_context_limit (typically a re-push step), bypassing the normal commit protocol.

Before every test run and before emitting structured output tokens:

Run git -C {work_dir} status --porcelain
If any files are dirty: git -C {work_dir} add -A && git -C {work_dir} commit -m "fix: commit pending review changes"
Only then proceed with the test or structured output

This ensures that even if context exhaustion interrupts the fix loop, all applied review fixes are committed and the downstream push step receives a clean branch.

Workflow

Read test configuration from .autoskillit/config.yaml: check test_check.commands (ordered list, if set) or test_check.command (single command, default: task test-check). The test_check MCP tool runs all configured commands automatically.

Step 0: Validate Arguments

Parse two positional arguments: feature_branch and base_branch.

If either is missing, abort with: "Usage: /autoskillit:resolve-review <feature_branch> <base_branch>"

Parse the optional mode keyword argument:

# Extract mode from keyword arguments
MODE="github"
for arg in "$@"; do
    case "$arg" in
        mode=local)  MODE="local" ;;
        mode=github) MODE="github" ;;
    esac
done

If mode is absent or unrecognized, default to "github".

Step 1: Find the Open PR

PR_LIST_OUTPUT=$(gh pr list --head "$feature_branch" --base "$base_branch" \
  --json number,url -q '.[0] | "\(.number) \(.url)"')
PR_NUMBER=$(echo "$PR_LIST_OUTPUT" | awk '{print $1}')
PR_URL=$(echo "$PR_LIST_OUTPUT" | awk '{print $2}')

Get owner/repo:

gh repo view --json nameWithOwner -q .nameWithOwner

If gh is unavailable or not authenticated, or no PR is found:

Log "No PR found or gh unavailable — skipping review resolution"
Exit 0 (graceful degradation — do not fail the pipeline)

Step 1.5: Post Accumulated Deferred Observations (github mode only)

When mode=github:

Before fetching current findings from GitHub, check for any deferred observations accumulated from prior local review rounds:

DEFERRED_FILE="{{AUTOSKILLIT_TEMP}}/resolve-review/deferred_observations_${PR_NUMBER}.json"

If the file exists and contains entries:

Load the deferred observations array from the file
Post ALL entries as a single batch review via POST /repos/{owner}/{repo}/pulls/{pr_number}/reviews:
- event: "COMMENT" (not requesting changes — these are observations for discussion)
- body: "Observations accumulated from {N} local review rounds:"
- commit_id: current HEAD commit SHA (from gh pr view {pr_number} --json headRefOid -q .headRefOid)
- comments[] array where each entry has:
  - path: from the deferred entry
  - line: from the deferred entry (if line is null, omit line and use position: 1 as file-level comment)
  - side: "RIGHT"
  - body:
```
**Observation from local review round {round}:**

{body}

**Evidence:** {evidence}


```
Use the batch review endpoint (never post individual comments unless the batch call fails)
Fallback: If the batch POST returns HTTP 422 (e.g., stale line numbers), retry by posting each observation individually via gh api repos/{owner}/{repo}/pulls/{pr_number}/comments --method POST with 1s delay between calls
After all deferred observations are posted successfully, rename the file to deferred_observations_${PR_NUMBER}_posted.json to prevent re-posting on retry
These review threads are left UNRESOLVED (same behavior as DISCUSS in github mode)

If the file does not exist or is empty, skip this step and proceed to Step 2.

REVIEW-FLAG marker format:  Matches the regex <!--\s*REVIEW-FLAG:\s*severity=(\w+)\s+dimension=(\w+)\s*>.

When mode=local: Skip this step entirely — there are no prior accumulated observations to post in local mode.

Step 2: Fetch Review Comments

MODE BRANCHING:

When mode=local:

Skip ALL GitHub API calls for fetching comments (no gh api repos/.../pulls/{N}/comments, no gh api repos/.../pulls/{N}/reviews, no GraphQL reviewThreads query)
Instead, read findings from {{AUTOSKILLIT_TEMP}}/review-pr/local_findings_{pr_number}.json
Transform the local findings format into the same internal structure used by the GitHub-sourced flow: each finding maps to path, line, body, severity, dimension
Load diff_context_{pr_number}.json as normal (mode-independent — same handoff file written by review-pr)
Set comment_id_to_thread_id = {} (no thread IDs in local mode)
Set already_replied_ids = set() (no prior replies in local mode)
Skip writing inline_comments_{pr_number}.json, reviews_{pr_number}.json, threads_{pr_number}.json (GitHub-API-specific files)
Proceed to Step 3 with the transformed local findings

When mode=github: Execute the following GitHub API fetching steps (current behavior unchanged).

Fetch inline comments (anchored to specific file lines):

gh api repos/{owner}/{repo}/pulls/{number}/comments --paginate

Fetch top-level review bodies (summary reviews):

gh api repos/{owner}/{repo}/pulls/{number}/reviews --paginate

Fetch review thread node IDs (needed for thread resolution in Step 6) using cursor-based pagination to handle PRs with more than 100 threads:

# Fetch all pages; repeat with after=$endCursor while hasNextPage is true
gh api graphql \
  -f query='query($owner:String!,$repo:String!,$number:Int!,$after:String){repository(owner:$owner,name:$repo){pullRequest(number:$number){reviewThreads(first:100,after:$after){pageInfo{hasNextPage endCursor}nodes{id isResolved comments(first:5){nodes{databaseId body}}}}}}}' \
  -F owner="$owner" \
  -F repo="$repo" \
  -F number=$number \
  -F after=""

Collect all nodes across pages into a single list. Continue fetching while pageInfo.hasNextPage is true, passing pageInfo.endCursor as $after.

Save raw responses to:

{{AUTOSKILLIT_TEMP}}/resolve-review/inline_comments_{pr_number}.json
{{AUTOSKILLIT_TEMP}}/resolve-review/reviews_{pr_number}.json
{{AUTOSKILLIT_TEMP}}/resolve-review/threads_{pr_number}.json (first page; subsequent pages merged in memory)

Build a lookup map from the threads response:

comment_id_to_thread_id: dict[int, str] — key: comment databaseId (integer), value: thread GraphQL id (string node ID)
Skip threads where isResolved is already true (no need to resolve again)

If the GraphQL call fails (e.g., token lacks read:discussion scope), log a warning and set comment_id_to_thread_id = {}. Thread resolution will be silently skipped in Step 6. Flag this in the Step 7 report for human review.

Build already_replied_ids (idempotency guard):

RESOLVED_MARKER_RE = re.compile(r"<!--\s*autoskillit:resolved\b")

already_replied_ids: set[int] = set()
for thread in all_thread_nodes:
    if thread.get("isResolved"):
        continue  # Already resolved — Step 3 will not see these comments anyway
    comments_in_thread = thread.get("comments", {}).get("nodes", [])
    if len(comments_in_thread) < 2:
        continue  # No replies yet
    first_comment_id = comments_in_thread[0].get("databaseId")
    if first_comment_id is None:
        continue
    for reply in comments_in_thread[1:]:
        if RESOLVED_MARKER_RE.search(reply.get("body", "")):
            already_replied_ids.add(first_comment_id)
            log(f"Skipping comment {first_comment_id} — already resolved by prior resolve-review run")
            break

already_replied_ids is a set of original-comment databaseId integers for which a prior resolve-review invocation already posted a reply. Comments in this set are skipped in Step 3 before classification.

If the GraphQL call failed and all_thread_nodes is empty, already_replied_ids defaults to set() — no skipping occurs (safe degradation: worst case is a duplicate reply on the next run, same as the current behavior).

Load Pre-Built Context (if available):

After saving the raw review responses, check for the handoff file from review-pr:

DIFF_CONTEXT_PATH="{{AUTOSKILLIT_TEMP}}/review-pr/diff_context_${PR_NUMBER}.json"

If the file exists:

Parse it as JSON
Build diff_context_map: dict[tuple[str, int], dict] where key is (entry.path, entry.line) and value is the full context entry dict (with fields: path, line, severity, dimension, message, code_region)
Log: "Loaded pre-built context for N findings from review-pr handoff (schema_version: {v})"

If the file is absent or cannot be parsed:

Set diff_context_map = {}
Log: "No pre-built context file found — will read files in Step 3.5 (fallback)"

This lookup is used in Steps 3.5 and 4 to avoid redundant file reads.

Step 3: Parse and Classify Findings

From inline comments, extract per comment:

path — file path relative to repo root
line — the line being commented on
body — the reviewer's message
diff_hunk — surrounding context
id — the comment's REST database ID (integer id field in the JSON)
thread_node_id — look up comment_id_to_thread_id.get(id) (may be None if lookup failed or thread was already resolved)

File-level comment guard: If line is null (file-level comment posted by review-pr), skip this finding entirely — file-level comments have no code anchor and cannot be resolved by code changes. Record: (path, null, reason="file-level comment — no line anchor"). See the thread_node_id tracking table in Step 4 for the no-add disposition.

Idempotency guard — already-replied comments: If comment["id"] (the REST id integer) is in already_replied_ids, skip this comment entirely. Do not classify it, do not apply fixes, do not post a reply. Record: (path, line, reason="already replied in prior round — skipped"). These skipped comments do not count toward accept_count, reject_count, or discuss_count, and must not appear in the Step 7 report's "Findings fetched" total (they were fetched but filtered before classification).

From top-level reviews, extract:

state — APPROVED, CHANGES_REQUESTED, COMMENTED
body — the review summary text (skip empty bodies and APPROVED state)

Classify each finding by severity:

critical — body contains: "must", "critical", "security", "data loss", "wrong", "broken", "incorrect", "bug", "error", "never"
warning — body contains: "should", "consider", "recommend", "prefer", "suggest", "missing", "lacks"
info — body contains: "nit", "optional", "minor", "style", "cosmetic", "could"

When a finding matches multiple tiers, use the highest severity.

Critical and warning findings proceed to intent validation (Step 3.5). Info findings are auto-classified as DISCUSS — they do not enter Step 3.5.

Step 3.5: Intent Validation (Parallel Sub-Agents — BEFORE any code changes)

Before applying any fix, validate every critical and warning finding against the actual codebase and git history. This analysis phase runs entirely before code changes are made.

Domain grouping: Group all critical and warning findings by the top-level path segment of their path field:

src/autoskillit/execution/headless.py → group execution
tests/skills/test_foo.py → group tests
src/autoskillit/server/tools_ci.py → group server

Inline classification shortcut: If there are 3 or fewer findings AND they all fall in a single domain group, classify them inline — use each finding's diff_hunk as the primary code context, run git log once per unique path, then emit a verdict for each finding — without spawning a Task sub-agent. Only read source files if a comment explicitly references code outside the hunk or the diff_hunk is missing. The classification criteria and output format are identical to the sub-agent path.

This produces 3–6 groups on a typical PR. Launch one parallel sub-agent per group using the Task tool (model: "sonnet").

Context resolution hierarchy (applied per finding):

diff_context_map code_region — richest context (±50 annotated diff lines); used when review-pr ran in the same pipeline and wrote the handoff file.
diff_hunk from the review comment — the unified-diff snippet surrounding the commented line; always available from the GitHub API. Sufficient for most classification tasks (naming, patterns, style).
Source file read — last resort (±30 lines); used only when the comment references code outside the hunk or the hunk is truncated/missing.

Sub-agent prompt template — each sub-agent receives:

The list of comments in its domain group (with path, line, body, diff_hunk)
Instructions for reading code context: if a pre-built code_region for this finding's (path, line) is available in diff_context_map, include it directly in the prompt under "Pre-built code region (from review-pr, ±50 diff lines):" and instruct the sub-agent to use it — do not instruct it to read the file for context. If diff_context_map has no entry for this finding, use the comment's diff_hunk as the primary code context — include it directly in the prompt under "Code context (diff_hunk from review comment):" and instruct the sub-agent to classify the finding using this hunk. Only instruct the sub-agent to read the source file if: (a) the review comment body explicitly references code outside the hunk (e.g., "see the function above", "this conflicts with the import at line N", "look at the caller in X.py"), or (b) the diff_hunk is truncated or missing (empty string). When a file read IS needed, read each unique file once, spanning all flagged lines with ±30 lines margin — do not re-read per finding.
Instructions to run git log --follow -p --max-count=5 -- {path} once per unique path (not once per finding) to trace original intent
Instructions to classify each comment as ACCEPT, REJECT, or DISCUSS with:
- verdict: the classification (ACCEPT / REJECT / DISCUSS)
- evidence: specific references (line numbers, function names, API docs, contracts)
- category (for REJECT only): one of api_direction_misunderstanding, false_positive_intentional_pattern, design_intent_misread, stale_comment, other
- commit_sha_hint: the most recent commit touching the flagged line (from git log)

Classification criteria:

ACCEPT — the reviewer identified a real issue; a code fix is warranted
REJECT — the reviewer is factually wrong (misread a guard, misunderstood an API, failed to recognize an intentional design pattern); do NOT change the code
DISCUSS — the comment raises a valid design question that requires a human decision; flag for human review, do NOT change the code automatically

Output from each sub-agent — a JSON array of objects with fields: comment_id, path, line, verdict, evidence, category (REJECT only), commit_sha_hint.

Building sub-agent prompts with pre-built context:

When diff_context_map.get((comment.path, comment.line), {}).get("code_region") returns a non-empty value:

Pre-built code region (from review-pr, ±50 diff lines):
{diff_context_map.get((comment.path, comment.line), {}).get("code_region", "")}

Use the above region for context. Do NOT read the file — the region is already provided.
Run `git log --follow -p --max-count=5 -- {path}` for history context as usual.

When diff_context_map has no entry but diff_hunk is present (non-empty):

Code context (diff_hunk from review comment):
{comment.diff_hunk}

Use the above hunk for classification context. Only read the source file if:
(a) the comment body references code outside this hunk, or (b) you need
additional context not visible in the hunk. Run `git log` for history as usual.

When diff_context_map has no entry AND diff_hunk is empty or missing: fall back to reading the file at ±30 lines from the flagged line.

Fallback: If a sub-agent fails or times out, classify all comments in that group as DISCUSS (safe fallback — no code is changed, human reviews). Log the failure including the error message, domain group name, and affected comment IDs.

Merge results into a classification_map: dict[comment_id, verdict_entry].

Each entry must also carry two additional fields populated at merge time (not delegated to sub-agents):

severity — diff_context_map.get((path, line), {}).get("severity", locally_classified_severity) where locally_classified_severity is the severity computed in Step 3 (critical/warning/info from keyword matching). This ensures a meaningful value even when no review-pr handoff entry exists for this (path, line).
dimension — diff_context_map.get((path, line), {}).get("dimension", "unknown") (arch|tests|bugs|defense|cohesion|slop|deletion_regression|unknown). "unknown" is the correct sentinel when diff_context_map has no entry.

For auto-classified INFO findings (those classified as DISCUSS in Step 3 without entering Step 3.5): add them to classification_map with severity="info" and dimension=diff_context_map.get((path, line), {}).get("dimension", "unknown").

Write analysis report to {{AUTOSKILLIT_TEMP}}/resolve-review/analysis_{pr_number}_{ts}.md before any code changes are made. The report must include a summary banner:

Analysis complete (BEFORE any code changes)
ACCEPT: N | REJECT: N | DISCUSS: N

Track: accept_count, reject_count, discuss_count.

Step 3.6: Accumulate DISCUSS Findings (local mode only)

When mode=local:

After intent validation (Step 3.5), accumulate all DISCUSS-classified findings to a persistent local file for later posting when mode switches to github.

Read the iteration field from {{AUTOSKILLIT_TEMP}}/review-pr/local_findings_{pr_number}.json to get the round number. If that file is absent, use iteration = 0.

import json, pathlib

deferred_file = pathlib.Path("{{AUTOSKILLIT_TEMP}}/resolve-review/deferred_observations_${PR_NUMBER}.json")

# Load existing entries if file exists
existing = []
if deferred_file.exists():
    existing = json.loads(deferred_file.read_text())

# Build new entries from classification_map (DISCUSS only)
discuss_entries = []
for c in classification_map.values():
    if c.get("verdict") != "DISCUSS":
        continue
    entry = {
        "round": iteration_number,
        "path": c.get("path"),
        "line": c.get("line"),
        "body": c.get("body"),
        "evidence": c.get("evidence", ""),
        "severity": c.get("severity", "warning"),
        "dimension": c.get("dimension", "unknown"),
        "verdict": "DISCUSS",
        "category": c.get("category", "design_decision"),
    }
    discuss_entries.append(entry)

# Deduplicate: skip if (path, line, body) already in existing
seen = {(e["path"], e["line"], e["body"]) for e in existing}
new_entries = [e for e in discuss_entries if (e["path"], e["line"], e["body"]) not in seen]

# Write atomically
all_entries = existing + new_entries
deferred_file.write_text(json.dumps(all_entries, indent=2))
print(f"Accumulated {len(new_entries)} new DISCUSS findings ({len(all_entries)} total)")

The round value is the iteration number from local_findings_{pr_number}.json (written by review-pr with auto-incrementing logic).

When mode=github: Skip this step. DISCUSS findings are handled by inline replies (with REVIEW-FLAG markers) in Step 6.5.

Step 4: Apply Fixes (max 3 iterations)

Initialize addressed_thread_ids: list[str] = [] before processing findings.

For each finding where the classification map shows verdict = ACCEPT (process critical findings first, then warnings):

Context for understanding: If diff_context_map.get((path, line), {}).get("code_region") returns a non-empty value, use the pre-built code_region for initial understanding — skip the ±20 line read. The pre-built region is already available from the review-pr handoff. If diff_context_map has no entry, read the referenced file and ±20 lines of context as before. In both cases, still read the file when actually applying the edit — the pre-built context covers understanding only, not the write.
Understand what the reviewer is requesting
Apply the fix

Stage and commit:

git add {file}
# If pre-commit hooks are configured:
pre-commit run --files {file} && git add {file}
git commit -m "fix(review): {brief description of reviewer's request}"

Classification gate — REJECT/DISCUSS bypass: For findings where the classification map shows verdict = REJECT or verdict = DISCUSS:

For REJECT: no code changes are applied; record (file, line, reason="classifier: REJECT — {evidence}").
For DISCUSS: record (file, line, reason="classifier: DISCUSS — {context}").

thread_node_id Tracking:

| Outcome | Append to addressed_thread_ids? | |---------|-----------------------------------| | ACCEPT — fix committed | Yes (if thread_node_id is not None) | | REJECT — no code change | Yes (if thread_node_id is not None) | | DISCUSS — awaiting human decision | No — do not add DISCUSS findings to addressed_thread_ids | | Skipped finding (stale, missing file, unclear) | No | | File-level comment (line is null) | No |

Skip a finding if:

The comment is a file-level comment (line is null) — these have no code anchor
The referenced file does not exist in the current branch
The finding references a line number that no longer exists (stale comment)
The fix would require a design decision beyond the reviewer's explicit guidance
The reviewer's request is contradicted by another reviewer's comment on the same location

Record each skip with: (file, line, reason).

Skip a finding flow: When skipping a finding (stale comment, missing file, unclear guidance, contradiction):

Record (file, line, reason) as before.

Step 5: Run Tests

{test_command}

Pass → proceed to Step 6 (Resolve Addressed Review Threads)
Fail (iteration < 3): analyze failures against the fixes applied, revert/adjust the problematic commit, re-commit and retry (increment iteration counter)
Fail (iteration >= 3): report failure, leave working directory intact, exit non-zero

Step 6: Resolve Addressed Review Threads

MODE BRANCHING:

When mode=github: Execute the following thread resolution steps (current behavior unchanged).

Batch all thread resolutions into a single GraphQL request using aliased mutations. This reduces N requests (5 pts each = 5N pts) to 1 request (5 pts total). If addressed_thread_ids has more than 50 threads, chunk into batches of 50.

# Build aliased mutation query for all addressed threads
MUTATION_QUERY="mutation {"
for i in $(seq 0 $((${#ADDRESSED_THREAD_IDS[@]} - 1))); do
    tid="${ADDRESSED_THREAD_IDS[$i]}"
    MUTATION_QUERY="${MUTATION_QUERY} resolve${i}: resolveReviewThread(input: {threadId: \"${tid}\"}) { thread { isResolved } }"
done
MUTATION_QUERY="${MUTATION_QUERY} }"

gh api graphql -f query="${MUTATION_QUERY}"

Parse the response: for each resolve${i} alias key, check thread.isResolved.

Success (isResolved: true): increment resolved_count.
Failure (non-zero exit code, parse error, or isResolved: false for any alias): log a warning "Warning: could not resolve thread ${tid}: {error}". Continue to the next thread. Do not modify exit code.

Track:

resolved_count: int — successfully resolved threads
resolve_failed_count: int — threads that could not be resolved (permissions, network)

This step is best-effort — failure to resolve any thread never affects the exit code. The same applies to Step 6.5 (inline replies).

When mode=local:

Skip all GitHub thread resolution API calls (no GraphQL mutation, no thread resolution)
Set resolved_count = 0, resolve_failed_count = 0
The addressed_thread_ids list is not populated (there are no thread IDs in local mode)
Proceed to Step 6.5 (inline replies — also skipped in local mode)

Step 6.5: Post Inline Replies

When mode=local:

Skip all inline reply POST calls
Set reply_posted_count = 0, reply_failed_count = 0
Proceed to Step 6.6

When mode=github: Execute the following inline reply steps (current behavior unchanged).

For every finding (those classified via intent validation in Step 3.5 and info findings auto-classified as DISCUSS in Step 3), post an inline reply using the GitHub comment reply API. Each finding receives exactly one reply based on its classification.

# Build reply body based on classification:
# ACCEPT:
BODY="Agreed — fixed in ${commit_sha}. ${evidence}
<!-- autoskillit:resolved comment_id=${comment_id} verdict=ACCEPT -->"
# REJECT:
BODY="Investigated — this is intentional. ${evidence}
<!-- autoskillit:resolved comment_id=${comment_id} verdict=REJECT -->"
# DISCUSS:
BODY="Valid observation — flagged for design decision. ${evidence}
<!-- REVIEW-FLAG: severity=${severity} dimension=${dimension} -->
<!-- autoskillit:resolved comment_id=${comment_id} verdict=DISCUSS -->"
# INFO (auto-classified DISCUSS):
BODY="Acknowledged — minor suggestion noted.
<!-- REVIEW-FLAG: severity=info dimension=${dimension} -->
<!-- autoskillit:resolved comment_id=${comment_id} verdict=INFO -->"

gh api repos/{owner}/{repo}/pulls/{pr_number}/comments/{comment_id}/replies \
  --method POST \
  --field body="${BODY}"
sleep 1  # Rate-limit discipline: 1s between mutating calls

For ACCEPT replies, use the commit_sha from the most recent commit made in Step 4 (i.e., git log --format="%H" -1 after committing the fix). If the comment was classified as ACCEPT but skipped in Step 4 (stale comment, etc.), omit the commit sha reference.

For REJECT replies, include specific evidence (line numbers, design contracts, API references) from the sub-agent's evidence field so the reply is self-contained and suitable for future automated mining.

Track:

reply_posted_count: int — successfully posted replies
reply_failed_count: int — replies that failed (log warning, continue)

Step 6.6: Persist Reject Patterns

MODE BRANCHING:

When mode=local:

After Step 6.5, save all REJECT-classified comments to a stable, accumulating JSON file (without timestamp — the same file is reused across local rounds):

import json, pathlib

reject_file = pathlib.Path("{{AUTOSKILLIT_TEMP}}/resolve-review/reject_patterns_${PR_NUMBER}.json")

# Load existing entries if file exists
existing = []
if reject_file.exists():
    existing = json.loads(reject_file.read_text())

# Build new entries — use synthetic comment_id for local-mode findings
# Format: "local_{iteration}_{index}" derived from iteration in local_findings + index
reject_entries = []
for idx, c in enumerate(classification_map.values()):
    if c.get("verdict") != "REJECT":
        continue
    # Build synthetic comment_id from local finding source
    comment_id = c.get("comment_id", f"local_unknown_{idx}")
    entry = {
        "comment_id": comment_id,
        "path": c.get("path"),
        "line": c.get("line"),
        "body": c.get("body"),
        "evidence": c.get("evidence", ""),
        "category": c.get("category", "other"),
        "pr_number": ${PR_NUMBER},
        "feature_branch": "${feature_branch}",
    }
    reject_entries.append(entry)

# Deduplicate: skip if (path, line, body) already in existing
seen = {(e["path"], e["line"], e["body"]) for e in existing}
new_entries = [e for e in reject_entries if (e["path"], e["line"], e["body"]) not in seen]

# Write atomically
all_entries = existing + new_entries
reject_file.write_text(json.dumps(all_entries, indent=2))
print(f"Accumulated {len(new_entries)} new REJECT patterns ({len(all_entries)} total)")

Note: In local mode, comment_id may not be a GitHub database ID. Use whatever ID is in the classification_map entry. If the finding came from local_findings_{pr_number}.json and has no native ID, use "local_{iteration}_{index}" as a synthetic identifier.

When mode=github: Execute the following current behavior (timestamped, one-shot write).

ts=$(date +%Y%m%d-%H%M%S)
python3 -c "
import json, pathlib
reject_entries = [
    {
        'comment_id': c['comment_id'],
        'path': c['path'],
        'line': c['line'],
        'body': c['body'],
        'evidence': c['evidence'],
        'category': c['category'],
        'pr_number': ${PR_NUMBER},
        'feature_branch': '${feature_branch}',
    }
    for c in classification_map.values()
    if c['verdict'] == 'REJECT'
]
pathlib.Path('{{AUTOSKILLIT_TEMP}}/resolve-review/reject_patterns_${PR_NUMBER}_${ts}.json').write_text(
    json.dumps(reject_entries, indent=2)
)
print(f'Saved {len(reject_entries)} reject patterns')
"

Step 7: Report

MODE INDEPENDENCE: task test-check (Step 5) runs identically in both modes. Gate token emission is mode-independent.

Print a structured summary to terminal:

resolve-review complete
PR: #{pr_number} ({feature_branch} → {base_branch})
Findings fetched: {total}
  - critical: {n}
  - warning: {n}
  - info: {n}
Intent validation (before code changes):
  - ACCEPT: {accept_count}
  - REJECT: {reject_count}
  - DISCUSS: {discuss_count}
Fixes applied: {accept_count - skipped_in_fix_phase}
Fixes skipped: {n}
  - {file}:{line} — {reason}
Threads resolved: {resolved_count}/{len(addressed_thread_ids)}
  - {resolve_failed_count} failed (warnings logged above)
Inline replies: {reply_posted_count} posted / {reply_failed_count} failed
Reject patterns saved: {{AUTOSKILLIT_TEMP}}/resolve-review/reject_patterns_{pr_number}_{ts}.json
Test iterations: {n}
Status: PASS

Save full report to:

Analysis report: {{AUTOSKILLIT_TEMP}}/resolve-review/analysis_{pr_number}_{ts}.md (written before code changes)
Final report: {{AUTOSKILLIT_TEMP}}/resolve-review/report_{pr_number}_{ts}.md

Then determine and emit the structured output tokens (required for the write_behavior: conditional contract gate and on_result: routing):

Verdict Decision:

If {accept_count - skipped_in_fix_phase} >= 1 (fixes were applied): verdict = real_fix
If all ACCEPT findings were skipped (no code changes): verdict = already_green

IMPORTANT: Emit the tokens as literal plain text with no markdown formatting. Do not wrap in bold or italic.

verdict = {verdict}
fixes_applied = {accept_count - skipped_in_fix_phase}

Where:

{verdict} is real_fix if fixes were applied, already_green otherwise
{accept_count - skipped_in_fix_phase} is the number of ACCEPT findings where code changes were actually committed

The Step 1 graceful degradation exit must NOT emit these tokens — no tokens when skipping due to no PR found.

Exit 0.

Output

When a PR is processed, the following structured output tokens are emitted:

verdict = real_fix|already_green
fixes_applied = {N}

Where {N} is the count of ACCEPT findings where code changes were committed. verdict = real_fix means fixes were applied; verdict = already_green means all review findings were already addressed and no code changes were needed.

Mode-conditional path outputs:

When mode=local, the following additional tokens are emitted:

deferred_observations_path = {AUTOSKILLIT_TEMP}/resolve-review/deferred_observations_{pr_number}.json
reject_patterns_path = {AUTOSKILLIT_TEMP}/resolve-review/reject_patterns_{pr_number}.json

When mode=github and prior local rounds accumulated observations, these are posted to GitHub and renamed to deferred_observations_{pr_number}_posted.json — no path token is emitted for the posted state.

Summary written to: {{AUTOSKILLIT_TEMP}}/resolve-review/report_{pr_number}_{ts}.md (relative to the current working directory)

Resolve Review Skill

Read all review comments (inline + summary) on an open GitHub PR, apply targeted fixes for actionable findings, commit each fix, and verify tests still pass.

Arguments

/autoskillit:resolve-review <feature_branch> <base_branch> [mode=<local|github>]

feature_branch — The PR's head branch (used to find the open PR)
base_branch — The PR's base branch (e.g., "main")
mode (optional, default: github) — Controls where findings are read from and how threads are handled:
- mode=github (or absent/unrecognized): current behavior — fetch findings from GitHub API, post deferred observations from prior local rounds, resolve threads and post inline replies.
- mode=local: read findings from local JSON (written by review-pr in mode=local), skip all GitHub API fetching, accumulate DISCUSS/REJECT to persistent local files, skip thread resolution and inline reply API calls, still run task test-check.

The cwd is provided by the recipe step's cwd: field — the clone with the feature branch already checked out.

When to Use

Called by the recipe orchestrator via run_skill after review_pr reports changes_requested or needs_human verdict
MCP-only: not user-invocable directly

Critical Constraints

NEVER:

Create files outside {{AUTOSKILLIT_TEMP}}/resolve-review/
Merge, push, or call merge_worktree
Fix issues beyond the explicit scope of the reviewer's comments
Exceed 3 fix-and-retest iterations
Delete or discard the working directory on failure
Modify tests to suppress failures introduced by reviewer fixes
Run subagents in the background (run_in_background: true is prohibited)

ALWAYS:

Find the PR by feature branch at invocation time (not a hardcoded number)
Fetch both inline comments (pulls/{number}/comments) and top-level review bodies (pulls/{number}/reviews) via the GitHub API
Commit each distinct fix separately with a message describing what was addressed
Run {test_command} (from config, default: task test-check) after applying all fixes to catch regressions
Gracefully degrade (exit 0, report skip) if gh is unavailable or no PR is found
Report a structured summary: findings fetched, fixes applied, fixes skipped (with reasons)
Read before editing: Before issuing an Edit call on any file, ensure you have issued a Read on that file earlier in this session. Claude Code rejects Edit on unread files — the retry wastes a full API turn at current context size. If you are uncertain whether a file was read, issue a targeted Read (offset + limit to the region you plan to edit) rather than risk an error.
CWD awareness: Before running python3 or other interpreters, verify your current working directory is the worktree root (not the orchestrator's project root). Use absolute paths for imports or cd to the worktree first. A wrong-CWD import error wastes a full API turn.

Context Limit Behavior

When context is exhausted mid-execution, edits may be on disk but not committed. The recipe routes to on_context_limit (typically a re-push step), bypassing the normal commit protocol.

Before every test run and before emitting structured output tokens:

Run git -C {work_dir} status --porcelain
If any files are dirty: git -C {work_dir} add -A && git -C {work_dir} commit -m "fix: commit pending review changes"
Only then proceed with the test or structured output

This ensures that even if context exhaustion interrupts the fix loop, all applied review fixes are committed and the downstream push step receives a clean branch.

Workflow

Step 0: Validate Arguments

Parse two positional arguments: feature_branch and base_branch.

If either is missing, abort with: "Usage: /autoskillit:resolve-review <feature_branch> <base_branch>"

Parse the optional mode keyword argument:

# Extract mode from keyword arguments
MODE="github"
for arg in "$@"; do
    case "$arg" in
        mode=local)  MODE="local" ;;
        mode=github) MODE="github" ;;
    esac
done

If mode is absent or unrecognized, default to "github".

Step 1: Find the Open PR

PR_LIST_OUTPUT=$(gh pr list --head "$feature_branch" --base "$base_branch" \
  --json number,url -q '.[0] | "\(.number) \(.url)"')
PR_NUMBER=$(echo "$PR_LIST_OUTPUT" | awk '{print $1}')
PR_URL=$(echo "$PR_LIST_OUTPUT" | awk '{print $2}')

Get owner/repo:

gh repo view --json nameWithOwner -q .nameWithOwner

If gh is unavailable or not authenticated, or no PR is found:

Log "No PR found or gh unavailable — skipping review resolution"
Exit 0 (graceful degradation — do not fail the pipeline)

Step 1.5: Post Accumulated Deferred Observations (github mode only)

When mode=github:

Before fetching current findings from GitHub, check for any deferred observations accumulated from prior local review rounds:

DEFERRED_FILE="{{AUTOSKILLIT_TEMP}}/resolve-review/deferred_observations_${PR_NUMBER}.json"

If the file exists and contains entries:

Load the deferred observations array from the file
Post ALL entries as a single batch review via POST /repos/{owner}/{repo}/pulls/{pr_number}/reviews:
- event: "COMMENT" (not requesting changes — these are observations for discussion)
- body: "Observations accumulated from {N} local review rounds:"
- commit_id: current HEAD commit SHA (from gh pr view {pr_number} --json headRefOid -q .headRefOid)
- comments[] array where each entry has:
  - path: from the deferred entry
  - line: from the deferred entry (if line is null, omit line and use position: 1 as file-level comment)
  - side: "RIGHT"
  - body:
```
**Observation from local review round {round}:**

{body}

**Evidence:** {evidence}


```
Use the batch review endpoint (never post individual comments unless the batch call fails)
Fallback: If the batch POST returns HTTP 422 (e.g., stale line numbers), retry by posting each observation individually via gh api repos/{owner}/{repo}/pulls/{pr_number}/comments --method POST with 1s delay between calls
After all deferred observations are posted successfully, rename the file to deferred_observations_${PR_NUMBER}_posted.json to prevent re-posting on retry
These review threads are left UNRESOLVED (same behavior as DISCUSS in github mode)

If the file does not exist or is empty, skip this step and proceed to Step 2.

REVIEW-FLAG marker format:  Matches the regex <!--\s*REVIEW-FLAG:\s*severity=(\w+)\s+dimension=(\w+)\s*>.

When mode=local: Skip this step entirely — there are no prior accumulated observations to post in local mode.

Step 2: Fetch Review Comments

MODE BRANCHING:

When mode=local:

Skip ALL GitHub API calls for fetching comments (no gh api repos/.../pulls/{N}/comments, no gh api repos/.../pulls/{N}/reviews, no GraphQL reviewThreads query)
Instead, read findings from {{AUTOSKILLIT_TEMP}}/review-pr/local_findings_{pr_number}.json
Transform the local findings format into the same internal structure used by the GitHub-sourced flow: each finding maps to path, line, body, severity, dimension
Load diff_context_{pr_number}.json as normal (mode-independent — same handoff file written by review-pr)
Set comment_id_to_thread_id = {} (no thread IDs in local mode)
Set already_replied_ids = set() (no prior replies in local mode)
Skip writing inline_comments_{pr_number}.json, reviews_{pr_number}.json, threads_{pr_number}.json (GitHub-API-specific files)
Proceed to Step 3 with the transformed local findings

When mode=github: Execute the following GitHub API fetching steps (current behavior unchanged).

Fetch inline comments (anchored to specific file lines):

gh api repos/{owner}/{repo}/pulls/{number}/comments --paginate

Fetch top-level review bodies (summary reviews):

gh api repos/{owner}/{repo}/pulls/{number}/reviews --paginate

Fetch review thread node IDs (needed for thread resolution in Step 6) using cursor-based pagination to handle PRs with more than 100 threads:

# Fetch all pages; repeat with after=$endCursor while hasNextPage is true
gh api graphql \
  -f query='query($owner:String!,$repo:String!,$number:Int!,$after:String){repository(owner:$owner,name:$repo){pullRequest(number:$number){reviewThreads(first:100,after:$after){pageInfo{hasNextPage endCursor}nodes{id isResolved comments(first:5){nodes{databaseId body}}}}}}}' \
  -F owner="$owner" \
  -F repo="$repo" \
  -F number=$number \
  -F after=""

Collect all nodes across pages into a single list. Continue fetching while pageInfo.hasNextPage is true, passing pageInfo.endCursor as $after.

Save raw responses to:

{{AUTOSKILLIT_TEMP}}/resolve-review/inline_comments_{pr_number}.json
{{AUTOSKILLIT_TEMP}}/resolve-review/reviews_{pr_number}.json
{{AUTOSKILLIT_TEMP}}/resolve-review/threads_{pr_number}.json (first page; subsequent pages merged in memory)

Build a lookup map from the threads response:

comment_id_to_thread_id: dict[int, str] — key: comment databaseId (integer), value: thread GraphQL id (string node ID)
Skip threads where isResolved is already true (no need to resolve again)

Build already_replied_ids (idempotency guard):

RESOLVED_MARKER_RE = re.compile(r"<!--\s*autoskillit:resolved\b")

already_replied_ids: set[int] = set()
for thread in all_thread_nodes:
    if thread.get("isResolved"):
        continue  # Already resolved — Step 3 will not see these comments anyway
    comments_in_thread = thread.get("comments", {}).get("nodes", [])
    if len(comments_in_thread) < 2:
        continue  # No replies yet
    first_comment_id = comments_in_thread[0].get("databaseId")
    if first_comment_id is None:
        continue
    for reply in comments_in_thread[1:]:
        if RESOLVED_MARKER_RE.search(reply.get("body", "")):
            already_replied_ids.add(first_comment_id)
            log(f"Skipping comment {first_comment_id} — already resolved by prior resolve-review run")
            break

Load Pre-Built Context (if available):

After saving the raw review responses, check for the handoff file from review-pr:

DIFF_CONTEXT_PATH="{{AUTOSKILLIT_TEMP}}/review-pr/diff_context_${PR_NUMBER}.json"

If the file exists:

Parse it as JSON
Build diff_context_map: dict[tuple[str, int], dict] where key is (entry.path, entry.line) and value is the full context entry dict (with fields: path, line, severity, dimension, message, code_region)
Log: "Loaded pre-built context for N findings from review-pr handoff (schema_version: {v})"

If the file is absent or cannot be parsed:

Set diff_context_map = {}
Log: "No pre-built context file found — will read files in Step 3.5 (fallback)"

This lookup is used in Steps 3.5 and 4 to avoid redundant file reads.

Step 3: Parse and Classify Findings

From inline comments, extract per comment:

path — file path relative to repo root
line — the line being commented on
body — the reviewer's message
diff_hunk — surrounding context
id — the comment's REST database ID (integer id field in the JSON)
thread_node_id — look up comment_id_to_thread_id.get(id) (may be None if lookup failed or thread was already resolved)

From top-level reviews, extract:

state — APPROVED, CHANGES_REQUESTED, COMMENTED
body — the review summary text (skip empty bodies and APPROVED state)

Classify each finding by severity:

critical — body contains: "must", "critical", "security", "data loss", "wrong", "broken", "incorrect", "bug", "error", "never"
warning — body contains: "should", "consider", "recommend", "prefer", "suggest", "missing", "lacks"
info — body contains: "nit", "optional", "minor", "style", "cosmetic", "could"

When a finding matches multiple tiers, use the highest severity.

Critical and warning findings proceed to intent validation (Step 3.5). Info findings are auto-classified as DISCUSS — they do not enter Step 3.5.

Step 3.5: Intent Validation (Parallel Sub-Agents — BEFORE any code changes)

Before applying any fix, validate every critical and warning finding against the actual codebase and git history. This analysis phase runs entirely before code changes are made.

Domain grouping: Group all critical and warning findings by the top-level path segment of their path field:

src/autoskillit/execution/headless.py → group execution
tests/skills/test_foo.py → group tests
src/autoskillit/server/tools_ci.py → group server

This produces 3–6 groups on a typical PR. Launch one parallel sub-agent per group using the Task tool (model: "sonnet").

Context resolution hierarchy (applied per finding):

diff_context_map code_region — richest context (±50 annotated diff lines); used when review-pr ran in the same pipeline and wrote the handoff file.
diff_hunk from the review comment — the unified-diff snippet surrounding the commented line; always available from the GitHub API. Sufficient for most classification tasks (naming, patterns, style).
Source file read — last resort (±30 lines); used only when the comment references code outside the hunk or the hunk is truncated/missing.

Sub-agent prompt template — each sub-agent receives:

The list of comments in its domain group (with path, line, body, diff_hunk)
Instructions for reading code context: if a pre-built code_region for this finding's (path, line) is available in diff_context_map, include it directly in the prompt under "Pre-built code region (from review-pr, ±50 diff lines):" and instruct the sub-agent to use it — do not instruct it to read the file for context. If diff_context_map has no entry for this finding, use the comment's diff_hunk as the primary code context — include it directly in the prompt under "Code context (diff_hunk from review comment):" and instruct the sub-agent to classify the finding using this hunk. Only instruct the sub-agent to read the source file if: (a) the review comment body explicitly references code outside the hunk (e.g., "see the function above", "this conflicts with the import at line N", "look at the caller in X.py"), or (b) the diff_hunk is truncated or missing (empty string). When a file read IS needed, read each unique file once, spanning all flagged lines with ±30 lines margin — do not re-read per finding.
Instructions to run git log --follow -p --max-count=5 -- {path} once per unique path (not once per finding) to trace original intent
Instructions to classify each comment as ACCEPT, REJECT, or DISCUSS with:
- verdict: the classification (ACCEPT / REJECT / DISCUSS)
- evidence: specific references (line numbers, function names, API docs, contracts)
- category (for REJECT only): one of api_direction_misunderstanding, false_positive_intentional_pattern, design_intent_misread, stale_comment, other
- commit_sha_hint: the most recent commit touching the flagged line (from git log)

Classification criteria:

ACCEPT — the reviewer identified a real issue; a code fix is warranted
REJECT — the reviewer is factually wrong (misread a guard, misunderstood an API, failed to recognize an intentional design pattern); do NOT change the code
DISCUSS — the comment raises a valid design question that requires a human decision; flag for human review, do NOT change the code automatically

Output from each sub-agent — a JSON array of objects with fields: comment_id, path, line, verdict, evidence, category (REJECT only), commit_sha_hint.

Building sub-agent prompts with pre-built context:

When diff_context_map.get((comment.path, comment.line), {}).get("code_region") returns a non-empty value:

Pre-built code region (from review-pr, ±50 diff lines):
{diff_context_map.get((comment.path, comment.line), {}).get("code_region", "")}

Use the above region for context. Do NOT read the file — the region is already provided.
Run `git log --follow -p --max-count=5 -- {path}` for history context as usual.

When diff_context_map has no entry but diff_hunk is present (non-empty):

Code context (diff_hunk from review comment):
{comment.diff_hunk}

Use the above hunk for classification context. Only read the source file if:
(a) the comment body references code outside this hunk, or (b) you need
additional context not visible in the hunk. Run `git log` for history as usual.

When diff_context_map has no entry AND diff_hunk is empty or missing: fall back to reading the file at ±30 lines from the flagged line.

Merge results into a classification_map: dict[comment_id, verdict_entry].

Each entry must also carry two additional fields populated at merge time (not delegated to sub-agents):

severity — diff_context_map.get((path, line), {}).get("severity", locally_classified_severity) where locally_classified_severity is the severity computed in Step 3 (critical/warning/info from keyword matching). This ensures a meaningful value even when no review-pr handoff entry exists for this (path, line).
dimension — diff_context_map.get((path, line), {}).get("dimension", "unknown") (arch|tests|bugs|defense|cohesion|slop|deletion_regression|unknown). "unknown" is the correct sentinel when diff_context_map has no entry.

Write analysis report to {{AUTOSKILLIT_TEMP}}/resolve-review/analysis_{pr_number}_{ts}.md before any code changes are made. The report must include a summary banner:

Analysis complete (BEFORE any code changes)
ACCEPT: N | REJECT: N | DISCUSS: N

Track: accept_count, reject_count, discuss_count.

Step 3.6: Accumulate DISCUSS Findings (local mode only)

When mode=local:

After intent validation (Step 3.5), accumulate all DISCUSS-classified findings to a persistent local file for later posting when mode switches to github.

Read the iteration field from {{AUTOSKILLIT_TEMP}}/review-pr/local_findings_{pr_number}.json to get the round number. If that file is absent, use iteration = 0.

import json, pathlib

deferred_file = pathlib.Path("{{AUTOSKILLIT_TEMP}}/resolve-review/deferred_observations_${PR_NUMBER}.json")

# Load existing entries if file exists
existing = []
if deferred_file.exists():
    existing = json.loads(deferred_file.read_text())

# Build new entries from classification_map (DISCUSS only)
discuss_entries = []
for c in classification_map.values():
    if c.get("verdict") != "DISCUSS":
        continue
    entry = {
        "round": iteration_number,
        "path": c.get("path"),
        "line": c.get("line"),
        "body": c.get("body"),
        "evidence": c.get("evidence", ""),
        "severity": c.get("severity", "warning"),
        "dimension": c.get("dimension", "unknown"),
        "verdict": "DISCUSS",
        "category": c.get("category", "design_decision"),
    }
    discuss_entries.append(entry)

# Deduplicate: skip if (path, line, body) already in existing
seen = {(e["path"], e["line"], e["body"]) for e in existing}
new_entries = [e for e in discuss_entries if (e["path"], e["line"], e["body"]) not in seen]

# Write atomically
all_entries = existing + new_entries
deferred_file.write_text(json.dumps(all_entries, indent=2))
print(f"Accumulated {len(new_entries)} new DISCUSS findings ({len(all_entries)} total)")

The round value is the iteration number from local_findings_{pr_number}.json (written by review-pr with auto-incrementing logic).

When mode=github: Skip this step. DISCUSS findings are handled by inline replies (with REVIEW-FLAG markers) in Step 6.5.

Step 4: Apply Fixes (max 3 iterations)

Initialize addressed_thread_ids: list[str] = [] before processing findings.

For each finding where the classification map shows verdict = ACCEPT (process critical findings first, then warnings):

Context for understanding: If diff_context_map.get((path, line), {}).get("code_region") returns a non-empty value, use the pre-built code_region for initial understanding — skip the ±20 line read. The pre-built region is already available from the review-pr handoff. If diff_context_map has no entry, read the referenced file and ±20 lines of context as before. In both cases, still read the file when actually applying the edit — the pre-built context covers understanding only, not the write.
Understand what the reviewer is requesting
Apply the fix

Stage and commit:

git add {file}
# If pre-commit hooks are configured:
pre-commit run --files {file} && git add {file}
git commit -m "fix(review): {brief description of reviewer's request}"

Classification gate — REJECT/DISCUSS bypass: For findings where the classification map shows verdict = REJECT or verdict = DISCUSS:

For REJECT: no code changes are applied; record (file, line, reason="classifier: REJECT — {evidence}").
For DISCUSS: record (file, line, reason="classifier: DISCUSS — {context}").

thread_node_id Tracking:

Skip a finding if:

The comment is a file-level comment (line is null) — these have no code anchor
The referenced file does not exist in the current branch
The finding references a line number that no longer exists (stale comment)
The fix would require a design decision beyond the reviewer's explicit guidance
The reviewer's request is contradicted by another reviewer's comment on the same location

Record each skip with: (file, line, reason).

Skip a finding flow: When skipping a finding (stale comment, missing file, unclear guidance, contradiction):

Record (file, line, reason) as before.

Step 5: Run Tests

{test_command}

Pass → proceed to Step 6 (Resolve Addressed Review Threads)
Fail (iteration < 3): analyze failures against the fixes applied, revert/adjust the problematic commit, re-commit and retry (increment iteration counter)
Fail (iteration >= 3): report failure, leave working directory intact, exit non-zero

Step 6: Resolve Addressed Review Threads

MODE BRANCHING:

When mode=github: Execute the following thread resolution steps (current behavior unchanged).

# Build aliased mutation query for all addressed threads
MUTATION_QUERY="mutation {"
for i in $(seq 0 $((${#ADDRESSED_THREAD_IDS[@]} - 1))); do
    tid="${ADDRESSED_THREAD_IDS[$i]}"
    MUTATION_QUERY="${MUTATION_QUERY} resolve${i}: resolveReviewThread(input: {threadId: \"${tid}\"}) { thread { isResolved } }"
done
MUTATION_QUERY="${MUTATION_QUERY} }"

gh api graphql -f query="${MUTATION_QUERY}"

Parse the response: for each resolve${i} alias key, check thread.isResolved.

Success (isResolved: true): increment resolved_count.
Failure (non-zero exit code, parse error, or isResolved: false for any alias): log a warning "Warning: could not resolve thread ${tid}: {error}". Continue to the next thread. Do not modify exit code.

Track:

resolved_count: int — successfully resolved threads
resolve_failed_count: int — threads that could not be resolved (permissions, network)

This step is best-effort — failure to resolve any thread never affects the exit code. The same applies to Step 6.5 (inline replies).

When mode=local:

Skip all GitHub thread resolution API calls (no GraphQL mutation, no thread resolution)
Set resolved_count = 0, resolve_failed_count = 0
The addressed_thread_ids list is not populated (there are no thread IDs in local mode)
Proceed to Step 6.5 (inline replies — also skipped in local mode)

Step 6.5: Post Inline Replies

When mode=local:

Skip all inline reply POST calls
Set reply_posted_count = 0, reply_failed_count = 0
Proceed to Step 6.6

When mode=github: Execute the following inline reply steps (current behavior unchanged).

# Build reply body based on classification:
# ACCEPT:
BODY="Agreed — fixed in ${commit_sha}. ${evidence}
<!-- autoskillit:resolved comment_id=${comment_id} verdict=ACCEPT -->"
# REJECT:
BODY="Investigated — this is intentional. ${evidence}
<!-- autoskillit:resolved comment_id=${comment_id} verdict=REJECT -->"
# DISCUSS:
BODY="Valid observation — flagged for design decision. ${evidence}
<!-- REVIEW-FLAG: severity=${severity} dimension=${dimension} -->
<!-- autoskillit:resolved comment_id=${comment_id} verdict=DISCUSS -->"
# INFO (auto-classified DISCUSS):
BODY="Acknowledged — minor suggestion noted.
<!-- REVIEW-FLAG: severity=info dimension=${dimension} -->
<!-- autoskillit:resolved comment_id=${comment_id} verdict=INFO -->"

gh api repos/{owner}/{repo}/pulls/{pr_number}/comments/{comment_id}/replies \
  --method POST \
  --field body="${BODY}"
sleep 1  # Rate-limit discipline: 1s between mutating calls

Track:

reply_posted_count: int — successfully posted replies
reply_failed_count: int — replies that failed (log warning, continue)

Step 6.6: Persist Reject Patterns

MODE BRANCHING:

When mode=local:

After Step 6.5, save all REJECT-classified comments to a stable, accumulating JSON file (without timestamp — the same file is reused across local rounds):

import json, pathlib

reject_file = pathlib.Path("{{AUTOSKILLIT_TEMP}}/resolve-review/reject_patterns_${PR_NUMBER}.json")

# Load existing entries if file exists
existing = []
if reject_file.exists():
    existing = json.loads(reject_file.read_text())

# Build new entries — use synthetic comment_id for local-mode findings
# Format: "local_{iteration}_{index}" derived from iteration in local_findings + index
reject_entries = []
for idx, c in enumerate(classification_map.values()):
    if c.get("verdict") != "REJECT":
        continue
    # Build synthetic comment_id from local finding source
    comment_id = c.get("comment_id", f"local_unknown_{idx}")
    entry = {
        "comment_id": comment_id,
        "path": c.get("path"),
        "line": c.get("line"),
        "body": c.get("body"),
        "evidence": c.get("evidence", ""),
        "category": c.get("category", "other"),
        "pr_number": ${PR_NUMBER},
        "feature_branch": "${feature_branch}",
    }
    reject_entries.append(entry)

# Deduplicate: skip if (path, line, body) already in existing
seen = {(e["path"], e["line"], e["body"]) for e in existing}
new_entries = [e for e in reject_entries if (e["path"], e["line"], e["body"]) not in seen]

# Write atomically
all_entries = existing + new_entries
reject_file.write_text(json.dumps(all_entries, indent=2))
print(f"Accumulated {len(new_entries)} new REJECT patterns ({len(all_entries)} total)")

When mode=github: Execute the following current behavior (timestamped, one-shot write).

ts=$(date +%Y%m%d-%H%M%S)
python3 -c "
import json, pathlib
reject_entries = [
    {
        'comment_id': c['comment_id'],
        'path': c['path'],
        'line': c['line'],
        'body': c['body'],
        'evidence': c['evidence'],
        'category': c['category'],
        'pr_number': ${PR_NUMBER},
        'feature_branch': '${feature_branch}',
    }
    for c in classification_map.values()
    if c['verdict'] == 'REJECT'
]
pathlib.Path('{{AUTOSKILLIT_TEMP}}/resolve-review/reject_patterns_${PR_NUMBER}_${ts}.json').write_text(
    json.dumps(reject_entries, indent=2)
)
print(f'Saved {len(reject_entries)} reject patterns')
"

Step 7: Report

MODE INDEPENDENCE: task test-check (Step 5) runs identically in both modes. Gate token emission is mode-independent.

Print a structured summary to terminal:

resolve-review complete
PR: #{pr_number} ({feature_branch} → {base_branch})
Findings fetched: {total}
  - critical: {n}
  - warning: {n}
  - info: {n}
Intent validation (before code changes):
  - ACCEPT: {accept_count}
  - REJECT: {reject_count}
  - DISCUSS: {discuss_count}
Fixes applied: {accept_count - skipped_in_fix_phase}
Fixes skipped: {n}
  - {file}:{line} — {reason}
Threads resolved: {resolved_count}/{len(addressed_thread_ids)}
  - {resolve_failed_count} failed (warnings logged above)
Inline replies: {reply_posted_count} posted / {reply_failed_count} failed
Reject patterns saved: {{AUTOSKILLIT_TEMP}}/resolve-review/reject_patterns_{pr_number}_{ts}.json
Test iterations: {n}
Status: PASS

Save full report to:

Analysis report: {{AUTOSKILLIT_TEMP}}/resolve-review/analysis_{pr_number}_{ts}.md (written before code changes)
Final report: {{AUTOSKILLIT_TEMP}}/resolve-review/report_{pr_number}_{ts}.md

Then determine and emit the structured output tokens (required for the write_behavior: conditional contract gate and on_result: routing):

Verdict Decision:

If {accept_count - skipped_in_fix_phase} >= 1 (fixes were applied): verdict = real_fix
If all ACCEPT findings were skipped (no code changes): verdict = already_green

IMPORTANT: Emit the tokens as literal plain text with no markdown formatting. Do not wrap in bold or italic.

verdict = {verdict}
fixes_applied = {accept_count - skipped_in_fix_phase}

Where:

{verdict} is real_fix if fixes were applied, already_green otherwise
{accept_count - skipped_in_fix_phase} is the number of ACCEPT findings where code changes were actually committed

The Step 1 graceful degradation exit must NOT emit these tokens — no tokens when skipping due to no PR found.

Exit 0.

Output

When a PR is processed, the following structured output tokens are emitted:

verdict = real_fix|already_green
fixes_applied = {N}

Mode-conditional path outputs:

When mode=local, the following additional tokens are emitted:

deferred_observations_path = {AUTOSKILLIT_TEMP}/resolve-review/deferred_observations_{pr_number}.json
reject_patterns_path = {AUTOSKILLIT_TEMP}/resolve-review/reject_patterns_{pr_number}.json

Summary written to: {{AUTOSKILLIT_TEMP}}/resolve-review/report_{pr_number}_{ts}.md (relative to the current working directory)

Adoption

talont-org/resolve-review

$ install --global

Security Scan Results

SKILL.md

Resolve Review Skill

Arguments

When to Use

Critical Constraints

Context Limit Behavior

Workflow

Step 0: Validate Arguments

Step 1: Find the Open PR

Step 1.5: Post Accumulated Deferred Observations (github mode only)

Step 2: Fetch Review Comments

Step 3: Parse and Classify Findings

Step 3.5: Intent Validation (Parallel Sub-Agents — BEFORE any code changes)

Step 3.6: Accumulate DISCUSS Findings (local mode only)

Step 4: Apply Fixes (max 3 iterations)

Step 5: Run Tests

Step 6: Resolve Addressed Review Threads

Step 6.5: Post Inline Replies

Step 6.6: Persist Reject Patterns

Step 7: Report

Output

Related Skills

talont-org/write-recipe

talont-org/vis-lens-uncertainty

talont-org/vis-lens-temporal

talont-org/vis-lens-story-arc

talont-org/resolve-review

$ install --global

Security Scan Results

SKILL.md

Resolve Review Skill

Arguments

When to Use

Critical Constraints

Context Limit Behavior

Workflow

Step 0: Validate Arguments

Step 1: Find the Open PR

Step 1.5: Post Accumulated Deferred Observations (github mode only)

Step 2: Fetch Review Comments

Step 3: Parse and Classify Findings

Step 3.5: Intent Validation (Parallel Sub-Agents — BEFORE any code changes)

Step 3.6: Accumulate DISCUSS Findings (local mode only)

Step 4: Apply Fixes (max 3 iterations)

Step 5: Run Tests

Step 6: Resolve Addressed Review Threads

Step 6.5: Post Inline Replies

Step 6.6: Persist Reject Patterns

Step 7: Report

Output

Related Skills

talont-org/write-recipe

talont-org/vis-lens-uncertainty

talont-org/vis-lens-temporal

talont-org/vis-lens-story-arc