Audit Review Decisions Skill

Mine merged PR review threads for agreed-but-deferred suggestions that were never implemented. Identify review debt before it compounds.

When to Use

User says "audit review decisions", "find deferred review items", "surface review debt", "what did reviewers flag for later"

Arguments

$1 — Time period (e.g. 14d, 30d, 7d). Default: 14d.
$2 — Output path. Default: ${AUTOSKILLIT_TEMP}/audit-review-decisions/review_decisions_audit_$(date +%Y-%m-%d_%H%M%S).md

Critical Constraints

NEVER:

Create files outside ${AUTOSKILLIT_TEMP}/audit-review-decisions/
Have triage or validation subagents make GitHub API calls (local data only for Step 2)
Post duplicate [AUDIT] markers — check for existing marker before posting
Run subagents in the background (run_in_background: true is prohibited)
Use gh pr list without --limit to avoid pagination truncation
Use \| in Grep patterns — use | for alternation (ERE, not BRE)

ALWAYS:

Save raw PR JSON to temp before any analysis (Step 1)
Use GraphQL alias batching (~20 PRs per query) for data collection
Include rateLimit { cost remaining resetAt } in every GraphQL query
Sleep 1s between consecutive mutating GitHub API calls (Step 5 watermark posts)
Step 2 triage subagents read local JSON files only — zero API calls
Step 3 validation subagents grep the actual current codebase
Skip threads that already contain an [AUDIT] comment
Resolve owner/repo from git remote get-url origin — never hardcode
Use /autoskillit: prefix when invoking any other skill

Workflow

Step 0: Watermark Resolution

Parse $1 for time period. Default 14d. Compute PERIOD_DAYS.
Resolve OWNER and REPO from git remote get-url origin.

Query the most recent [AUDIT] sentinel comment across recently merged PRs:

gh api graphql -f query='
  query($owner:String!, $name:String!) {
    rateLimit { cost remaining resetAt }
    repository(owner:$owner, name:$name) {
      pullRequests(first:500, states:MERGED, orderBy:{field:UPDATED_AT,direction:DESC}) {
        nodes { number
          reviewThreads(first:50) {
            nodes { comments(first:10) { nodes { body createdAt } } }
          }
        }
      }
    }
  }' -f owner="${OWNER}" -f name="${REPO}"

Extract the most recent createdAt from any comment whose body starts with [AUDIT]. Store as LAST_AUDIT_TS (empty string if none — first run).

Compute SCAN_SINCE:
- If LAST_AUDIT_TS is set: max(LAST_AUDIT_TS, date -d "now - PERIOD_DAYS days")
- Else: date -d "now - PERIOD_DAYS days" --iso-8601=seconds
Log: Scan window: ${SCAN_SINCE} to now (${PERIOD_DAYS}d configured, last audit: ${LAST_AUDIT_TS:-none})

Step 1: Data Collection (GraphQL Batch)

List merged PRs in the scan window:

SCAN_DATE=$(echo "${SCAN_SINCE}" | cut -c1-10)
PR_NUMS=$(gh pr list --state merged \
  --search "merged:>=${SCAN_DATE}" \
  --json number --limit 500 | jq -r '.[].number')

Create temp directory:

mkdir -p "${AUTOSKILLIT_TEMP}/audit-review-decisions/raw"

Batch fetch in groups of 20 using GraphQL aliases. For each batch, build a query with aliased pr${i}: pullRequest(number: ${NUM}) nodes. Each node fetches:
```
number title mergedAt
reviews(first: 100) {
  nodes { author { login } body state submittedAt }
}
reviewThreads(first: 100) {
  pageInfo { hasNextPage endCursor }
  nodes {
    isResolved
    comments(first: 100) {
      nodes { databaseId author { login } body path line createdAt }
    }
  }
}
```
Include rateLimit { cost remaining resetAt } at query root. After the initial fetch, for each PR where reviewThreads.pageInfo.hasNextPage is true, issue additional aliased queries with reviewThreads(first:100, after:$endCursor) until hasNextPage is false. Merge the nodes arrays across pages before filtering.
For each PR in the batch response:
- Filter out threads whose comments list contains any comment with body starting with [AUDIT] (already watermarked — skip entirely).
- If the PR has zero remaining threads: skip saving.
- Otherwise: save filtered data to ${AUTOSKILLIT_TEMP}/audit-review-decisions/raw/pr_${number}.json

Step 2: Triage (Haiku — Broad Pass)

List all JSON files in raw/. Split into batches of ~5 files per agent.
Launch parallel Haiku subagents (one per batch, model: "haiku"). Each agent:
- Reads its assigned JSON files only (no API calls).
- Flags a thread if it matches any signal:
  - <!-- REVIEW-FLAG: tag present
  - Body contains one of: "Valid observation — flagged for design decision", "out of scope for this fix cycle", "requires a dedicated cleanup commit", "left open for human review", "future improvement", "beyond this PR's scope", "requires team consensus"
  - Thread isResolved: false AND author acknowledged validity in a reply
  - Review body state: COMMENTED with no corresponding thread (needs_human indicator)
- Returns candidates as response text only — no file writes. Per-candidate format:
```
PR: {number}
thread_index: {N}
comment_id: {databaseId of first comment in thread}
path: {file path or empty}
line: {line number or empty}
signal: REVIEW-FLAG|KEYWORD|UNRESOLVED|NEEDS_HUMAN
severity: {from REVIEW-FLAG tag, or "unknown"}
dimension: {from REVIEW-FLAG tag, or "unknown"}
quote: {first 200 chars of flagged comment body}
```
- False positives are acceptable; false negatives are not.
Collect and parse candidate text from all agent responses.

Step 3: Validation (Sonnet — Deep Pass)

Group candidates into batches of ~10. Launch parallel Sonnet subagents (model: "sonnet") per batch.
Each Sonnet agent receives its candidate batch and, for each candidate:
- If path is set: reads the file and surrounding context.
- Greps the current codebase for the core concern from quote (judgment-based pattern, not a literal string match).
- Classifies:
  - VALID — issue still present, impactful, ticket-worthy
  - RESOLVED — code changed; concern no longer applies
  - STALE — code deleted/refactored; finding irrelevant
- Returns findings as response text only — no file writes. Per-finding format:
```
PR: {number}
comment_id: {databaseId}
classification: VALID|RESOLVED|STALE
path: {file:line or empty}
severity: {critical|warning|info|unknown}
dimension: {arch|bugs|defense|tests|cohesion|slop|unknown}
priority: HIGH|MEDIUM|LOW
impact: {one sentence}
suggested_title: {short GitHub issue title}
reviewer_quote: {verbatim first 300 chars}
```
- Priority assignment: HIGH = severity=critical OR dimension in (bugs, arch); MEDIUM = severity=warning; LOW = severity=info or unknown.
Collect and parse validated findings from all agent responses.

Step 4: Report Generation

Collect all validated findings from Step 3 subagent responses.
Sort findings: VALID first (by priority HIGH→MEDIUM→LOW), then RESOLVED, then STALE.
Resolve the output path:
- Use $2 if provided.
- Otherwise: ${AUTOSKILLIT_TEMP}/audit-review-decisions/review_decisions_audit_$(date +%Y-%m-%d_%H%M%S).md
Create parent directory: mkdir -p "$(dirname "${OUTPUT_PATH}")"
Write the markdown report to ${OUTPUT_PATH}. Structure:

PR Review Decisions Audit — {PERIOD_DAYS}d window

Generated: {ISO timestamp} Scan window: {SCAN_SINCE} to {now} PRs scanned: {N} | Threads examined: {M} | Threads skipped (already audited): {K}

Summary

| Classification | Count | |---|---| | VALID (ticket-worthy) | {N} | | RESOLVED (already fixed) | {N} | | STALE (no longer applicable) | {N} |

Priority Triage

HIGH Priority

For each HIGH VALID finding, write a section:

### {suggested_title}

**PR:** #{number} | **File:** {path}:{line} | **Severity:** {severity} | **Dimension:** {dimension}

> {reviewer_quote}

**Current relevance:** VALID — {impact}

**Suggested issue title:** {suggested_title}
**Affected files:** {path}

MEDIUM Priority

{Same structure}

LOW Priority

{Same structure}

RESOLVED Findings

{List: PR, file, one-line description of what was fixed}

STALE Findings

{List: PR, file, one-line description of why no longer applicable}

Open PR Findings

{Findings from PRs that were open (not merged) at scan time — may still be addressed. Same per-finding structure but labeled as pending.}

Pattern Analysis

Most common deferral phrases (by frequency): {Table: phrase | count | % of all candidates}

Dimensions with highest VALID rate: {Table: dimension | valid | resolved | stale | valid_rate}

Systemic escape hatches detected: {Narrative: which phrases act as systematic blockers to tracking, with counts}

Recommendations: {2–4 concrete process recommendations based on the pattern data}

After writing the file, print a terminal summary:

audit-review-decisions complete
Output: {OUTPUT_PATH}
VALID: {N} | RESOLVED: {N} | STALE: {N}
Top finding: {first HIGH priority suggested_title, or "none"}

Step 5: Watermark (Thread Annotation)

For every finding processed in Steps 2–3 (all classifications — VALID, RESOLVED, STALE):

Re-check for existing audit marker (live): fetch the thread's current comments directly from the GitHub API — do not use the Step 1 JSON cache, which already filtered out [AUDIT]-marked threads and cannot detect markers posted after Step 1:
```
gh api "repos/${OWNER}/${REPO}/pulls/${PR_NUMBER}/comments" \
  --jq "[.[] | select(.id == ${COMMENT_ID} or .in_reply_to_id == ${COMMENT_ID}) | .body | startswith(\"[AUDIT]\")] | any"
```
If the result is true: skip this thread (idempotent — no duplicate post).
Determine marker body based on classification and ticket status:

| Classification | Ticket created? | Marker body | |---|---|---| | VALID | Yes | [AUDIT] — tracked in #{issue_number} | | VALID | No | [AUDIT] — acknowledged, no action taken | | RESOLVED | — | [AUDIT] — verified resolved in current codebase | | STALE | — | [AUDIT] — no longer applicable |

Post reply comment:

gh api "repos/${OWNER}/${REPO}/pulls/${PR_NUMBER}/comments/${COMMENT_ID}/replies" \
  --method POST \
  --field body="${MARKER_BODY}"
sleep 1

COMMENT_ID is the databaseId of the first comment in the thread (from Step 1 JSON).

Thread reply constraint: These calls cannot be batched via the reviews API — each requires an individual POST. The 1s delay between calls is mandatory per GitHub API discipline.
Log progress per finding: [AUDIT] Posted marker on PR #{number} thread {comment_id}: {marker_body}

Audit Review Decisions Skill

Mine merged PR review threads for agreed-but-deferred suggestions that were never implemented. Identify review debt before it compounds.

When to Use

User says "audit review decisions", "find deferred review items", "surface review debt", "what did reviewers flag for later"

Arguments

$1 — Time period (e.g. 14d, 30d, 7d). Default: 14d.
$2 — Output path. Default: ${AUTOSKILLIT_TEMP}/audit-review-decisions/review_decisions_audit_$(date +%Y-%m-%d_%H%M%S).md

Critical Constraints

NEVER:

Create files outside ${AUTOSKILLIT_TEMP}/audit-review-decisions/
Have triage or validation subagents make GitHub API calls (local data only for Step 2)
Post duplicate [AUDIT] markers — check for existing marker before posting
Run subagents in the background (run_in_background: true is prohibited)
Use gh pr list without --limit to avoid pagination truncation
Use \| in Grep patterns — use | for alternation (ERE, not BRE)

ALWAYS:

Save raw PR JSON to temp before any analysis (Step 1)
Use GraphQL alias batching (~20 PRs per query) for data collection
Include rateLimit { cost remaining resetAt } in every GraphQL query
Sleep 1s between consecutive mutating GitHub API calls (Step 5 watermark posts)
Step 2 triage subagents read local JSON files only — zero API calls
Step 3 validation subagents grep the actual current codebase
Skip threads that already contain an [AUDIT] comment
Resolve owner/repo from git remote get-url origin — never hardcode
Use /autoskillit: prefix when invoking any other skill

Workflow

Step 0: Watermark Resolution

Parse $1 for time period. Default 14d. Compute PERIOD_DAYS.
Resolve OWNER and REPO from git remote get-url origin.

Query the most recent [AUDIT] sentinel comment across recently merged PRs:

gh api graphql -f query='
  query($owner:String!, $name:String!) {
    rateLimit { cost remaining resetAt }
    repository(owner:$owner, name:$name) {
      pullRequests(first:500, states:MERGED, orderBy:{field:UPDATED_AT,direction:DESC}) {
        nodes { number
          reviewThreads(first:50) {
            nodes { comments(first:10) { nodes { body createdAt } } }
          }
        }
      }
    }
  }' -f owner="${OWNER}" -f name="${REPO}"

Extract the most recent createdAt from any comment whose body starts with [AUDIT]. Store as LAST_AUDIT_TS (empty string if none — first run).

Compute SCAN_SINCE:
- If LAST_AUDIT_TS is set: max(LAST_AUDIT_TS, date -d "now - PERIOD_DAYS days")
- Else: date -d "now - PERIOD_DAYS days" --iso-8601=seconds
Log: Scan window: ${SCAN_SINCE} to now (${PERIOD_DAYS}d configured, last audit: ${LAST_AUDIT_TS:-none})

Step 1: Data Collection (GraphQL Batch)

List merged PRs in the scan window:

SCAN_DATE=$(echo "${SCAN_SINCE}" | cut -c1-10)
PR_NUMS=$(gh pr list --state merged \
  --search "merged:>=${SCAN_DATE}" \
  --json number --limit 500 | jq -r '.[].number')

Create temp directory:

mkdir -p "${AUTOSKILLIT_TEMP}/audit-review-decisions/raw"

Batch fetch in groups of 20 using GraphQL aliases. For each batch, build a query with aliased pr${i}: pullRequest(number: ${NUM}) nodes. Each node fetches:
```
number title mergedAt
reviews(first: 100) {
  nodes { author { login } body state submittedAt }
}
reviewThreads(first: 100) {
  pageInfo { hasNextPage endCursor }
  nodes {
    isResolved
    comments(first: 100) {
      nodes { databaseId author { login } body path line createdAt }
    }
  }
}
```
Include rateLimit { cost remaining resetAt } at query root. After the initial fetch, for each PR where reviewThreads.pageInfo.hasNextPage is true, issue additional aliased queries with reviewThreads(first:100, after:$endCursor) until hasNextPage is false. Merge the nodes arrays across pages before filtering.
For each PR in the batch response:
- Filter out threads whose comments list contains any comment with body starting with [AUDIT] (already watermarked — skip entirely).
- If the PR has zero remaining threads: skip saving.
- Otherwise: save filtered data to ${AUTOSKILLIT_TEMP}/audit-review-decisions/raw/pr_${number}.json

Step 2: Triage (Haiku — Broad Pass)

List all JSON files in raw/. Split into batches of ~5 files per agent.
Launch parallel Haiku subagents (one per batch, model: "haiku"). Each agent:
- Reads its assigned JSON files only (no API calls).
- Flags a thread if it matches any signal:
  - <!-- REVIEW-FLAG: tag present
  - Body contains one of: "Valid observation — flagged for design decision", "out of scope for this fix cycle", "requires a dedicated cleanup commit", "left open for human review", "future improvement", "beyond this PR's scope", "requires team consensus"
  - Thread isResolved: false AND author acknowledged validity in a reply
  - Review body state: COMMENTED with no corresponding thread (needs_human indicator)
- Returns candidates as response text only — no file writes. Per-candidate format:
```
PR: {number}
thread_index: {N}
comment_id: {databaseId of first comment in thread}
path: {file path or empty}
line: {line number or empty}
signal: REVIEW-FLAG|KEYWORD|UNRESOLVED|NEEDS_HUMAN
severity: {from REVIEW-FLAG tag, or "unknown"}
dimension: {from REVIEW-FLAG tag, or "unknown"}
quote: {first 200 chars of flagged comment body}
```
- False positives are acceptable; false negatives are not.
Collect and parse candidate text from all agent responses.

Step 3: Validation (Sonnet — Deep Pass)

Group candidates into batches of ~10. Launch parallel Sonnet subagents (model: "sonnet") per batch.
Each Sonnet agent receives its candidate batch and, for each candidate:
- If path is set: reads the file and surrounding context.
- Greps the current codebase for the core concern from quote (judgment-based pattern, not a literal string match).
- Classifies:
  - VALID — issue still present, impactful, ticket-worthy
  - RESOLVED — code changed; concern no longer applies
  - STALE — code deleted/refactored; finding irrelevant
- Returns findings as response text only — no file writes. Per-finding format:
```
PR: {number}
comment_id: {databaseId}
classification: VALID|RESOLVED|STALE
path: {file:line or empty}
severity: {critical|warning|info|unknown}
dimension: {arch|bugs|defense|tests|cohesion|slop|unknown}
priority: HIGH|MEDIUM|LOW
impact: {one sentence}
suggested_title: {short GitHub issue title}
reviewer_quote: {verbatim first 300 chars}
```
- Priority assignment: HIGH = severity=critical OR dimension in (bugs, arch); MEDIUM = severity=warning; LOW = severity=info or unknown.
Collect and parse validated findings from all agent responses.

Step 4: Report Generation

Collect all validated findings from Step 3 subagent responses.
Sort findings: VALID first (by priority HIGH→MEDIUM→LOW), then RESOLVED, then STALE.
Resolve the output path:
- Use $2 if provided.
- Otherwise: ${AUTOSKILLIT_TEMP}/audit-review-decisions/review_decisions_audit_$(date +%Y-%m-%d_%H%M%S).md
Create parent directory: mkdir -p "$(dirname "${OUTPUT_PATH}")"
Write the markdown report to ${OUTPUT_PATH}. Structure:

PR Review Decisions Audit — {PERIOD_DAYS}d window

Generated: {ISO timestamp} Scan window: {SCAN_SINCE} to {now} PRs scanned: {N} | Threads examined: {M} | Threads skipped (already audited): {K}

Summary

| Classification | Count | |---|---| | VALID (ticket-worthy) | {N} | | RESOLVED (already fixed) | {N} | | STALE (no longer applicable) | {N} |

Priority Triage

HIGH Priority

For each HIGH VALID finding, write a section:

### {suggested_title}

**PR:** #{number} | **File:** {path}:{line} | **Severity:** {severity} | **Dimension:** {dimension}

> {reviewer_quote}

**Current relevance:** VALID — {impact}

**Suggested issue title:** {suggested_title}
**Affected files:** {path}

MEDIUM Priority

{Same structure}

LOW Priority

{Same structure}

RESOLVED Findings

{List: PR, file, one-line description of what was fixed}

STALE Findings

{List: PR, file, one-line description of why no longer applicable}

Open PR Findings

{Findings from PRs that were open (not merged) at scan time — may still be addressed. Same per-finding structure but labeled as pending.}

Pattern Analysis

Most common deferral phrases (by frequency): {Table: phrase | count | % of all candidates}

Dimensions with highest VALID rate: {Table: dimension | valid | resolved | stale | valid_rate}

Systemic escape hatches detected: {Narrative: which phrases act as systematic blockers to tracking, with counts}

Recommendations: {2–4 concrete process recommendations based on the pattern data}

After writing the file, print a terminal summary:

audit-review-decisions complete
Output: {OUTPUT_PATH}
VALID: {N} | RESOLVED: {N} | STALE: {N}
Top finding: {first HIGH priority suggested_title, or "none"}

Step 5: Watermark (Thread Annotation)

For every finding processed in Steps 2–3 (all classifications — VALID, RESOLVED, STALE):

Re-check for existing audit marker (live): fetch the thread's current comments directly from the GitHub API — do not use the Step 1 JSON cache, which already filtered out [AUDIT]-marked threads and cannot detect markers posted after Step 1:
```
gh api "repos/${OWNER}/${REPO}/pulls/${PR_NUMBER}/comments" \
  --jq "[.[] | select(.id == ${COMMENT_ID} or .in_reply_to_id == ${COMMENT_ID}) | .body | startswith(\"[AUDIT]\")] | any"
```
If the result is true: skip this thread (idempotent — no duplicate post).
Determine marker body based on classification and ticket status:

| Classification | Ticket created? | Marker body | |---|---|---| | VALID | Yes | [AUDIT] — tracked in #{issue_number} | | VALID | No | [AUDIT] — acknowledged, no action taken | | RESOLVED | — | [AUDIT] — verified resolved in current codebase | | STALE | — | [AUDIT] — no longer applicable |

Post reply comment:

gh api "repos/${OWNER}/${REPO}/pulls/${PR_NUMBER}/comments/${COMMENT_ID}/replies" \
  --method POST \
  --field body="${MARKER_BODY}"
sleep 1

COMMENT_ID is the databaseId of the first comment in the thread (from Step 1 JSON).

Thread reply constraint: These calls cannot be batched via the reviews API — each requires an individual POST. The 1s delay between calls is mandatory per GitHub API discipline.
Log progress per finding: [AUDIT] Posted marker on PR #{number} thread {comment_id}: {marker_body}

Adoption

talont-org/audit-review-decisions

$ install --global

Security Scan Results

SKILL.md

Audit Review Decisions Skill

When to Use

Arguments

Critical Constraints

Workflow

Step 0: Watermark Resolution

Step 1: Data Collection (GraphQL Batch)

Step 2: Triage (Haiku — Broad Pass)

Step 3: Validation (Sonnet — Deep Pass)

Step 4: Report Generation

PR Review Decisions Audit — {PERIOD_DAYS}d window

Summary

Priority Triage

HIGH Priority

MEDIUM Priority

LOW Priority

RESOLVED Findings

STALE Findings

Open PR Findings

Pattern Analysis

Step 5: Watermark (Thread Annotation)

Related Skills

talont-org/write-recipe

talont-org/vis-lens-uncertainty

talont-org/vis-lens-temporal

talont-org/vis-lens-story-arc

talont-org/audit-review-decisions

$ install --global

Security Scan Results

SKILL.md

Audit Review Decisions Skill

When to Use

Arguments

Critical Constraints

Workflow

Step 0: Watermark Resolution

Step 1: Data Collection (GraphQL Batch)

Step 2: Triage (Haiku — Broad Pass)

Step 3: Validation (Sonnet — Deep Pass)

Step 4: Report Generation

PR Review Decisions Audit — {PERIOD_DAYS}d window

Summary

Priority Triage

HIGH Priority

MEDIUM Priority

LOW Priority

RESOLVED Findings

STALE Findings

Open PR Findings

Pattern Analysis

Step 5: Watermark (Thread Annotation)

Related Skills

talont-org/write-recipe

talont-org/vis-lens-uncertainty

talont-org/vis-lens-temporal

talont-org/vis-lens-story-arc