- name:
- gpt-pro-report
- description:
- Thoroughly evaluate one or more GPT Pro reports item by item, plus any user comments or handling preferences supplied with them; independently verify each recommendation, build a high-confidence plan of change, refine that plan through at least three explicit rounds, then execute it end-to-end with verification, battle testing, documentation updates, checkpoint commits, and safe pushes until the repo is clean.
GPT Pro Report
Multi-agent collaboration
- Default to using subagents when they are likely to improve speed, quality, confidence, or keep the main context clean.
- Use subagents to widen coverage, dig deeper on one thread, get a fresh second opinion, or keep the main thread clean while side work runs.
- Split work into clear packets with owners, inputs, acceptance checks, and a synthesis step when parallelizing.
- Keep the main agent focused on synthesis, unblockers, and the next critical-path step; let subagents handle bounded side work that can run in parallel.
- Use single-agent execution only when scope is small or coordination overhead outweighs gains.
GPT Pro report-specific subagent split
- Default to subagents when the report has many items, more than one source file, or mixed user comments that need separate checks.
- Suggested split:
item-checkers each own one report item or a small item batch and update plan/current/gpt-pro-report.md with verdict, evidence, and exact recommended change.
- Suggested split:
plan-checker gives a fresh pass on the merged plan before execution and calls out gaps, risky leaps, or duplicate work.
- Suggested split:
verification-checker does a clean re-read after implementation so final sign-off is not based only on the main agent's context.
- The main agent owns item acceptance or rejection, the merged execution plan, execution order, and the final user-facing result.
- Use subagents only when the work splits cleanly; otherwise stay single-agent.
Proactive autonomy and knowledge compounding
- Be proactive: immediately take the next highest-value in-scope action when it is clear.
- Default to autonomous execution: do not pause for confirmation between normal in-scope steps.
- Request user input only when absolutely necessary: ambiguous requirements, material risk tradeoffs, missing required data/access, or destructive/irreversible actions outside policy.
- If blocked by command/tool/env failures, attempt high-confidence fallbacks autonomously before escalating.
- When the workflow uses
plan/, ensure required plan directories exist before reading/writing them.
- Treat transient external failures as retryable by default: run bounded retries with backoff and capture failure evidence before concluding blocked.
- On repeated invocations for the same objective, resume from prior findings, decisions, plan files, and execution logs; prioritize net-new progress over repeating identical work unless verification requires reruns.
- Drive work to complete outcomes with verification, not partial handoffs.
- Treat iterative execution as the default for non-trivial work; run adaptive loop passes such as
consider -> investigate -> high-confidence gate -> plan -> verify plan -> execute -> verify -> battletest -> organise-docs -> checkpoint -> re-review.
- Keep looping until actual completion criteria are met: no actionable in-scope next step remains, verification is green, and confidence is high.
- Run
organise-docs frequently during execution to capture durable decisions and learnings, not only at the end.
- Create small checkpoint commits frequently with
git-commit when changes are commit-eligible, checks are green, and repo policy permits commits.
- Never squash commits; always use merge commits when integrating branches.
- Prefer simplification over added complexity: aggressively remove bloat, redundancy, and over-engineering while preserving correctness.
- When you touch code, leave the touched area in a better state than you found it: clearer, simpler, tidier, and at least as performant unless the task requires an explicit trade-off.
- Use simple, plain English in user messages, docs, notes, reports, code comments, and other explanatory writing. Avoid jargon, fancy wording, and complex phrasing. When a technical term is needed for correctness, explain it in simple words the first time. Default to short user-facing responses. Think about what the user most wants to know, and lead with that. Do not dump every detail by default. Always include important changes, blockers, verification gaps, and any important assumptions, nuances, principles, or decisions that shaped the work. Add more detail only when the user asks for it or when uncertainty or risk makes it necessary.
- Compound knowledge continuously: keep
docs/ accurate and up to date, and promote durable learnings and decisions from work into docs.
Long-task checkpoint cadence
- For any non-trivial task, run recurring checkpoint cycles instead of waiting for a single end-of-task wrap-up.
- At each meaningful milestone with commit-eligible changes, and at least once per major phase, invoke
checkpoint or the equivalent organise-docs -> git-commit -> git-push-safe sequence once relevant checks are green and repo policy permits commits/pushes.
- If a checkpoint step is blocked, resolve or record the blocker immediately and retry before expanding scope.
Terminal state contract (must follow)
The skill is complete only when all of the following are true:
- Objective completion: every provided GPT Pro report has been fully processed, or the exact blocker has been documented with evidence.
- Item completion: every report item has an explicit disposition of
accept, accept-with-modifications, defer, or reject, with evidence.
- Confidence completion: every accepted implementation item is explicitly
high-confidence or very-high-confidence; lower-confidence items are not implemented.
- Plan completion: the consolidated plan of change exists, or the result explicitly states that no change plan is warranted.
- Refinement completion: at least three explicit plan-refinement rounds are completed and recorded before execution starts, unless there is no plan to execute.
- Step-level terminal completion: each numbered workflow step and each delegated subtask is explicitly resolved as
done, blocked, or not-applicable, with brief evidence before advancing.
- Execution completion: every plan item that remains in scope is executed end-to-end, or explicitly marked
blocked with concrete evidence.
- Verification completion: every implemented item is verified and battle-tested before the workflow advances to the next item, and the final plan includes three explicit end-state validation rounds.
- Docs/git completion: durable docs/notes are updated, checkpointing is performed at meaningful milestones, and the final repo state is clean; if a remote exists and policy allows push, the final state is also pushed.
- Loop completion: no actionable in-scope high-confidence next step remains.
Stop only after this terminal contract is satisfied; otherwise continue iterating.
Terminal state examples (adapt to skill)
done: every report item has a disposition, the high-confidence plan has been refined and executed, required verification passes are complete, and the repo is left clean.
blocked: progress cannot continue after bounded retries because a concrete dependency, access issue, or unresolved confidence gap prevents further safe execution; blocker evidence and exact unblock action are reported.
not-applicable: an optional branch is explicitly skipped with reason, such as final push in a local-only repo or execution phases when the validated outcome is a confident no-change result.
Overview
Use this skill when the user provides one or more GPT Pro reports, writeups, recommendation lists, or suggestion dumps, optionally with their own inline comments or handling preferences, and wants Codex to treat the reports as hypotheses rather than truth.
The core job is not to obey the report. The core job is:
- decompose every report into atomic items,
- capture any user-supplied comments, preferences, dislikes, or handling instructions that affect how those items should be evaluated,
- investigate every item independently,
- decide whether each item is actually correct and useful in the project context,
- build a consolidated high-confidence plan of change from only the validated items,
- refine that plan repeatedly until it is execution-ready,
- then execute the plan fully and thoroughly with verification, battle testing, docs promotion, checkpoint commits, and safe pushes.
It is valid to conclude that no changes should be made. A confident no-op is a successful outcome.
Full-cycle completion requirement (must follow)
- Treat every invocation as a full start-to-finish execution unless the user explicitly restricts scope.
- Be relentless, patient, careful, and meticulous. Long, repetitive, or inconvenient work is not a reason to stop early.
- Do not stop after analysis, planning, partial implementation, partial verification, or a single pass through the plan when safe progress can continue.
- Keep driving until every report item and every surviving plan item reaches a terminal state that satisfies this skill's terminal contract.
- When something fails or blocks progress, exhaust bounded retries, high-confidence fallbacks, and plan adjustments before concluding
blocked.
- If a blocker is cleared or a previously unclear item becomes actionable, resume immediately and continue until the full validated plan/report has been completed.
Mandatory principles
- Do not treat GPT Pro output as authoritative.
- Treat explicit user instructions, preferences, and handling comments supplied with the report as first-class input and constraints for the evaluation process.
- If the user expresses an opinion about whether a report item is good or bad, take that opinion seriously and incorporate it into the evaluation, but still validate correctness and implementation worth with independent evidence.
- Do not skip items because the list is long.
- Do not merge distinct report items into one vague bucket unless they are provably the same underlying change.
- Do not implement medium-confidence or low-confidence ideas.
- Do not stop after planning unless the user explicitly limited the invocation to planning-only.
- Do not return control to the user until the validated report/plan has been carried through to full terminal completion, unless the work is genuinely blocked with evidence.
- Do not leave verification, battle testing, docs updates, or repo hygiene as implied follow-up work.
Confidence policy
Before a report item can enter the executable plan, all of these gates must pass:
- The item is understood clearly enough to act without guessing.
- The current state has been independently validated from code, docs, tests, logs, artifacts, or focused probes.
- The proposed change is the correct lever for the observed gap or opportunity.
- The likely blast radius and invariants are understood.
- There is a concrete verification and battle-test path.
- Required tools, permissions, and environment conditions exist to carry the work through.
Use this rubric:
very-high-confidence: all gates pass strongly and contradictions are absent or resolved.
high-confidence: all gates pass with only minor non-material uncertainty remaining.
medium-confidence: one or more gates are only partially satisfied or evidence is conflicting.
low-confidence: the change is speculative or multiple gates fail.
Action rule:
- Only
high-confidence and very-high-confidence items may enter the executable plan.
medium-confidence and low-confidence items must be investigated further or deferred/rejected.
Behavioral guardrails (must follow)
- Proceed without permission for standard in-scope steps such as reading, searching, summarizing, planning, testing, editing, and analysis.
- Require explicit approval only for destructive or irreversible actions outside normal repo work, executing untrusted code/installers, remote-state changes outside the scope of this skill invocation, or changes outside the repo environment.
- Run a preflight before substantial work: confirm
cwd, key paths, available tools, repo policy, and whether remotes exist.
- Prefer quoted paths and explicit path checks when running shell commands.
- Prefer the simplest investigation or implementation that can establish confidence.
- Keep changes surgical and within the accepted plan.
- If a report item is too vague, reconstruct intent from the supplied reports, repo context, docs, and recent conversation/session history before asking the user.
- If nothing remains to do, say so explicitly and stop.
Required output contract
Default to a short final result.
In most cases include only:
- Report source summary.
- Include both GPT Pro report sources and any user-supplied comments, annotations, or handling preferences that materially shaped the evaluation.
- Consolidated plan of change, or an explicit no-change conclusion.
- Execution summary.
- Verification and battle-test evidence.
- Remaining risks, deferred items, and exact blockers if any.
Include the full item decision register or plan-refinement round summary only when the user asks for it, when the work is blocked, or when those details are needed to justify the outcome.
Workflow
1) Preflight and scope
- Identify every GPT Pro report or report section in scope.
- Identify any user-supplied comments, annotations, preferences, dislikes, or explicit handling instructions attached to those reports.
- Identify the affected repo or repos, relevant branches, and project constraints.
- Confirm whether the invocation is full-cycle or plan-only. Default to full-cycle.
- Check whether the repo has a remote and whether commits/pushes are allowed by policy.
- Create or reuse
plan/current/gpt-pro-report.md and adjacent scratch files when useful.
- Treat user handling instructions as explicit constraints for later plan building and execution unless they conflict with higher-priority repo or system policy.
2) Build the report item ledger
- Read each report fully.
- Decompose every report into atomic items.
- Assign each item a stable identifier such as
R1-I01, R1-I02, R2-I01.
- Preserve the original wording in short form, then rewrite the item in concrete engineering terms.
- Record per item:
- original report id,
- short claim or recommendation,
- any relevant user annotation, preference, or explicit instruction attached to that item or report,
- affected area,
- initial confidence,
- evidence needed next.
- Do not move forward until every report item is represented in the ledger.
3) Investigate each item one by one
- Evaluate items individually, not as a hand-wavy bundle.
- For each item:
- inspect the relevant code, docs, tests, history, and recent discussion context,
- factor in any user-supplied preferences, dislikes, or handling guidance that affect desirability, risk tolerance, or acceptance criteria,
- run focused probes when needed,
- identify whether the recommendation is correct, partially correct, misguided, outdated, or mis-scoped,
- determine whether the item should be accepted, modified, deferred, or rejected.
- Record evidence and reasoning for every disposition.
- If the project context contradicts the report, prefer the project evidence.
- If the user comment conflicts with the report, treat the user comment as a signal about desired handling or project preference, but still validate factual claims independently.
4) Build the initial plan of change
- Create a consolidated plan using only accepted
high-confidence and very-high-confidence items.
- If multiple reports are provided, process them one by one but merge all surviving items into one combined plan.
- For each plan item include:
- goal and rationale,
- exact scope,
- ordered tasks,
- dependencies,
- risks and mitigations,
- validation steps,
- battle-test steps,
- docs/notes updates,
- checkpoint expectations.
- If no items qualify for implementation, state that no change plan is warranted and stop.
5) Mandatory plan-refinement rounds
Before any implementation starts, perform at least these three explicit refinement rounds:
-
Round 1: completeness and dependency pass.
- Check for missing prerequisites, missing files/tests/docs, bad ordering, oversized tasks, or hidden dependencies.
- Split large items into smaller execution-ready steps.
-
Round 2: risk, snag, and corner-case pass.
- Search for invariants, edge cases, migration hazards, performance risks, compatibility risks, and likely user-facing regressions.
- Add mitigation steps or remove items that no longer meet the confidence bar.
-
Round 3: validation, battle-test, and hygiene pass.
- Make verification explicit after every meaningful implementation phase.
- Add battle tests, docs updates, cleanup steps, checkpoint cadence, and final repo-hygiene checks.
- Ensure the plan ends with at least three explicit final validation rounds.
Repeat more rounds if needed. Do not start execution while the plan still contains vague steps, weak validation, or low-confidence items.
6) Execute the plan item by item
- Execute the plan from start to finish without unnecessary pauses.
- Default to sequential execution so each plan item is fully resolved before the next one begins.
- Parallelize only when independent work packets have disjoint scopes and clear synthesis/verification points.
- For each plan item, follow this minimum loop:
- implement,
- run the item-specific verification,
- run the item-specific battle test,
- update durable docs/notes/rationale,
- checkpoint when appropriate,
- re-check whether confidence still holds.
- If verification or battle testing drops confidence below
high-confidence, reopen the item, adjust the plan, and do not force it through.
7) Final validation rounds (minimum three)
The plan must end with at least these three explicit final passes after implementation:
-
Final round 1: targeted verification pass.
- Re-run the direct correctness checks for every implemented item.
-
Final round 2: broader battle-test pass.
- Stress the changed workflows across realistic configs, perspectives, or user paths.
-
Final round 3: final consistency and hygiene pass.
- Confirm docs, decisions, tests, artifacts, git state, and upstream state are all aligned.
- Confirm there are no leftover TODOs, stale scratch artifacts that should be removed, or uncommitted tracked changes.
If these rounds reveal issues, fix them and repeat the affected validation rounds until green.
8) Checkpointing and pushing
- During long or multi-phase execution, run checkpoint cycles frequently.
- Prefer
checkpoint as the wrapper for organise-docs -> git-commit -> git-push-safe.
- If a remote exists and repo policy allows push, finish with all committed work pushed.
- If no remote exists or policy disallows push, report push as
not-applicable but still finish with a clean local tree.
9) Close out
- Provide the full item decision register only when the user asks for it, when the work is blocked, or when it is needed to justify the outcome.
- Report which report items were accepted, modified, deferred, or rejected.
- Report what changed, why, and what verification/battle-test evidence was collected.
- Report any deferred items with the exact evidence gap.
- State explicitly whether the final outcome is:
no-change,
verified change set complete,
- or
blocked.
Skill composition
When this skill is triggered, compose other skills as needed:
- Use
consider to treat the report as non-authoritative evidence that must be evaluated.
- Use
investigate to build independent evidence for each report item.
- Use
high-confidence-changes to enforce the confidence gate before implementation.
- Use
plan to write and refine the consolidated change plan.
- Use
execute to carry the final plan through completion.
- Use
verify for targeted checks and plan-critique passes.
- Use
battletest for broader user-perspective validation.
- Use
organise-docs to capture durable rationale and decisions.
- Use
checkpoint at milestones.
- Use
git-push-safe when final push is in scope and allowed.
If there is any conflict, the confidence gate and exhaustive per-item evaluation rules in this skill win.
Trigger phrases
Use this skill when the user asks for intents like:
gpt pro report
evaluate this GPT Pro report
consider this GPT Pro writeup item by item
turn this GPT Pro discussion into a verified plan and execute it
review these GPT recommendations, keep only high-confidence changes, then do them
process multiple GPT Pro reports into one plan of change
Prompt templates
Use these copy-paste templates:
[$gpt-pro-report] treat the attached GPT Pro report as hypotheses, evaluate every item independently, keep only high-confidence changes, refine the plan three times, then execute it end-to-end.
[$gpt-pro-report] process these GPT Pro reports one by one, build one consolidated plan of change, then implement, verify, battle-test, document, checkpoint, and push until the repo is clean.
[$gpt-pro-report] do not take the report as gospel; investigate every item, reject weak ideas, and fully execute only the high-confidence plan.
[$gpt-pro-report] process this GPT Pro report plus my inline comments/preferences; treat my handling instructions as constraints, but still validate report-item correctness independently before acting.