skills/meta-gpt-prompt-maintenance/SKILL.md
Maintain, audit, rewrite, or upgrade prompt artifacts for GPT-series models. Use when adapting existing prompts to current OpenAI GPT prompt guidance; polishing SKILL.md files, AGENTS instructions, system/developer prompts, product assistant prompts, reusable agent workflows, eval prompts, or grader prompts; or reducing legacy process-heavy prompt stacks into clearer outcome-first instructions. Do not use for ordinary prose editing, model-agnostic prompt advice, new skill creation from scratch, or deciding where durable context belongs before a target prompt artifact is chosen.
npx skillsauth add plimeor/agent-skills meta-gpt-prompt-maintenanceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Maintain prompt artifacts so GPT-series models receive clear, outcome-first instructions with the constraints, evidence guidance, validation, and stop rules needed to complete the user's requested work without unnecessary process noise.
A good prompt-maintenance result:
always, never, must, and only for true invariants: safety, exact output contracts, irreversible actions, required fields, or tool syntax.Use this skill once a prompt artifact exists to maintain: a SKILL.md, rules file, system/developer prompt, product prompt, agent workflow prompt, eval prompt, grader prompt, or comparable reusable instruction block.
Use skill-creator instead when the main task is to create a brand-new skill, design skill evals, benchmark a skill, or optimize skill triggering from scratch.
Use context-engineering when the main question is where a rule belongs: global rules, project rules, a skill, tooling, task context, or an external system — unless the prompt artifact is already selected or the work is specifically GPT prompt quality.
Use writing or editing skills for ordinary prose polishing. Do not turn blog drafts, documentation prose, customer copy, or creative writing into prompt maintenance unless the text is itself an instruction to a model.
Do not broaden the artifact's behavior, target model family, tool access, external side effects, sync state, commits, deployment, or persistent configuration unless the user explicitly asks.
Read the target prompt artifact first. If the task mentions a current GPT model, OpenAI prompt guidance, migration, or model-specific behavior, read the user-provided guidance or current official OpenAI documentation before making model-specific edits.
Continue retrieval only when:
Stop retrieval once the core rewrite can be justified; do not search again for phrasing, decorative examples, or noncritical background.
When the source is a local file supplied by the user, prefer that file over web search. Use official OpenAI sources for external refreshes unless the user requests another source.
Optimize for role clarity, collaboration style, tool behavior, evidence discipline, validation, and stop conditions. Keep personality short. Separate how the assistant sounds from how it works.
For long-running or tool-heavy workflows, include a short user-visible preamble rule when the host supports intermediate messages. For Responses API workflows that replay assistant items manually, preserve phase values exactly when the artifact controls replay behavior.
Define the user's visible outcome, what completion means, what actions are allowed, and what the final answer should contain. Include fallback behavior for missing evidence, unavailable tools, or unsupported requests.
For customer-facing text, define tone and length, but do not let tone instructions obscure policy, evidence, or action boundaries.
Prefer decision rules over fixed sequences. Keep required order only for fragile operations, safety checks, exact tool syntax, validation integrity, or irreversible side effects.
For coding agents, require concrete validation commands when available and an explicit explanation when validation cannot run.
For SKILL.md, keep frontmatter description: focused on trigger conditions, near-miss exclusions, and routing. Put reusable workflow guidance in the body.
The body should start with outcome and constraints before process. Keep references, large examples, scripts, and domain-specific details outside SKILL.md when progressive disclosure would reduce context load.
For skills with user-stated acceptance, safety, parity, quality, or evidence invariants, promote each invariant into a hard gate. A hard gate defines when it activates, what fields or artifacts are required, which weaker substitutes are insufficient, what self-review should catch, and what evidence is required before completion.
When modifying existing skills, do not rename directories, update indexes, or sync installed skills unless the user asked for those operations.
Make the task, inputs, scoring criteria, and output schema explicit. Avoid leaking expected answers into the prompt under test unless that is the purpose of the eval. Keep grading rubrics atomic enough that pass/fail evidence is inspectable.
Before rewriting, identify which defects actually matter for the user's goal:
always / never rules used for judgment callsshould languageDo not force every prompt into the same template. Add sections only when they change behavior or make maintenance safer.
Preserve the artifact's requested behavior first. Improve clarity, ordering, and enforceability without adding new product requirements, facts, capabilities, tools, or obligations.
Use the shortest structure that covers the real risk:
Role:
Goal:
Success Criteria:
Constraints:
Evidence:
Output:
Stop Rules:
For small prompts, this can be compressed into a few paragraphs. For durable agent prompts, explicit headings are usually worth the space.
Convert brittle sequences into decision rules:
For retrieval, include a budget:
Read the directly relevant source first. Make another retrieval call only when a
required fact, source, ID, date, parameter, or comparison target is missing.
Stop once the core answer can be supported.
For rewriting, summary, and customer-facing outputs, state what to preserve:
Preserve the requested artifact, length, structure, genre, and factual claims.
Improve clarity and flow without adding new claims, extra sections, or a more
promotional tone unless explicitly requested.
For creative drafting, separate source-backed facts from allowed creative wording. Use placeholders or labeled assumptions instead of inventing metrics, customer names, roadmap status, capabilities, or dates.
For reasoning guidance, ask for concise rationale, checks, or evidence in the final answer when useful. Do not ask the model to reveal hidden chain of thought.
When maintaining multiple prompt artifacts, process each artifact on its own terms. Collapse them into a shared rewrite only when the user asks for a common template.
For each artifact:
If the user asks for independent review per artifact, create one worker per artifact when available and aggregate their findings. If worker capacity is limited, batch workers and record the batching constraint. Do not pretend a single local pass was independent worker review.
Choose validation that matches the artifact:
SKILL.md, verify frontmatter name: matches the directory and that trigger descriptions are still focused on use and near-miss exclusions.After rewriting a prompt that carries user acceptance, safety, parity, or quality invariants, run an adversarial check: could an agent follow the rewritten prompt and skip the invariant while still sounding compliant? If so, strengthen the gate, required output field, or stop rule before finishing.
If validation cannot run, say why and name the next best check.
For rewrite tasks, provide:
For review-only tasks, lead with findings ordered by impact, then give concrete rewrite recommendations. Say that no files changed.
For batch work, summarize per artifact. Avoid hiding individual decisions behind a generic theme.
Stop when the requested prompt artifacts are rewritten or reviewed, evidence and validation status are stated, and out-of-scope improvements are separated.
Ask one narrow question only when missing information would change the target artifact, model family, product behavior, authorization boundary, or validation claim.
Stop optimizing wording once the prompt is clear, scoped, supported, and testable.
development
Set up, resume, or repair a compact active execution workbench for long-horizon, multi-session or checkpointed work. Use when a task needs durable handoff, unattended iteration, human gates, auditable evidence, or active-vs-archive routing that keeps a current packet separate from stale historical context. Do not use for one-session tasks, ordinary plans/reviews/audits, one-session bug fixes, direct code edits, or simple docs cleanup; complete those directly.
tools
Decide whether and how to use authorized sub-agents, then coordinate delegated work while preserving the main agent's context. Use when the user asks for orchestration, parallel agents, delegation, background workers, context isolation, or when another skill needs delegated research, review, implementation, or verification. Owns host-policy checks, delegation packets, non-overlap, report verification, and stop rules. Do not use to bypass tool policy, infer user authorization, or add coordination overhead to simple single-threaded tasks.
development
Use before finalizing a non-trivial answer, recommendation, review, or decision to reconsider it and raise its quality, especially when shallow reasoning, context inertia, false framing, overconfidence, unfit analogy transfer, or an obvious-but-missed defect could distort the result. Trigger especially before applying external evidence, familiar frameworks, or comparisons to the user's specific request, and when the user asks to reconsider, double-check, take a second look, or sanity-check an answer.
tools
Route durable rules and context to the right layer — task, project, skill, tooling, hooks, MCP, or global. Use for global rules files (~/.claude/CLAUDE.md, global AGENTS.md), repo-local AGENTS.md/CLAUDE.md, task context packs, hook placement (Codex/Claude Code settings.json), collaboration friction diagnosis, and rule-placement decisions.