Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lwlee2608/writing-system-prompts

Name: writing-system-prompts
Author: lwlee2608

skills/writing-system-prompts/SKILL.md

npx skillsauth add lwlee2608/agent-skills writing-system-prompts

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Writing LLM System Prompts

A well-built system prompt steers behavior across the whole conversation and — if structured for KV-cache reuse — costs a fraction of an uncached prefix on providers that offer prompt caching (Anthropic, OpenAI, Bedrock, Vertex, etc.).

Rules

Put the role first, in one sentence. A single descriptive sentence in the system field (not the user turn) anchors tone and vocabulary, e.g. You are a helpful coding assistant specializing in Python. Plain prose at the top — do not wrap in markup.
Order static-first, dynamic-last. Cache hits require a byte-identical prefix. Sequence: role → tools/policies → long reference docs / few-shot examples → cache breakpoint → dynamic context → user turn. Why: one changed token before the breakpoint drops you from cached-read pricing (~10% of input) to full input cost.
Mark cache breakpoints explicitly when the provider supports it. Most caching is opt-in (e.g. Anthropic cache_control, Bedrock cachePoint). Place a breakpoint at the end of each stable block to reuse. Cache writes typically cost more than base input; reads a fraction of it. TTLs vary (commonly ~5 min, sometimes longer tiers). Only cache prefixes you'll reuse within the TTL — sparse traffic pays the write penalty repeatedly.
```
# Anthropic example — adapt to your provider's syntax
system=[
    {"type": "text", "text": ROLE_AND_POLICIES},
    {"type": "text", "text": LONG_REFERENCE_DOC,
     "cache_control": {"type": "ephemeral"}},
]
```
Never put churning content at the prefix. Current time, request ID, user ID, or conversation summary at the top defeats caching for the whole prompt. Push them past the breakpoint — either a second system message (for authoritative facts like date/IDs) or the user turn (for task material like questions and documents). Both preserve cache reuse equally.
Use structural delimiters for distinct sections. Wrap longform inputs and behavioral directives in clear delimiters — XML tags (<documents><document>…</document></documents>, <default_to_action>…</default_to_action>), markdown headers, or named JSON fields. Why: delimited sections survive long contexts better than walls of prose, and several model families (notably Claude) are explicitly trained on XML structure.
Place longform data above instructions in the user turn. For 20k+ token inputs, docs at the top, question at the bottom — measured up to 30% quality lift on multi-document tasks across major models.
Write literal, scoped instructions. Modern instruction-tuned models interpret prompts increasingly literally and won't silently generalize. If a rule applies broadly, say so: Apply this formatting to every section, not just the first one. Avoid ALL-CAPS shouting (CRITICAL, MUST) — it causes over-triggering on recent Claude models and adds little on others; use normal imperative voice.
Tune verbosity, tone, tool use, and subagent spawning only when the default is wrong. Examples: Provide concise, focused responses. Skip non-essential context. / Use a warm, collaborative tone. / Spawn multiple subagents in the same turn when fanning out across items. Do not spawn a subagent for work you can complete in a single response. Override defaults only after observing a problem — don't pre-empt.

Verification procedure

After drafting or editing, check each:

Role check — One-sentence role at the very top of the system field?
Cache-order check — List segments top-to-bottom. Is every segment before the breakpoint stable across requests? Move per-request values below the breakpoint (second system message or user turn).
Breakpoint check — Is the breakpoint on the last static block? Is the cached prefix above the provider's minimum (e.g. 1024 tokens for Anthropic Opus/Sonnet, 2048 for Haiku, 1024 for OpenAI)? Below the minimum, breakpoints are silently ignored.
Structure check — Longform docs wrapped in delimiters? Behavioral blocks in named tags or sections?
Literalism check — If an instruction says "format the output" but means "format every section," rewrite with explicit scope.

Common mistakes to watch for

Timestamp at the top. Current date: {{today}} in the role line invalidates the cache every request. Move past the breakpoint — second system message or user turn.
Caching a tiny prefix. A short system prompt below the provider's minimum — breakpoint silently ignored and you still pay the write surcharge. Either grow the cached block or drop the breakpoint.
Conversation history before the breakpoint. History grows every request, so anything after it can never be cached. Put the breakpoint before the rolling history.
Copy-pasted legacy CRITICAL: YOU MUST… shouting. Causes over-triggering on recent models. Rewrite as plain imperatives.
Role in the user message. You are a… in messages[0] instead of the system field weakens steering and wastes a cacheable slot.

lwlee2608/writing-system-prompts

skills/writing-system-prompts/SKILL.md

Use when writing or editing a system prompt for any LLM API or SDK (any code passing a `system=` / `system` role parameter, or a `.txt`/`.md` file holding such a prompt). Applies prompt-engineering and prompt-caching best practices.

development

Updated May 20, 2026

$ install --global

skillsauth

npx skillsauth add lwlee2608/agent-skills writing-system-prompts

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 20, 2026, 8:33 AM192.4s1 file scanned

SKILL.md

name:: writing-system-prompts
description:: Use when writing or editing a system prompt for any LLM API or SDK (any code passing a `system=` / `system` role parameter, or a `.txt`/`.md` file holding such a prompt). Applies prompt-engineering and prompt-caching best practices.
user-invocable:: true

Writing LLM System Prompts

Rules

Put the role first, in one sentence. A single descriptive sentence in the system field (not the user turn) anchors tone and vocabulary, e.g. You are a helpful coding assistant specializing in Python. Plain prose at the top — do not wrap in markup.
Order static-first, dynamic-last. Cache hits require a byte-identical prefix. Sequence: role → tools/policies → long reference docs / few-shot examples → cache breakpoint → dynamic context → user turn. Why: one changed token before the breakpoint drops you from cached-read pricing (~10% of input) to full input cost.
Mark cache breakpoints explicitly when the provider supports it. Most caching is opt-in (e.g. Anthropic cache_control, Bedrock cachePoint). Place a breakpoint at the end of each stable block to reuse. Cache writes typically cost more than base input; reads a fraction of it. TTLs vary (commonly ~5 min, sometimes longer tiers). Only cache prefixes you'll reuse within the TTL — sparse traffic pays the write penalty repeatedly.
```
# Anthropic example — adapt to your provider's syntax
system=[
    {"type": "text", "text": ROLE_AND_POLICIES},
    {"type": "text", "text": LONG_REFERENCE_DOC,
     "cache_control": {"type": "ephemeral"}},
]
```
Never put churning content at the prefix. Current time, request ID, user ID, or conversation summary at the top defeats caching for the whole prompt. Push them past the breakpoint — either a second system message (for authoritative facts like date/IDs) or the user turn (for task material like questions and documents). Both preserve cache reuse equally.
Use structural delimiters for distinct sections. Wrap longform inputs and behavioral directives in clear delimiters — XML tags (<documents><document>…</document></documents>, <default_to_action>…</default_to_action>), markdown headers, or named JSON fields. Why: delimited sections survive long contexts better than walls of prose, and several model families (notably Claude) are explicitly trained on XML structure.
Place longform data above instructions in the user turn. For 20k+ token inputs, docs at the top, question at the bottom — measured up to 30% quality lift on multi-document tasks across major models.
Write literal, scoped instructions. Modern instruction-tuned models interpret prompts increasingly literally and won't silently generalize. If a rule applies broadly, say so: Apply this formatting to every section, not just the first one. Avoid ALL-CAPS shouting (CRITICAL, MUST) — it causes over-triggering on recent Claude models and adds little on others; use normal imperative voice.
Tune verbosity, tone, tool use, and subagent spawning only when the default is wrong. Examples: Provide concise, focused responses. Skip non-essential context. / Use a warm, collaborative tone. / Spawn multiple subagents in the same turn when fanning out across items. Do not spawn a subagent for work you can complete in a single response. Override defaults only after observing a problem — don't pre-empt.

Verification procedure

After drafting or editing, check each:

Role check — One-sentence role at the very top of the system field?
Cache-order check — List segments top-to-bottom. Is every segment before the breakpoint stable across requests? Move per-request values below the breakpoint (second system message or user turn).
Breakpoint check — Is the breakpoint on the last static block? Is the cached prefix above the provider's minimum (e.g. 1024 tokens for Anthropic Opus/Sonnet, 2048 for Haiku, 1024 for OpenAI)? Below the minimum, breakpoints are silently ignored.
Structure check — Longform docs wrapped in delimiters? Behavioral blocks in named tags or sections?
Literalism check — If an instruction says "format the output" but means "format every section," rewrite with explicit scope.

Common mistakes to watch for

Timestamp at the top. Current date: {{today}} in the role line invalidates the cache every request. Move past the breakpoint — second system message or user turn.
Caching a tiny prefix. A short system prompt below the provider's minimum — breakpoint silently ignored and you still pay the write surcharge. Either grow the cached block or drop the breakpoint.
Conversation history before the breakpoint. History grows every request, so anything after it can never be cached. Put the breakpoint before the rolling history.
Copy-pasted legacy CRITICAL: YOU MUST… shouting. Causes over-triggering on recent models. Rewrite as plain imperatives.
Role in the user message. You are a… in messages[0] instead of the system field weakens steering and wastes a cacheable slot.

Related Skills

lwlee2608/handoff

documentation

VerifiedTrustedCommunity

Use when the user wants to condense the current conversation into a handoff document for another agent to pick up.

SKILL.mdUpdated May 22, 2026

lwlee2608/whiteboard-explain

development

VerifiedTrustedCommunity

Use when the user asks to explain or teach a technical concept. Replies in plain language with a simple diagram instead of a wall of jargon.

SKILL.mdUpdated May 18, 2026

lwlee2608/whiteboard-explain

lwlee2608/prefer-make

tools

VerifiedTrustedCommunity

Use before running any Go toolchain command (`go build`, `go test`, `go run`, `go vet`, `go fmt`, `golangci-lint`). Substitutes make targets when a Makefile is present.

SKILL.mdUpdated Apr 16, 2026

lwlee2608/prefer-make

lwlee2608/linear

tools

VerifiedTrustedCommunity

Use when the user wants to interact with Linear.app — reading or searching issues/tickets.

SKILL.mdUpdated Apr 16, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lwlee2608/agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills/skills/writing-system-prompts ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lwlee2608/agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT