skills/prompt-engineering-guidance/SKILL.md
Use when building constrained LLM generation with Microsoft Guidance library. Use when outputs MUST match a specific format (JSON, dates, enums, code). Use when choosing between Guidance, Instructor, Outlines, or native JSON mode for structured output. Use when debugging constrained generation failures (slow grammar compilation, quality degradation from overly strict constraints). NEVER for general prompt engineering or prompt writing — this is specifically for the Guidance library and constrained generation patterns.
npx skillsauth add sharkitect-solutions/sharkitect-claude-toolkit prompt-engineering-guidanceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Think like an engineer who has shipped constrained generation in production and learned where it shines and where it silently destroys output quality.
Core insight: Constraints are a double-edged sword. They guarantee format validity but can murder content quality. A regex-constrained email field will always produce valid syntax — but the model may generate nonsense to satisfy the pattern. The art is constraining just enough.
Before reaching for Guidance or any constrained generation tool, ask:
Does the output NEED to be machine-parseable?
│
├─ YES → Use constrained generation
│ │
│ ├─ Simple structure (JSON, enum, date)?
│ │ └─ Try native JSON mode first (OpenAI, Anthropic both support it)
│ │ If native mode can't handle it → Guidance or Instructor
│ │
│ └─ Complex structure (nested, conditional, multi-step)?
│ └─ Guidance (grammar-based) or Outlines (schema-based)
│
└─ NO → Do NOT use constrained generation
Constraints on free-text tasks (creative writing, analysis,
explanations) actively hurt quality. The model fights the
constraint instead of thinking about the content.
| Need | Best Tool | Why Not Others |
|------|-----------|---------------|
| JSON from API models with retries | Instructor | Pydantic validation + automatic retry on parse failure. Guidance has no retry. |
| Regex-constrained fields (emails, dates, IDs) | Guidance | Token-level regex enforcement. Instructor validates after generation (too late for streaming). |
| Complex grammar (nested structures, code syntax) | Guidance | CFG support at token level. Outlines also works but Guidance has better API model support. |
| JSON Schema validation with local models | Outlines | JSON Schema → grammar compilation. More automatic than Guidance's manual grammar. |
| Simple enum/classification | Any (or native) | All tools handle this well. Native JSON mode is simplest. |
| Multi-step agent workflows | Guidance | Stateful functions with @guidance(stateless=False) enable agentic loops with constrained actions. |
Before writing any Guidance code, check if your provider's native structured output works:
response_format={"type": "json_object"} or response_format={"type": "json_schema", ...}Native mode is simpler, requires no library, and works with the provider's latest optimizations. Only reach for Guidance when native mode can't express your constraints (regex patterns, CFG grammars, multi-step workflows).
Guidance's most underappreciated feature. When you concatenate prompt + generation, tokenization creates unnatural boundaries:
Without healing: "The answer is " + gen() → "The answer is 42" (double space)
With healing: Guidance backs up one token, regenerates → "The answer is 42"
This seems cosmetic but significantly improves generation quality. The model sees natural token sequences instead of artificial breaks. Always leave token healing enabled (it's on by default).
First-run trap: Grammar compilation is cached after first use, but the FIRST call with a new grammar pattern takes 1-5 seconds. In production:
Too loose ◄──────────────────────────────────► Too strict
gen(max_tokens=100) gen(regex=r"^(John|Jane)$")
- No format guarantee - Format guaranteed
- Model generates freely - Model forced into narrow path
- May need post-processing - Quality may suffer severely
The sweet spot: Constrain structure, not content. Let the model fill in the meaning.
# GOOD: Structure constrained, content free
lm += '{"name": "' + gen("name", stop='"') + '", "reason": "' + gen("reason", stop='"') + '"}'
# BAD: Content over-constrained
lm += gen("name", regex=r"^(Alice|Bob|Charlie)$") # Only 3 possible outputs
Guidance's token-level constraints work perfectly with local models (Transformers, llama.cpp) because it intercepts the logit sampling. With API models (OpenAI, Anthropic), constraints are enforced differently — the library post-filters or uses provider-specific features. This means:
Check references/backends.md for backend-specific capabilities.
| File | Purpose | When to Load | |------|---------|-------------| | SKILL.md | Decision frameworks, anti-patterns, when to use | Always (auto-loaded) | | references/constraints.md | Regex and grammar pattern cookbook | When writing specific constraint patterns | | references/backends.md | Backend-specific configuration | When setting up Guidance with a specific provider | | references/examples.md | Production-ready code examples | When implementing specific patterns |
Do NOT load reference files for decision-making tasks (choosing between tools, architecture design). Load them only when implementing specific Guidance patterns.
| Rationalization | When It Appears | Why It's Wrong | |----------------|-----------------|----------------| | "Let me add Guidance to guarantee the output format" | Starting any LLM task | Check native JSON mode first. 80% of structured output needs are handled without a library. | | "More constraints = better output" | Writing constraint patterns | Over-constraining kills content quality. The model spends tokens satisfying the pattern, not thinking about the answer. | | "I'll constrain the entire response with a grammar" | Complex generation tasks | Grammar-constrained generation is significantly slower for large outputs. Constrain only the structured parts. | | "Guidance handles everything the same across backends" | Choosing Guidance for API models | Token-level constraints work natively only with local models. API models have degraded constraint support. |
gen("name", stop='"') inside a JSON template is better than gen("name", regex=r"[A-Z][a-z]+ [A-Z][a-z]+")When you've decided Guidance is the right tool:
pip install guidance (add [transformers] or [llama_cpp] for local models)references/backends.md for configurationselect() for enums and gen(stop="\n") for single-line fields — simplest constraintsreferences/constraints.md for patterns@guidance decorator functions — see references/examples.mddevelopment
When the user wants help with paid advertising campaigns on Google Ads, Meta (Facebook/Instagram), LinkedIn, Twitter/X, or other ad platforms. Also use when the user mentions 'PPC,' 'paid media,' 'ad copy,' 'ad creative,' 'ROAS,' 'CPA,' 'ad campaign,' 'retargeting,' or 'audience targeting.' This skill covers campaign strategy, ad creation, audience targeting, and optimization.
testing
--- name: using-sharkitect-methodology description: Use when starting any conversation in a Sharkitect workspace OR before any task involving NEW pricing, positioning, proposal, strategy, plan-execution, or schema-design work — mandates invocation of Sharkitect-specific methodology skills (pricing-strategy, marketing-strategy-pmm, smb-cfo, hq-revenue-ops, executing-plans, brainstorming) under the same anti-rationalization discipline as using-superpowers. Documentation has failed 4 times across H
testing
Use when user says 'end session', 'wrap up', 'stop for the day', 'done for today', 'close out', 'save session', 'wrapping up', or invokes /end-session. Runs the full 9-step end-of-session protocol: resource audit, MEMORY.md update, lessons capture, plan status, pending items, workspace checklist, .tmp/ audit, git commit+push, Supabase brain sync, session brief, summary. Final step schedules a detached self-kill of the current session ONLY (3s delay) so the window closes cleanly. Other claude.exe processes (active workspaces) are NOT touched -- orphan cleanup is handled separately by Claude-Orphan-Cleanup-Hourly with proper age safeguards. Do NOT use for: mid-session quick saves (use session-checkpoint), skill syncing (use sync-skills.py), brain memory queries (use supabase-sync.py pull), document freshness reviews (use document-lifecycle), resource gap detection (use resource-auditor).
testing
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, passive voice, negative parallelisms, and filler phrases.