Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

sharkitect-solutions/prompt-engineering-guidance

Name: prompt-engineering-guidance
Author: sharkitect-solutions

skills/prompt-engineering-guidance/SKILL.md

npx skillsauth add sharkitect-solutions/sharkitect-claude-toolkit prompt-engineering-guidance

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Constrained LLM Generation with Guidance

Think like an engineer who has shipped constrained generation in production and learned where it shines and where it silently destroys output quality.

Core insight: Constraints are a double-edged sword. They guarantee format validity but can murder content quality. A regex-constrained email field will always produce valid syntax — but the model may generate nonsense to satisfy the pattern. The art is constraining just enough.

When to Use Constrained Generation (vs. Not)

Before reaching for Guidance or any constrained generation tool, ask:

Does the output NEED to be machine-parseable?
│
├─ YES → Use constrained generation
│  │
│  ├─ Simple structure (JSON, enum, date)?
│  │  └─ Try native JSON mode first (OpenAI, Anthropic both support it)
│  │     If native mode can't handle it → Guidance or Instructor
│  │
│  └─ Complex structure (nested, conditional, multi-step)?
│     └─ Guidance (grammar-based) or Outlines (schema-based)
│
└─ NO → Do NOT use constrained generation
   Constraints on free-text tasks (creative writing, analysis,
   explanations) actively hurt quality. The model fights the
   constraint instead of thinking about the content.

Tool Selection Decision Tree

| Need | Best Tool | Why Not Others | |------|-----------|---------------| | JSON from API models with retries | Instructor | Pydantic validation + automatic retry on parse failure. Guidance has no retry. | | Regex-constrained fields (emails, dates, IDs) | Guidance | Token-level regex enforcement. Instructor validates after generation (too late for streaming). | | Complex grammar (nested structures, code syntax) | Guidance | CFG support at token level. Outlines also works but Guidance has better API model support. | | JSON Schema validation with local models | Outlines | JSON Schema → grammar compilation. More automatic than Guidance's manual grammar. | | Simple enum/classification | Any (or native) | All tools handle this well. Native JSON mode is simplest. | | Multi-step agent workflows | Guidance | Stateful functions with @guidance(stateless=False) enable agentic loops with constrained actions. |

The "Just Use Native JSON Mode" Test

Before writing any Guidance code, check if your provider's native structured output works:

OpenAI: response_format={"type": "json_object"} or response_format={"type": "json_schema", ...}
Anthropic: Tool use with JSON schema for structured responses

Native mode is simpler, requires no library, and works with the provider's latest optimizations. Only reach for Guidance when native mode can't express your constraints (regex patterns, CFG grammars, multi-step workflows).

Guidance-Specific Expert Knowledge

Token Healing — The Hidden Feature That Matters

Guidance's most underappreciated feature. When you concatenate prompt + generation, tokenization creates unnatural boundaries:

Without healing: "The answer is " + gen() → "The answer is  42" (double space)
With healing:    Guidance backs up one token, regenerates → "The answer is 42"

This seems cosmetic but significantly improves generation quality. The model sees natural token sequences instead of artificial breaks. Always leave token healing enabled (it's on by default).

Grammar Compilation Latency

First-run trap: Grammar compilation is cached after first use, but the FIRST call with a new grammar pattern takes 1-5 seconds. In production:

Pre-warm grammars at service startup
Don't create new grammar patterns per-request
Cache compiled grammars across requests

The Constraint Strictness Spectrum

Too loose ◄──────────────────────────────────► Too strict
gen(max_tokens=100)                          gen(regex=r"^(John|Jane)$")
- No format guarantee                        - Format guaranteed
- Model generates freely                     - Model forced into narrow path
- May need post-processing                   - Quality may suffer severely

The sweet spot: Constrain structure, not content. Let the model fill in the meaning.

# GOOD: Structure constrained, content free
lm += '{"name": "' + gen("name", stop='"') + '", "reason": "' + gen("reason", stop='"') + '"}'

# BAD: Content over-constrained
lm += gen("name", regex=r"^(Alice|Bob|Charlie)$")  # Only 3 possible outputs

When Guidance + API Models Goes Wrong

Guidance's token-level constraints work perfectly with local models (Transformers, llama.cpp) because it intercepts the logit sampling. With API models (OpenAI, Anthropic), constraints are enforced differently — the library post-filters or uses provider-specific features. This means:

Regex constraints with API models may silently fall back to post-validation (generate → validate → retry)
Grammar constraints may not work at all with some API backends
Token healing works but adds one extra API call per generation boundary

Check references/backends.md for backend-specific capabilities.

File Index

| File | Purpose | When to Load | |------|---------|-------------| | SKILL.md | Decision frameworks, anti-patterns, when to use | Always (auto-loaded) | | references/constraints.md | Regex and grammar pattern cookbook | When writing specific constraint patterns | | references/backends.md | Backend-specific configuration | When setting up Guidance with a specific provider | | references/examples.md | Production-ready code examples | When implementing specific patterns |

Do NOT load reference files for decision-making tasks (choosing between tools, architecture design). Load them only when implementing specific Guidance patterns.

Rationalization Table

| Rationalization | When It Appears | Why It's Wrong | |----------------|-----------------|----------------| | "Let me add Guidance to guarantee the output format" | Starting any LLM task | Check native JSON mode first. 80% of structured output needs are handled without a library. | | "More constraints = better output" | Writing constraint patterns | Over-constraining kills content quality. The model spends tokens satisfying the pattern, not thinking about the answer. | | "I'll constrain the entire response with a grammar" | Complex generation tasks | Grammar-constrained generation is significantly slower for large outputs. Constrain only the structured parts. | | "Guidance handles everything the same across backends" | Choosing Guidance for API models | Token-level constraints work natively only with local models. API models have degraded constraint support. |

Red Flags

[ ] Using constrained generation for free-text tasks (analysis, explanations, creative writing) — constraints hurt quality here
[ ] Grammar compilation happening per-request instead of at startup — adds 1-5s latency
[ ] Regex pattern is so strict only a few valid outputs exist — model quality degrades when choice space is too narrow
[ ] No fallback for when constrained generation produces valid-format but nonsense-content output
[ ] Using Guidance grammar constraints with API models expecting local-model behavior
[ ] Every field in the JSON individually constrained with tight regex — constrain structure, not every value

NEVER

NEVER use constrained generation for creative or analytical free-text output — constraints fight the model's reasoning process, producing format-valid but content-poor results
NEVER compile grammars per-request in production — compilation takes 1-5s on first run; pre-warm at startup and cache
NEVER assume Guidance constraints work identically across local and API backends — local models get true token-level enforcement, API models get post-validation fallback
NEVER constrain content when you should constrain structure — gen("name", stop='"') inside a JSON template is better than gen("name", regex=r"[A-Z][a-z]+ [A-Z][a-z]+")
NEVER skip the "do I need a library?" check — native JSON mode handles most structured output needs without adding a dependency
NEVER test constrained generation only on happy-path inputs — adversarial and edge-case inputs reveal where constraints produce valid-format garbage

Implementation Quickstart

When you've decided Guidance is the right tool:

Install: pip install guidance (add [transformers] or [llama_cpp] for local models)
Choose backend — see references/backends.md for configuration
Start with select() for enums and gen(stop="\n") for single-line fields — simplest constraints
Use regex only when format matters (emails, dates, IDs) — see references/constraints.md for patterns
For complex structures, use @guidance decorator functions — see references/examples.md
Pre-warm grammars in production, never compile per-request
Always test: generate 10 outputs, check both format validity AND content quality

sharkitect-solutions/prompt-engineering-guidance

skills/prompt-engineering-guidance/SKILL.md

Use when building constrained LLM generation with Microsoft Guidance library. Use when outputs MUST match a specific format (JSON, dates, enums, code). Use when choosing between Guidance, Instructor, Outlines, or native JSON mode for structured output. Use when debugging constrained generation failures (slow grammar compilation, quality degradation from overly strict constraints). NEVER for general prompt engineering or prompt writing — this is specifically for the Guidance library and constrained generation patterns.

development

Updated Apr 26, 2026

$ install --global

skillsauth

npx skillsauth add sharkitect-solutions/sharkitect-claude-toolkit prompt-engineering-guidance

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 26, 2026, 8:11 AM19.0s5 files scanned

SKILL.md

name:: prompt-engineering-guidance
description:: Use when building constrained LLM generation with Microsoft Guidance library. Use when outputs MUST match a specific format (JSON, dates, enums, code). Use when choosing between Guidance, Instructor, Outlines, or native JSON mode for structured output. Use when debugging constrained generation failures (slow grammar compilation, quality degradation from overly strict constraints). NEVER for general prompt engineering or prompt writing — this is specifically for the Guidance library and constrained generation patterns.
version:: 2.0
optimized:: true
optimized_date:: 2026-03-10

Constrained LLM Generation with Guidance

Think like an engineer who has shipped constrained generation in production and learned where it shines and where it silently destroys output quality.

When to Use Constrained Generation (vs. Not)

Before reaching for Guidance or any constrained generation tool, ask:

Does the output NEED to be machine-parseable?
│
├─ YES → Use constrained generation
│  │
│  ├─ Simple structure (JSON, enum, date)?
│  │  └─ Try native JSON mode first (OpenAI, Anthropic both support it)
│  │     If native mode can't handle it → Guidance or Instructor
│  │
│  └─ Complex structure (nested, conditional, multi-step)?
│     └─ Guidance (grammar-based) or Outlines (schema-based)
│
└─ NO → Do NOT use constrained generation
   Constraints on free-text tasks (creative writing, analysis,
   explanations) actively hurt quality. The model fights the
   constraint instead of thinking about the content.

Tool Selection Decision Tree

The "Just Use Native JSON Mode" Test

Before writing any Guidance code, check if your provider's native structured output works:

OpenAI: response_format={"type": "json_object"} or response_format={"type": "json_schema", ...}
Anthropic: Tool use with JSON schema for structured responses

Guidance-Specific Expert Knowledge

Token Healing — The Hidden Feature That Matters

Guidance's most underappreciated feature. When you concatenate prompt + generation, tokenization creates unnatural boundaries:

Without healing: "The answer is " + gen() → "The answer is  42" (double space)
With healing:    Guidance backs up one token, regenerates → "The answer is 42"

This seems cosmetic but significantly improves generation quality. The model sees natural token sequences instead of artificial breaks. Always leave token healing enabled (it's on by default).

Grammar Compilation Latency

First-run trap: Grammar compilation is cached after first use, but the FIRST call with a new grammar pattern takes 1-5 seconds. In production:

Pre-warm grammars at service startup
Don't create new grammar patterns per-request
Cache compiled grammars across requests

The Constraint Strictness Spectrum

Too loose ◄──────────────────────────────────► Too strict
gen(max_tokens=100)                          gen(regex=r"^(John|Jane)$")
- No format guarantee                        - Format guaranteed
- Model generates freely                     - Model forced into narrow path
- May need post-processing                   - Quality may suffer severely

The sweet spot: Constrain structure, not content. Let the model fill in the meaning.

# GOOD: Structure constrained, content free
lm += '{"name": "' + gen("name", stop='"') + '", "reason": "' + gen("reason", stop='"') + '"}'

# BAD: Content over-constrained
lm += gen("name", regex=r"^(Alice|Bob|Charlie)$")  # Only 3 possible outputs

When Guidance + API Models Goes Wrong

Regex constraints with API models may silently fall back to post-validation (generate → validate → retry)
Grammar constraints may not work at all with some API backends
Token healing works but adds one extra API call per generation boundary

Check references/backends.md for backend-specific capabilities.

File Index

Do NOT load reference files for decision-making tasks (choosing between tools, architecture design). Load them only when implementing specific Guidance patterns.

Rationalization Table

Red Flags

[ ] Using constrained generation for free-text tasks (analysis, explanations, creative writing) — constraints hurt quality here
[ ] Grammar compilation happening per-request instead of at startup — adds 1-5s latency
[ ] Regex pattern is so strict only a few valid outputs exist — model quality degrades when choice space is too narrow
[ ] No fallback for when constrained generation produces valid-format but nonsense-content output
[ ] Using Guidance grammar constraints with API models expecting local-model behavior
[ ] Every field in the JSON individually constrained with tight regex — constrain structure, not every value

NEVER

NEVER use constrained generation for creative or analytical free-text output — constraints fight the model's reasoning process, producing format-valid but content-poor results
NEVER compile grammars per-request in production — compilation takes 1-5s on first run; pre-warm at startup and cache
NEVER assume Guidance constraints work identically across local and API backends — local models get true token-level enforcement, API models get post-validation fallback
NEVER constrain content when you should constrain structure — gen("name", stop='"') inside a JSON template is better than gen("name", regex=r"[A-Z][a-z]+ [A-Z][a-z]+")
NEVER skip the "do I need a library?" check — native JSON mode handles most structured output needs without adding a dependency
NEVER test constrained generation only on happy-path inputs — adversarial and edge-case inputs reveal where constraints produce valid-format garbage

Implementation Quickstart

When you've decided Guidance is the right tool:

Install: pip install guidance (add [transformers] or [llama_cpp] for local models)
Choose backend — see references/backends.md for configuration
Start with select() for enums and gen(stop="\n") for single-line fields — simplest constraints
Use regex only when format matters (emails, dates, IDs) — see references/constraints.md for patterns
For complex structures, use @guidance decorator functions — see references/examples.md
Pre-warm grammars in production, never compile per-request
Always test: generate 10 outputs, check both format validity AND content quality

Related Skills

sharkitect-solutions/paid-ads

development

VerifiedTrustedCommunity

When the user wants help with paid advertising campaigns on Google Ads, Meta (Facebook/Instagram), LinkedIn, Twitter/X, or other ad platforms. Also use when the user mentions 'PPC,' 'paid media,' 'ad copy,' 'ad creative,' 'ROAS,' 'CPA,' 'ad campaign,' 'retargeting,' or 'audience targeting.' This skill covers campaign strategy, ad creation, audience targeting, and optimization.

SKILL.mdUpdated May 29, 2026

sharkitect-solutions/paid-ads

sharkitect-solutions/skills/using-sharkitect-methodology

testing

VerifiedTrustedCommunity

--- name: using-sharkitect-methodology description: Use when starting any conversation in a Sharkitect workspace OR before any task involving NEW pricing, positioning, proposal, strategy, plan-execution, or schema-design work — mandates invocation of Sharkitect-specific methodology skills (pricing-strategy, marketing-strategy-pmm, smb-cfo, hq-revenue-ops, executing-plans, brainstorming) under the same anti-rationalization discipline as using-superpowers. Documentation has failed 4 times across H

SKILL.mdUpdated May 13, 2026

sharkitect-solutions/skills/using-sharkitect-methodology

sharkitect-solutions/end-session

testing

VerifiedTrustedCommunity

Use when user says 'end session', 'wrap up', 'stop for the day', 'done for today', 'close out', 'save session', 'wrapping up', or invokes /end-session. Runs the full 9-step end-of-session protocol: resource audit, MEMORY.md update, lessons capture, plan status, pending items, workspace checklist, .tmp/ audit, git commit+push, Supabase brain sync, session brief, summary. Final step schedules a detached self-kill of the current session ONLY (3s delay) so the window closes cleanly. Other claude.exe processes (active workspaces) are NOT touched -- orphan cleanup is handled separately by Claude-Orphan-Cleanup-Hourly with proper age safeguards. Do NOT use for: mid-session quick saves (use session-checkpoint), skill syncing (use sync-skills.py), brain memory queries (use supabase-sync.py pull), document freshness reviews (use document-lifecycle), resource gap detection (use resource-auditor).

SKILL.mdUpdated May 12, 2026

sharkitect-solutions/end-session

sharkitect-solutions/humanizer

testing

VerifiedTrustedCommunity

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, passive voice, negative parallelisms, and filler phrases.

SKILL.mdUpdated May 7, 2026

sharkitect-solutions/humanizer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/sharkitect-solutions/sharkitect-claude-toolkit.git

# Copy into Claude Code skills folder (global)
cp -r sharkitect-claude-toolkit/skills/prompt-engineering-guidance ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

sharkitect-solutions/sharkitect-claude-toolkit

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT