skills/ai-agent-development/agent-guardrails/SKILL.md
Guides implementation of AI agent guardrails: input/output validation, PII filtering, cost control, safety policies, and audit logging. Use when securing agent pipelines or adding compliance and observability.
npx skillsauth add pkuppens/pkuppens agent-guardrailsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Patterns for securing AI agent input and output, controlling costs, protecting privacy, and maintaining audit trails. Guardrails wrap the agent core — they do not replace good prompt engineering.
Guardrails apply at two points in the agent pipeline:
User Input → [Input Guardrails] → Agent Core → [Output Guardrails] → User Output
│
[Audit Log]
| Guardrail | Purpose | Implementation | |-----------|---------|----------------| | Prompt injection detection | Block attempts to override system instructions | Pattern matching, classifier model, or dedicated guardrail LLM | | Input sanitisation | Strip dangerous content (scripts, SQL) | Regex/allowlist filtering | | Topic restriction | Reject off-topic or prohibited queries | Classifier or keyword filter | | Rate limiting | Prevent abuse and control costs | Token bucket, sliding window | | Authentication | Verify caller identity | OAuth2, API key, JWT validation |
| Guardrail | Purpose | Implementation | |-----------|---------|----------------| | PII filtering | Redact personal data from responses | Named entity recognition (spaCy, Presidio) | | Content policy | Block harmful, biased, or inappropriate output | Content classifier, keyword blocklist | | Hallucination detection | Flag unsupported claims | Cross-reference with retrieved sources | | Format validation | Ensure structured output matches expected schema | JSON schema validation, Pydantic parsing | | Citation enforcement | Require source references in responses | Post-processing check against retrieved docs |
Use Microsoft Presidio or spaCy NER for on-premises PII detection:
[PERSON], [EMAIL])Healthcare-specific: filter patient identifiers, medical record numbers, and clinical data per HIPAA/GDPR.
class CostGuard:
def __init__(self, max_tokens_per_request: int, max_cost_per_day: float):
self.max_tokens_per_request = max_tokens_per_request
self.max_cost_per_day = max_cost_per_day
self.daily_cost = 0.0
def check(self, estimated_tokens: int, cost_per_token: float) -> bool:
estimated_cost = estimated_tokens * cost_per_token
if estimated_tokens > self.max_tokens_per_request:
return False
if self.daily_cost + estimated_cost > self.max_cost_per_day:
return False
return True
Every agent interaction should produce an audit record:
| Field | Description |
|-------|-------------|
| request_id | Unique identifier for the interaction |
| timestamp | ISO 8601 timestamp |
| user_id | Authenticated caller identity |
| model | LLM model used |
| input_tokens | Token count for the input |
| output_tokens | Token count for the output |
| tools_called | List of tools invoked with arguments (redacted) |
| guardrails_triggered | Which guardrails fired and their action (block, redact, warn) |
| latency_ms | End-to-end response time |
| cost | Estimated cost of the interaction |
| Tool | Language | Scope | |------|----------|-------| | Guardrails AI | Python | Input/output validation with RAIL specs | | NeMo Guardrails (NVIDIA) | Python | Programmable guardrails for LLM apps | | Microsoft Presidio | Python | PII detection and anonymisation | | LangChain output parsers | Python | Structured output validation | | Semantic Kernel filters | C# | Pre/post-processing in the SK pipeline |
testing
Syncs remote default branch locally (checkout, fetch --prune, pull) and returns to the previous branch when it still exists. Reports stashes and worktrees not yet handled. Use when the user asks to sync main, update default branch, fetch/pull origin, or run /sync-branch.
tools
Creates, queries, updates, and links Azure Boards work items via az boards CLI. Use when filing ADO work items, running WIQL queries, or setting area path, iteration, tags, and assignee.
tools
Creates, reviews, and completes Azure Repos pull requests and branch policies via az repos CLI. Use when opening ADO PRs, setting required reviewers, or configuring build validation policies.
development
Guides Azure Pipelines YAML structure, build validation on PRs, and staged deployment with environments and approvals. Use when authoring azure-pipelines.yml or configuring CI/CD on Azure DevOps.