apps/docs/skills/identity-and-guardrails/SKILL.md
Enable prompt injection detection, PII masking, behavioral contracts, kill switch controls, and agent identity for safe production deployments.
npx skillsauth add tylerjrbuell/reactive-agents-ts identity-and-guardrailsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Produce a builder with guardrails, behavioral contracts, and safety controls correctly configured so the agent operates within defined bounds and can be stopped when needed.
import { ReactiveAgents } from "@reactive-agents/runtime";
const agent = await ReactiveAgents.create()
.withName("assistant")
.withProvider("anthropic")
.withReasoning({ defaultStrategy: "adaptive" })
.withTools({ allowedTools: ["web-search", "http-get", "file-read", "checkpoint"] })
.withGuardrails({
injection: true, // detect prompt injection attempts
pii: true, // detect and mask PII in inputs/outputs
toxicity: true, // detect toxic content
})
.withBehavioralContracts({
deniedTools: ["file-write", "shell-execute"],
maxToolCalls: 30,
maxIterations: 20,
requireDisclosure: true, // agent must disclose it is an AI
})
.withKillSwitch() // enables pause/resume/stop/terminate controls
.withIdentity() // enables identity and session tracking
.withAudit() // records all tool calls and decisions to audit log
.build();
.withGuardrails()
// Enables all detectors with defaults: injection=true, pii=true, toxicity=true
.withGuardrails({
injection: true, // detect "ignore previous instructions" attacks
pii: false, // disable PII masking (e.g., agent legitimately processes PII)
toxicity: true,
customBlocklist: ["competitor-product", "internal-codename"], // substring blocklist
})
Guardrail violations abort the turn and return a structured error — the agent never processes the blocked content.
Full field reference for .withBehavioralContracts(contract):
.withBehavioralContracts({
deniedTools: ["file-delete", "shell-execute"], // tools the agent may NEVER call
allowedTools: ["web-search", "file-read"], // if set, ONLY these tools are allowed
maxToolCalls: 50, // hard stop after N total tool calls
maxIterations: 20, // hard stop after N reasoning iterations
maxOutputLength: 4000, // truncate/block output over N chars
deniedTopics: ["competitor names", "legal advice"], // topics agent must refuse
requireDisclosure: true, // first response must disclose AI identity
})
Contract violations are enforced at runtime — violations halt the current turn with a ContractViolation error.
.withKillSwitch()
// Enables runtime controls on the built agent handle:
const handle = agent.run("Do a long task...");
// Graceful pause (waits for current phase to finish)
await handle.pause();
// Resume from paused state
await handle.resume();
// Graceful stop (finishes current phase, then stops)
await handle.stop("User requested cancellation");
// Immediate termination
await handle.terminate("Emergency shutdown");
Kill switch controls are no-ops if .withKillSwitch() was not called during build.
.withIdentity() // enables agent identity headers, session IDs, and persona tracking
.withAudit() // records all tool calls, guardrail decisions, and contract checks to an audit trail
Identity and audit work independently — enable both for full traceability.
| Field | Type | Default | Notes |
|-------|------|---------|-------|
| injection | boolean | true | Prompt injection detection |
| pii | boolean | true | PII detection and masking |
| toxicity | boolean | true | Toxic content detection |
| customBlocklist | string[] | [] | Case-insensitive substring blocklist |
| Field | Type | Notes |
|-------|------|-------|
| deniedTools | string[] | Tools that may never be called |
| allowedTools | string[] | If set, only these tools are allowed |
| maxToolCalls | number | Hard stop after N total tool calls |
| maxIterations | number | Hard stop after N reasoning iterations |
| maxOutputLength | number | Max output characters before truncation |
| deniedTopics | string[] | Topics the agent must refuse |
| requireDisclosure | boolean | Agent must disclose AI identity |
.withGuardrails() with no args enables ALL detectors — disable selectively if your use case legitimately handles PIIdeniedTools in a contract and allowedTools in .withTools() are independent — a tool can be in .withTools({ allowedTools }) but still blocked by a contract's deniedToolspause, resume, stop, terminate) are no-ops without .withKillSwitch() — no error is thrown, calls are silently ignoredrequireDisclosure enforces the agent states it is AI in its first response — this is a prompt-level enforcement, not cryptographicContractViolation errors — handle these in your error callback or the agent run will throw.withAudit() without a log destination writes to the observability stream — add .withObservability() to capture audit eventsdevelopment
Orient to the Reactive Agents framework, understand the builder API shape, and select the right capability skills for your task.
testing
Enable output verification (hallucination detection, semantic entropy, self-consistency), add post-run verification steps, and run LLM-scored evals across 5 quality dimensions.
data-ai
Configure per-provider behavior, understand streaming quirks, and use the 7-hook adapter system for optimal performance across LLM providers.
data-ai
Configure the 4-layer memory system with SQLite/FTS5/vec storage for persistent agent knowledge that survives sessions.