apps/docs/skills/cost-budget-enforcement/SKILL.md
Set per-request, per-session, daily, and monthly spend limits, configure rate limiting and circuit breakers, and isolate costs per user or tenant.
npx skillsauth add tylerjrbuell/reactive-agents-ts cost-budget-enforcementInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Produce a builder with cost tracking, budget limits, and rate limiting configured so the agent never exceeds defined spending thresholds.
import { ReactiveAgents } from "@reactive-agents/runtime";
const agent = await ReactiveAgents.create()
.withName("assistant")
.withProvider("anthropic")
.withReasoning({ defaultStrategy: "adaptive", maxIterations: 15 })
.withTools({ allowedTools: ["web-search", "http-get", "checkpoint"] })
.withCostTracking({
perRequest: 0.25, // max $0.25 per LLM call
perSession: 2.0, // max $2.00 per agent.run() call
daily: 10.0, // max $10.00/day across all sessions
monthly: 100.0, // max $100.00/month
})
.withRateLimiting({
requestsPerMinute: 30,
tokensPerMinute: 50_000,
maxConcurrent: 3,
})
.withCircuitBreaker() // auto-opens on provider errors; prevents cascading failures
.build();
.withCostTracking()
// Enables cost tracking with defaults:
// perRequest: $1.00, perSession: $5.00, daily: $25.00, monthly: $200.00
.withCostTracking({
perRequest: 0.50, // hard stop mid-request if cost would exceed this
perSession: 5.0,
daily: 25.0, // daily limit (default $25.00)
monthly: 200.0,
})
When a budget is exceeded, the agent throws a BudgetExceededError and stops. Daily/monthly budgets reset based on the timezone configured in .withGateway() (if used) or UTC by default.
.withRateLimiting()
// Defaults: 60 RPM, 100,000 TPM, 10 concurrent requests
.withRateLimiting({
requestsPerMinute: 60, // max LLM requests per minute
tokensPerMinute: 100_000, // max tokens per minute (input + output)
maxConcurrent: 10, // max simultaneous in-flight LLM requests
})
Requests that exceed limits are queued (not dropped) — the agent waits for capacity before proceeding.
.withCircuitBreaker()
// Default thresholds (open after 5 failures in 60s window, retry after 30s)
.withCircuitBreaker({
failureThreshold: 5, // open circuit after N consecutive failures
windowMs: 60_000, // failure counting window
retryAfterMs: 30_000, // wait before trying half-open probe
})
Circuit breaker states: closed (normal) → open (failing fast) → half-open (probing recovery).
// Create one agent per user/tenant with separate tracking contexts
const userAgent = await ReactiveAgents.create()
.withProvider("anthropic")
.withCostTracking({ perSession: 1.0, daily: 5.0 })
.withName(`user-${userId}`)
.withSystemPrompt(`You are assisting user ${userId}.`)
.build();
// Or use per-request context injection:
const result = await agent.run(task, {
context: { userId, tenantId }, // included in cost tracking metadata
});
import { createLiteLLMPricingProvider } from "@reactive-agents/llm-provider";
.withDynamicPricing(createLiteLLMPricingProvider())
// Fetches live model prices from LiteLLM pricing API
// Required when using models whose costs are not in the built-in price table
| Field | Type | Default | Notes |
|-------|------|---------|-------|
| perRequest | number | 1.00 | Max USD per single LLM request |
| perSession | number | 5.00 | Max USD per agent.run() call |
| daily | number | 20.00 | Max USD per calendar day |
| monthly | number | 200.00 | Max USD per calendar month |
| Field | Type | Default | Notes |
|-------|------|---------|-------|
| requestsPerMinute | number | 60 | Max LLM requests/minute |
| tokensPerMinute | number | 100_000 | Max tokens/minute (input + output) |
| maxConcurrent | number | 10 | Max simultaneous in-flight requests |
withCostTracking() with no args is still useful — it enables cost telemetry without enforcing limits (all defaults are generous)withCircuitBreaker() opens on LLM provider errors, not on budget exceeded errors — they are independent systemsmaxConcurrent based on your provider's actual concurrency limits to avoid provider-side 429swithDynamicPricing() makes an external HTTP call during build — ensure network access and handle build failures.withGateway({ timezone: "America/New_York" })development
Orient to the Reactive Agents framework, understand the builder API shape, and select the right capability skills for your task.
testing
Enable output verification (hallucination detection, semantic entropy, self-consistency), add post-run verification steps, and run LLM-scored evals across 5 quality dimensions.
data-ai
Configure per-provider behavior, understand streaming quirks, and use the 7-hook adapter system for optimal performance across LLM providers.
data-ai
Configure the 4-layer memory system with SQLite/FTS5/vec storage for persistent agent knowledge that survives sessions.