.github/skills/apim-policies/SKILL.md
Guide for creating Azure API Management (APIM) XML policies. Use when users want to create, modify, or understand APIM policies including inbound/outbound processing, authentication, rate limiting, caching, transformations, AI gateway policies, and policy expressions. This skill provides policy syntax, examples, and C# policy expressions for request/response manipulation.
npx skillsauth add azure-samples/ai-gateway apim-policiesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides guidance for creating Azure API Management XML policies.
Every APIM policy document follows this structure:
<policies>
<inbound>
<base />
<!-- Policies applied to incoming requests -->
</inbound>
<backend>
<base />
<!-- Policies applied before forwarding to backend -->
</backend>
<outbound>
<base />
<!-- Policies applied to outgoing responses -->
</outbound>
<on-error>
<base />
<!-- Policies applied when errors occur -->
</on-error>
</policies>
The <base /> element inherits policies from parent scopes (Global → Product → API → Operation).
| Category | Common Policies | Section |
|----------|-----------------|---------|
| Authentication | authentication-managed-identity, validate-azure-ad-token, validate-jwt | inbound |
| Rate Limiting | rate-limit-by-key, llm-token-limit, azure-openai-token-limit | inbound |
| Caching | azure-openai-semantic-cache-lookup/store, cache-lookup/store | inbound/outbound |
| Routing | set-backend-service, forward-request, retry | inbound/backend |
| Transformation | set-header, set-body, set-variable, rewrite-uri | any |
| AI Gateway | llm-content-safety, llm-emit-token-metric, azure-openai-emit-token-metric | inbound |
| Control Flow | choose, return-response, retry, wait | any |
Route requests to a specific backend:
<set-backend-service backend-id="my-backend" />
Authenticate to Azure services using APIM's managed identity:
<authentication-managed-identity resource="https://cognitiveservices.azure.com"
output-token-variable-name="managed-id-access-token" ignore-error="false" />
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["managed-id-access-token"])</value>
</set-header>
Validate JWT tokens from Microsoft Entra ID:
<validate-azure-ad-token tenant-id="{tenant-id}">
<client-application-ids>
<application-id>{client-app-id}</application-id>
</client-application-ids>
</validate-azure-ad-token>
Apply policies based on conditions:
<choose>
<when condition="@(context.Request.Headers.GetValueOrDefault("X-Custom","") == "value")">
<!-- policies when condition is true -->
</when>
<otherwise>
<!-- fallback policies -->
</otherwise>
</choose>
Return an immediate response without calling the backend:
<return-response>
<set-status code="403" reason="Forbidden" />
<set-header name="Content-Type" exists-action="override">
<value>application/json</value>
</set-header>
<set-body>{"error": "Access denied"}</set-body>
</return-response>
Retry failed requests with conditions:
<retry count="3" interval="1" first-fast-retry="true"
condition="@(context.Response.StatusCode == 429 || context.Response.StatusCode >= 500)">
<forward-request buffer-request-body="true" />
</retry>
Limit LLM token consumption:
<llm-token-limit counter-key="@(context.Subscription.Id)"
tokens-per-minute="10000" estimate-prompt-tokens="false"
remaining-tokens-variable-name="remainingTokens" />
For Azure OpenAI specifically:
<azure-openai-token-limit counter-key="@(context.Subscription.Id)"
tokens-per-minute="10000" estimate-prompt-tokens="false"
remaining-tokens-variable-name="remainingTokens" />
Emit token usage metrics to Application Insights:
<azure-openai-emit-token-metric namespace="openai">
<dimension name="Subscription ID" value="@(context.Subscription.Id)" />
<dimension name="Client IP" value="@(context.Request.IpAddress)" />
<dimension name="API ID" value="@(context.Api.Id)" />
<dimension name="User ID" value="@(context.Request.Headers.GetValueOrDefault("x-user-id", "N/A"))" />
</azure-openai-emit-token-metric>
Cache LLM responses using semantic similarity:
<!-- Inbound: Check cache -->
<azure-openai-semantic-cache-lookup score-threshold="0.8"
embeddings-backend-id="embeddings-backend"
embeddings-backend-auth="system-assigned" />
<!-- Outbound: Store in cache -->
<azure-openai-semantic-cache-store duration="120" />
Enforce content safety checks on LLM requests:
<llm-content-safety backend-id="content-safety-backend" shield-prompt="true">
<categories output-type="EightSeverityLevels">
<category name="SelfHarm" threshold="4" />
<category name="Hate" threshold="4" />
<category name="Violence" threshold="4" />
<category name="Sexual" threshold="4" />
</categories>
<blocklists>
<id>blocklist-id</id>
</blocklists>
</llm-content-safety>
Policy expressions use C# syntax within @() for single statements or @{} for multi-statement blocks.
// Get header value
@(context.Request.Headers.GetValueOrDefault("header-name", "default"))
// Get query parameter
@(context.Request.Url.Query.GetValueOrDefault("param-name", "default"))
// Get URL path parameter
@(context.Request.MatchedParameters.GetValueOrDefault("param-name", "default"))
// Get subscription ID
@(context.Subscription.Id)
// Get client IP
@(context.Request.IpAddress)
// Read JSON body property
@(context.Request.Body.As<JObject>(preserveContent: true)["property"]?.ToString())
// Check header existence
@(context.Request.Headers.ContainsKey("header-name"))
// Get context variable
@(context.Variables.GetValueOrDefault<string>("var-name", "default"))
<set-variable name="result" value="@{
string[] value;
if (context.Request.Headers.TryGetValue("Authorization", out value))
{
if(value != null && value.Length > 0)
{
return Encoding.UTF8.GetString(Convert.FromBase64String(value[0]));
}
}
return null;
}" />
For detailed information, see:
tools
Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
tools
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
development
Guide for creating new AI Gateway labs. Use when users want to create a new lab in the labs/ folder. This skill provides the standard lab structure, templates, and patterns used across the AI Gateway repository including Jupyter notebooks, Bicep infrastructure templates, APIM policies, and README documentation.
development
Guide for creating Terraform files for Azure API Management (APIM) and related Azure services. Use when users want to create, modify, or understand Terraform configurations for APIM instances, APIs, backends, subscriptions, policies, products, loggers, diagnostics, and supporting infrastructure using the azurerm provider. This skill provides HCL syntax, resource definitions, and patterns from the Terraform Registry and this repository.