skills/azure-ai-contentunderstanding-py/SKILL.md
Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video. Triggers: "azure-ai-contentunderstanding", "ContentUnderstandingClient", "multimodal analysis", "document extraction", "video analysis", "audio transcription".
npx skillsauth add athility/krashitos-ai-os-portfolio azure-ai-contentunderstanding-pyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
pip install azure-ai-contentunderstanding
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)
Content Understanding operations are asynchronous long-running operations:
begin_analyze() (returns a poller).result())AnalyzeResult.contents| Analyzer | Content Type | Purpose |
|----------|--------------|---------|
| prebuilt-documentSearch | Documents | Extract markdown for RAG applications |
| prebuilt-imageSearch | Images | Extract content from images |
| prebuilt-audioSearch | Audio | Transcribe audio with timing |
| prebuilt-videoSearch | Video | Extract frames, transcripts, summaries |
| prebuilt-invoice | Documents | Extract invoice fields |
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
)
# Analyze document from URL
poller = client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
)
result = poller.result()
# Access markdown content (contents is a list)
content = result.contents[0]
print(content.markdown)
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
document_content: DocumentContent = content # type: ignore
print(document_content.start_page_number)
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-imageSearch",
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-videoSearch",
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)
result = poller.result()
# Access video content (AudioVisualContent)
content = result.contents[0]
# Get transcript phrases with timing
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
# Get key frames (for video)
for frame in content.key_frames:
print(f"Frame at {frame.time}: {frame.description}")
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-audioSearch",
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)
result = poller.result()
# Access audio transcript
content = result.contents[0]
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time}] {phrase.text}")
Create custom analyzers with field schemas for specialized extraction:
# Create custom analyzer
analyzer = client.create_analyzer(
analyzer_id="my-invoice-analyzer",
analyzer={
"description": "Custom invoice analyzer",
"base_analyzer_id": "prebuilt-documentSearch",
"field_schema": {
"fields": {
"vendor_name": {"type": "string"},
"invoice_total": {"type": "number"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"amount": {"type": "number"}
}
}
}
}
}
}
)
# Use custom analyzer
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="my-invoice-analyzer",
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
)
result = poller.result()
# Access extracted fields
print(result.fields["vendor_name"])
print(result.fields["invoice_total"])
# List all analyzers
analyzers = client.list_analyzers()
for analyzer in analyzers:
print(f"{analyzer.analyzer_id}: {analyzer.description}")
# Get specific analyzer
analyzer = client.get_analyzer("prebuilt-documentSearch")
# Delete custom analyzer
client.delete_analyzer("my-custom-analyzer")
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential
async def analyze_document():
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
async with ContentUnderstandingClient(
endpoint=endpoint,
credential=credential
) as client:
poller = await client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
)
result = await poller.result()
content = result.contents[0]
return content.markdown
asyncio.run(analyze_document())
| Class | For | Provides |
|-------|-----|----------|
| DocumentContent | PDF, images, Office docs | Pages, tables, figures, paragraphs |
| AudioVisualContent | Audio, video files | Transcript phrases, timing, key frames |
Both derive from MediaContent which provides basic info and markdown representation.
from azure.ai.contentunderstanding.models import (
AnalyzeInput,
AnalyzeResult,
MediaContentKind,
DocumentContent,
AudioVisualContent,
)
| Client | Purpose |
|--------|---------|
| ContentUnderstandingClient | Sync client for all operations |
| ContentUnderstandingClient (aio) | Async client for all operations |
begin_analyze with AnalyzeInput — this is the correct method signatureresult.contents[0] — results are returned as a listazure.identity.aio credentialsdevelopment
Create Zustand stores with TypeScript, subscribeWithSelector middleware, and proper state/action separation. Use when building React state management, creating global stores, or implementing reactive state patterns with Zustand.
tools
Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.
tools
Automate Zoho CRM tasks via Rube MCP (Composio): create/update records, search contacts, manage leads, and convert leads. Always search tools first for current schemas.
tools
Automate Zendesk tasks via Rube MCP (Composio): tickets, users, organizations, replies. Always search tools first for current schemas.