.claude/skills/bedrock-knowledge-bases/SKILL.md
Amazon Bedrock Knowledge Bases for RAG (Retrieval-Augmented Generation). Create knowledge bases with vector stores, ingest data from S3/web/Confluence/SharePoint, configure chunking strategies, query with retrieve and generate APIs, manage sessions. Use when building RAG applications, implementing semantic search, creating document Q&A systems, integrating knowledge bases with agents, optimizing chunking for accuracy, or querying enterprise knowledge.
npx skillsauth add adaptationio/skrillz bedrock-knowledge-basesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.
Amazon Bedrock Knowledge Bases provides:
Use this skill when you need to:
import boto3
import json
# Initialize clients
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
# 1. Create Knowledge Base
kb_response = bedrock_agent.create_knowledge_base(
name='enterprise-docs-kb',
description='Company documentation knowledge base',
roleArn='arn:aws:iam::123456789012:role/BedrockKBRole',
knowledgeBaseConfiguration={
'type': 'VECTOR',
'vectorKnowledgeBaseConfiguration': {
'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
}
},
storageConfiguration={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-knowledge-base-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
)
knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId']
print(f"Knowledge Base ID: {knowledge_base_id}")
# 2. Add S3 Data Source
ds_response = bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='s3-documents',
description='Company documents from S3',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': 'arn:aws:s3:::my-docs-bucket',
'inclusionPrefixes': ['documents/']
}
},
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'FIXED_SIZE',
'fixedSizeChunkingConfiguration': {
'maxTokens': 512,
'overlapPercentage': 20
}
}
}
)
data_source_id = ds_response['dataSource']['dataSourceId']
# 3. Start Ingestion
ingestion_response = bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Initial document ingestion'
)
print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")
# 4. Query with Retrieve and Generate
response = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What is our vacation policy?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': knowledge_base_id,
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID'
}
}
}
}
)
print(f"Answer: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
for reference in citation['retrievedReferences']:
print(f" - {reference['location']['s3Location']['uri']}")
Best for: Production RAG applications with auto-scaling requirements
Benefits:
Configuration:
storageConfiguration={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-knowledge-base-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
Best for: Cost-optimized, large-scale RAG applications
Benefits:
Ideal Use Cases:
Configuration:
storageConfiguration={
'type': 'S3_VECTORS',
's3VectorsConfiguration': {
'bucketArn': 'arn:aws:s3:::my-vector-bucket',
'prefix': 'vectors/'
}
}
Limitations:
Best for: Interconnected knowledge domains requiring relationship-aware retrieval
Benefits:
Use Cases:
Configuration:
storageConfiguration={
'type': 'NEPTUNE_ANALYTICS',
'neptuneAnalyticsConfiguration': {
'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
'vectorSearchConfiguration': {
'vectorField': 'embedding'
}
}
}
Best for: Existing OpenSearch infrastructure, advanced customization
Configuration:
storageConfiguration={
'type': 'OPENSEARCH_SERVICE',
'opensearchServiceConfiguration': {
'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}
Pinecone:
storageConfiguration={
'type': 'PINECONE',
'pineconeConfiguration': {
'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
'namespace': 'bedrock-kb',
'fieldMapping': {
'textField': 'text',
'metadataField': 'metadata'
}
}
}
MongoDB Atlas:
storageConfiguration={
'type': 'MONGODB_ATLAS',
'mongoDbAtlasConfiguration': {
'endpoint': 'https://cluster0.mongodb.net',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
'databaseName': 'bedrock_kb',
'collectionName': 'vectors',
'vectorIndexName': 'vector_index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}
Redis Enterprise Cloud:
storageConfiguration={
'type': 'REDIS_ENTERPRISE_CLOUD',
'redisEnterpriseCloudConfiguration': {
'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}
Supported File Types: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='s3-technical-docs',
description='Technical documentation from S3',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': 'arn:aws:s3:::my-docs-bucket',
'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
'exclusionPrefixes': ['docs/archive/']
}
}
)
Automatic website scraping and indexing:
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='company-website',
description='Public company website content',
dataSourceConfiguration={
'type': 'WEB',
'webConfiguration': {
'sourceConfiguration': {
'urlConfiguration': {
'seedUrls': [
{'url': 'https://www.example.com/docs'},
{'url': 'https://www.example.com/blog'}
]
}
},
'crawlerConfiguration': {
'crawlerLimits': {
'rateLimit': 300 # Pages per minute
}
}
}
}
)
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='confluence-wiki',
description='Company Confluence knowledge base',
dataSourceConfiguration={
'type': 'CONFLUENCE',
'confluenceConfiguration': {
'sourceConfiguration': {
'hostUrl': 'https://company.atlassian.net/wiki',
'hostType': 'SAAS',
'authType': 'BASIC',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
},
'crawlerConfiguration': {
'filterConfiguration': {
'type': 'PATTERN',
'patternObjectFilter': {
'filters': [
{
'objectType': 'Space',
'inclusionFilters': ['Engineering', 'Product'],
'exclusionFilters': ['Archive']
}
]
}
}
}
}
}
)
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='sharepoint-docs',
description='SharePoint document library',
dataSourceConfiguration={
'type': 'SHAREPOINT',
'sharePointConfiguration': {
'sourceConfiguration': {
'siteUrls': [
'https://company.sharepoint.com/sites/Engineering',
'https://company.sharepoint.com/sites/Product'
],
'tenantId': 'tenant-id',
'domain': 'company',
'authType': 'OAUTH2_CLIENT_CREDENTIALS',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
}
}
}
)
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='salesforce-knowledge',
description='Salesforce knowledge articles',
dataSourceConfiguration={
'type': 'SALESFORCE',
'salesforceConfiguration': {
'sourceConfiguration': {
'hostUrl': 'https://company.my.salesforce.com',
'authType': 'OAUTH2_CLIENT_CREDENTIALS',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
},
'crawlerConfiguration': {
'filterConfiguration': {
'type': 'PATTERN',
'patternObjectFilter': {
'filters': [
{
'objectType': 'Knowledge',
'inclusionFilters': ['Product_Documentation', 'Support_Articles']
}
]
}
}
}
}
}
)
Best for: Simple documents with uniform structure
How it works: Splits text into chunks of fixed token size with overlap
Parameters:
maxTokens: 200-8192 tokens (typically 512-1024)overlapPercentage: 10-50% (typically 20%)Configuration:
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'FIXED_SIZE',
'fixedSizeChunkingConfiguration': {
'maxTokens': 512,
'overlapPercentage': 20
}
}
}
Use Cases:
Pros:
Cons:
Best for: Documents without clear boundaries (legal, technical, academic)
How it works: Uses sentence similarity to group related content
Parameters:
maxTokens: 20-8192 tokens (typically 300-500)bufferSize: Number of neighboring sentences (default: 1)breakpointPercentileThreshold: Similarity threshold (recommended: 95%)Configuration:
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'SEMANTIC',
'semanticChunkingConfiguration': {
'maxTokens': 300,
'bufferSize': 1,
'breakpointPercentileThreshold': 95
}
}
}
Use Cases:
Pros:
Cons:
Cost Consideration: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.
Best for: Complex documents with nested structure
How it works: Creates parent and child chunks; retrieves child, returns parent for context
Parameters:
levelConfigurations: Array of chunk sizes (parent → child)overlapTokens: Overlap between chunksConfiguration:
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'HIERARCHICAL',
'hierarchicalChunkingConfiguration': {
'levelConfigurations': [
{
'maxTokens': 1500 # Parent chunk (comprehensive context)
},
{
'maxTokens': 300 # Child chunk (focused retrieval)
}
],
'overlapTokens': 60
}
}
}
Use Cases:
How Retrieval Works:
Pros:
Cons:
Best for: Specialized domain logic, custom parsing requirements
How it works: Invoke Lambda function for custom chunking logic
Configuration:
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'NONE' # Custom via Lambda
},
'customTransformationConfiguration': {
'intermediateStorage': {
's3Location': {
'uri': 's3://my-kb-bucket/intermediate/'
}
},
'transformations': [
{
'stepToApply': 'POST_CHUNKING',
'transformationFunction': {
'transformationLambdaConfiguration': {
'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
}
}
}
]
}
}
Example Lambda Handler:
# Lambda function for custom chunking
import json
def lambda_handler(event, context):
"""
Custom chunking logic for specialized documents
Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""
# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})
# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')
for idx, section in enumerate(sections):
if section.strip():
chunks.append({
'text': section.strip(),
'metadata': {
**metadata,
'chunk_id': f'section_{idx}',
'chunk_type': 'custom_section'
}
})
return {
'chunks': chunks
}
Use Cases:
Pros:
Cons:
| Document Type | Recommended Strategy | Rationale | |--------------|---------------------|-----------| | Blog posts, articles | Fixed-size | Simple, uniform structure | | Legal documents | Semantic | Preserve legal reasoning flow | | Technical manuals | Hierarchical | Nested sections and subsections | | Academic papers | Hierarchical | Abstract, sections, subsections | | FAQs | Fixed-size | Independent Q&A pairs | | Medical records | Custom Lambda | Structured sections (SOAP, HL7) | | Code documentation | Custom Lambda | Code blocks + explanations | | Product catalogs | Fixed-size | Uniform product descriptions | | Research reports | Semantic | Preserve research narrative |
Returns raw retrieved chunks without generation.
Use Cases:
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={
'text': 'What are the benefits of hierarchical chunking?'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID', # SEMANTIC, HYBRID
'filter': {
'andAll': [
{
'equals': {
'key': 'document_type',
'value': 'technical_guide'
}
},
{
'greaterThan': {
'key': 'publish_year',
'value': 2024
}
}
]
}
}
}
)
# Process retrieved chunks
for result in response['retrievalResults']:
print(f"Score: {result['score']}")
print(f"Content: {result['content']['text']}")
print(f"Location: {result['location']}")
print(f"Metadata: {result.get('metadata', {})}")
print("---")
Returns generated response with source attribution.
Use Cases:
response = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'Explain semantic chunking benefits and when to use it'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID'
}
},
'generationConfiguration': {
'inferenceConfig': {
'textInferenceConfig': {
'temperature': 0.7,
'maxTokens': 2048,
'topP': 0.9
}
},
'promptTemplate': {
'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.
Context: $search_results$
Question: $query$
Answer:'''
}
}
}
}
)
print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
for reference in citation['retrievedReferences']:
print(f" - {reference['location']}")
print(f" Relevance Score: {reference.get('score', 'N/A')}")
Bedrock automatically manages conversation context across turns.
# First turn - creates session automatically
response1 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What is Amazon Bedrock Knowledge Bases?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
}
)
session_id = response1['sessionId']
print(f"Session ID: {session_id}")
print(f"Response: {response1['output']['text']}\n")
# Follow-up turn - reuse session for context
response2 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What chunking strategies does it support?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
},
sessionId=session_id # Continue conversation with context
)
print(f"Follow-up Response: {response2['output']['text']}")
# Third turn
response3 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'Which strategy would you recommend for legal documents?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
},
sessionId=session_id
)
print(f"Third Response: {response3['output']['text']}")
Filter retrieval by metadata attributes for precision.
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={
'text': 'Security best practices for production deployments'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 10,
'overrideSearchType': 'HYBRID',
'filter': {
'andAll': [
{
'equals': {
'key': 'document_type',
'value': 'security_guide'
}
},
{
'greaterThanOrEquals': {
'key': 'publish_year',
'value': 2024
}
},
{
'in': {
'key': 'category',
'value': ['production', 'security', 'compliance']
}
}
]
}
}
}
)
Supported Filter Operators:
equals: Exact matchnotEquals: Not equalgreaterThan, greaterThanOrEquals: Numeric comparisonlessThan, lessThanOrEquals: Numeric comparisonin: Match any value in arraynotIn: Not match any value in arraystartsWith: String prefix matchandAll: Combine filters with ANDorAll: Combine filters with ORingestion_response = bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Monthly document sync',
clientToken='unique-idempotency-token-123'
)
job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")
# Get job status
job_status = bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
print(f"Status: {job_status['ingestionJob']['status']}")
print(f"Started: {job_status['ingestionJob']['startedAt']}")
print(f"Updated: {job_status['ingestionJob']['updatedAt']}")
if 'statistics' in job_status['ingestionJob']:
stats = job_status['ingestionJob']['statistics']
print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}")
print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}")
print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")
# Wait for completion
import time
while True:
status = bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
current_status = status['ingestionJob']['status']
if current_status in ['COMPLETE', 'FAILED']:
print(f"Ingestion job {current_status}")
break
print(f"Status: {current_status}, waiting...")
time.sleep(30)
list_response = bedrock_agent.list_ingestion_jobs(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
maxResults=50
)
for job in list_response['ingestionJobSummaries']:
print(f"Job ID: {job['ingestionJobId']}")
print(f"Status: {job['status']}")
print(f"Started: {job['startedAt']}")
print(f"Updated: {job['updatedAt']}")
print("---")
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
# Create agent with knowledge base
agent_response = bedrock_agent.create_agent(
agentName='customer-support-agent',
description='Customer support agent with knowledge base access',
instruction='''You are a customer support agent. When answering questions:
1. Search the knowledge base for relevant information
2. Provide accurate answers based on retrieved context
3. Cite your sources
4. Admit when you don't know something''',
foundationModel='anthropic.claude-3-sonnet-20240229-v1:0',
agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole'
)
agent_id = agent_response['agent']['agentId']
# Associate knowledge base with agent
kb_association = bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB123456',
description='Company documentation knowledge base',
knowledgeBaseState='ENABLED'
)
# Prepare and create alias
bedrock_agent.prepare_agent(agentId=agent_id)
alias_response = bedrock_agent.create_agent_alias(
agentId=agent_id,
agentAliasName='production',
description='Production alias'
)
agent_alias_id = alias_response['agentAlias']['agentAliasId']
# Invoke agent (automatically queries knowledge base)
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.invoke_agent(
agentId=agent_id,
agentAliasId=agent_alias_id,
sessionId='session-123',
inputText='What is our return policy for defective products?'
)
for event in response['completion']:
if 'chunk' in event:
chunk = event['chunk']
print(chunk['bytes'].decode())
# Associate multiple knowledge bases
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-PRODUCT-DOCS',
description='Product documentation'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-SUPPORT-ARTICLES',
description='Support knowledge articles'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-COMPANY-POLICIES',
description='Company policies and procedures'
)
# Agent automatically searches all knowledge bases and combines results
Decision Framework:
Simple, uniform documents → Fixed-size chunking
Documents without clear boundaries → Semantic chunking
Nested, hierarchical documents → Hierarchical chunking
Specialized formats → Custom Lambda chunking
Tuning Guidelines:
Number of Results:
Search Type:
Use Hybrid Search when:
Use Semantic Search when:
Metadata Filters:
S3 Vectors:
Semantic Chunking:
Ingestion Frequency:
Model Selection:
Token Usage:
Always Reuse Sessions:
sessionId for follow-up turnsSession Lifecycle:
Context Limits:
When to Use:
Benefits:
Considerations:
S3 Best Practices:
Web Crawler:
Confluence/SharePoint:
Metadata Enrichment:
Enable CloudWatch Logs:
# Monitor retrieval quality
# Track: query latency, retrieval scores, generation quality
# Set alarms for: high latency, low scores, high error rates
Test Retrieval Quality:
# Use retrieve API to debug
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={'text': 'test query'}
)
# Analyze retrieval scores
for result in response['retrievalResults']:
print(f"Score: {result['score']}")
print(f"Content preview: {result['content']['text'][:200]}")
Common Issues:
Low Retrieval Scores:
Irrelevant Results:
Missing Information:
Slow Retrieval:
IAM Permissions:
Data Encryption:
Access Control:
PII Handling:
import boto3
import json
from typing import List, Dict, Optional
class BedrockKnowledgeBaseRAG:
"""Production RAG application with Amazon Bedrock Knowledge Bases"""
def __init__(self, region_name: str = 'us-east-1'):
self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)
def create_knowledge_base(
self,
name: str,
description: str,
role_arn: str,
vector_store_config: Dict,
embedding_model: str = 'amazon.titan-embed-text-v2:0'
) -> str:
"""Create knowledge base with vector store"""
response = self.bedrock_agent.create_knowledge_base(
name=name,
description=description,
roleArn=role_arn,
knowledgeBaseConfiguration={
'type': 'VECTOR',
'vectorKnowledgeBaseConfiguration': {
'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
}
},
storageConfiguration=vector_store_config
)
return response['knowledgeBase']['knowledgeBaseId']
def add_s3_data_source(
self,
knowledge_base_id: str,
name: str,
bucket_arn: str,
inclusion_prefixes: List[str],
chunking_strategy: str = 'FIXED_SIZE',
chunking_config: Optional[Dict] = None
) -> str:
"""Add S3 data source with chunking configuration"""
if chunking_config is None:
chunking_config = {
'maxTokens': 512,
'overlapPercentage': 20
}
vector_ingestion_config = {
'chunkingConfiguration': {
'chunkingStrategy': chunking_strategy
}
}
if chunking_strategy == 'FIXED_SIZE':
vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
elif chunking_strategy == 'SEMANTIC':
vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
elif chunking_strategy == 'HIERARCHICAL':
vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config
response = self.bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name=name,
description=f'S3 data source: {name}',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': bucket_arn,
'inclusionPrefixes': inclusion_prefixes
}
},
vectorIngestionConfiguration=vector_ingestion_config
)
return response['dataSource']['dataSourceId']
def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
"""Start ingestion job and wait for completion"""
import time
# Start ingestion
response = self.bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Automated ingestion'
)
job_id = response['ingestionJob']['ingestionJobId']
# Wait for completion
while True:
status_response = self.bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
status = status_response['ingestionJob']['status']
if status == 'COMPLETE':
print(f"Ingestion completed successfully")
if 'statistics' in status_response['ingestionJob']:
stats = status_response['ingestionJob']['statistics']
print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")
break
elif status == 'FAILED':
print(f"Ingestion failed")
break
print(f"Ingestion status: {status}")
time.sleep(30)
return job_id
def query(
self,
knowledge_base_id: str,
query: str,
model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
num_results: int = 5,
search_type: str = 'HYBRID',
metadata_filter: Optional[Dict] = None,
session_id: Optional[str] = None
) -> Dict:
"""Query knowledge base with retrieve and generate"""
retrieval_config = {
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': knowledge_base_id,
'modelArn': model_arn,
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': num_results,
'overrideSearchType': search_type
}
},
'generationConfiguration': {
'inferenceConfig': {
'textInferenceConfig': {
'temperature': 0.7,
'maxTokens': 2048
}
}
}
}
}
# Add metadata filter if provided
if metadata_filter:
retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter
# Build request
request = {
'input': {'text': query},
'retrieveAndGenerateConfiguration': retrieval_config
}
# Add session if provided
if session_id:
request['sessionId'] = session_id
response = self.bedrock_agent_runtime.retrieve_and_generate(**request)
return {
'answer': response['output']['text'],
'citations': response.get('citations', []),
'session_id': response['sessionId']
}
def multi_turn_conversation(
self,
knowledge_base_id: str,
queries: List[str],
model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
) -> List[Dict]:
"""Execute multi-turn conversation with context"""
session_id = None
conversation = []
for query in queries:
result = self.query(
knowledge_base_id=knowledge_base_id,
query=query,
model_arn=model_arn,
session_id=session_id
)
session_id = result['session_id']
conversation.append({
'query': query,
'answer': result['answer'],
'citations': result['citations']
})
return conversation
# Example Usage
if __name__ == '__main__':
rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')
# Create knowledge base
kb_id = rag.create_knowledge_base(
name='production-docs-kb',
description='Production documentation knowledge base',
role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
vector_store_config={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
)
# Add data source
ds_id = rag.add_s3_data_source(
knowledge_base_id=kb_id,
name='technical-docs',
bucket_arn='arn:aws:s3:::my-docs-bucket',
inclusion_prefixes=['docs/'],
chunking_strategy='HIERARCHICAL',
chunking_config={
'levelConfigurations': [
{'maxTokens': 1500},
{'maxTokens': 300}
],
'overlapTokens': 60
}
)
# Ingest data
rag.ingest_data(kb_id, ds_id)
# Single query
result = rag.query(
knowledge_base_id=kb_id,
query='What are the best practices for RAG applications?',
metadata_filter={
'equals': {
'key': 'document_type',
'value': 'best_practices'
}
}
)
print(f"Answer: {result['answer']}")
print(f"\nSources:")
for citation in result['citations']:
for ref in citation['retrievedReferences']:
print(f" - {ref['location']}")
# Multi-turn conversation
conversation = rag.multi_turn_conversation(
knowledge_base_id=kb_id,
queries=[
'What is hierarchical chunking?',
'When should I use it?',
'What are the configuration parameters?'
]
)
for turn in conversation:
print(f"\nQ: {turn['query']}")
print(f"A: {turn['answer']}")
/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md - Section 2 (Complete Knowledge Bases research)development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.