name: ai-chatbot-contextual-knowledge description: "Use when adding an AI assistant that answers domain-specific questions, building a documentation chatbot, or when you need contextual AI without vector database infrastructure. Implements RAG-lite: inject the relevant knowledge subset into the system prompt based on the user's question using an in-memory registry — no embeddings API required. Domain: AI Integration, LLM, RAG-lite. Level: Advanced. Tags: ai, chatbot, rag, microsoft-extensions-ai, system-prompt, knowledge-injection, streaming, temperature."

AI Chatbot with Contextual Knowledge Injection

Problem

You want to add an AI assistant to your application that answers questions about your specific domain (your library, your product, your documentation) — not just general knowledge. Sending the entire knowledge base in every prompt is too expensive and hits token limits.

Solution: Contextual Knowledge Injection (Lightweight RAG)

Instead of full vector-database RAG, inject only the relevant subset of domain knowledge into the system prompt based on the user's question. This is a "RAG-lite" approach:

Parse the user's question for relevant terms
Search your local knowledge registry for matching items
Inject matching items into the system prompt as structured context
Send to the LLM with both the context and the user's question

Why RAG-Lite Over Full RAG

| Aspect | Full RAG | RAG-Lite (this skill) | |--------|----------|----------------------| | Infrastructure | Vector DB + embeddings API | In-memory registry | | Cost | Embedding API calls | Zero (local search) | | Latency | 100-500ms retrieval | <1ms retrieval | | Accuracy | Semantic similarity | Keyword + fuzzy match | | Best for | Large/changing corpora | Small-medium static knowledge |

Implementation

1. Configure options

public sealed class ChatbotOptions
{
    [Required]
    public string ApiKey { get; set; } = string.Empty;

    public string ModelId { get; set; } = "<your-model-id>";
    public string AssistantName { get; set; } = "Assistant";
    public string WelcomeMessage { get; set; } = "How can I help you?";
    public int MaxConversationHistory { get; set; } = 10;

    /// <summary>
    /// Controls response creativity (0.0 = deterministic, 2.0 = very creative).
    /// Use 0.0–0.3 for knowledge-grounded chatbots to keep answers anchored to
    /// the injected context and prevent hallucination. Default is 0.2.
    /// </summary>
    [Range(0.0f, 2.0f)]
    public float Temperature { get; set; } = 0.2f;
}

2. Define the knowledge builder abstraction

/// <summary>
/// Builds system prompts with contextual domain knowledge injected
/// based on the user's question.
/// </summary>
public interface IKnowledgeBuilder
{
    string BuildContextualPrompt(string userMessage);
}

internal sealed class KnowledgeBuilder : IKnowledgeBuilder
{
    private readonly IRegistryService _registry;
    private readonly string _basePrompt;

    public KnowledgeBuilder(IRegistryService registry)
    {
        _registry = registry;
        _basePrompt = BuildBasePrompt();
    }

    /// <summary>
    /// Builds a system prompt with relevant knowledge injected.
    /// </summary>
    public string BuildContextualPrompt(string userMessage)
    {
        var relevantItems = _registry.Search(
            userMessage, maxResults: 10, useFuzzyMatching: true, useSynonyms: true)
            .ToArray();

        if (relevantItems.Length == 0)
            return _basePrompt;

        var context = FormatKnowledgeContext(relevantItems);

        return $"""
            {_basePrompt}

            ## Relevant Knowledge for This Question

            {context}
            """;
    }

    private static string BuildBasePrompt() => """
        You are a helpful assistant for this application.
        Answer questions based on the knowledge provided in the context below.
        If you don't know the answer, say so — don't make things up.
        Use code examples when helpful. Be concise.
        """;

    private static string FormatKnowledgeContext(IEnumerable<SearchResult> results)
    {
        var sb = new StringBuilder();

        foreach (var result in results)
        {
            var item = result.Item;
            sb.AppendLine($"### {item.Name} ({item.Category})");
            sb.AppendLine($"- **Full Name:** `{item.FullName}`");
            sb.AppendLine($"- **Description:** {item.Description}");

            if (item.MethodNames.Length > 0)
                sb.AppendLine($"- **Methods:** `{string.Join("`, `", item.MethodNames)}`");

            if (item.Keywords.Length > 0)
                sb.AppendLine($"- **Keywords:** {string.Join(", ", item.Keywords)}");

            sb.AppendLine();
        }

        return sb.ToString();
    }
}

IRegistryService is defined in the In-Memory Search Registry skill. The implementation (RegistryService) is registered as IRegistryService in that skill's DI wiring.

3. Define the chatbot service abstraction

using Microsoft.Extensions.AI;

public sealed record ChatMessageDto(string Role, string Content);
public sealed record ChatRequest(string Message, List<ChatMessageDto>? History);

/// <summary>
/// Orchestrates AI chat with contextual knowledge and streaming responses.
/// </summary>
public interface IChatbotService
{
    IAsyncEnumerable<string> StreamResponseAsync(
        string userMessage,
        List<ChatMessageDto>? history,
        CancellationToken ct = default);
}

internal sealed class ChatbotService : IChatbotService
{
    private readonly IChatClient _chatClient;
    private readonly IKnowledgeBuilder _knowledgeBuilder;
    private readonly ChatbotOptions _options;

    public ChatbotService(
        IChatClient chatClient,
        IKnowledgeBuilder knowledgeBuilder,
        IOptions<ChatbotOptions> options)
    {
        _chatClient = chatClient;
        _knowledgeBuilder = knowledgeBuilder;
        _options = options.Value;
    }

    /// <summary>
    /// Streams the AI response as chunks for real-time display.
    /// </summary>
    public async IAsyncEnumerable<string> StreamResponseAsync(
        string userMessage,
        List<ChatMessageDto>? history,
        [EnumeratorCancellation] CancellationToken ct = default)
    {
        // 1. Build prompt with injected knowledge
        var systemPrompt = _knowledgeBuilder.BuildContextualPrompt(userMessage);

        // 2. Assemble conversation
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, systemPrompt)
        };

        if (history is not null)
        {
            foreach (var msg in history.TakeLast(_options.MaxConversationHistory))
            {
                var role = msg.Role.Equals("user", StringComparison.OrdinalIgnoreCase)
                    ? ChatRole.User
                    : ChatRole.Assistant;
                messages.Add(new ChatMessage(role, msg.Content));
            }
        }

        messages.Add(new ChatMessage(ChatRole.User, userMessage));

        // 3. Stream response with temperature controlling factual grounding
        var chatOptions = new ChatOptions { Temperature = _options.Temperature };
        await foreach (var update in _chatClient.GetStreamingResponseAsync(messages, chatOptions, cancellationToken: ct))
        {
            if (update.Text is { Length: > 0 } text)
                yield return text;
        }
    }
}

4. Wire up in `Program.cs` (optional — only when configured)

var chatbotSection = builder.Configuration.GetSection("AIChatbot");
if (chatbotSection.GetValue<bool>("Enabled"))
{
    builder.Services
        .AddOptions<ChatbotOptions>()
        .Bind(chatbotSection)
        .ValidateDataAnnotations()
        .ValidateOnStart();

    builder.Services.AddSingleton<IKnowledgeBuilder, KnowledgeBuilder>();
    builder.Services.AddChatClient(
        new OpenAIClient(chatbotSection["ApiKey"]!)
            .GetChatClient(chatbotSection["ModelId"] ?? "<your-model-id>")
            .AsIChatClient())
        .UseLogging();
    builder.Services.AddSingleton<IChatbotService, ChatbotService>();
}

5. Configuration (`appsettings.json`)

{
  "AIChatbot": {
    "Enabled": true,
    "ApiKey": "sk-...",
    "ModelId": "<your-model-id>",
    "AssistantName": "My Assistant",
    "WelcomeMessage": "Ask me anything about this library!",
    "MaxConversationHistory": 10,
    "Temperature": 0.2
  }
}

6. API endpoint (Razor Pages or Minimal API)

app.MapPost("/api/chat", async (
    ChatRequest request,
    IChatbotService chatbot,
    CancellationToken ct) =>
{
    var chunks = chatbot.StreamResponseAsync(request.Message, request.History, ct);

    return Results.Stream(async stream =>
    {
        var writer = new StreamWriter(stream, leaveOpen: true);
        await foreach (var chunk in chunks.WithCancellation(ct))
        {
            await writer.WriteAsync(chunk);
            await writer.FlushAsync(ct);
        }
    }, "text/plain");
});

Temperature Guide

Temperature controls how strictly the model follows the injected context versus generating creative responses. For knowledge-grounded chatbots, lower is always better.

| Temperature | Behaviour | Use When | |-------------|-----------|----------| | 0.0 | Fully deterministic, stays strictly on context | Compliance, support bots, technical Q&A | | 0.1–0.3 | Factual, minimal creativity — recommended default | Documentation assistants, API helpers | | 0.4–0.7 | Balanced — may rephrase or extend beyond context | General-purpose assistants | | 0.8–1.0 | Creative, may hallucinate beyond injected knowledge | Not recommended for RAG-lite | | >1.0 | Highly unpredictable | Never for grounded chatbots |

Architecture Flow

User Question: "How do I validate an email?"
         │
         ▼
┌──────────────────────┐
│  KnowledgeBuilder    │
│  1. Search registry  │──► Finds: ContactValidator, Guard, FluentValidationHelper
│  2. Format context   │
│  3. Inject into      │
│     system prompt    │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  System Prompt       │
│  Base instructions + │
│  Relevant types:     │
│  - ContactValidator  │
│  - Guard             │
│  - FluentValidation  │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  LLM ({model-id})    │──► Streaming response with accurate, grounded answer
└──────────────────────┘

When to Use

Adding AI assistance to documentation sites, admin panels, or developer tools
Your knowledge base is small-to-medium (fits in memory as metadata)
You want fast, cheap contextual AI without vector database infrastructure
The knowledge is relatively static (types, API docs, feature catalogs)

When NOT to Use

Large, frequently changing knowledge bases (use full RAG with embeddings)
Answers require reading full source code or long documents (context window limits)
Offline / air-gapped environments (requires API access)

Gotchas

Temperature for grounded responses: Default temperature is 0.2. Higher values cause the model to ignore injected context and hallucinate. Never use temperature > 0.5 with RAG-lite — the model will generate answers beyond the knowledge you provided.
API key security: Never commit API keys. Use environment variables or Azure Key Vault. The Enabled flag lets you disable in environments without keys.
Token limits: The system prompt + knowledge context + conversation history must fit within the model's context window. Limit maxResults and MaxConversationHistory.
Optional registration: Guard the DI registration with if (Enabled) so the app works without AI configuration.
Streaming vs. buffered: Streaming (IAsyncEnumerable) gives a ChatGPT-like progressive display. Use buffered only for non-interactive scenarios.
Microsoft.Extensions.AI: This is the official Microsoft abstraction. It decouples from OpenAI, Azure OpenAI, or any provider.

Required Packages

<PackageReference Include="Microsoft.Extensions.AI" Version="9.5.0" />
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.5.0" />
<PackageReference Include="OpenAI" Version="2.2.0" />

Related Skills

In-Memory Search Registry (provides the knowledge source)
Streaming AI Responses (the streaming pattern in detail)

name: ai-chatbot-contextual-knowledge description: "Use when adding an AI assistant that answers domain-specific questions, building a documentation chatbot, or when you need contextual AI without vector database infrastructure. Implements RAG-lite: inject the relevant knowledge subset into the system prompt based on the user's question using an in-memory registry — no embeddings API required. Domain: AI Integration, LLM, RAG-lite. Level: Advanced. Tags: ai, chatbot, rag, microsoft-extensions-ai, system-prompt, knowledge-injection, streaming, temperature."

AI Chatbot with Contextual Knowledge Injection

Problem

Solution: Contextual Knowledge Injection (Lightweight RAG)

Instead of full vector-database RAG, inject only the relevant subset of domain knowledge into the system prompt based on the user's question. This is a "RAG-lite" approach:

Parse the user's question for relevant terms
Search your local knowledge registry for matching items
Inject matching items into the system prompt as structured context
Send to the LLM with both the context and the user's question

Why RAG-Lite Over Full RAG

Implementation

1. Configure options

public sealed class ChatbotOptions
{
    [Required]
    public string ApiKey { get; set; } = string.Empty;

    public string ModelId { get; set; } = "<your-model-id>";
    public string AssistantName { get; set; } = "Assistant";
    public string WelcomeMessage { get; set; } = "How can I help you?";
    public int MaxConversationHistory { get; set; } = 10;

    /// <summary>
    /// Controls response creativity (0.0 = deterministic, 2.0 = very creative).
    /// Use 0.0–0.3 for knowledge-grounded chatbots to keep answers anchored to
    /// the injected context and prevent hallucination. Default is 0.2.
    /// </summary>
    [Range(0.0f, 2.0f)]
    public float Temperature { get; set; } = 0.2f;
}

2. Define the knowledge builder abstraction

/// <summary>
/// Builds system prompts with contextual domain knowledge injected
/// based on the user's question.
/// </summary>
public interface IKnowledgeBuilder
{
    string BuildContextualPrompt(string userMessage);
}

internal sealed class KnowledgeBuilder : IKnowledgeBuilder
{
    private readonly IRegistryService _registry;
    private readonly string _basePrompt;

    public KnowledgeBuilder(IRegistryService registry)
    {
        _registry = registry;
        _basePrompt = BuildBasePrompt();
    }

    /// <summary>
    /// Builds a system prompt with relevant knowledge injected.
    /// </summary>
    public string BuildContextualPrompt(string userMessage)
    {
        var relevantItems = _registry.Search(
            userMessage, maxResults: 10, useFuzzyMatching: true, useSynonyms: true)
            .ToArray();

        if (relevantItems.Length == 0)
            return _basePrompt;

        var context = FormatKnowledgeContext(relevantItems);

        return $"""
            {_basePrompt}

            ## Relevant Knowledge for This Question

            {context}
            """;
    }

    private static string BuildBasePrompt() => """
        You are a helpful assistant for this application.
        Answer questions based on the knowledge provided in the context below.
        If you don't know the answer, say so — don't make things up.
        Use code examples when helpful. Be concise.
        """;

    private static string FormatKnowledgeContext(IEnumerable<SearchResult> results)
    {
        var sb = new StringBuilder();

        foreach (var result in results)
        {
            var item = result.Item;
            sb.AppendLine($"### {item.Name} ({item.Category})");
            sb.AppendLine($"- **Full Name:** `{item.FullName}`");
            sb.AppendLine($"- **Description:** {item.Description}");

            if (item.MethodNames.Length > 0)
                sb.AppendLine($"- **Methods:** `{string.Join("`, `", item.MethodNames)}`");

            if (item.Keywords.Length > 0)
                sb.AppendLine($"- **Keywords:** {string.Join(", ", item.Keywords)}");

            sb.AppendLine();
        }

        return sb.ToString();
    }
}

IRegistryService is defined in the In-Memory Search Registry skill. The implementation (RegistryService) is registered as IRegistryService in that skill's DI wiring.

3. Define the chatbot service abstraction

using Microsoft.Extensions.AI;

public sealed record ChatMessageDto(string Role, string Content);
public sealed record ChatRequest(string Message, List<ChatMessageDto>? History);

/// <summary>
/// Orchestrates AI chat with contextual knowledge and streaming responses.
/// </summary>
public interface IChatbotService
{
    IAsyncEnumerable<string> StreamResponseAsync(
        string userMessage,
        List<ChatMessageDto>? history,
        CancellationToken ct = default);
}

internal sealed class ChatbotService : IChatbotService
{
    private readonly IChatClient _chatClient;
    private readonly IKnowledgeBuilder _knowledgeBuilder;
    private readonly ChatbotOptions _options;

    public ChatbotService(
        IChatClient chatClient,
        IKnowledgeBuilder knowledgeBuilder,
        IOptions<ChatbotOptions> options)
    {
        _chatClient = chatClient;
        _knowledgeBuilder = knowledgeBuilder;
        _options = options.Value;
    }

    /// <summary>
    /// Streams the AI response as chunks for real-time display.
    /// </summary>
    public async IAsyncEnumerable<string> StreamResponseAsync(
        string userMessage,
        List<ChatMessageDto>? history,
        [EnumeratorCancellation] CancellationToken ct = default)
    {
        // 1. Build prompt with injected knowledge
        var systemPrompt = _knowledgeBuilder.BuildContextualPrompt(userMessage);

        // 2. Assemble conversation
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, systemPrompt)
        };

        if (history is not null)
        {
            foreach (var msg in history.TakeLast(_options.MaxConversationHistory))
            {
                var role = msg.Role.Equals("user", StringComparison.OrdinalIgnoreCase)
                    ? ChatRole.User
                    : ChatRole.Assistant;
                messages.Add(new ChatMessage(role, msg.Content));
            }
        }

        messages.Add(new ChatMessage(ChatRole.User, userMessage));

        // 3. Stream response with temperature controlling factual grounding
        var chatOptions = new ChatOptions { Temperature = _options.Temperature };
        await foreach (var update in _chatClient.GetStreamingResponseAsync(messages, chatOptions, cancellationToken: ct))
        {
            if (update.Text is { Length: > 0 } text)
                yield return text;
        }
    }
}

4. Wire up in `Program.cs` (optional — only when configured)

var chatbotSection = builder.Configuration.GetSection("AIChatbot");
if (chatbotSection.GetValue<bool>("Enabled"))
{
    builder.Services
        .AddOptions<ChatbotOptions>()
        .Bind(chatbotSection)
        .ValidateDataAnnotations()
        .ValidateOnStart();

    builder.Services.AddSingleton<IKnowledgeBuilder, KnowledgeBuilder>();
    builder.Services.AddChatClient(
        new OpenAIClient(chatbotSection["ApiKey"]!)
            .GetChatClient(chatbotSection["ModelId"] ?? "<your-model-id>")
            .AsIChatClient())
        .UseLogging();
    builder.Services.AddSingleton<IChatbotService, ChatbotService>();
}

5. Configuration (`appsettings.json`)

{
  "AIChatbot": {
    "Enabled": true,
    "ApiKey": "sk-...",
    "ModelId": "<your-model-id>",
    "AssistantName": "My Assistant",
    "WelcomeMessage": "Ask me anything about this library!",
    "MaxConversationHistory": 10,
    "Temperature": 0.2
  }
}

6. API endpoint (Razor Pages or Minimal API)

app.MapPost("/api/chat", async (
    ChatRequest request,
    IChatbotService chatbot,
    CancellationToken ct) =>
{
    var chunks = chatbot.StreamResponseAsync(request.Message, request.History, ct);

    return Results.Stream(async stream =>
    {
        var writer = new StreamWriter(stream, leaveOpen: true);
        await foreach (var chunk in chunks.WithCancellation(ct))
        {
            await writer.WriteAsync(chunk);
            await writer.FlushAsync(ct);
        }
    }, "text/plain");
});

Temperature Guide

Temperature controls how strictly the model follows the injected context versus generating creative responses. For knowledge-grounded chatbots, lower is always better.

Architecture Flow

User Question: "How do I validate an email?"
         │
         ▼
┌──────────────────────┐
│  KnowledgeBuilder    │
│  1. Search registry  │──► Finds: ContactValidator, Guard, FluentValidationHelper
│  2. Format context   │
│  3. Inject into      │
│     system prompt    │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  System Prompt       │
│  Base instructions + │
│  Relevant types:     │
│  - ContactValidator  │
│  - Guard             │
│  - FluentValidation  │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  LLM ({model-id})    │──► Streaming response with accurate, grounded answer
└──────────────────────┘

When to Use

Adding AI assistance to documentation sites, admin panels, or developer tools
Your knowledge base is small-to-medium (fits in memory as metadata)
You want fast, cheap contextual AI without vector database infrastructure
The knowledge is relatively static (types, API docs, feature catalogs)

When NOT to Use

Large, frequently changing knowledge bases (use full RAG with embeddings)
Answers require reading full source code or long documents (context window limits)
Offline / air-gapped environments (requires API access)

Gotchas

Temperature for grounded responses: Default temperature is 0.2. Higher values cause the model to ignore injected context and hallucinate. Never use temperature > 0.5 with RAG-lite — the model will generate answers beyond the knowledge you provided.
API key security: Never commit API keys. Use environment variables or Azure Key Vault. The Enabled flag lets you disable in environments without keys.
Token limits: The system prompt + knowledge context + conversation history must fit within the model's context window. Limit maxResults and MaxConversationHistory.
Optional registration: Guard the DI registration with if (Enabled) so the app works without AI configuration.
Streaming vs. buffered: Streaming (IAsyncEnumerable) gives a ChatGPT-like progressive display. Use buffered only for non-interactive scenarios.
Microsoft.Extensions.AI: This is the official Microsoft abstraction. It decouples from OpenAI, Azure OpenAI, or any provider.

Required Packages

<PackageReference Include="Microsoft.Extensions.AI" Version="9.5.0" />
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.5.0" />
<PackageReference Include="OpenAI" Version="2.2.0" />

Related Skills

In-Memory Search Registry (provides the knowledge source)
Streaming AI Responses (the streaming pattern in detail)

Adoption

klod68/ai-chatbot-contextual-knowledge

$ install --global

Security Scan Results

SKILL.md

AI Chatbot with Contextual Knowledge Injection

Problem

Solution: Contextual Knowledge Injection (Lightweight RAG)

Why RAG-Lite Over Full RAG

Implementation

1. Configure options

2. Define the knowledge builder abstraction

3. Define the chatbot service abstraction

4. Wire up in Program.cs (optional — only when configured)

5. Configuration (appsettings.json)

6. API endpoint (Razor Pages or Minimal API)

Temperature Guide

Architecture Flow

When to Use

When NOT to Use

Gotchas

Required Packages

Related Skills

Related Skills

klod68/interceptor-pipeline

klod68/integration-testing-webapplicationfactory

klod68/index-design-strategy

klod68/in-memory-search-registry

klod68/ai-chatbot-contextual-knowledge

$ install --global

Security Scan Results

SKILL.md

AI Chatbot with Contextual Knowledge Injection

Problem

Solution: Contextual Knowledge Injection (Lightweight RAG)

Why RAG-Lite Over Full RAG

Implementation

1. Configure options

2. Define the knowledge builder abstraction

3. Define the chatbot service abstraction

4. Wire up in Program.cs (optional — only when configured)

5. Configuration (appsettings.json)

6. API endpoint (Razor Pages or Minimal API)

Temperature Guide

Architecture Flow

When to Use

When NOT to Use

Gotchas

Required Packages

Related Skills

Related Skills

klod68/interceptor-pipeline

klod68/integration-testing-webapplicationfactory

klod68/index-design-strategy

klod68/in-memory-search-registry

4. Wire up in `Program.cs` (optional — only when configured)

5. Configuration (`appsettings.json`)

4. Wire up in `Program.cs` (optional — only when configured)

5. Configuration (`appsettings.json`)