ai-app-architecture/SKILL.md
Use when designing or building AI-powered application systems — choosing architecture style, selecting components, structuring the AI stack, making build-vs-buy decisions, and planning multi-tenant AI module gating
npx skillsauth add peterbamuhigire/skills-web-dev ai-app-architectureInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
ai-app-architecture or would be better handled by a more specific companion skill.SKILL.md first, then load only the referenced deep-dive files that are necessary for the task.AI-powered apps are built on top of foundation models via APIs. You are NOT training models — you are orchestrating them. Your value lies in the application layer: context construction, prompt engineering, retrieval, guardrails, and user experience.
Core principle: Start with the simplest architecture that works. Evolve deliberately.
| Style | Description | When to Use | |---|---|---| | Wrap | Your UI + prompt engineering wraps a commercial LLM API | First project, internal tools, quick wins | | RAG | Retriever fetches private/fresh data, injected into prompt | Apps needing company-specific or up-to-date knowledge | | Agentic | LLM plans and executes multi-step tasks using tools | Complex automation, multi-step workflows | | Fine-tuned | Model weights adapted for domain/style | Only when brand voice or jargon cannot be achieved via prompts |
Default path: Wrap → RAG → Agents. Only fine-tune when all else fails.
┌──────────────────────────────────┐
│ User Interface (web/mobile) │
├──────────────────────────────────┤
│ Input Guardrail │ ← block PII, prompt injection, off-topic
│ Router / Intent Classifier │ ← route to right model/solution
│ Context Builder (RAG / Tools) │ ← feature engineering for AI
│ Model Gateway │ ← unified API wrapper, key mgmt, fallbacks
│ LLM API (OpenAI/Claude/Gemini) │
│ Output Guardrail │ ← catch toxicity, format failures, PII
│ Cache Layer │ ← exact + semantic caching
│ Streaming Handler │ ← MANDATORY — never block on full generation
├──────────────────────────────────┤
│ Token Ledger (MANDATORY) │ ← log every call: tenant_id, user_id, tokens
│ AI Module Gate (OFF by default)│ ← per-tenant enable/disable
└──────────────────────────────────┘
routing → retrieval → generation → scoringStep 1 (Baseline): Query → Model API → Response
Step 2 (Context): Query → Retriever → [Context + Query] → Model → Response
Step 3 (Guardrails): Input Guard → Context → Model → Output Guard → Response
Step 4 (Router): Router → [Intent-specific path] → Model(s) → Response
Step 5 (Cache): Router → Cache → [miss: full pipeline] → Cache Store
Step 6 (Agents): Router → Agent Loop [Plan → Tools → Reflect] → Response
Add each layer only when its absence is causing a real problem.
| Option | Effort | Control | Cost | When | |---|---|---|---|---| | Commercial API (OpenAI/Claude) | Low | Low | Per-token | Default choice | | Open source self-hosted (Llama) | High | Full | GPU infra | Data privacy requirement, high volume | | Fine-tuned commercial | Medium | Partial | Training + inference | Brand voice, jargon control | | Fine-tuned self-hosted | Very High | Full | High | Maximum control, regulated industries |
Rule: API wrap first. Justify self-hosting with actual cost/compliance numbers.
Every AI feature MUST be gated. AI costs real money per token.
-- Schema: AI module per tenant
CREATE TABLE tenant_ai_config (
tenant_id INT PRIMARY KEY,
ai_enabled BOOLEAN DEFAULT FALSE, -- OFF by default
monthly_budget_usd DECIMAL(10,2), -- null = unlimited
budget_alert_pct INT DEFAULT 80, -- alert at 80% of budget
plan_name VARCHAR(50), -- 'basic', 'pro', 'enterprise'
enabled_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW()
);
Enforcement: Every AI endpoint checks tenant_ai_config.ai_enabled before processing. Return 402 Payment Required if disabled.
Log every AI API call for billing, debugging, and cost visibility.
CREATE TABLE ai_token_usage (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
tenant_id INT NOT NULL,
user_id INT NOT NULL,
feature_name VARCHAR(100), -- 'invoice_analysis', 'report_summary'
model VARCHAR(50), -- 'gpt-4o', 'claude-3-sonnet'
tokens_in INT NOT NULL,
tokens_out INT NOT NULL,
cost_usd DECIMAL(10,6), -- calculated at log time
latency_ms INT,
created_at TIMESTAMP DEFAULT NOW(),
INDEX idx_tenant_date (tenant_id, created_at),
INDEX idx_user_date (user_id, created_at)
);
-- Usage by tenant (for invoicing)
SELECT tenant_id,
SUM(tokens_in + tokens_out) AS total_tokens,
SUM(cost_usd) AS total_cost_usd,
DATE_FORMAT(created_at, '%Y-%m') AS month
FROM ai_token_usage
GROUP BY tenant_id, month;
-- Usage by user (for analytics)
SELECT user_id, feature_name,
SUM(tokens_in + tokens_out) AS tokens,
COUNT(*) AS calls
FROM ai_token_usage
WHERE tenant_id = ? AND created_at > DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY user_id, feature_name;
function checkAiQuota(int $tenantId): void {
$config = TenantAiConfig::find($tenantId);
if (!$config || !$config->ai_enabled) {
throw new AiModuleDisabledException('AI module not enabled for this account.');
}
if ($config->monthly_budget_usd !== null) {
$spent = AiTokenUsage::currentMonthCost($tenantId);
if ($spent >= $config->monthly_budget_usd) {
throw new AiBudgetExceededException('Monthly AI budget reached.');
}
if ($spent >= $config->monthly_budget_usd * ($config->budget_alert_pct / 100)) {
notifyTenantBudgetAlert($tenantId, $spent, $config->monthly_budget_usd);
}
}
}
| Layer | Lightweight | Production | |---|---|---| | LLM | OpenAI API | API + fallback provider via gateway | | Context | In-memory / SQLite | Vector DB (Chroma, Qdrant, Pinecone) | | Cache | Redis | Redis Cluster | | Queue | Sync | Kafka / RabbitMQ | | Monitoring | Log file | Prometheus + Grafana |
Chip Huyen — AI Engineering (2025); David Spuler — Generative AI Applications (2024); Andrea De Mauro — AI Applications Made Easy (2024)
data-ai
Use when adding AI-powered analytics to a SaaS platform — semantic search over business data, natural language queries, trend detection, anomaly alerts, and AI-generated insights for dashboards. Covers embeddings, NL2SQL, and per-tenant analytics...
data-ai
Design AI-powered analytics dashboards — what metrics to show, how to display AI predictions and confidence, drill-down patterns, KPI cards, trend visualisation, AI Insights panels, export design, and role-based dashboard variants. Invoke when...
development
Use when designing, building, reviewing, or upgrading production software systems that must be secure, performant, maintainable, scalable, and user-centered. Apply before writing specs, code, architecture, APIs, databases, mobile apps, SaaS platforms, or ERP systems.
development
Professional web app UI using commercial templates (Tabler/Bootstrap 5) with strong frontend design direction when needed. Use for CRUD interfaces, dashboards, admin panels with SweetAlert2, DataTables, Flatpickr. Clone seeder-page.php, use...