plugins/vercel/skills/ai-generation-persistence/SKILL.md
AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
npx skillsauth add openai/plugins ai-generation-persistenceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
AI generations are expensive, non-reproducible assets. Never discard them.
Every call to an LLM costs real money and produces unique output that cannot be exactly reproduced. Treat generations like database records — assign an ID, persist immediately, and make them retrievable.
nanoid() or createId() from @paralleldrive/cuid2/chat/[id], /generate/[id], /image/[id]The standard UX flow for AI features: create the resource first, then redirect to its page.
// app/api/chat/route.ts
import { nanoid } from "nanoid";
import { db } from "@/lib/db";
import { redirect } from "next/navigation";
export async function POST(req: Request) {
const { prompt, model } = await req.json();
const id = nanoid();
// Create the record BEFORE generation starts
await db.insert(generations).values({
id,
prompt,
model,
status: "pending",
createdAt: new Date(),
});
// Redirect to the generation page — it handles streaming
redirect(`/chat/${id}`);
}
// app/chat/[id]/page.tsx
import { db } from "@/lib/db";
import { notFound } from "next/navigation";
export default async function ChatPage({ params }: { params: Promise<{ id: string }> }) {
const { id } = await params;
const generation = await db.query.generations.findFirst({
where: eq(generations.id, id),
});
if (!generation) notFound();
// Render with streaming if still pending, or show saved result
return <ChatView generation={generation} />;
}
This gives you: shareable URLs, back-button support, multi-tab sessions, and generation history for free.
// lib/db/schema.ts
import { pgTable, text, integer, timestamp, jsonb } from "drizzle-orm/pg-core";
export const generations = pgTable("generations", {
id: text("id").primaryKey(), // nanoid
userId: text("user_id"), // auth user
model: text("model").notNull(), // "openai/gpt-5.4"
prompt: text("prompt"), // input text
result: text("result"), // generated output
imageUrls: jsonb("image_urls"), // Blob URLs for generated images
tokenUsage: jsonb("token_usage"), // { promptTokens, completionTokens }
estimatedCostCents: integer("estimated_cost_cents"),
status: text("status").default("pending"), // pending | streaming | complete | error
createdAt: timestamp("created_at").defaultNow(),
});
| Data Type | Storage | Why |
|-----------|---------|-----|
| Text, metadata, history | Neon Postgres via Drizzle | Queryable, relational, supports search |
| Generated images & files | Vercel Blob (@vercel/blob) | Permanent URLs, CDN-backed, no expiry |
| Prompt dedup cache | Upstash Redis | Fast lookup, TTL-based expiry |
Never serve generated images as ephemeral base64 or temporary URLs. Save to Blob immediately:
import { put } from "@vercel/blob";
import { generateText } from "ai";
const result = await generateText({ model, prompt });
// Save every generated image to permanent storage
const imageUrls: string[] = [];
for (const file of result.files ?? []) {
if (file.mediaType?.startsWith("image/")) {
const ext = file.mediaType.split("/")[1] || "png";
const blob = await put(`generations/${generationId}.${ext}`, file.uint8Array, {
access: "public",
contentType: file.mediaType,
});
imageUrls.push(blob.url);
}
}
// Update the generation record with permanent URLs
await db.update(generations)
.set({ imageUrls, status: "complete" })
.where(eq(generations.id, generationId));
Extract usage from every generation and store it. This enables billing, budgeting, and abuse detection:
const result = await generateText({ model, prompt });
const usage = result.usage; // { promptTokens, completionTokens, totalTokens }
const estimatedCostCents = estimateCost(model, usage);
await db.update(generations).set({
result: result.text,
tokenUsage: usage,
estimatedCostCents,
status: "complete",
}).where(eq(generations.id, generationId));
Avoid paying for identical generations. Cache by content hash:
import { Redis } from "@upstash/redis";
import { createHash } from "crypto";
const redis = Redis.fromEnv();
function hashPrompt(model: string, prompt: string): string {
return createHash("sha256").update(`${model}:${prompt}`).digest("hex");
}
// Check cache before generating
const cacheKey = `gen:${hashPrompt(model, prompt)}`;
const cached = await redis.get<string>(cacheKey);
if (cached) return cached; // Return cached generation ID
// After generation, cache the result
await redis.set(cacheKey, generationId, { ex: 3600 }); // 1hr TTL
[id] segments — /api/chat with no ID means generations aren't addressable. Use /chat/[id].tools
Top-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.
data-ai
Use when the user mentions MagicPath, designs, UI components, themes, canvas selections, or repo-to-canvas UI work; run magicpath-ai to search, inspect, install, or author components.
documentation
Use as the top-level router for Omniverse Realtime Viewer USD app requests and focused viewer reference documents.
tools
Turn Notion specs into implementation plans, tasks, and progress tracking; use when implementing PRDs/feature specs and creating Notion plans + tasks from them.