skills/caching-strategies/SKILL.md
Caching design across HTTP, CDN, application, and client layers. Use when: adding cache headers, configuring CloudFront or CDN behavior, implementing Redis or in-memory caching, designing IndexedDB or service worker caches, choosing cache invalidation strategies, auditing an existing application for missing or misconfigured caches, or optimizing API response times. Covers HTTP cache-control, CDN edge caching, application-level caching, client-side storage, cache invalidation patterns, and cache warming.
npx skillsauth add michaelsvanbeek/personal-agent-skills caching-strategiesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The best cache is the one closest to the requester. Cache at the edge before the origin, at the application before the database, at the client before the network. Each cache layer reduces load on the layers behind it.
| Layer | Where | TTL range | Invalidation | Example | |-------|-------|-----------|-------------|---------| | Browser | Client | Seconds to years | Cache-Control headers, versioned URLs | Static assets, API responses | | Service Worker | Client | App-controlled | Programmatic via Cache API | Offline-first PWA data | | CDN / Edge | Edge PoP | Minutes to days | Purge API, TTL expiry | CloudFront, Cloudflare | | API Gateway | Origin edge | Seconds to minutes | TTL expiry | API Gateway response cache | | Application | Server memory or Redis | Seconds to hours | Explicit delete, TTL | Computed aggregations, session data | | Database | Query layer | Automatic | Query plan cache, materialized views | PostgreSQL query cache |
Shared caches (CDN, API Gateway) must never serve one user's data to another. Mark user-specific responses as private or no-store.
Cache-Control: public, max-age=31536000, immutable
Use content-hashed filenames (app.a1b2c3.js) so the URL changes when content changes. Set max-age to 1 year. The immutable directive tells browsers not to revalidate.
Cache-Control: public, max-age=60, stale-while-revalidate=300
max-age=60: Fresh for 60 seconds.stale-while-revalidate=300: Serve stale for up to 5 minutes while fetching fresh copy in background.Cache-Control: private, max-age=0, must-revalidate
Prevents CDN/proxy caching. Browser may cache but must revalidate on every request.
Cache-Control: no-store
Use for sensitive data (auth tokens, PII, financial data). no-store is stronger than no-cache — it prevents storage entirely.
# FastAPI example
from hashlib import sha256
@app.get("/api/config")
def get_config(request: Request) -> Response:
data = load_config()
body = json.dumps(data)
etag = sha256(body.encode()).hexdigest()[:16]
if request.headers.get("if-none-match") == etag:
return Response(status_code=304)
return Response(
content=body,
headers={"ETag": etag, "Cache-Control": "private, max-age=0, must-revalidate"},
)
| Path pattern | TTL | Origin | Cache policy |
|-------------|-----|--------|-------------|
| /static/* | 1 year | S3 | CachingOptimized (query strings ignored) |
| /api/public/* | 60s | ALB/Lambda | Forward Accept, Accept-Encoding |
| /api/private/* | 0 | ALB/Lambda | CachingDisabled |
| Default (*) | 1 day | S3 | CachingOptimized |
Include only what differentiates responses:
Accept-Language header (for localized content).Authorization header (every user gets a different cache entry — defeats purpose).# Invalidate specific paths
aws cloudfront create-invalidation \
--distribution-id E1234567890 \
--paths "/api/public/products" "/api/public/categories"
# Invalidate everything (expensive — use sparingly)
aws cloudfront create-invalidation \
--distribution-id E1234567890 \
--paths "/*"
Prefer versioned URLs over invalidation. Invalidation is slow (minutes), costly at scale, and cannot be undone.
import json
import hashlib
from collections.abc import Callable
from typing import Any
import redis
cache = redis.Redis(host="cache.example.com", port=6379, decode_responses=True)
def cached(prefix: str, ttl_seconds: int = 300) -> Callable[..., Any]:
"""Decorator for caching function results in Redis."""
def decorator(func: Callable[..., Any]) -> Callable[..., Any]:
def wrapper(*args: Any, **kwargs: Any) -> Any:
key_data = json.dumps({"args": args, "kwargs": kwargs}, sort_keys=True)
cache_key = f"{prefix}:{hashlib.sha256(key_data.encode()).hexdigest()[:16]}"
cached_value = cache.get(cache_key)
if cached_value is not None:
return json.loads(cached_value)
result = func(*args, **kwargs)
cache.setex(cache_key, ttl_seconds, json.dumps(result))
return result
return wrapper
return decorator
@cached("products", ttl_seconds=60)
def get_products(category: str) -> list[dict[str, Any]]:
return db.query_products(category)
from functools import lru_cache
@lru_cache(maxsize=256)
def compute_expensive_result(input_hash: str) -> dict[str, Any]:
"""Cache in process memory. Use for immutable inputs only."""
return heavy_computation(input_hash)
Warning: In-memory caches are per-process. In Lambda, each concurrent execution has its own cache. Use Redis for shared state.
This is the most common pattern. The application manages the cache explicitly.
Use when read-after-write consistency is required. More complex, but eliminates stale reads.
const { data } = useQuery({
queryKey: ["products", categoryId],
queryFn: () => fetchProducts(categoryId),
staleTime: 5 * 60 * 1000, // Fresh for 5 minutes
gcTime: 30 * 60 * 1000, // Keep in memory for 30 minutes
refetchOnWindowFocus: false, // Don't refetch on tab switch
});
async function getCachedOrFetch<T>(
storeName: string,
key: string,
fetcher: () => Promise<T>,
maxAgeMs: number = 5 * 60 * 1000,
): Promise<T> {
const cached = await idb.get(storeName, key);
if (cached && Date.now() - cached.timestamp < maxAgeMs) {
return cached.data as T;
}
const fresh = await fetcher();
await idb.put(storeName, { key, data: fresh, timestamp: Date.now() });
return fresh;
}
// Cache-first for static assets, network-first for API
self.addEventListener("fetch", (event: FetchEvent) => {
const url = new URL(event.request.url);
if (url.pathname.startsWith("/static/")) {
event.respondWith(caches.match(event.request).then((r) => r ?? fetch(event.request)));
} else if (url.pathname.startsWith("/api/")) {
event.respondWith(
fetch(event.request)
.then((response) => {
const clone = response.clone();
caches.open("api-cache").then((cache) => cache.put(event.request, clone));
return response;
})
.catch(() => caches.match(event.request).then((r) => r ?? new Response("Offline", { status: 503 }))),
);
}
});
| Strategy | How it works | Best for | |----------|-------------|----------| | TTL expiry | Cache entry expires after fixed duration | Tolerant of slight staleness (product listings, feeds) | | Event-driven purge | Publish event on write → subscriber deletes cache key | Strong consistency needs (user profile, permissions) | | Versioned keys | Include version/hash in cache key; new version = new key | Configuration, feature flags | | Write-through | Update cache on every write | Read-heavy with frequent writes | | Cache stampede prevention | Lock during recomputation; others wait or serve stale | Expensive computations with high concurrency |
import time
LOCK_TTL = 10 # seconds
def get_with_lock(key: str, compute_fn: Callable[[], Any], ttl: int = 300) -> Any:
"""Prevent thundering herd on cache miss."""
value = cache.get(key)
if value is not None:
return json.loads(value)
lock_key = f"lock:{key}"
if cache.set(lock_key, "1", nx=True, ex=LOCK_TTL):
try:
result = compute_fn()
cache.setex(key, ttl, json.dumps(result))
return result
finally:
cache.delete(lock_key)
else:
# Another process is computing — wait briefly and retry
time.sleep(0.1)
value = cache.get(key)
return json.loads(value) if value else compute_fn()
Warm caches on deploy to avoid cold-start latency:
def warm_cache() -> None:
"""Call after deployment to pre-populate critical caches."""
popular_categories = ["electronics", "clothing", "home"]
for category in popular_categories:
get_products(category) # Triggers cache-aside population
When to warm: After deploys, after cache flushes, for predictably popular content. When not to warm: For long-tail content (millions of unique keys) — let demand drive caching.
| Anti-Pattern | Problem | Fix |
|-------------|---------|-----|
| Cache everything forever | Stale data served indefinitely | Set explicit TTLs on every cache entry |
| No cache at all | Every request hits origin | Add Cache-Control headers at minimum |
| Caching user-specific data in shared CDN | Data leaks between users | Use Cache-Control: private or no-store |
| Cache key includes irrelevant parameters | Low hit rate, wasted memory | Minimize cache key components |
| No invalidation strategy | Stale data until TTL expires | Use event-driven purge for mutable data |
| Cache stampede on popular keys | Origin overwhelmed on expiry | Use locking or stale-while-revalidate |
| Caching errors | Error responses cached and served to all | Never cache 5xx; cache 404 briefly if intentional |
| Using cache as primary data store | Data loss on eviction or restart | Cache is ephemeral — always have a source of truth |
When auditing an existing application for caching:
max-age with content-hashed filenamesCache-Control headers (not missing entirely)private or no-storeno-storestaleTime / gcTime configured for data fetching librarydevelopment
TypeScript coding standards and type safety conventions. Use when: creating TypeScript files, defining interfaces and types, writing type-safe code, reviewing TypeScript for type correctness, auditing a codebase for type safety gaps, eliminating any or ts-ignore usage, or improving strict-mode compliance. Covers strict typing, avoiding any and ts-ignore, discriminated unions, Zod runtime validation, immutability patterns, and proper type definitions.
testing
Writing clear, actionable tickets in any issue tracker (Jira, Linear, GitHub Issues, ServiceNow, etc.). Use when: creating epics, stories, tasks, bugs, or spikes; writing acceptance criteria; decomposing work for a sprint; linking dependencies between tickets; auditing backlog items for clarity; or coaching a team on ticket quality. Covers title conventions, description templates, acceptance criteria, decomposition rules, dependency linking, and org-specific pluggable configuration.
development
Testing strategy, patterns, and evaluation for software and LLM/AI systems. Use when: writing tests, choosing test boundaries, designing test data, structuring test suites, evaluating LLM outputs, building evaluation pipelines, setting coverage thresholds, auditing test coverage gaps in existing projects, or improving test quality and structure.
development
Writing effective status updates for different audiences and cadences. Use when: writing a weekly status update, preparing a monthly summary, drafting a quarterly review, sending updates to leadership, sharing progress with stakeholders, or improving the clarity and impact of team communications. Covers weekly, monthly, and quarterly formats tailored for upward, lateral, and downward communication.