plugins/frontend-toolkit/skills/new-tech-evaluation/SKILL.md
Evaluate a new library, framework version, or AI integration with bundle size, TypeScript support, maintenance status, security/supply-chain, and migration + exit cost. POC + benchmark before adoption. Use at quarterly tech review, when a new library could solve a pain point, on a React/Next.js major version release, or on AI API major updates. Not for recording the resulting decision (use decision-records) or shrinking an already-adopted dependency (use bundle-optimization).
npx skillsauth add jaykim88/claude-ai-engineering new-tech-evaluationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Decide whether to adopt a new library / framework version / AI capability with evidence — bundle impact, type safety, maintenance health, security, migration cost — not hype. Default to skepticism: proven "boring" tech is the baseline, and each new dependency spends an innovation token — the burden is on the candidate to beat what you'd otherwise write or already use.
Universal — evaluation rubric (size / types / maintenance / migration / license / a11y) applies to any frontend stack; size-check tools differ.
Measure bundle impact first
npx vite-bundle-visualizer against a real installTypeScript support quality
any underneath?Maintenance health
Migration cost (build vs borrow) — and exit cost
decision-records)'use client' and drags its subtree into the client bundle (see bundle-optimization)4b. License compatibility
4c. Accessibility (for UI libraries)
prefers-reduced-motion honored?4d. Security & supply-chain
npm audit / Snyk / OSV against the candidate and its transitive treesecurity-audit)POC code
Document decision in ADR (see decision-records skill for template — MADR recommended for 3+ alternatives)
Streaming UI patterns
ReadableStream → render progressively via useChat or custom reader3-state handling for AI responses
Human-in-the-loop gates
Graceful degradation
Cost monitoring
Trust boundary: protect what goes in, distrust what comes out
dangerouslySetInnerHTML it or run it as code/SQL without validation — it's an XSS/RCE vector like any user inputsecurity-audit)Pin the model + add an eval harness
test-strategy)npm audit/OSV clean, maintainer trust); Server Component compatibility verifiedpoc/<library-or-feature-name> with realistic usage (not toy example), benchmark script in scripts/poc-benchmark-<name>.tsdocs/evaluations/<library-or-feature>-YYYY-MM-DD.md with sections:
## Bundle impact (Bundlephobia + actual analyzer numbers)## TypeScript support (first-class / @types / @ts-ignore needed)## Maintenance health (last release, contributors, issue response)## Migration cost (estimated hours)## License compatibility (MIT / Apache / GPL / etc.)## A11y (if UI library)## Verdict (adopt / reject / re-evaluate in N months)docs/adr/ADR-NNN-adopt-<library>.md (or reject-) — even rejections deserve documentationnpx vite-bundle-visualizernpm audit / OSV-Scanner / Snyk; check the candidate's transitive tree'use client'? Client-only libs inflate the client bundle (cross-check bundle-optimization)npx @next/codemod)ReadableStream + useChat (Vercel AI SDK) or custom readertest-strategy)next build warns on deprecated APIsng update handles major version upgrades automatically (best-in-class)ReadableStream is a web standard — adapter pattern works across stacks; differences are in how the framework's streaming primitives consume itdecision-records — every adoption decision results in an ADR; weigh adoption as a one-way vs two-way doorbundle-optimization — size delta + Server Component compatibility are key adoption gatessecurity-audit — dependency supply-chain risk and treating AI output as untrustedtest-strategy — eval harness for non-deterministic AI featuresdevelopment
Audit and optimize third-party scripts — analytics, tag managers, chat widgets, embeds — with the right loading strategy, performance budget, facades, and CSP/consent controls. Use when adding a script, when TBT/INP regress, when a GDPR/CCPA consent requirement arises, or before shipping. Not for first-party bundle size (use bundle-optimization) or broad Core Web Vitals diagnosis (use rendering-performance).
development
Apply the Testing Trophy (mostly integration tests with RTL + MSW, sparing E2E with Playwright) and set coverage thresholds. Use before new feature work, after bug fixes, when CI coverage falls below target, or when tests are flaky or break on every refactor. Not for wiring coverage gates + Playwright into the GitHub Actions matrix (use cicd-pipeline) or auditing WCAG a11y compliance (use accessibility-audit).
development
Inventory and prioritize technical debt — TODO/FIXME/HACK, any usage, deprecated APIs, untested logic — with impact × effort matrix. Use at quarter start, before a refactoring sprint, when a new teammate joins, or when feature velocity slows. Not for actually paying down debt (use code-refactoring) or recording a migration approach (use decision-records) — this only inventories and prioritizes.
development
Decision framework for choosing the right state location — URL, server cache, local component, or shared/global store. Use when state-sync bugs appear, prop drilling gets deep (3+ levels), filters/tabs lose state on reload, or quarterly review. Not for form state specifically (use form-ux) or when the state is actually server data (use api-caching-optimization).