Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jaykim88/new-tech-evaluation

Name: new-tech-evaluation
Author: jaykim88

plugins/frontend-toolkit/skills/new-tech-evaluation/SKILL.md

npx skillsauth add jaykim88/claude-ai-engineering new-tech-evaluation

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

New Tech and AI Integration Evaluation

Purpose

Decide whether to adopt a new library / framework version / AI capability with evidence — bundle impact, type safety, maintenance health, security, migration cost — not hype. Default to skepticism: proven "boring" tech is the baseline, and each new dependency spends an innovation token — the burden is on the candidate to beat what you'd otherwise write or already use.

Universal — evaluation rubric (size / types / maintenance / migration / license / a11y) applies to any frontend stack; size-check tools differ.

Procedure

New library evaluation

Measure bundle impact first
- Primary: Bundlephobia for the candidate
- Fallback when Bundlephobia data is stale or missing (a known issue since ~2024): pkg-size.dev or npx vite-bundle-visualizer against a real install
- Look at min+gzip (what users actually download), NOT raw size
- Check tree-shakability — a 200KB library that tree-shakes to 8KB beats a 30KB monolith
- Gate library adoption on size delta in PR review
TypeScript support quality
- First-class TypeScript (types shipped with library) >> @types/* package >> @ts-ignore needed
- Check: do the types actually represent runtime behavior, or are they any underneath?
Maintenance health
- Last release date (< 6 months)
- Open issue count + median response time
- Number of contributors (bus factor > 1)
- Major-version stability (breaking changes every 6 months = high migration cost)
- Adoption momentum: downloads/week trend (not absolute), is it the de-facto choice or a niche bet? A well-maintained library the community has moved away from is still a risk
Migration cost (build vs borrow) — and exit cost
- Hours to integrate
- Hours to migrate existing code (if replacing something)
- Hours of ongoing maintenance (updates, breaking-change handling)
- Exit cost, not just entry cost: how hard is it to remove later? A thin wrapper is cheap to swap; something that metastasizes through the codebase (ORM, state lib, styling system) is high lock-in — weight that as a one-way-door decision (see decision-records)
- Compatibility: supported React/Node versions, peer-dep conflicts, and Server Component support — a client-only library forces 'use client' and drags its subtree into the client bundle (see bundle-optimization)
- Compare to "just write it ourselves" — sometimes 200 lines is cheaper than a dep

4b. License compatibility

Safe for commercial use: MIT, Apache 2.0, BSD-2-Clause, BSD-3-Clause, ISC
Caution required: LGPL (linking restrictions), MPL (file-level copyleft)
Avoid for closed-source products: GPL, AGPL — viral copyleft
Check transitive deps too (a dual-licensed top-level dep can still pull GPL deps)

4c. Accessibility (for UI libraries)

Keyboard navigation works out of the box?
ARIA attributes correct?
prefers-reduced-motion honored?
Non-negotiable for UI libraries — a "great DX" component that fails a11y becomes tech debt fast

4d. Security & supply-chain

Known vulnerabilities: npm audit / Snyk / OSV against the candidate and its transitive tree
Maintainer trust: recent ownership transfer, a typosquatted name, or a lone unverified maintainer = supply-chain risk
Each dependency is attack surface and install-time code (postinstall scripts) — fewer, well-vetted deps beat many convenient ones (see security-audit)

POC code
- Use the candidate in an actual project pattern, not a toy example
- Benchmark against current solution if replacing one
- Document what worked and what didn't
Document decision in ADR (see decision-records skill for template — MADR recommended for 3+ alternatives)
- Even rejections deserve an ADR — saves the team from re-evaluating the same library in 6 months

React / Next.js major upgrade

Read the official migration guide thoroughly
Run the codemod (Next.js ships codemods for major version bumps)
Audit deprecated APIs in build output
Update one feature area at a time, ship incrementally

AI integration evaluation

Streaming UI patterns
- Server Action returns ReadableStream → render progressively via useChat or custom reader
- Loading state shows partial output as it arrives (don't block UI on complete response)
3-state handling for AI responses
- Streaming: progressive render + visible "AI is thinking" indicator
- Complete: final state with regenerate button
- Failed: error message + retry + fallback path
Human-in-the-loop gates
- For high-stakes AI output (financial, legal, medical, code-deploy)
- Always show the AI output for review before applying
- Audit trail: who approved, when, what input produced it
Graceful degradation
- AI API down? App should keep working via non-AI flow
- Never make the AI a single point of failure
- Cache previous AI responses where it makes sense
Cost monitoring
- Track tokens per session, per user
- Alert on cost spikes (often signals a prompt-injection or loop bug)
Trust boundary: protect what goes in, distrust what comes out
- Prompt injection: untrusted content in the prompt can hijack instructions — don't interpolate user/third-party text into a system prompt unguarded
- Treat AI output as untrusted input: never dangerouslySetInnerHTML it or run it as code/SQL without validation — it's an XSS/RCE vector like any user input
- Data governance: user data sent to a third-party API leaves your boundary — scrub PII, check data-retention / training opt-out and region/compliance
- (see security-audit)
Pin the model + add an eval harness
- Pin the model / API version — outputs drift across versions, so an "upgrade" can silently regress your feature
- You can't assert exact strings: build an eval set (golden cases, LLM-as-judge) to catch quality regressions (see test-strategy)

Completion Criteria

[ ] POC code exists for the candidate, not just docs reading
[ ] Bundle impact measured (Bundlephobia + actual analyzer)
[ ] Security/supply-chain checked (npm audit/OSV clean, maintainer trust); Server Component compatibility verified
[ ] Exit cost / lock-in assessed, not just integration cost
[ ] Decision documented in ADR (adopt or reject — both deserve documentation)
[ ] For AI integration: 3-state UI implemented, HITL gate for high-stakes output, graceful degradation verified
[ ] For AI: output treated as untrusted (not rendered/executed unsanitized); model version pinned + eval harness in place

Output

POC code: branch poc/<library-or-feature-name> with realistic usage (not toy example), benchmark script in scripts/poc-benchmark-<name>.ts
Evaluation report: docs/evaluations/<library-or-feature>-YYYY-MM-DD.md with sections:
- ## Bundle impact (Bundlephobia + actual analyzer numbers)
- ## TypeScript support (first-class / @types / @ts-ignore needed)
- ## Maintenance health (last release, contributors, issue response)
- ## Migration cost (estimated hours)
- ## License compatibility (MIT / Apache / GPL / etc.)
- ## A11y (if UI library)
- ## Verdict (adopt / reject / re-evaluate in N months)
Decision ADR: docs/adr/ADR-NNN-adopt-<library>.md (or reject-) — even rejections deserve documentation
AI integration only: streaming UI implementation, HITL gate code, graceful degradation fallback

Implementation

React + Next.js (default)

Size check: Bundlephobia / pkg-size.dev / npx vite-bundle-visualizer
Security/supply-chain: npm audit / OSV-Scanner / Snyk; check the candidate's transitive tree
Server Component compat: does the lib work without 'use client'? Client-only libs inflate the client bundle (cross-check bundle-optimization)
React/Next.js major upgrade: official codemods (npx @next/codemod)
AI streaming: Server Action returning ReadableStream + useChat (Vercel AI SDK) or custom reader
AI evals: pin the model id; golden-set or LLM-as-judge eval suite for regression (cross-check test-strategy)
Build-time check: next build warns on deprecated APIs

Other stacks

Vue / Nuxt: same Bundlephobia; Nuxt has its own codemods for major upgrades
SvelteKit: same Bundlephobia; SvelteKit migration guides per release
Angular: same Bundlephobia; Angular's ng update handles major version upgrades automatically (best-in-class)
AI streaming (any framework): ReadableStream is a web standard — adapter pattern works across stacks; differences are in how the framework's streaming primitives consume it

Related skills

decision-records — every adoption decision results in an ADR; weigh adoption as a one-way vs two-way door
bundle-optimization — size delta + Server Component compatibility are key adoption gates
security-audit — dependency supply-chain risk and treating AI output as untrusted
test-strategy — eval harness for non-deterministic AI features

Reference

Key insight encoded: Always compare on min+gzip + tree-shakability — a 200KB library that tree-shakes to 8KB beats a 30KB monolith. Gate library adoption on Bundlephobia size delta in PR review, and check Server Component compatibility (client-only libs balloon the bundle). Evaluate exit cost and supply-chain risk, not just entry cost — and default to boring/proven tech. For AI integration, the HITL gate is mandatory for high-stakes output (review + audit trail), AI is never a single point of failure, and crucially: treat AI output as untrusted input (don't render/execute it unsanitized) and pin the model behind an eval harness so version drift doesn't silently regress quality.

jaykim88/new-tech-evaluation

plugins/frontend-toolkit/skills/new-tech-evaluation/SKILL.md

Evaluate a new library, framework version, or AI integration with bundle size, TypeScript support, maintenance status, security/supply-chain, and migration + exit cost. POC + benchmark before adoption. Use at quarterly tech review, when a new library could solve a pain point, on a React/Next.js major version release, or on AI API major updates. Not for recording the resulting decision (use decision-records) or shrinking an already-adopted dependency (use bundle-optimization).

development

Updated May 30, 2026

$ install --global

skillsauth

npx skillsauth add jaykim88/claude-ai-engineering new-tech-evaluation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 30, 2026, 7:51 AM30.8s1 file scanned

SKILL.md

name:: new-tech-evaluation
description:: Evaluate a new library, framework version, or AI integration with bundle size, TypeScript support, maintenance status, security/supply-chain, and migration + exit cost. POC + benchmark before adoption. Use at quarterly tech review, when a new library could solve a pain point, on a React/Next.js major version release, or on AI API major updates. Not for recording the resulting decision (use decision-records) or shrinking an already-adopted dependency (use bundle-optimization).
license:: MIT

New Tech and AI Integration Evaluation

Purpose

Universal — evaluation rubric (size / types / maintenance / migration / license / a11y) applies to any frontend stack; size-check tools differ.

Procedure

New library evaluation

Measure bundle impact first
- Primary: Bundlephobia for the candidate
- Fallback when Bundlephobia data is stale or missing (a known issue since ~2024): pkg-size.dev or npx vite-bundle-visualizer against a real install
- Look at min+gzip (what users actually download), NOT raw size
- Check tree-shakability — a 200KB library that tree-shakes to 8KB beats a 30KB monolith
- Gate library adoption on size delta in PR review
TypeScript support quality
- First-class TypeScript (types shipped with library) >> @types/* package >> @ts-ignore needed
- Check: do the types actually represent runtime behavior, or are they any underneath?
Maintenance health
- Last release date (< 6 months)
- Open issue count + median response time
- Number of contributors (bus factor > 1)
- Major-version stability (breaking changes every 6 months = high migration cost)
- Adoption momentum: downloads/week trend (not absolute), is it the de-facto choice or a niche bet? A well-maintained library the community has moved away from is still a risk
Migration cost (build vs borrow) — and exit cost
- Hours to integrate
- Hours to migrate existing code (if replacing something)
- Hours of ongoing maintenance (updates, breaking-change handling)
- Exit cost, not just entry cost: how hard is it to remove later? A thin wrapper is cheap to swap; something that metastasizes through the codebase (ORM, state lib, styling system) is high lock-in — weight that as a one-way-door decision (see decision-records)
- Compatibility: supported React/Node versions, peer-dep conflicts, and Server Component support — a client-only library forces 'use client' and drags its subtree into the client bundle (see bundle-optimization)
- Compare to "just write it ourselves" — sometimes 200 lines is cheaper than a dep

4b. License compatibility

Safe for commercial use: MIT, Apache 2.0, BSD-2-Clause, BSD-3-Clause, ISC
Caution required: LGPL (linking restrictions), MPL (file-level copyleft)
Avoid for closed-source products: GPL, AGPL — viral copyleft
Check transitive deps too (a dual-licensed top-level dep can still pull GPL deps)

4c. Accessibility (for UI libraries)

Keyboard navigation works out of the box?
ARIA attributes correct?
prefers-reduced-motion honored?
Non-negotiable for UI libraries — a "great DX" component that fails a11y becomes tech debt fast

4d. Security & supply-chain

Known vulnerabilities: npm audit / Snyk / OSV against the candidate and its transitive tree
Maintainer trust: recent ownership transfer, a typosquatted name, or a lone unverified maintainer = supply-chain risk
Each dependency is attack surface and install-time code (postinstall scripts) — fewer, well-vetted deps beat many convenient ones (see security-audit)

POC code
- Use the candidate in an actual project pattern, not a toy example
- Benchmark against current solution if replacing one
- Document what worked and what didn't
Document decision in ADR (see decision-records skill for template — MADR recommended for 3+ alternatives)
- Even rejections deserve an ADR — saves the team from re-evaluating the same library in 6 months

React / Next.js major upgrade

Read the official migration guide thoroughly
Run the codemod (Next.js ships codemods for major version bumps)
Audit deprecated APIs in build output
Update one feature area at a time, ship incrementally

AI integration evaluation

Streaming UI patterns
- Server Action returns ReadableStream → render progressively via useChat or custom reader
- Loading state shows partial output as it arrives (don't block UI on complete response)
3-state handling for AI responses
- Streaming: progressive render + visible "AI is thinking" indicator
- Complete: final state with regenerate button
- Failed: error message + retry + fallback path
Human-in-the-loop gates
- For high-stakes AI output (financial, legal, medical, code-deploy)
- Always show the AI output for review before applying
- Audit trail: who approved, when, what input produced it
Graceful degradation
- AI API down? App should keep working via non-AI flow
- Never make the AI a single point of failure
- Cache previous AI responses where it makes sense
Cost monitoring
- Track tokens per session, per user
- Alert on cost spikes (often signals a prompt-injection or loop bug)
Trust boundary: protect what goes in, distrust what comes out
- Prompt injection: untrusted content in the prompt can hijack instructions — don't interpolate user/third-party text into a system prompt unguarded
- Treat AI output as untrusted input: never dangerouslySetInnerHTML it or run it as code/SQL without validation — it's an XSS/RCE vector like any user input
- Data governance: user data sent to a third-party API leaves your boundary — scrub PII, check data-retention / training opt-out and region/compliance
- (see security-audit)
Pin the model + add an eval harness
- Pin the model / API version — outputs drift across versions, so an "upgrade" can silently regress your feature
- You can't assert exact strings: build an eval set (golden cases, LLM-as-judge) to catch quality regressions (see test-strategy)

Completion Criteria

[ ] POC code exists for the candidate, not just docs reading
[ ] Bundle impact measured (Bundlephobia + actual analyzer)
[ ] Security/supply-chain checked (npm audit/OSV clean, maintainer trust); Server Component compatibility verified
[ ] Exit cost / lock-in assessed, not just integration cost
[ ] Decision documented in ADR (adopt or reject — both deserve documentation)
[ ] For AI integration: 3-state UI implemented, HITL gate for high-stakes output, graceful degradation verified
[ ] For AI: output treated as untrusted (not rendered/executed unsanitized); model version pinned + eval harness in place

Output

POC code: branch poc/<library-or-feature-name> with realistic usage (not toy example), benchmark script in scripts/poc-benchmark-<name>.ts
Evaluation report: docs/evaluations/<library-or-feature>-YYYY-MM-DD.md with sections:
- ## Bundle impact (Bundlephobia + actual analyzer numbers)
- ## TypeScript support (first-class / @types / @ts-ignore needed)
- ## Maintenance health (last release, contributors, issue response)
- ## Migration cost (estimated hours)
- ## License compatibility (MIT / Apache / GPL / etc.)
- ## A11y (if UI library)
- ## Verdict (adopt / reject / re-evaluate in N months)
Decision ADR: docs/adr/ADR-NNN-adopt-<library>.md (or reject-) — even rejections deserve documentation
AI integration only: streaming UI implementation, HITL gate code, graceful degradation fallback

Implementation

React + Next.js (default)

Size check: Bundlephobia / pkg-size.dev / npx vite-bundle-visualizer
Security/supply-chain: npm audit / OSV-Scanner / Snyk; check the candidate's transitive tree
Server Component compat: does the lib work without 'use client'? Client-only libs inflate the client bundle (cross-check bundle-optimization)
React/Next.js major upgrade: official codemods (npx @next/codemod)
AI streaming: Server Action returning ReadableStream + useChat (Vercel AI SDK) or custom reader
AI evals: pin the model id; golden-set or LLM-as-judge eval suite for regression (cross-check test-strategy)
Build-time check: next build warns on deprecated APIs

Other stacks

Vue / Nuxt: same Bundlephobia; Nuxt has its own codemods for major upgrades
SvelteKit: same Bundlephobia; SvelteKit migration guides per release
Angular: same Bundlephobia; Angular's ng update handles major version upgrades automatically (best-in-class)
AI streaming (any framework): ReadableStream is a web standard — adapter pattern works across stacks; differences are in how the framework's streaming primitives consume it

Related skills

decision-records — every adoption decision results in an ADR; weigh adoption as a one-way vs two-way door
bundle-optimization — size delta + Server Component compatibility are key adoption gates
security-audit — dependency supply-chain risk and treating AI output as untrusted
test-strategy — eval harness for non-deterministic AI features

Reference

Key insight encoded: Always compare on min+gzip + tree-shakability — a 200KB library that tree-shakes to 8KB beats a 30KB monolith. Gate library adoption on Bundlephobia size delta in PR review, and check Server Component compatibility (client-only libs balloon the bundle). Evaluate exit cost and supply-chain risk, not just entry cost — and default to boring/proven tech. For AI integration, the HITL gate is mandatory for high-stakes output (review + audit trail), AI is never a single point of failure, and crucially: treat AI output as untrusted input (don't render/execute it unsanitized) and pin the model behind an eval harness so version drift doesn't silently regress quality.

Related Skills

jaykim88/third-party-scripts

development

VerifiedTrustedCommunity

Audit and optimize third-party scripts — analytics, tag managers, chat widgets, embeds — with the right loading strategy, performance budget, facades, and CSP/consent controls. Use when adding a script, when TBT/INP regress, when a GDPR/CCPA consent requirement arises, or before shipping. Not for first-party bundle size (use bundle-optimization) or broad Core Web Vitals diagnosis (use rendering-performance).

SKILL.mdUpdated May 30, 2026

jaykim88/third-party-scripts

jaykim88/test-strategy

development

VerifiedTrustedCommunity

Apply the Testing Trophy (mostly integration tests with RTL + MSW, sparing E2E with Playwright) and set coverage thresholds. Use before new feature work, after bug fixes, when CI coverage falls below target, or when tests are flaky or break on every refactor. Not for wiring coverage gates + Playwright into the GitHub Actions matrix (use cicd-pipeline) or auditing WCAG a11y compliance (use accessibility-audit).

SKILL.mdUpdated May 30, 2026

jaykim88/test-strategy

jaykim88/tech-debt-management

development

VerifiedTrustedCommunity

Inventory and prioritize technical debt — TODO/FIXME/HACK, any usage, deprecated APIs, untested logic — with impact × effort matrix. Use at quarter start, before a refactoring sprint, when a new teammate joins, or when feature velocity slows. Not for actually paying down debt (use code-refactoring) or recording a migration approach (use decision-records) — this only inventories and prioritizes.

SKILL.mdUpdated May 30, 2026

jaykim88/tech-debt-management

jaykim88/state-management-decisions

development

VerifiedTrustedCommunity

Decision framework for choosing the right state location — URL, server cache, local component, or shared/global store. Use when state-sync bugs appear, prop drilling gets deep (3+ levels), filters/tabs lose state on reload, or quarterly review. Not for form state specifically (use form-ux) or when the state is actually server data (use api-caching-optimization).

SKILL.mdUpdated May 30, 2026

jaykim88/state-management-decisions

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jaykim88/claude-ai-engineering.git

# Copy into Claude Code skills folder (global)
cp -r claude-ai-engineering/plugins/frontend-toolkit/skills/new-tech-evaluation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jaykim88/claude-ai-engineering

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT