Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jaykim88/data-validation

Name: data-validation
Author: jaykim88

plugins/backend-toolkit/skills/data-validation/SKILL.md

npx skillsauth add jaykim88/claude-ai-engineering data-validation

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Data Validation (Parse, Don't Validate)

Purpose

Validate untrusted input exactly once, at the system boundary, and convert it into a typed value that downstream code is statically guaranteed is valid. Stop passing raw/unparsed data inward and re-checking it everywhere.

Universal — the "parse, don't validate" principle (validate at the boundary, return a typed parsed value) applies to any typed language; only the validation library differs.

Procedure

Identify every trust boundary
- HTTP request body / query / params / headers
- Queue / event payloads
- External API responses (don't trust them either)
- File contents, env vars
Define a schema per boundary input
- Co-locate with the endpoint or in a shared schema module
- The schema IS the documentation of what's accepted
Parse, don't validate — return a typed value, not a boolean
- const data = Schema.parse(raw) → data is now a typed, guaranteed-valid object
- Anti-pattern: if (isValid(raw)) { use raw as any } — downstream still sees untyped data
- Use safeParse at boundaries to convert failures into a 400 response, not a thrown 500
Validate at the boundary ONLY — trust inward
- Once parsed, inner functions take the typed value and never re-validate
- Re-validation everywhere = noise + drift; the type system carries the guarantee
Coerce and normalize during parse
- Trim strings, coerce numeric query params, normalize emails/dates
- Output of parse should be canonical form, ready to use

5b. Cap size + bound dangerous types at the boundary

Payload size limit at the HTTP layer (e.g. 1MB body cap by default; raise per-endpoint when justified) — a 10GB JSON request will OOM the server before the schema parser is reached
String length / array length caps in the schema — Schema.string().max(N), .array().max(N) — prevent ReDoS / pathological allocations
Numbers: distinguish int vs float; BigInt for IDs / counters that may exceed Number.MAX_SAFE_INTEGER; reject NaN / Infinity
Dates / timezones: parse into a canonical UTC Date / instant; reject ambiguous local time without zone — date-string handling is the #1 silent corruption source
Regex: avoid user-supplied regex or unbounded *? patterns (ReDoS); use a regex library with timeout or pre-validated patterns

Validate (validation loop)
- Send malformed input to each boundary; verify a clean 400 (not a 500 or silent acceptance)
- If invalid data reaches business logic / DB → the boundary schema is incomplete; fix and re-test

Anti-patterns

| ❌ Anti-pattern | ✅ Correct | |---|---| | if (isValid(x)) { use x } (boolean check) | const parsed = Schema.parse(x) (typed value) | | Re-validating the same data in 5 inner functions | Parse once at boundary, trust the type inward | | Trusting external API responses without parsing | Parse external responses too — they're untrusted | | body as RequestType (type assertion, no runtime check) | Runtime parse that produces the type | | Throwing 500 on bad input | safeParse → 400 with field errors | | No request-body size cap (10GB JSON OOMs the server) | Body-size limit at HTTP layer + .max() in schema | | Unbounded string / array fields | .max(N) in the schema | | Storing user input as a local-time date with no zone | Parse to canonical UTC; reject ambiguous local-time strings | | User-supplied or unbounded regex | Pre-validated patterns + execution timeout |

Completion Criteria

[ ] Every trust boundary has a parse step
[ ] Parse returns typed values (no as assertions on raw input)
[ ] Malformed input returns 400 (verified), never 500 or silent acceptance
[ ] No re-validation of already-parsed data in inner layers

Output

Schema modules: one per boundary input, shared where reused
Boundary parse code: safeParse → 400 mapping
Commit format: feat(validation): parse <endpoint> input at boundary

Implementation

TypeScript + NestJS (default)

Zod schemas + safeParse at the controller boundary, OR NestJS class-validator DTOs with ValidationPipe
Zod for shared client/server schemas (pairs with frontend-toolkit form-ux)
Map ZodError → RFC 9457 400 response in a global filter

Other stacks

Python / FastAPI: Pydantic v2 models (parsing is built into the framework — request body → typed model)
Go: go-playground/validator on structs; or parse into typed structs explicitly
Universal: "parse don't validate" is a principle (validate at boundary → typed value), implementable in any typed language

Related skills

api-contract — the contract's request schema is the validation schema
backend-security-audit — input validation is the first injection defense
authentication — validate token claims as untrusted input

Reference

Key insight encoded: Validate once at the trust boundary and return a typed parsed value (not a boolean) so downstream code is statically guaranteed valid.
Caveat: King's essay is Haskell-flavored — the principle is universal but the examples are FP. Pair with Zod docs for the concrete TypeScript landing.

jaykim88/data-validation

plugins/backend-toolkit/skills/data-validation/SKILL.md

Validate untrusted input once at the trust boundary and return a typed parsed value (parse, don't validate). Use when adding an endpoint, accepting external input, or when invalid data leaks past the boundary into business logic. Not for defining the API contract/schema itself (use api-contract) or downstream business-rule logic — parse only at the trust boundary.

development

Updated Jun 9, 2026

$ install --global

skillsauth

npx skillsauth add jaykim88/claude-ai-engineering data-validation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 9, 2026, 8:25 AM134.4s1 file scanned

SKILL.md

name:: data-validation
description:: Validate untrusted input once at the trust boundary and return a typed parsed value (parse, don't validate). Use when adding an endpoint, accepting external input, or when invalid data leaks past the boundary into business logic. Not for defining the API contract/schema itself (use api-contract) or downstream business-rule logic — parse only at the trust boundary.
license:: MIT

Data Validation (Parse, Don't Validate)

Purpose

Universal — the "parse, don't validate" principle (validate at the boundary, return a typed parsed value) applies to any typed language; only the validation library differs.

Procedure

Identify every trust boundary
- HTTP request body / query / params / headers
- Queue / event payloads
- External API responses (don't trust them either)
- File contents, env vars
Define a schema per boundary input
- Co-locate with the endpoint or in a shared schema module
- The schema IS the documentation of what's accepted
Parse, don't validate — return a typed value, not a boolean
- const data = Schema.parse(raw) → data is now a typed, guaranteed-valid object
- Anti-pattern: if (isValid(raw)) { use raw as any } — downstream still sees untyped data
- Use safeParse at boundaries to convert failures into a 400 response, not a thrown 500
Validate at the boundary ONLY — trust inward
- Once parsed, inner functions take the typed value and never re-validate
- Re-validation everywhere = noise + drift; the type system carries the guarantee
Coerce and normalize during parse
- Trim strings, coerce numeric query params, normalize emails/dates
- Output of parse should be canonical form, ready to use

5b. Cap size + bound dangerous types at the boundary

Payload size limit at the HTTP layer (e.g. 1MB body cap by default; raise per-endpoint when justified) — a 10GB JSON request will OOM the server before the schema parser is reached
String length / array length caps in the schema — Schema.string().max(N), .array().max(N) — prevent ReDoS / pathological allocations
Numbers: distinguish int vs float; BigInt for IDs / counters that may exceed Number.MAX_SAFE_INTEGER; reject NaN / Infinity
Dates / timezones: parse into a canonical UTC Date / instant; reject ambiguous local time without zone — date-string handling is the #1 silent corruption source
Regex: avoid user-supplied regex or unbounded *? patterns (ReDoS); use a regex library with timeout or pre-validated patterns

Validate (validation loop)
- Send malformed input to each boundary; verify a clean 400 (not a 500 or silent acceptance)
- If invalid data reaches business logic / DB → the boundary schema is incomplete; fix and re-test

Anti-patterns

Completion Criteria

[ ] Every trust boundary has a parse step
[ ] Parse returns typed values (no as assertions on raw input)
[ ] Malformed input returns 400 (verified), never 500 or silent acceptance
[ ] No re-validation of already-parsed data in inner layers

Output

Schema modules: one per boundary input, shared where reused
Boundary parse code: safeParse → 400 mapping
Commit format: feat(validation): parse <endpoint> input at boundary

Implementation

TypeScript + NestJS (default)

Zod schemas + safeParse at the controller boundary, OR NestJS class-validator DTOs with ValidationPipe
Zod for shared client/server schemas (pairs with frontend-toolkit form-ux)
Map ZodError → RFC 9457 400 response in a global filter

Other stacks

Python / FastAPI: Pydantic v2 models (parsing is built into the framework — request body → typed model)
Go: go-playground/validator on structs; or parse into typed structs explicitly
Universal: "parse don't validate" is a principle (validate at boundary → typed value), implementable in any typed language

Related skills

api-contract — the contract's request schema is the validation schema
backend-security-audit — input validation is the first injection defense
authentication — validate token claims as untrusted input

Reference

Key insight encoded: Validate once at the trust boundary and return a typed parsed value (not a boolean) so downstream code is statically guaranteed valid.
Caveat: King's essay is Haskell-flavored — the principle is universal but the examples are FP. Pair with Zod docs for the concrete TypeScript landing.

Related Skills

jaykim88/webhook-design

development

VerifiedTrustedCommunity

Design webhooks correctly on both sides — sending (HMAC signing, retries with backoff, at-least-once) and receiving (verify signature on raw body, enqueue + 200 fast, dedupe on event id). Use when adding webhook delivery or consuming a provider's webhooks. Not for internal service-to-service events (use async-messaging) or general outbound-call retry policy (use resilience-patterns).

SKILL.mdUpdated Jun 9, 2026

jaykim88/webhook-design

jaykim88/transaction-management

testing

VerifiedTrustedCommunity

Use transactions and isolation levels correctly — keep them short, no network calls inside, explicit isolation, retry on serialization conflicts, and choose optimistic vs pessimistic locking. Use when a write spans multiple tables, when concurrent updates corrupt data, or when designing money/inventory flows. Not for cross-service event delivery (use async-messaging Outbox) or schema-level constraints (use schema-design).

SKILL.mdUpdated Jun 9, 2026

jaykim88/transaction-management

jaykim88/test-strategy

development

VerifiedTrustedCommunity

Backend testing pyramid — unit for pure logic, integration against a real DB (Testcontainers), and consumer-driven contract testing (Pact) for service boundaries. Use before a feature, after a bug fix, or when services break each other on deploy. Not for load testing (use performance-profiling) or security testing (use backend-security-audit).

SKILL.mdUpdated Jun 9, 2026

jaykim88/test-strategy

jaykim88/schema-design

data-ai

VerifiedTrustedCommunity

Design a relational schema — normalize to 3NF then denormalize with justification, choose the right Postgres index type per data shape, enforce constraints at the DB. Use when modeling a new domain, when queries are slow, or before a migration. Not for diagnosing slow queries (use query-optimization) or shipping the change without downtime (use migration-strategy).

SKILL.mdUpdated Jun 9, 2026

jaykim88/schema-design

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jaykim88/claude-ai-engineering.git

# Copy into Claude Code skills folder (global)
cp -r claude-ai-engineering/plugins/backend-toolkit/skills/data-validation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jaykim88/claude-ai-engineering

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT