library/specializations/qa-testing-automation/skills/test-data-generation/SKILL.md
Synthetic test data generation and management using Faker.js and similar tools. Generate realistic test data, create data factories, implement database seeding, and manage test data anonymization.
npx skillsauth add a5c-ai/babysitter test-data-generationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are test-data-generation - a specialized skill for synthetic test data generation and management, providing capabilities for creating realistic, reproducible test data.
This skill enables AI-powered test data management including:
Generate realistic test data with Faker.js:
import { faker } from '@faker-js/faker';
// Generate user data
const generateUser = () => ({
id: faker.string.uuid(),
email: faker.internet.email(),
firstName: faker.person.firstName(),
lastName: faker.person.lastName(),
phone: faker.phone.number(),
address: {
street: faker.location.streetAddress(),
city: faker.location.city(),
state: faker.location.state(),
zipCode: faker.location.zipCode(),
country: faker.location.country()
},
company: faker.company.name(),
jobTitle: faker.person.jobTitle(),
avatar: faker.image.avatar(),
createdAt: faker.date.past(),
updatedAt: faker.date.recent()
});
// Generate multiple users
const users = faker.helpers.multiple(generateUser, { count: 100 });
Create reusable data factories:
import { faker } from '@faker-js/faker';
// User Factory
class UserFactory {
static defaults = {
id: () => faker.string.uuid(),
email: () => faker.internet.email(),
firstName: () => faker.person.firstName(),
lastName: () => faker.person.lastName(),
role: () => 'user',
isActive: () => true,
createdAt: () => faker.date.past()
};
static create(overrides = {}) {
const defaults = Object.fromEntries(
Object.entries(this.defaults).map(([key, fn]) => [key, fn()])
);
return { ...defaults, ...overrides };
}
static createMany(count, overrides = {}) {
return Array.from({ length: count }, () => this.create(overrides));
}
// Trait methods
static admin(overrides = {}) {
return this.create({ role: 'admin', ...overrides });
}
static inactive(overrides = {}) {
return this.create({ isActive: false, ...overrides });
}
}
// Usage
const user = UserFactory.create();
const admin = UserFactory.admin({ firstName: 'Admin' });
const users = UserFactory.createMany(50);
Using Fishery for typed factories:
import { Factory } from 'fishery';
import { faker } from '@faker-js/faker';
interface User {
id: string;
email: string;
firstName: string;
lastName: string;
role: 'user' | 'admin';
profile: Profile;
}
interface Profile {
bio: string;
avatar: string;
}
const profileFactory = Factory.define<Profile>(() => ({
bio: faker.person.bio(),
avatar: faker.image.avatar()
}));
const userFactory = Factory.define<User>(({ associations, sequence }) => ({
id: faker.string.uuid(),
email: faker.internet.email(),
firstName: faker.person.firstName(),
lastName: faker.person.lastName(),
role: 'user',
profile: associations.profile || profileFactory.build()
}));
// Usage
const user = userFactory.build();
const admin = userFactory.build({ role: 'admin' });
const usersWithProfiles = userFactory.buildList(10, {}, {
associations: { profile: profileFactory.build() }
});
Seed databases with test data:
// seed.js - Database seeding script
import { PrismaClient } from '@prisma/client';
import { faker } from '@faker-js/faker';
const prisma = new PrismaClient();
async function seed() {
// Set seed for reproducibility
faker.seed(12345);
// Clear existing data
await prisma.order.deleteMany();
await prisma.product.deleteMany();
await prisma.user.deleteMany();
// Create users
const users = await Promise.all(
Array.from({ length: 50 }, () =>
prisma.user.create({
data: {
email: faker.internet.email(),
name: faker.person.fullName(),
password: faker.internet.password()
}
})
)
);
// Create products
const products = await Promise.all(
Array.from({ length: 100 }, () =>
prisma.product.create({
data: {
name: faker.commerce.productName(),
description: faker.commerce.productDescription(),
price: parseFloat(faker.commerce.price()),
sku: faker.string.alphanumeric(8).toUpperCase(),
inStock: faker.datatype.boolean()
}
})
)
);
// Create orders
for (const user of users) {
const orderCount = faker.number.int({ min: 1, max: 5 });
for (let i = 0; i < orderCount; i++) {
await prisma.order.create({
data: {
userId: user.id,
status: faker.helpers.arrayElement(['pending', 'processing', 'shipped', 'delivered']),
total: parseFloat(faker.commerce.price({ min: 10, max: 500 })),
items: {
create: faker.helpers.arrayElements(products, { min: 1, max: 5 }).map(p => ({
productId: p.id,
quantity: faker.number.int({ min: 1, max: 3 }),
price: p.price
}))
}
}
});
}
}
console.log('Seeding complete!');
}
seed()
.catch(console.error)
.finally(() => prisma.$disconnect());
Generate edge case test data:
import { faker } from '@faker-js/faker';
const boundaryValues = {
// String boundaries
strings: {
empty: '',
singleChar: 'a',
maxLength: 'a'.repeat(255),
unicode: '日本語テスト',
emoji: '🎉🚀💡',
specialChars: '<script>alert("xss")</script>',
sqlInjection: "'; DROP TABLE users; --",
whitespace: ' spaces ',
newlines: 'line1\nline2\rline3'
},
// Number boundaries
numbers: {
zero: 0,
negative: -1,
maxInt: Number.MAX_SAFE_INTEGER,
minInt: Number.MIN_SAFE_INTEGER,
decimal: 0.1 + 0.2, // Famous floating point issue
infinity: Infinity,
nan: NaN
},
// Date boundaries
dates: {
epochStart: new Date(0),
farPast: new Date('1900-01-01'),
farFuture: new Date('2100-12-31'),
leapYear: new Date('2024-02-29'),
endOfMonth: new Date('2024-01-31'),
timezoneEdge: new Date('2024-03-10T02:30:00') // DST transition
},
// Array boundaries
arrays: {
empty: [],
single: [1],
large: Array.from({ length: 10000 }, (_, i) => i)
}
};
// Generate boundary test cases
function generateBoundaryTestCases(schema) {
const testCases = [];
for (const [field, config] of Object.entries(schema)) {
if (config.type === 'string') {
testCases.push(
{ [field]: '', expected: config.required ? 'error' : 'success' },
{ [field]: 'a'.repeat(config.maxLength + 1), expected: 'error' },
{ [field]: 'a'.repeat(config.maxLength), expected: 'success' }
);
}
if (config.type === 'number') {
testCases.push(
{ [field]: config.min - 1, expected: 'error' },
{ [field]: config.min, expected: 'success' },
{ [field]: config.max, expected: 'success' },
{ [field]: config.max + 1, expected: 'error' }
);
}
}
return testCases;
}
Anonymize production data for testing:
import { faker } from '@faker-js/faker';
import crypto from 'crypto';
const anonymize = {
// Consistent anonymization (same input = same output)
email: (email) => {
const hash = crypto.createHash('md5').update(email).digest('hex').slice(0, 8);
return `user_${hash}@example.com`;
},
// Full replacement
name: () => faker.person.fullName(),
// Partial masking
phone: (phone) => phone.replace(/\d(?=\d{4})/g, '*'),
// Format preservation
creditCard: (cc) => {
const last4 = cc.slice(-4);
return `****-****-****-${last4}`;
},
// Consistent fake data
ssn: (ssn) => {
faker.seed(crypto.createHash('md5').update(ssn).digest('hex'));
return faker.string.numeric('###-##-####');
},
// Address anonymization
address: () => ({
street: faker.location.streetAddress(),
city: faker.location.city(),
state: faker.location.state(),
zip: faker.location.zipCode()
})
};
// Anonymize dataset
function anonymizeDataset(records) {
return records.map(record => ({
...record,
email: anonymize.email(record.email),
name: anonymize.name(),
phone: anonymize.phone(record.phone),
creditCard: record.creditCard ? anonymize.creditCard(record.creditCard) : null,
address: anonymize.address()
}));
}
Generate data in different locales:
import { faker, Faker } from '@faker-js/faker';
import { de, fr, ja, es } from '@faker-js/faker';
// German locale
const fakerDE = new Faker({ locale: [de] });
const germanUser = {
name: fakerDE.person.fullName(),
address: fakerDE.location.streetAddress(),
city: fakerDE.location.city()
};
// Japanese locale
const fakerJA = new Faker({ locale: [ja] });
const japaneseUser = {
name: fakerJA.person.fullName(),
address: fakerJA.location.streetAddress(),
city: fakerJA.location.city()
};
// Generate test data for multiple locales
const locales = { de, fr, ja, es };
function generateMultiLocaleData(count = 10) {
return Object.entries(locales).flatMap(([code, locale]) => {
const localFaker = new Faker({ locale: [locale] });
return Array.from({ length: count }, () => ({
locale: code,
name: localFaker.person.fullName(),
email: localFaker.internet.email(),
phone: localFaker.phone.number(),
address: localFaker.location.streetAddress()
}));
});
}
Create reproducible test data:
import { faker } from '@faker-js/faker';
// Set global seed for reproducibility
faker.seed(12345);
// Generate same data every time
const user1 = faker.person.fullName(); // Always same name
const user2 = faker.person.fullName(); // Always same name
// Reset seed for new sequence
faker.seed(12345);
const user1Again = faker.person.fullName(); // Same as user1
// Environment-based seeding
const testSeed = process.env.TEST_SEED || Date.now();
faker.seed(testSeed);
console.log(`Using seed: ${testSeed}`);
This skill can leverage the following MCP servers for enhanced capabilities:
| Server | Description | Installation | |--------|-------------|--------------| | funsjanssen/faker-mcp | Faker.js MCP Server | GitHub |
This skill integrates with the following processes:
test-data-management.js - All phases of test data handlinge2e-test-suite.js - E2E test data setupapi-testing.js - API test data generationenvironment-management.js - Environment data seedingWhen executing operations, provide structured output:
{
"operation": "generate",
"dataType": "users",
"count": 100,
"seed": 12345,
"locale": "en",
"schema": {
"id": "uuid",
"email": "email",
"name": "fullName"
},
"outputFile": "./test-data/users.json",
"statistics": {
"generated": 100,
"uniqueEmails": 100,
"executionTime": "45ms"
}
}
development
Model documentation skill for generating model cards following Google's model card framework.
development
MLflow integration skill for experiment tracking, model registry, and artifact management. Enables LLMs to log experiments, compare runs, manage model lifecycle, and retrieve artifacts through the MLflow API.
data-ai
LIME-based local explanation skill for individual predictions across tabular, text, and image data.
devops
Kubeflow Pipelines skill for ML workflow orchestration, component management, and Kubernetes-native ML.