packages/mcp-server/skills/mbc-debug/SKILL.md
Debug and troubleshoot MBC CQRS Serverless applications. Use this when encountering errors, investigating issues, or optimizing performance in MBC CQRS Serverless projects.
npx skillsauth add mbc-net/mbc-cqrs-serverless mbc-debugInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Before executing this skill, check for updates:
mbc install-skills --check to check if a newer version is availablembc install-skills --force to updateNote: Skip this check if the user explicitly says to skip updates or if you've already checked in this session.
This skill helps debug and troubleshoot issues in MBC CQRS Serverless applications.
| Error Code | Category | Quick Fix | |------------|----------|-----------| | MBC-CMD-001 | Command | Check pk/sk format | | MBC-CMD-002 | Command | Verify version field | | MBC-CMD-003 | Command | Check command type | | MBC-DDB-001 | DynamoDB | Verify table exists | | MBC-DDB-002 | DynamoDB | Check IAM permissions | | MBC-DDB-003 | DynamoDB | Resolve version conflict | | MBC-TNT-001 | Tenant | Verify tenantCode | | MBC-TNT-002 | Tenant | Check tenant isolation | | MBC-IMP-001 | Import | Validate CSV format | | MBC-IMP-002 | Import | Check S3 permissions |
Symptom: Command not being processed
Debug Steps:
// Step 1: Enable verbose logging
const logger = new Logger('CommandDebug');
// Step 2: Log the command before publishing
logger.debug('Publishing command:', JSON.stringify(command, null, 2));
// Step 3: Wrap in try-catch with detailed logging
try {
const result = await this.commandService.publishAsync(command, {
source: commandSource,
invokeContext,
});
logger.debug('Command result:', JSON.stringify(result, null, 2));
return result;
} catch (error) {
logger.error('Command failed:', {
error: error.message,
name: error.name,
command: { pk: command.pk, sk: command.sk, type: command.type },
});
throw error;
}
Common Causes:
Symptom: Version conflict error
Debug Steps:
// Step 1: Check current version
const existing = await this.dataService.getItem({ pk, sk });
console.log('Current version:', existing.version);
// Step 2: Compare with command version
console.log('Command version:', command.version);
// Step 3: Investigate concurrent modifications
// Check CloudWatch logs for other requests modifying the same item
Solution:
// Always fetch the latest version before updating
async update(key: DetailKey, dto: UpdateDto, invokeContext: IInvoke) {
// Retry logic for version conflicts
const maxRetries = 3;
let retries = 0;
while (retries < maxRetries) {
try {
const existing = await this.dataService.getItem(key);
return await this.commandService.publishPartialUpdateAsync({
pk: key.pk,
sk: key.sk,
version: existing.version, // Always use latest version
...dto,
}, { source: commandSource, invokeContext });
} catch (error) {
if (error.name === 'ConditionalCheckFailedException') {
retries++;
if (retries >= maxRetries) {
throw new ConflictException('Too many concurrent modifications');
}
// Wait before retry
await new Promise(resolve => setTimeout(resolve, 100 * retries));
continue;
}
throw error;
}
}
}
Symptom: Data not syncing to RDS/Elasticsearch
Debug Checklist:
□ Handler is decorated with @DataSyncHandler({ type: 'ENTITY' })
□ Type matches the command's type field exactly
□ Handler is registered in module's dataSyncHandlers array
□ Handler implements IDataSyncHandler interface
□ up() method is async and properly awaited
Debug Steps:
// Step 1: Add logging to handler
@DataSyncHandler({ type: 'ORDER' })
export class OrderDataSyncRdsHandler implements IDataSyncHandler {
private readonly logger = new Logger(OrderDataSyncRdsHandler.name);
async up(cmd: CommandModel, data: DataModel): Promise<void> {
this.logger.log(`DataSync triggered for ${data.id}`);
this.logger.debug('Command:', JSON.stringify(cmd, null, 2));
this.logger.debug('Data:', JSON.stringify(data, null, 2));
// Your sync logic
}
}
// Step 2: Verify type matching
// In your command:
const command = new OrderCommandDto({
type: 'ORDER', // Must match @DataSyncHandler({ type: 'ORDER' })
});
// Step 3: Check module registration
CommandModule.register({
tableName: 'order',
dataSyncHandlers: [OrderDataSyncRdsHandler], // Must be included!
}),
Symptom: Cross-tenant data leakage or access denied
Debug Steps:
// Step 1: Log tenant context
const { tenantCode, userId } = getUserContext(invokeContext);
console.log('Tenant context:', { tenantCode, userId });
// Step 2: Verify PK includes tenant code
const pk = `ORDER#${tenantCode}`;
console.log('Generated PK:', pk);
// Step 3: Check data query includes tenant filter
const results = await this.dataService.listByPk({
pk: `ORDER#${tenantCode}`, // Tenant-scoped query
});
Common Causes:
Symptom: Cannot read properties of null when accessing result of publishSync()
Cause: Since v1.2.0, publishSync() and publishPartialUpdateSync() return null when the command is not dirty (no changes detected — no-op).
Debug Steps:
// Step 1: Check if command has actual changes
const result = await this.commandService.publishSync(entity, options)
console.log('publishSync result:', result) // null = no-op (not dirty)
// Step 2: Always null-check before accessing properties
if (!result) {
// Command was a no-op — item already up-to-date
return existingItem
}
console.log('Published:', result.pk)
Correct Pattern:
const result = await this.commandService.publishSync(command, options)
if (!result) {
// No changes were made — return existing data
return await this.dataService.getItem({ pk, sk })
}
return result
Symptom (pre-v1.2.2): Valid CSV rows are never processed because one bad row causes the entire SQS batch to crash and retry indefinitely.
Root Cause: A persistent validation error on any row (especially row 1) caused CsvBatchProcessor to throw immediately, blocking all subsequent rows until DLQ threshold was reached.
Status: Fixed in v1.2.2 via Smart Retry pattern — each row is now processed in an independent try/catch.
If still on < v1.2.2, workaround:
// Pre-validate all rows before processing to identify bad rows
const validRows = []
const invalidRows = []
for (const row of rows) {
try {
await strategy.compare(row, tenantCode)
validRows.push(row)
} catch {
invalidRows.push(row)
}
}
// Log invalid rows, then process only valid ones
Upgrade recommendation: Update to v1.2.2+ to get automatic per-row error isolation.
Symptom: App crashes at startup with:
Nest can't resolve dependencies of MyTaskService (?).
Please make sure that the argument TASK_QUEUE_EVENT_FACTORY at index [0] is available in the MyModule context.
Cause: Since v1.2.4, TaskModule.register() is global and must be called exactly once in the host AppModule. MasterModule no longer registers it internally.
Fix:
import { TaskModule, TaskQueueEventFactory } from '@mbc-cqrs-serverless/master'
@Module({
imports: [
// Register ONCE here in AppModule
TaskModule.register({ taskQueueEventFactory: MyTaskQueueEventFactory }),
MasterModule.register({ enableController: true, prismaService: PrismaService }),
// Remove TaskModule.register() from all feature modules!
],
})
export class AppModule {}
Also check: "transformTask is not a function" at runtime indicates multiple TaskModule.register() calls creating conflicting bindings.
Symptom: Session entry deleted from {NODE_ENV}-{APP_NAME}-session table before RYW_SESSION_TTL_MINUTES elapses.
Cause (intentional, since v1.2.6): Repository proactively purges sessions once the data table catches up (existing.version >= session.version). Once your write is visible in the data table, the session is no longer needed — keeping it would cause stale overrides when external updates arrive.
How to verify the purge succeeded normally:
// Check application logs. Absence of this warning means the delete succeeded:
// "Failed to delete RYW session (non-fatal): ..."
//
// You can also enable Repository debug logs to see the purge decision:
// logger.debug(`getItem session merge — version ${session.version}`, key)
When it's a real problem: If sessions disappear and Repository.getItem still returns stale data, the issue is upstream — DynamoDB Streams sync may be delayed. Check IteratorAge on the DynamoDB Streams source; a sustained non-zero value indicates the Stream → IDataSyncHandler pipeline is falling behind:
# Inspect Stream lag (high IteratorAge = sync is behind = data table stays stale)
aws cloudwatch get-metric-statistics \
--namespace AWS/Lambda \
--metric-name IteratorAge \
--dimensions Name=FunctionName,Value=your-data-sync-handler \
--start-time $(date -u -v-1H +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 60 --statistics Maximum
Related: If you maintain RDS read models, supply mergeOptions.getVersion in listItems() to skip the extra DynamoDB round-trip when the RDS row already carries the latest version (see migration guide v1.2.6).
Symptom: Import job fails or gets stuck
Debug Checklist:
□ CSV file format is valid
□ S3 bucket permissions are correct
□ Step Functions execution has proper IAM role
□ Lambda timeout is sufficient
□ Memory allocation is adequate
Debug Steps:
// Step 1: Check Step Functions execution
// AWS Console → Step Functions → Executions → View details
// Step 2: Check Lambda logs
// AWS Console → CloudWatch → Log groups → /aws/lambda/import-handler
// Step 3: Validate CSV format locally
import * as csv from 'csv-parse';
const validateCsv = async (filePath: string) => {
const parser = fs.createReadStream(filePath).pipe(csv.parse({
columns: true,
skip_empty_lines: true,
}));
let rowCount = 0;
const errors: string[] = [];
for await (const row of parser) {
rowCount++;
// Validate required fields
if (!row.code) errors.push(`Row ${rowCount}: missing code`);
if (!row.name) errors.push(`Row ${rowCount}: missing name`);
}
return { rowCount, errors };
};
Symptom: Slow API responses
Debug Areas:
// Bad: Scanning entire table
const results = await this.dataService.scan(); // Avoid!
// Good: Query by partition key
const results = await this.dataService.listByPk({
pk: `ORDER#${tenantCode}`,
limit: 20,
});
// Better: Add GSI for common queries
// Check if GSI exists for your query pattern
// Bad: N+1 queries
const orders = await this.dataService.listByPk({ pk });
for (const order of orders) {
const customer = await this.customerService.findOne(order.customerId); // N queries!
}
// Good: Batch fetch
const orders = await this.dataService.listByPk({ pk });
const customerIds = [...new Set(orders.map(o => o.customerId))];
const customers = await this.customerService.findByIds(customerIds); // 1 query
const customerMap = new Map(customers.map(c => [c.id, c]));
// Enable provisioned concurrency for critical functions
// serverless.yml
functions:
api:
handler: dist/lambda.handler
provisionedConcurrency: 2
fields @timestamp, @message
| filter @requestId = "REQUEST_ID_HERE"
| sort @timestamp asc
fields @timestamp, @message
| filter @message like /ConditionalCheckFailedException/
| sort @timestamp desc
| limit 100
fields @timestamp, @duration, @message
| filter @duration > 3000
| sort @duration desc
| limit 50
fields @timestamp, @message
| filter @message like /DataSync/
| sort @timestamp desc
| limit 100
Start LocalStack:
docker-compose up -d localstack
Verify Services:
# Check DynamoDB
aws --endpoint-url=http://localhost:4566 dynamodb list-tables
# Check S3
aws --endpoint-url=http://localhost:4566 s3 ls
# Check SQS
aws --endpoint-url=http://localhost:4566 sqs list-queues
# Start with debug logging
DEBUG=* npm run offline
# Or enable specific debug namespaces
DEBUG=serverless:* npm run offline
// .vscode/launch.json
{
"version": "0.2.0",
"configurations": [
{
"type": "node",
"request": "launch",
"name": "Debug Serverless Offline",
"runtimeExecutable": "npm",
"runtimeArgs": ["run", "offline"],
"port": 9229,
"restart": true,
"console": "integratedTerminal",
"internalConsoleOptions": "neverOpen"
}
]
}
npm list @mbc-cqrs-serverless/core
npm list @mbc-cqrs-serverless/cli
aws dynamodb describe-table --table-name YOUR_TABLE_NAME
aws lambda get-function-configuration --function-name YOUR_FUNCTION_NAME
aws logs tail /aws/lambda/YOUR_FUNCTION_NAME --since 1h
Error Occurred
│
├── Is it a TypeScript compilation error?
│ ├── "genNewSequence is not a function"?
│ │ └── Removed in v1.2.0 — use generateSequenceItem() instead
│ │
│ └── Check import statements and type definitions
│
├── Is it a startup crash?
│ ├── "Nest can't resolve dependencies of MyTaskService"?
│ │ └── v1.2.4: Add TaskModule.register() to AppModule (see §7)
│ │
│ └── Check module imports and provider registration
│
├── Is it a runtime error?
│ ├── "Cannot read properties of null" after publishSync?
│ │ └── v1.2.0+: publishSync returns null on no-op — add null check (see §5)
│ │
│ ├── ConditionalCheckFailedException?
│ │ └── Version mismatch - fetch latest version before updating
│ │
│ ├── "transformTask is not a function"?
│ │ └── v1.2.4: Multiple TaskModule.register() calls — keep only one in AppModule
│ │
│ ├── ResourceNotFoundException?
│ │ └── Table/Item doesn't exist - check table name
│ │
│ ├── ValidationError?
│ │ └── Check DTO validation decorators
│ │
│ └── Unknown error?
│ └── Check CloudWatch logs for stack trace
│
├── Is it a silent failure?
│ ├── CSV import rows silently blocked (pre-v1.2.2)?
│ │ └── Poison Pill problem — upgrade to v1.2.2+ (see §6)
│ │
│ ├── DataSyncHandler not running?
│ │ └── Check type matching and registration
│ │
│ ├── Event not received?
│ │ └── Check SNS/SQS configuration
│ │
│ └── Command not processing?
│ └── Check Lambda invocation and DLQ
│
└── Is it a performance issue?
├── Cold start?
│ └── Enable provisioned concurrency
│
├── Slow queries?
│ └── Add GSI or optimize query pattern
│
└── Memory issues?
└── Increase Lambda memory allocation
Symptom: appsync-event transport is configured but clients receive no events.
Checklist:
Verify NOTIFICATION_TRANSPORTS includes appsync-event
echo $NOTIFICATION_TRANSPORTS
# Expected: appsync-event or appsync-graphql,appsync-event
Verify APPSYNC_EVENTS_ENDPOINT is set and correct
# Must end with /event, not /graphql
echo $APPSYNC_EVENTS_ENDPOINT
# Expected: https://xxxx.appsync-api.ap-northeast-1.amazonaws.com/event
Check IAM permissions (most common cause)
# Lambda/ECS execution role must have appsync:EventPublish
aws iam simulate-principal-policy \
--policy-source-arn arn:aws:iam::ACCOUNT:role/YOUR_LAMBDA_ROLE \
--action-names appsync:EventPublish \
--resource-arns "arn:aws:appsync:REGION:ACCOUNT:apis/API_ID/channelNamespace/*"
If using CDK, appSyncEventsApi.grantPublish(lambdaRole) handles this automatically.
Verify APPSYNC_EVENTS_NAMESPACE matches the ChannelNamespace in AppSync
echo $APPSYNC_EVENTS_NAMESPACE
# Expected: default (or whatever namespace was created by CDK)
The value must match a pre-created ChannelNamespace in the AppSync Event API. Check the AWS Console → AppSync → your Event API → Channel Namespaces.
Check for 400 errors in CloudWatch
A 400 response from AppSync means the channel path is invalid. Channel segments must be alphanumeric + dashes only, max 50 chars each. The framework sanitizes tenantCode, action, and id automatically.
Confirm dual-publish is intentional
If NOTIFICATION_TRANSPORTS=appsync-graphql,appsync-event, the framework publishes to both transports. A failure in one transport causes the entire publish to fail — check CloudWatch for errors from either service.
When reporting issues, include:
npm list @mbc-cqrs-serverless/core)node --version)Resources:
development
Review code for MBC CQRS Serverless best practices and anti-patterns. Use this when reviewing code that uses MBC CQRS Serverless framework, checking for common mistakes, or validating implementation patterns.
development
Guide version migrations for MBC CQRS Serverless framework. Use this when upgrading framework versions, migrating from deprecated APIs, or understanding breaking changes between versions.
development
Generate MBC CQRS Serverless boilerplate code. Use this when the user wants to create a new module, service, controller, command, query, event handler, or data sync handler for MBC CQRS Serverless framework.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.