.claude/skills/rule-document-extractor/SKILL.md
Rule mapping for document-extractor
npx skillsauth add carrot-foundation/methodology-rules rule-document-extractorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Apply this rule whenever work touches:
libs/shared/document-extractor/**/*.tslibs/shared/text-extractor/**/*.tsThe document extractor framework separates concerns between parsing (extracting fields from documents) and evaluation (deciding which fields matter). Parsers are deliberately field-neutral.
Parsers implement DocumentParser<T> and extract as many fields as possible without enforcing which ones are required:
import type { DocumentParser, NonEmptyString } from '@carrot-fndn/shared/document-extractor';
export class InvoiceParser implements DocumentParser<InvoiceFields> {
parse(rawText: string, layoutId: NonEmptyString): Partial<InvoiceFields> {
return {
invoiceNumber: this.extractInvoiceNumber(rawText), // string | undefined
issueDate: this.extractDate(rawText), // string | undefined
totalAmount: this.extractAmount(rawText), // number | undefined
};
}
}
Every field in the return type is optional. If extraction fails for a field, return undefined.
The reviewRequired flag signals that a human should verify the extraction. It is triggered exclusively by extraction quality signals:
// Correct - confidence-based review trigger
const reviewRequired = matchScore < 0.5 || fields.some(f => f.confidence < THRESHOLD);
// Wrong - business rule in parser
const reviewRequired = !fields.totalAmount || fields.totalAmount < 100;
Layout IDs and layout names use the NonEmptyString type. Cast string literals explicitly:
import type { NonEmptyString } from '@carrot-fndn/shared/document-extractor';
const LAYOUT_ID = 'invoice-standard-v2' as NonEmptyString;
Only register default layouts in defaults.ts for document types that are stable and well-established. Experimental or rapidly evolving formats should be configured at the processor level instead.
databases
Create and modify Zod schemas for runtime validation with proper type inference.
testing
Write Vitest unit tests following project conventions with proper stubs and assertions.
tools
Autonomously implement a task following project conventions with iterative verification.
testing
Analyze a pull request diff and provide structured feedback on correctness, conventions, and quality.