.ai-rulez/skills/wasm-constraints/SKILL.md
wasm constraints
npx skillsauth add kreuzberg-dev/kreuzberg .ai-rulez/skills/wasm-constraintsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
WASM target in crates/kreuzberg-wasm/. Uses wasm-bindgen with sync-only internal APIs.
[features]
wasm-target = ["pdf", "html", "xml", "email", "language-detection", "chunking", "quality", "office"]
wasm-threads = ["dep:wasm-bindgen-rayon"] # Optional
All operations must be synchronous internally. Use #[cfg(not(feature = "tokio-runtime"))] paths.
Every WASM-compatible extractor MUST implement SyncExtractor:
impl SyncExtractor for MyExtractor {
fn extract_sync(&self, content: &[u8], mime_type: &str, config: &ExtractionConfig)
-> Result<ExtractionResult> { /* sync implementation */ }
}
impl DocumentExtractor for MyExtractor {
fn as_sync_extractor(&self) -> Option<&dyn SyncExtractor> {
Some(self) // MUST return Some for WASM
}
}
const MAX_HTML_SIZE: usize = 2 * 1024 * 1024; // 2MB - stack constraint
import init, { initialize_pdfium_render } from './kreuzberg_wasm.js';
const wasm = await init();
const pdfium = await pdfiumModule();
initialize_pdfium_render(pdfium, wasm, false); // REQUIRED for PDF
[lib]
crate-type = ["cdylib", "rlib"]
[profile.release.package.kreuzberg-wasm]
opt-level = "z" # Size optimization
codegen-units = 1
#[wasm_bindgen]
pub async fn extract_from_bytes(content: Vec<u8>, config: JsValue) -> Result<JsValue, JsValue> {
let config: ExtractionConfig = serde_wasm_bindgen::from_value(config)?;
let result = extract_bytes_sync(&content, mime_type, &config)?;
Ok(serde_wasm_bindgen::to_value(&result)?)
}
Functions can be async for JS compatibility, but internal extraction is sync.
opt-level = "z"#[cfg(target_arch = "wasm32")]tools
Extract text, tables, metadata, and images from 91+ document formats (PDF, Office, images, HTML, email, archives, academic) using Kreuzberg. Use when writing code that calls Kreuzberg APIs in Python, Node.js/TypeScript, Rust, or CLI. Covers installation, extraction (sync/async), configuration (OCR, chunking, output format), batch processing, error handling, and plugins.
testing
test execution patterns
development
ocr uackend management
data-ai
mime detection routing