.ai-rulez/skills/config-loading-precedence/SKILL.md
config loading precedence
npx skillsauth add kreuzberg-dev/kreuzberg .ai-rulez/skills/config-loading-precedenceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
--ocr, --output-format, --chunk)--config-json or --config-json-base64)--config path.toml)kreuzberg.{toml,yaml,json} in cwd/parents)--host, --port)KREUZBERG_HOST, KREUZBERG_PORT)[server] section127.0.0.1:8000)Searches current directory and parents for kreuzberg.toml, kreuzberg.yaml, or kreuzberg.json. Stops at first match.
Field-level merge (not whole-object replacement):
fn merge_json_into_config(base: &ExtractionConfig, json: Value) -> Result<ExtractionConfig> {
let mut config_json = serde_json::to_value(base)?;
// Merge fields from json into config_json
serde_json::from_value(merged)?
}
Use --config-json-base64 for shell escaping.
TOML (kreuzberg.toml):
use_cache = true
[ocr]
backend = "tesseract"
languages = ["eng", "deu"]
[security_limits]
max_archive_size = 524288000
YAML and JSON follow equivalent structure.
In commands.rs: apply_extraction_overrides() applies individual flags on top of merged config.
--config-json-base64 for shell-safe JSON passing[server] section + extraction configtools
Extract text, tables, metadata, and images from 91+ document formats (PDF, Office, images, HTML, email, archives, academic) using Kreuzberg. Use when writing code that calls Kreuzberg APIs in Python, Node.js/TypeScript, Rust, or CLI. Covers installation, extraction (sync/async), configuration (OCR, chunking, output format), batch processing, error handling, and plugins.
testing
test execution patterns
development
ocr uackend management
data-ai
mime detection routing