.remote-cache/kreuzberg-shared-rules/.ai-rulez/skills/security-and-vulnerability-management/SKILL.md
______________________________________________________________________ ## priority: critical # Security & Vulnerability Management **CVE disclosure · cargo-audit · cargo-deny · Fuzzing · Unsafe code review · Security testing** ## Vulnerability Management - **cargo-audit**: Run on every commit and in CI; fail build on known vulnerabilities ```bash cargo audit cargo audit --deny warnings # Fail on advisories ``` - **cargo-deny**: Detect vulnerable, unmaintained, and unlicensed depe
npx skillsauth add kreuzberg-dev/html-to-markdown .remote-cache/kreuzberg-shared-rules/.ai-rulez/skills/security-and-vulnerability-managementInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
CVE disclosure · cargo-audit · cargo-deny · Fuzzing · Unsafe code review · Security testing
cargo-audit: Run on every commit and in CI; fail build on known vulnerabilities
cargo audit
cargo audit --deny warnings # Fail on advisories
cargo-deny: Detect vulnerable, unmaintained, and unlicensed dependencies
cargo deny check advisories
cargo deny check bans
cargo deny check sources
CVE Disclosure Policy:
SECURITY.md with patch version and workarounds# Cargo.toml
[dependencies]
serde = "=1.0.150" # Pin critical deps to known safe versions
serde_json = "=1.0.89"
# Run lock file checks in CI
# .github/workflows/security.yml
- name: Check lock file
run: cargo tree --locked # Ensure lock matches Cargo.toml
cargo install cargo-fuzz && cargo fuzz initfuzz/fuzz_targets/ directory; one target per public API surface// fuzz/fuzz_targets/parse_fuzzer.rs
#![no_main]
use libfuzzer_sys::fuzz_target;
use my_crate::parse;
fuzz_target!(|data: &[u8]| {
// Parse should never crash or panic on arbitrary input
let _ = parse(data);
});
# Run fuzzing
cargo +nightly fuzz run parse_fuzzer -- -max_len=1024 -timeout=5
# Minimize failing input
cargo +nightly fuzz cmin parse_fuzzer
SAFETY comments: EVERY unsafe block MUST have a // SAFETY: comment explaining:
Isolation: Unsafe code in dedicated modules; public API must be safe
Review checklist:
/// SAFETY: This function assumes the pointer points to valid, initialized memory
/// for at least `len` bytes. Caller MUST ensure pointer is:
/// - Non-null and properly aligned
/// - Pointing to valid, initialized T values
/// - Valid for reads of at least `len * sizeof(T)` bytes
/// - No aliasing mutable references exist to this data
unsafe fn copy_from_raw(ptr: *const u8, len: usize) -> Vec<u8> {
std::slice::from_raw_parts(ptr, len).to_vec()
}
// Calling code MUST validate preconditions
let data = unsafe {
// SAFETY: C function guarantees ptr is valid, len bytes
copy_from_raw(c_buffer, buffer_len)
};
No panics on untrusted input: All public functions should return Result, never panic on bad input
Test adversarial inputs:
Input validation: Validate all untrusted input at boundaries (FFI, network, files)
Property-based testing: Use proptest for fuzz-like testing in normal tests
#[cfg(test)]
mod security_tests {
use proptest::proptest;
#[test]
fn parse_never_panics_on_arbitrary_input() {
// Test with random bytes
for _ in 0..1000 {
let random_bytes = rand::random::<Vec<u8>>();
let result = my_crate::parse(&random_bytes);
// Should return Err, never panic
assert!(result.is_ok() || result.is_err());
}
}
#[test]
fn handles_max_size_input() {
let huge_input = vec![0u8; 1024 * 1024 * 100]; // 100MB
let result = my_crate::parse(&huge_input);
assert!(result.is_ok() || result.is_err()); // Never panic
}
proptest! {
#[test]
fn parse_property_never_panics(input in ".*") {
let _ = my_crate::parse(input.as_bytes());
}
}
}
cargo update but within version constraintscargo audit and cargo deny# deny.toml
[advisories]
vulnerability = "deny"
unmaintained = "warn"
yanked = "warn"
notice = "warn"
ignore = []
[bans]
multiple-versions = "warn"
wildcards = "warn"
allow = []
deny = [
{ name = "openssl", version = "<0.10" },
{ name = "parking_lot", version = "*" } # Use std instead
]
[sources]
unknown-registry = "warn"
unknown-git = "warn"
allow-registry = ["https://github.com/rust-lang/crates.io-index"]
allow-git = []
unwrap() on parsed/network data will crash in productioncargo audit passes (no known vulns)cargo deny check passes (licenses, sources)cargo test --release passes (including security tests)tools
Convert HTML to Markdown, Djot, or plain text with structured extraction. Use when writing code that calls html-to-markdown APIs in Rust, Python, TypeScript, Go, Ruby, PHP, Java, C#, Elixir, R, C, or WASM. Covers installation, conversion, configuration, metadata extraction, document structure, and CLI usage.
development
Developer quick start guide with prerequisites, setup, and workflow commands
development
Common task runner commands for build, test, lint, and format workflows
tools
______________________________________________________________________ ## priority: high # Workspace Structure & Project Organization **Rust workspace** (Cargo.toml): crates/{kreuzberg,kreuzberg-py,kreuzberg-node,kreuzberg-ffi,kreuzberg-cli}, packages/ruby/ext/kreuzberg_rb/native, tools/{benchmark-harness,e2e-generator}, e2e/{rust,go}. **Language packages**: packages/{python,typescript,ruby,java,go} - thin wrappers around Rust core. **E2E tests**: Auto-generated from fixtures/ via tools/e2e