.remote-cache/kreuzberg-shared-rules/.ai-rulez/skills/memory-safety-optimization-patterns/SKILL.md
______________________________________________________________________ ## priority: high # Memory Safety & Optimization Patterns **Zero-copy APIs · RAII principle · Lifetime optimization · Automatic memory management** ## Zero-Copy Patterns - **References over ownership**: Use `&T` and `&mut T` to avoid transfers; let Rust's borrow checker enforce safety - **Borrowed data**: Prefer `&str` over `String`, `&[T]` over `Vec<T>` in function signatures - **Cow<T>** (Copy-on-Write): Use for condit
npx skillsauth add kreuzberg-dev/html-to-markdown .remote-cache/kreuzberg-shared-rules/.ai-rulez/skills/memory-safety-optimization-patternsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Zero-copy APIs · RAII principle · Lifetime optimization · Automatic memory management
&T and &mut T to avoid transfers; let Rust's borrow checker enforce safety&str over String, &[T] over Vec<T> in function signaturesCow::Borrowed for immutable access (zero cost)Cow::Owned for owned data; switch to owned only when needed'a to tie referencesDrop traitDrop; no manual file.close() needed{
let file = std::fs::File::open("data.txt")?; // Acquire
let mut reader = std::io::BufReader::new(file);
// Use resource
} // Automatically closed/dropped here
// With explicit guards
{
let mut data = vec![1, 2, 3];
let guard = std::sync::Mutex::new(&mut data);
{
let mut locked = guard.lock().unwrap();
locked.push(4);
} // Unlock here automatically
}
with_capacity()use std::borrow::Cow;
fn normalize_path(path: &str) -> Cow<'_, str> {
if path.contains("//") {
Cow::Owned(path.replace("//", "/"))
} else {
Cow::Borrowed(path)
}
}
// Arc for sharing
let shared_name = std::sync::Arc::new("expensive_name".to_string());
let clone1 = std::sync::Arc::clone(&shared_name); // Cheap clone
let clone2 = std::sync::Arc::clone(&shared_name);
Vec::split_at_mut() to avoid multiple allocationsstruct DataProcessor {
input_buffer: Vec<u8>,
output_buffer: Vec<u8>,
}
impl DataProcessor {
fn new() -> Self {
DataProcessor {
input_buffer: Vec::with_capacity(8192),
output_buffer: Vec::with_capacity(8192),
}
}
fn process(&mut self, data: &[u8]) -> &[u8] {
self.output_buffer.clear();
self.input_buffer.clear();
self.input_buffer.extend_from_slice(data);
// Process in-place
for byte in &mut self.input_buffer {
*byte = transform(*byte);
}
self.input_buffer.as_slice()
}
}
'_ (wildcard) for unused lifetimes in bounds// Avoid unnecessary lifetimes
struct BadParser<'a> {
config: &'a str, // Tied to parser lifetime
}
// Better: owned data or shorter borrowing
struct BetterParser {
config: String, // Owned, no lifetime
}
// Lifetime elision (single input → output)
fn parse(input: &str) -> &str { // Output borrows from input
&input[0..10]
}
Valgrind (Linux): Detect memory leaks, use-after-free
valgrind --leak-check=full --show-leak-kinds=all ./myapp
valgrind --tool=massif ./myapp # Heap profiler
AddressSanitizer (ASan): Compile-time instrumentation for memory errors
RUSTFLAGS="-Z sanitizer=address" cargo +nightly build --target x86_64-unknown-linux-gnu
cargo-careful: Run tests under Miri for undefined behavior detection
cargo +nightly careful test
MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test
Clippy: Lint for lifetime issues
cargo clippy --all-targets -- -D warnings
fn process(data: Vec<u8>) should be &[u8]; cloning is expensive and unnecessaryformat!() when .to_string() or borrowing sufficestools
Convert HTML to Markdown, Djot, or plain text with structured extraction. Use when writing code that calls html-to-markdown APIs in Rust, Python, TypeScript, Go, Ruby, PHP, Java, C#, Elixir, R, C, or WASM. Covers installation, conversion, configuration, metadata extraction, document structure, and CLI usage.
development
Developer quick start guide with prerequisites, setup, and workflow commands
development
Common task runner commands for build, test, lint, and format workflows
tools
______________________________________________________________________ ## priority: high # Workspace Structure & Project Organization **Rust workspace** (Cargo.toml): crates/{kreuzberg,kreuzberg-py,kreuzberg-node,kreuzberg-ffi,kreuzberg-cli}, packages/ruby/ext/kreuzberg_rb/native, tools/{benchmark-harness,e2e-generator}, e2e/{rust,go}. **Language packages**: packages/{python,typescript,ruby,java,go} - thin wrappers around Rust core. **E2E tests**: Auto-generated from fixtures/ via tools/e2e