.claude/skills/kafka-development-practices/SKILL.md
Applies general coding standards and best practices for Kafka development with Scala.
npx skillsauth add oimiragieo/agent-studio kafka-development-practicesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
TopologyTestDriver (unit-test) plus an integration test against local Kafka.
</instructions>acks=all and min.insync.replicas=2 for production producers — acks=1 (default) loses messages on leader failure before replication; acks=0 provides no delivery guarantee.earliest replays the entire topic history from the beginning on first start; use latest for new consumers on existing topics.max.poll.interval.ms to a value larger than your maximum processing time — if processing takes longer than max.poll.interval.ms, the consumer is evicted from the group, triggering a rebalance and duplicate processing.| Anti-Pattern | Why It Fails | Correct Approach |
| ------------------------------------------------------- | --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- |
| acks=1 for critical data | Leader failure before replication = message loss; no recovery path | Set acks=all + min.insync.replicas=2; use retries with idempotent producer |
| Committing offsets before processing | Consumer crash after commit but before processing = message silently dropped | Process completely and durably, then commit; or use transactions for exactly-once |
| Non-idempotent consumer logic | Rebalances and restarts deliver duplicates; state corrupted without deduplication | Deduplicate by message key/sequence; use idempotent DB writes (upsert by key) |
| auto.offset.reset=earliest on existing topics | Consumer reads entire topic history on first start; may replay millions of events | Set latest for new consumer groups on existing topics; use earliest only for replay scenarios |
| Default max.poll.interval.ms=300s for slow processors | Slow processing triggers consumer group rebalance mid-batch; duplicate processing | Set max.poll.interval.ms > worst-case processing time; reduce batch size if needed |
Before starting:
cat .claude/context/memory/learnings.md
After completing: Record any new patterns or exceptions discovered.
ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.
tools
Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.
tools
Comprehensive toolkit for creating, analyzing, and visualizing complex networks and graphs in Python. Use when working with network/graph data structures, analyzing relationships between entities, computing graph algorithms (shortest paths, centrality, clustering), detecting communities, generating synthetic networks, or visualizing network topologies. Applicable to social networks, biological networks, transportation systems, citation networks, and any domain involving pairwise relationships.
data-ai
Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.
development
Run Python code in the cloud with serverless containers, GPUs, and autoscaling. Use when deploying ML models, running batch processing jobs, scheduling compute-intensive tasks, or serving APIs that require GPU acceleration or dynamic scaling.