Level 2 · 25 min
Kafka Fundamentals
Apache Kafka is a distributed event streaming platform built around an append-only commit log. It decouples producers from consumers, handles millions of events per second, and guarantees ordered delivery within partitions. Understanding Kafka starts with topics, partitions, offsets, and the consumer model.
Topics and Partitions
A topic is a logical channel for events of a given type (orders, user-clicks, payments). Topics are divided into partitions — each partition is an ordered, immutable sequence of records stored as segment files on disk: a `.log` file (raw records), `.index` file (offset → physical position), and `.timeindex` file (timestamp → offset). Partitions are the unit of parallelism: more partitions = more consumers can read in parallel. Each record in a partition has a monotonically increasing offset. Records are appended at the tail; consumers read from any offset. Partition key determines which partition a record goes to — records with the same key always go to the same partition, guaranteeing order per key. Each partition has one leader broker and N-1 follower replicas; the leader handles all reads and writes. Leader election is managed by Kafka's KRaft controller (Raft-based, replaced ZooKeeper in Kafka 3.3+ as production-ready). KRaft uses a leader epoch counter — followers reject writes from stale leaders whose epoch is lower than the current epoch, preventing split-brain. The ISR (In-Sync Replicas) set tracks which followers are caught up; a message is committed only after all ISR members acknowledge it (when acks=all).
Producers and Consumers
At a major e-commerce company during a Black Friday peak, the order-events topic received 4.2M messages/min. Three of twelve partitions mapped to a single hot broker because the team had used customerId as partition key and their top 3 enterprise accounts generated 60% of volume. Those broker CPUs hit 95% while others sat at 15%. Consumer lag on the analytics group climbed to 850K messages within 40 minutes — read queries started returning stale data from an hour prior, causing the pricing dashboard to show outdated stock levels. Root cause: key skew. Fix: composite key (customerId + orderBucket) using a murmur2 hash with bucket modulo, spreading the load. Producers write records to topics — by default, the sticky partitioner batches to one partition then rotates (Kafka 2.4+); with a key, consistent murmur2 hash. Kafka uses a pull model — consumers poll rather than Kafka pushing to them, so slow consumers don't back-pressure producers. Consumers commit offsets to the __consumer_offsets internal topic (compacted). On restart, a consumer resumes from the last committed offset. Key insight from Designing Data-Intensive Applications: "Even though these message brokers write all messages to disk, they are able to achieve throughput of millions of messages per second by partitioning across multiple machines, and fault tolerance by replicating messages." The log-based approach works because sequential disk writes are far faster than random writes — a modern NVMe drive sustains 3 GB/s sequential throughput. Kleppmann further notes that the consumer offset is "very similar to the log sequence number that is commonly found in single-leader database replication" — the same mechanism used to resume followers after a restart applies directly to resuming Kafka consumers. The broker only needs to periodically record consumer offsets rather than tracking per-message acknowledgments, which is why Kafka's bookkeeping overhead is dramatically lower than traditional message brokers like RabbitMQ.
Delivery Semantics
Kafka supports three delivery semantics. At-most-once: commit offset before processing — if consumer crashes after commit but before processing, the record is lost. At-least-once: commit after processing — if consumer crashes after processing but before commit, the record is reprocessed. Exactly-once: combine idempotent producers + transactional API + read-process-write atomically. Most systems use at-least-once with idempotent consumers (deduplication by record ID) as a pragmatic balance. Key producer configs for durability and throughput: `acks=all` (wait for all ISR), `retries=Integer.MAX_VALUE`, `max.in.flight.requests.per.connection=5` (safe with idempotence on), `linger.ms=5` (batch records for 5ms), `batch.size=65536`, `compression.type=lz4` (best throughput/compression tradeoff). Retention: `log.retention.ms` (time-based) or `log.retention.bytes` (size-based per partition). `replica.lag.time.max.ms=10000` — a follower lagging more than this is removed from ISR.
Code example
// Idempotent producer — production config
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker1:9092,broker2:9092");
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); // dedup retries
props.put(ProducerConfig.ACKS_CONFIG, "all"); // wait for ISR
props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE);
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
props.put(ProducerConfig.LINGER_MS_CONFIG, 5); // batch 5ms
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 65536); // 64KB
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4"); // ~60% size reduction
Producer<String, String> producer = new KafkaProducer<>(props);
ProducerRecord<String, String> record = new ProducerRecord<>(
"orders", customerId, orderJson // key → partition determinism
);
producer.send(record, (meta, ex) -> {
if (ex != null) log.error("Send failed partition={}", meta.partition(), ex);
else log.info("offset={} partition={}", meta.offset(), meta.partition());
});