What is a Partition in Kafka?
-
A Partition is the basic unit of storage and parallelism in Kafka.
-
Each Kafka topic is split into one or more partitions.
-
A partition is essentially a log file (append-only) where messages are stored in order.
š Messages in a partition are identified by a unique, sequential offset (like a line number).
š¹ Why Partitions?
-
Scalability ā Multiple partitions allow a topic to be spread across brokers ā higher throughput.
-
Parallelism ā Different consumers in a group can read partitions in parallel.
-
Ordering guarantee ā Kafka guarantees ordering only within a single partition (not across partitions).
š¹ Example
Topic orders
with 3 partitions:
orders-topic:
Partition-0: [0,1,2,3,...]
Partition-1: [0,1,2,3,...]
Partition-2: [0,1,2,3,...]
-
Each partition is stored on a broker.
-
Messages inside Partition-0 have offsets 0,1,2⦠in order.
-
Consumers read from specific partitions.
š¹ Partition Assignment (How Kafka decides where a message goes)
When a producer sends a message ā partition chosen by:
-
Keyed messages:
-
Partition =
hash(key) % numPartitions
. -
Ensures messages with the same key go to the same partition ā preserves order for that key.
-
-
Round-robin (no key):
-
Messages distributed evenly across partitions.
-
-
Custom partitioner:
-
Producer can define its own logic.
-
š¹ Replication of Partitions
-
Each partition is replicated across brokers for fault tolerance.
-
One replica = Leader (handles reads/writes).
-
Others = Followers (replicate data, part of ISR).
Example:
Partition-0 (Replication Factor = 3):
-
Leader on Broker-1
-
Followers on Broker-2 and Broker-3
š¹ Benefits of Partitions
ā
Enable horizontal scaling of Kafka.
ā
Provide parallelism (consumers read different partitions).
ā
Ensure data durability with replication.
ā
Allow ordering guarantees per key.
š¹ In Short
A Partition in Kafka is:
-
An ordered, immutable log of messages within a topic.
-
Identified by sequential offsets.
-
The key to scalability, parallelism, and fault-tolerance in Kafka.
-
Ordering is guaranteed only inside a partition.