Major Components of Apache Kafka

1. Producer

  • Applications that publish (write) messages to Kafka topics.

  • Decide which topic and partition the message goes to.

  • Can send data synchronously or asynchronously.


2. Consumer

  • Applications that subscribe (read) messages from Kafka topics.

  • Belong to consumer groups → Kafka distributes partitions among consumers in the group (parallel processing).

  • Track offsets (position in log) to know which messages are processed.


3. Topic

  • A logical category (like a channel or stream) where records are published.

  • Example: "orders", "payments".

  • Topics are split into partitions for scalability.


4. Partition

  • A log file inside a topic, storing messages in ordered sequence.

  • Each message gets an offset (ID).

  • Multiple partitions = parallelism + higher throughput.


5. Broker

  • A Kafka server that stores topics and partitions.

  • Handles requests from producers (write) and consumers (read).

  • A cluster usually has multiple brokers (for scalability & fault-tolerance).


6. Cluster

  • A group of brokers working together.

  • Topics are distributed across brokers → partitions are replicated for reliability.


7. Zookeeper (in older Kafka versions ≤ 2.8)

  • Manages broker metadata, cluster coordination, leader election.

  • 📌 Newer Kafka (KRaft mode) removes ZooKeeper → Kafka manages metadata internally.


8. Controller

  • A special broker that manages partition leaders and handles failover.

  • Ensures if a broker/partition leader fails, a new leader is elected from ISR.


9. Log (Commit Log / Message Store)

  • Each partition is a commit log file where records are appended.

  • Data is immutable and retained for a configured period (e.g., 7 days).

  • Consumers read sequentially using offsets.


10. Replication & ISR (In-Sync Replicas)

  • Kafka replicates partitions across brokers for durability.

  • ISR = replicas fully caught up with the leader → ensures safe failover.


11. Kafka Connect

  • A framework to integrate Kafka with external systems (DBs, Elasticsearch, S3, etc.) using source/sink connectors.


12. Kafka Streams / ksqlDB

  • Kafka Streams → a Java library for building real-time stream processing apps.

  • ksqlDB → SQL-like interface for querying and transforming Kafka streams.


🔹 Quick Diagram (Textual)

Producers ---> [ Topic: orders ]
               |   Partition-0 [Leader+Replicas]
               |   Partition-1 [Leader+Replicas]
               |   Partition-2 [Leader+Replicas]
Consumers <--- |
  • Producers write messages into topics (partitions).

  • Brokers store them.

  • Consumers read them (via consumer groups).

  • Replication + ISR ensure fault-tolerance.


In short:
Kafka’s major components are: Producers, Consumers, Topics, Partitions, Brokers, Cluster, Controller, ZooKeeper (legacy), Logs, ISR, Kafka Connect, and Kafka Streams.

Back to blog

Leave a comment