What is Apache Kafka?

Apache Kafka is distributed Publisher-Subscriber messaging system.

What does Publisher-Subscriber do?

Publisher publishes some information and subscriber subscribes those messages.

What is Messaging System?

Apache Kafka main purpose to store the messages sent by publisher and supply those message to subscribe whenever they request those messages.

Kafka Overview

Producer / Publisher

In Apache Kafka, publishers are referred to as Producers. They play a crucial role in sending messages to Subscribers (also known as Consumers). Producers are responsible for producing and sending messages to Kafka topics, which are then stored in the brokers and consumed by subscribers.

Consumer/ Subscriber

Subscribers are called consumers as they consume messages from Kafka Brokers

Broker

They store messages and they serve publishers and subscribers.And the Kafka broker simply stores messages in files on a hard drive, and producers are able to append messages to those files, and consumers are able to read from those files.

Zookeeper

Zookeeper manages the configuration of topics and partitions in a Kafka cluster. When a topic is created, Zookeeper:

  • Stores the topic configuration
  • Distributes the configuration to all brokers in the cluster

In summary, Zookeeper is essential in the Kafka ecosystem as it maintains and coordinates the configuration of topics and partitions across the cluster.

Kafka Topic and Partition

Kafka Topic : Messages sent by publishers to brokers are stored in a special entity called a topic.

Key Points bout Kafka Topics

  1. Unique Name: Each topic has its own unique name.
  2. Cluster-Wide Uniqueness: The topic name must be unique across the entire Kafka cluster.
  3. Offset Number: Each message within a topic is assigned a specific number called an offset.
  4. Offset Assignment: The offset number is assigned to each message when it arrives at a specific broker.
  5. Append-Only: Producers can only append messages to the end of the log, not insert them in the middle or at the beginning.

These points highlight the organization and structure of topics in Kafka, ensuring efficient and scalable data processing.

Kafka Partition