Unravelling Kafka — Part 2: Getting Started with Apache Kafka (CLI)
Welcome back to Part 2 of our Kafka series! In the preceding article, we explored the foundational concepts of Kafka. In this installment, we’ll dive into the practical side of working with Apache Kafka.
We’ll explore how to get started with Kafka, from setting up your environment to creating topics and producing/consuming messages on Mac. Additionally, we’ll delve into the powerful Kafka Command Line Interface (CLI), discovering its capabilities and how it can streamline your Kafka workflows. Let’s embark on this journey together as we unlock the potential of Apache Kafka.
Before diving into the installation process, let’s explore the foundational aspects of Apache Kafka, notably its dependency on Zookeeper in earlier versions and the introduction of Kafka Kraft mode in newer releases
Zookeeper Dependency in Kafka:
Before version 2.8.0, Apache Kafka relied on Apache Zookeeper for distributed coordination and metadata management. Zookeeper played a critical role in maintaining the state of the Kafka cluster, including topics, partitions, and leader election. It acted as a centralized registry for storing cluster configuration, ensuring fault tolerance, and handling cluster reconfiguration events.
However, managing Zookeeper added complexity to Kafka deployments, requiring additional operational overhead and potential single points of failure. As a result, there was a growing demand for simplifying Kafka’s architecture by reducing its dependency on Zookeeper.
Kafka Kraft Mode (Raft Metadata Mode):
Starting from version 2.8.0, Apache Kafka introduced the Kafka Raft Metadata mode, also known as Kafka Kraft mode. This mode offers an alternative to using Zookeeper for cluster coordination and metadata management.
In Kraft mode, Kafka utilizes a Raft consensus protocol implementation to manage cluster metadata, such as topics, partitions, and broker configuration. The Raft protocol ensures strong consistency and fault tolerance, similar to Zookeeper, but with reduced complexity and overhead.
Enabling Kraft mode involves setting the metadata.mode=raft
property in the Kafka server configuration (server.properties
). By doing so, Kafka eliminates the dependency on Zookeeper for cluster coordination, resulting in a simpler and more streamlined deployment architecture.[To set up Kafka without a zookeeper, refer this documentation.]
Prerequisites:
- Java Development Kit (JDK) installed on your system. You can download it from the official Oracle website or use a package manager like Homebrew (
brew install openjdk
). - Make sure your
JAVA_HOME
environment variable is set.
Installation
- Download the latest version of Apache Kafka from the official website: Apache Kafka Downloads
- Extract the downloaded archive to your desired location on your Mac.
- Navigate to the Kafka directory in your terminal. We will run all the following commands from this Apache Kafka directory.
Once you extracted the Apache Kafka tar you will see there are several scripts inside the bin directory. We will use these scripts to perform the CLI operations.
Start Zookeeper
As mentioned earlier, cluster management in Apache Kafka relies on Zookeeper. Therefore, before initiating Kafka, it’s essential to start Zookeeper.
/bin/zookeeper-server-start.sh ~/kafka_2.13-3.0.0/config/zookeeper.properties
Start Apache Kafka
Open another terminal window and run the following command to start Kafka.
bin/kafka-server-start.sh config/server.properties
Kafka has now started. Ensure to keep both terminal windows open, otherwise, you will shut down Kafka or Zookeeper.
Kafka topics
kafka-topics
enables users to create, delete, describe, or modify topics within Kafka.
Create the topic using the following command.
bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic first_topic --create --partitions 3 --replication-factor 1
List the topics using the following command.
bin/kafka-topics.sh --bootstrap-server localhost:9092 --list
Describe a Kafka topic using the following command.
bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic first_topic
Delete the Kafka topic using the following command.
bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic first_topic
Kafka Producer
The kafka-console-producer
, facilitates the reading of data from standard input and publishing it to Kafka.
To produce a message into a kafka first you need to have a topic created in your kafka cluster. In this exercise, we will first create a topic named “new_topic”.
bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic new_topic
After the producer is opened, you should see a >
sign. After pressing Enter, any line of text you write will be sent to the Kafka topic. Use ctrl+c
for exit.
A topic with the name provided should exist. When specifying a topic that does not yet exist, a new topic will be created with the provided name, employing the default number of partitions and replication factor.
Kafka Consumers
kafka-console-consumer
is used to creating a Kafka consumer using CLI.
The number of consumers within a group cannot exceed the number of partitions in the Kafka topic. Thus, it’s essential to create a Kafka topic with an adequate number of partitions before adding consumers to the group.
One can launch the Kafka consumer using the following command.
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic new_topic --group my-first-application
Here I have launched the 3 Kafka consumers in the same group and 1 producer. When you stop a consumer, messages are automatically routed to the remaining consumers due to the consumer group’s automatic consumer rebalance.
Conclusion
You have successfully set up Apache Kafka with Zookeeper on your Mac environment. You’ve also learned how to create a Kafka topic and run a basic demo with producers and consumers. In addition, we explored the Kafka Kraft mode as an alternative to Zookeeper for cluster coordination, offering greater simplicity and scalability. Keep exploring the capabilities of Apache Kafka to unlock its full potential in handling real-time data streams. In the upcoming part, we will delve deeper into Kafka’s functionality by exploring how to integrate it with Java/Spring Boot.