How to Install Apache Kafka on Ubuntu 18.04

 

In this tutorial, we will show you how to install and set Apache Kafka on a VPS running Ubuntu 18.04.

Kafka or Apache Kafka is a distributed messaging system based on the principle of the pub-sub (publish-subscribe) model. It allows us to publish and subscribe to a stream of records that can be categorized. It is an incredibly fast, highly scalable, fault-tolerant system, and it’s designed to process large amounts of data in real time. Apache Kafka can be used as an alternative to a message broker as well, which allows us to process/transform a stream of records. Kafka can be used as a messaging system, but in a rather incomparably huge scale. Overall, Apache Kafka is a very powerful tool when used correctly.

Prerequisites

  • A Server running Ubuntu 18.04 with at least 4GB of memory. For the purposes of this tutorial, we’ll be using one of our Managed Ubuntu 18.04 VPSes.
  • SSH access with root privileges, or access to the “root” user itself

Step 1: Log in via SSH and Update the System

Log in to your Ubuntu 18.04 VPS with SSH as the root user:

ssh [email protected]_Address -p Port_number

Replace “root” with a user that has sudo privileges if necessary. Additionally, replace “IP_Address” and “Port_Number” with your server’s respective IP address and SSH port.

Once that is done, you can check whether you have the proper Ubuntu version installed on your server with the following command:

# lsb_release -a

You should get this output:

Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic

Then, run the following command to make sure that all installed packages on the server are updated to their latest available versions:

# apt update && apt upgrade

Step 2: Add a System User

Let’s create a new user called ‘kafka’, after which we will add this new user as a sudoer.

# adduser kafka
# usermod -aG sudo kafka

Step 3: Install Java

Kafka is written in Java, so a JVM is required to get it working. In this tutorial, we will use OpenJDK 11, as it is the standard version of Java that comes with Ubuntu since September 2018.

# apt install default-jre

Step 4: Download Apache Kafka

Now let’s download Kafka, you can go to here and download the latest release if necessary. The latest download link at the time of writing has already been entered in the example for you.

# su - kafka
wget https://www-us.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz -O kafka.tgz

Now that the Apache Kafka binary has been downloaded, now we need to extract it in our Kafka user directory

$ tar -xzvf kafka.tgz --stripe 1

Step 5: Configure Apache Kafka

It is time to configure Apache Kafka. By default, we are not allowed to delete topics, categories or groups in which messages can be posted. To change this behavior, we need to edit the default configuration.

$ nano ~/config/server.properties

Append the following line to the last line of the configuration file.

delete.topic.enable = true

Step 6: Create a System Unit File for Apache Kafka

Zookeeper is required for running Kafka. Kafka uses zookeeper, so we’ll need to first start an instance of the Zookeeper server prior to starting the Apache Kafka service. In this tutorial, we will use the convenience script packaged with Kafka to get a quick-and-dirty single-node Zookeeper instance.

Open a new file at the filepath /etc/systemd/system/zookeeper.service, and open it in your preferred text editor. We’ll be using nano for this tutorial.

$ sudo nano /etc/systemd/system/zookeeper.service

Paste the following lines into it:

[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties
ExecStop=/home/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target

Now, let’s create a system unit file for kafka at the filepath /etc/systemd/system/kafka.service:

$ sudo nano /etc/systemd/system/kafka.service

Paste the following lines into the file:

[Unit]
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties > /home/kafka/kafka.log 2>&1'
ExecStop=/home/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target

The new system units have been added, so let’s enable Apache Kafka to automatically run on boot, and then run the service.

$ sudo systemctl enable kafka
$ sudo systemctl start kafka

Step 7: Create a Topic

In this step, we will create a topic named “FirstTopic”, with a single partition and only one replica:

$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic FirstTopic

Created topic "FirstTopic".

The replication-factor value describes how many copies of data will be created. We are running with a single instance, so the value would be 1.

The partitions value describe the number of brokers you want your data to be split between. We are running with a single broker, so the value would be 1.

Now you can see the created topic on Kafka by running the list topic command:

$ bin/kafka-topics.sh --list --zookeeper localhost:2181

FirstTopic

Step 8: Send Messages using Apache Kafka

Apache Kafka comes with a command line client that will take input from a file or standard input and send it out as messages to the Kafka cluster. The “producer” is the process that has responsibility for putting data into our Kafka service. By default, Kafka sends each line as a separate message.

Let’s run the producer and then type a few messages into the console to send to the server.

$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic FirstTopic

>Welcome to kafka
>This is the content of our first topic
>

Keep the terminal opened, and let’s proceed to the next step.

Step 9: Use Apache Kafka as a Consumer

Apache Kafka also has a command line for the consumer to read data from Kafka – this is so that the consumer can use Kafka to display messages in a standard output.

Run the following command in a new SSH session.

$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic testTopic --from-beginning

Welcome to kafka
This is the content of our first topic

That’s it! Apache Kafka has been successfully installed and set up. Now we can type some messages on the producer terminal as stated in the previous step. The messages will be immediately visible on our consumer terminal.

Original Article