How to Install Apache Kafka on RHEL 7

Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, real-time data pipelines. Originally developed at LinkedIn and later open-sourced under the Apache Software Foundation, Kafka is used by thousands of companies to process trillions of events per day. Unlike traditional message queues, Kafka persists messages to disk in an ordered, append-only log and allows consumer groups to read from any point in the history. This makes it ideal for log aggregation, real-time analytics, event sourcing, and building data pipelines between microservices. This guide covers installing the Java prerequisite on RHEL 7, downloading and extracting the Kafka distribution, starting ZooKeeper and the Kafka broker, creating topics, and sending and receiving messages with the bundled command-line tools. Systemd unit files are provided to manage Kafka as a proper system service.

Prerequisites

RHEL 7 server with at least 2 GB RAM and 10 GB available disk space
Root or sudo access
Active Red Hat subscription or configured yum repositories
Internet access to download the Kafka tarball from Apache mirrors
Basic familiarity with systemctl and Linux file management

Step 1: Install Java

Kafka is written in Scala and runs on the Java Virtual Machine. Install OpenJDK 11, which is the recommended version for Kafka 3.x:

sudo yum install -y java-11-openjdk java-11-openjdk-devel

If your workload requires Kafka 2.x for compatibility reasons, OpenJDK 8 also works:

sudo yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel

Verify Java is installed and check the version:

java -version

Expected output (for OpenJDK 11):

openjdk version "11.0.22" 2024-01-16 LTS
OpenJDK Runtime Environment (Red Hat) (build 11.0.22+7-LTS)
OpenJDK 64-Bit Server VM (Red Hat) (build 11.0.22+7-LTS, mixed mode, sharing)

If you have multiple Java versions installed, set the default with:

sudo alternatives --config java

Step 2: Create a Kafka System User

Run Kafka as a dedicated non-root user to limit the blast radius of any security issue:

sudo useradd --no-create-home --shell /bin/false kafka

Step 3: Download and Extract Kafka

Navigate to the Apache Kafka downloads page to find the latest stable release. This guide uses Kafka 3.7.0 with the Scala 2.13 build. Download and extract it to /opt/kafka:

cd /tmp
curl -LO https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz

Verify the integrity of the download using the SHA512 checksum provided on the downloads page:

curl -LO https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz.sha512
sha512sum --check kafka_2.13-3.7.0.tgz.sha512

Extract the archive and move it to /opt/kafka:

tar xzf kafka_2.13-3.7.0.tgz
sudo mv kafka_2.13-3.7.0 /opt/kafka
sudo chown -R kafka:kafka /opt/kafka

Add Kafka’s bin directory to the system PATH by creating a profile script:

sudo bash -c 'echo "export PATH=$PATH:/opt/kafka/bin" > /etc/profile.d/kafka.sh'
source /etc/profile.d/kafka.sh

Step 4: Configure Kafka Data Directories

Create dedicated directories for Kafka and ZooKeeper data storage. Keeping data on a separate mount point from the OS is recommended in production:

sudo mkdir -p /var/kafka/data /var/kafka/zookeeper
sudo chown -R kafka:kafka /var/kafka

Update the ZooKeeper configuration to point to the new data directory:

sudo sed -i 's|dataDir=/tmp/zookeeper|dataDir=/var/kafka/zookeeper|' 
  /opt/kafka/config/zookeeper.properties

Update the Kafka broker configuration:

sudo vi /opt/kafka/config/server.properties

Locate and update the following settings:

# Data directory
log.dirs=/var/kafka/data

# Listener configuration - bind to the server's IP for external clients
# For localhost-only testing use: PLAINTEXT://localhost:9092
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://YOUR_SERVER_IP:9092

# Retention settings
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

# Performance tuning
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600

Replace YOUR_SERVER_IP with the actual IP address or hostname of your server.

Step 5: Create the ZooKeeper systemd Service

Create a systemd unit file to manage ZooKeeper as a service. ZooKeeper is required by Kafka 2.x and earlier Kafka 3.x versions for cluster coordination:

sudo vi /etc/systemd/system/zookeeper.service

Add the following content:

[Unit]
Description=Apache ZooKeeper (Kafka dependency)
Documentation=https://zookeeper.apache.org
Requires=network.target
After=network.target

[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk"
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-failure
RestartSec=10s
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Step 6: Create the Kafka systemd Service

sudo vi /etc/systemd/system/kafka.service

Add the following content:

[Unit]
Description=Apache Kafka Broker
Documentation=https://kafka.apache.org
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
RestartSec=10s
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Adjust KAFKA_HEAP_OPTS based on your available RAM. A minimum of 1 GB is recommended; production brokers often use 4–6 GB.

Step 7: Start ZooKeeper and Kafka

sudo systemctl daemon-reload
sudo systemctl enable zookeeper kafka
sudo systemctl start zookeeper

Wait 10–15 seconds for ZooKeeper to fully initialize, then start Kafka:

sudo systemctl start kafka
sudo systemctl status kafka

Check that both services are running and check the Kafka log for any errors:

sudo tail -50 /opt/kafka/logs/server.log

Look for a line containing [KafkaServer id=0] started which confirms the broker is ready.

Step 8: Create a Topic

Create a test topic named events with 3 partitions and a replication factor of 1 (appropriate for a single-broker setup):

kafka-topics.sh --create 
  --bootstrap-server localhost:9092 
  --topic events 
  --partitions 3 
  --replication-factor 1

List all topics to confirm creation:

kafka-topics.sh --list --bootstrap-server localhost:9092

Describe the topic to see partition and leader information:

kafka-topics.sh --describe --topic events --bootstrap-server localhost:9092

Step 9: Produce and Consume Messages

Open a terminal and start the console producer to send messages to the events topic. Type a message and press Enter to send it:

kafka-console-producer.sh --topic events --bootstrap-server localhost:9092
>Hello Kafka
>This is a test message
>From RHEL 7

Press Ctrl+C to exit the producer. In a separate terminal, start a console consumer to read all messages from the beginning of the topic:

kafka-console-consumer.sh 
  --topic events 
  --bootstrap-server localhost:9092 
  --from-beginning

You should see the three messages appear. Consumer groups allow multiple consumers to share the load of reading from a topic. Start two consumers in the same group:

kafka-console-consumer.sh 
  --topic events 
  --bootstrap-server localhost:9092 
  --group myapp-consumers 
  --from-beginning

Step 10: Configure the Firewall

Open the Kafka broker port to application servers on your internal network. Do not expose Kafka to the public internet without TLS and SASL authentication:

sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="tcp" port="9092" accept'
sudo firewall-cmd --reload

ZooKeeper’s port 2181 should remain accessible only on localhost or to other Kafka brokers in the cluster.

Conclusion

You have installed Apache Kafka on RHEL 7 with proper systemd service management, configured data directories outside the default /tmp paths, created a topic, and verified message production and consumption using the bundled command-line tools. This single-broker setup is suitable for development, testing, and low-volume production workloads. For a production deployment, you would expand to at least three brokers for fault tolerance, set a replication factor of 3 on critical topics, enable TLS encryption for client connections, configure SASL/SCRAM authentication, and implement regular log compaction or retention policies. Kafka’s ecosystem — including Kafka Connect for data integration and Kafka Streams for stream processing — makes it a versatile foundation for building resilient, real-time data architectures.

How to Install Apache Kafka on RHEL 7