How to Install Apache Kafka on RHEL 7
Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, real-time data pipelines. Originally developed at LinkedIn and later open-sourced under the Apache Software Foundation, Kafka is used by thousands of companies to process trillions of events per day. Unlike traditional message queues, Kafka persists messages to disk in an ordered, append-only log and allows consumer groups to read from any point in the history. This makes it ideal for log aggregation, real-time analytics, event sourcing, and building data pipelines between microservices. This guide covers installing the Java prerequisite on RHEL 7, downloading and extracting the Kafka distribution, starting ZooKeeper and the Kafka broker, creating topics, and sending and receiving messages with the bundled command-line tools. Systemd unit files are provided to manage Kafka as a proper system service.
Prerequisites
- RHEL 7 server with at least 2 GB RAM and 10 GB available disk space
- Root or sudo access
- Active Red Hat subscription or configured yum repositories
- Internet access to download the Kafka tarball from Apache mirrors
- Basic familiarity with systemctl and Linux file management
Step 1: Install Java
Kafka is written in Scala and runs on the Java Virtual Machine. Install OpenJDK 11, which is the recommended version for Kafka 3.x:
sudo yum install -y java-11-openjdk java-11-openjdk-devel
If your workload requires Kafka 2.x for compatibility reasons, OpenJDK 8 also works:
sudo yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel
Verify Java is installed and check the version:
java -version
Expected output (for OpenJDK 11):
openjdk version "11.0.22" 2024-01-16 LTS
OpenJDK Runtime Environment (Red Hat) (build 11.0.22+7-LTS)
OpenJDK 64-Bit Server VM (Red Hat) (build 11.0.22+7-LTS, mixed mode, sharing)
If you have multiple Java versions installed, set the default with:
sudo alternatives --config java
Step 2: Create a Kafka System User
Run Kafka as a dedicated non-root user to limit the blast radius of any security issue:
sudo useradd --no-create-home --shell /bin/false kafka
Step 3: Download and Extract Kafka
Navigate to the Apache Kafka downloads page to find the latest stable release. This guide uses Kafka 3.7.0 with the Scala 2.13 build. Download and extract it to /opt/kafka:
cd /tmp
curl -LO https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
Verify the integrity of the download using the SHA512 checksum provided on the downloads page:
curl -LO https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz.sha512
sha512sum --check kafka_2.13-3.7.0.tgz.sha512
Extract the archive and move it to /opt/kafka:
tar xzf kafka_2.13-3.7.0.tgz
sudo mv kafka_2.13-3.7.0 /opt/kafka
sudo chown -R kafka:kafka /opt/kafka
Add Kafka’s bin directory to the system PATH by creating a profile script:
sudo bash -c 'echo "export PATH=$PATH:/opt/kafka/bin" > /etc/profile.d/kafka.sh'
source /etc/profile.d/kafka.sh
Step 4: Configure Kafka Data Directories
Create dedicated directories for Kafka and ZooKeeper data storage. Keeping data on a separate mount point from the OS is recommended in production:
sudo mkdir -p /var/kafka/data /var/kafka/zookeeper
sudo chown -R kafka:kafka /var/kafka
Update the ZooKeeper configuration to point to the new data directory:
sudo sed -i 's|dataDir=/tmp/zookeeper|dataDir=/var/kafka/zookeeper|'
/opt/kafka/config/zookeeper.properties
Update the Kafka broker configuration:
sudo vi /opt/kafka/config/server.properties
Locate and update the following settings:
# Data directory
log.dirs=/var/kafka/data
# Listener configuration - bind to the server's IP for external clients
# For localhost-only testing use: PLAINTEXT://localhost:9092
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://YOUR_SERVER_IP:9092
# Retention settings
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
# Performance tuning
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
Replace YOUR_SERVER_IP with the actual IP address or hostname of your server.
Step 5: Create the ZooKeeper systemd Service
Create a systemd unit file to manage ZooKeeper as a service. ZooKeeper is required by Kafka 2.x and earlier Kafka 3.x versions for cluster coordination:
sudo vi /etc/systemd/system/zookeeper.service
Add the following content:
[Unit]
Description=Apache ZooKeeper (Kafka dependency)
Documentation=https://zookeeper.apache.org
Requires=network.target
After=network.target
[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk"
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-failure
RestartSec=10s
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Step 6: Create the Kafka systemd Service
sudo vi /etc/systemd/system/kafka.service
Add the following content:
[Unit]
Description=Apache Kafka Broker
Documentation=https://kafka.apache.org
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
RestartSec=10s
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Adjust KAFKA_HEAP_OPTS based on your available RAM. A minimum of 1 GB is recommended; production brokers often use 4–6 GB.
Step 7: Start ZooKeeper and Kafka
sudo systemctl daemon-reload
sudo systemctl enable zookeeper kafka
sudo systemctl start zookeeper
Wait 10–15 seconds for ZooKeeper to fully initialize, then start Kafka:
sudo systemctl start kafka
sudo systemctl status kafka
Check that both services are running and check the Kafka log for any errors:
sudo tail -50 /opt/kafka/logs/server.log
Look for a line containing [KafkaServer id=0] started which confirms the broker is ready.
Step 8: Create a Topic
Create a test topic named events with 3 partitions and a replication factor of 1 (appropriate for a single-broker setup):
kafka-topics.sh --create
--bootstrap-server localhost:9092
--topic events
--partitions 3
--replication-factor 1
List all topics to confirm creation:
kafka-topics.sh --list --bootstrap-server localhost:9092
Describe the topic to see partition and leader information:
kafka-topics.sh --describe --topic events --bootstrap-server localhost:9092
Step 9: Produce and Consume Messages
Open a terminal and start the console producer to send messages to the events topic. Type a message and press Enter to send it:
kafka-console-producer.sh --topic events --bootstrap-server localhost:9092
>Hello Kafka
>This is a test message
>From RHEL 7
Press Ctrl+C to exit the producer. In a separate terminal, start a console consumer to read all messages from the beginning of the topic:
kafka-console-consumer.sh
--topic events
--bootstrap-server localhost:9092
--from-beginning
You should see the three messages appear. Consumer groups allow multiple consumers to share the load of reading from a topic. Start two consumers in the same group:
kafka-console-consumer.sh
--topic events
--bootstrap-server localhost:9092
--group myapp-consumers
--from-beginning
Step 10: Configure the Firewall
Open the Kafka broker port to application servers on your internal network. Do not expose Kafka to the public internet without TLS and SASL authentication:
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="tcp" port="9092" accept'
sudo firewall-cmd --reload
ZooKeeper’s port 2181 should remain accessible only on localhost or to other Kafka brokers in the cluster.
Conclusion
You have installed Apache Kafka on RHEL 7 with proper systemd service management, configured data directories outside the default /tmp paths, created a topic, and verified message production and consumption using the bundled command-line tools. This single-broker setup is suitable for development, testing, and low-volume production workloads. For a production deployment, you would expand to at least three brokers for fault tolerance, set a replication factor of 3 on critical topics, enable TLS encryption for client connections, configure SASL/SCRAM authentication, and implement regular log compaction or retention policies. Kafka’s ecosystem — including Kafka Connect for data integration and Kafka Streams for stream processing — makes it a versatile foundation for building resilient, real-time data architectures.