ClickHouse is a column-oriented database management system (DBMS) designed for Online Analytical Processing (OLAP). Unlike row-oriented databases (MySQL, PostgreSQL) where each row is stored together — efficient for retrieving full records — ClickHouse stores each column separately, making it extremely efficient for aggregation queries that scan specific columns across billions of rows. ClickHouse can process hundreds of millions of rows per second on a single server, making it the fastest open-source OLAP database for analytics workloads. It excels at log analysis, time-series data, user behaviour analytics, financial reporting, and any scenario involving large-scale aggregations (GROUP BY, COUNT, SUM, AVG) over immense datasets. This guide covers installing ClickHouse on RHEL 9, creating tables with appropriate engines, and running analytical queries.
Prerequisites
- RHEL 9 with at least 4 GB RAM
- At least 50 GB free disk space for data storage
Step 1 — Add the ClickHouse Repository
curl -fsSL https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key | gpg --dearmor -o /usr/share/keyrings/clickhouse.gpg
cat > /etc/yum.repos.d/clickhouse.repo <<'EOF'
[clickhouse-stable]
name=ClickHouse Stable
baseurl=https://packages.clickhouse.com/rpm/stable/
gpgcheck=1
gpgkey=https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key
enabled=1
EOF
Step 2 — Install ClickHouse
dnf install -y clickhouse-server clickhouse-client
systemctl enable --now clickhouse-server
systemctl status clickhouse-server
Step 3 — Secure the Installation
# Set a password for the default user
# /etc/clickhouse-server/users.d/default-password.xml
cat > /etc/clickhouse-server/users.d/default-password.xml <<'EOF'
StrongClickHousePass!
EOF
systemctl restart clickhouse-server
Step 4 — Create a Table and Insert Data
clickhouse-client --password StrongClickHousePass!
-- Create a database
CREATE DATABASE analytics;
USE analytics;
-- MergeTree is the primary ClickHouse table engine for large datasets
CREATE TABLE web_events (
event_date Date,
event_time DateTime,
user_id UInt64,
page_url String,
bytes_sent UInt32,
status_code UInt16
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_date, user_id);
-- Insert sample data
INSERT INTO web_events VALUES
('2024-01-15', '2024-01-15 10:00:00', 1001, '/home', 42500, 200),
('2024-01-15', '2024-01-15 10:01:00', 1002, '/products', 85000, 200),
('2024-01-15', '2024-01-15 10:02:00', 1001, '/checkout', 12000, 404);
-- Aggregation query
SELECT
page_url,
count() AS visits,
sum(bytes_sent) AS total_bytes,
avg(bytes_sent) AS avg_bytes
FROM web_events
GROUP BY page_url
ORDER BY visits DESC;
Step 5 — Import from CSV
# ClickHouse can ingest CSV directly
clickhouse-client --password StrongClickHousePass!
--query "INSERT INTO analytics.web_events FORMAT CSV"
< /tmp/events.csv
Step 6 — Open Firewall for Remote Access
# Default ClickHouse ports: 8123 (HTTP), 9000 (native protocol)
firewall-cmd --permanent --add-port=8123/tcp --add-port=9000/tcp
firewall-cmd --reload
# HTTP API test
curl "http://localhost:8123/?query=SELECT+version()&password=StrongClickHousePass!"
Conclusion
ClickHouse on RHEL 9 delivers exceptional analytical query performance through columnar storage and vectorised query execution. The MergeTree engine family (MergeTree, ReplacingMergeTree, SummingMergeTree, AggregatingMergeTree) provides different data management strategies for different use cases. ClickHouse is typically used alongside an OLTP database (MySQL/PostgreSQL) — the transactional data flows into ClickHouse for analysis via Kafka, direct inserts, or ETL pipelines.
Next steps: How to Install Elasticsearch on RHEL 9, How to Install Apache Kafka on RHEL 9, and How to Install Redis on RHEL 9.