How to Install ClickHouse Columnar Database on RHEL 7
ClickHouse is an open-source columnar database management system developed by Yandex, designed specifically for online analytical processing (OLAP) workloads. Unlike traditional row-oriented databases, ClickHouse stores data by column, enabling it to read only the columns needed for a query and compress similar data together — resulting in dramatically faster analytical queries over billions of rows. This guide covers installing ClickHouse on RHEL 7, configuring the service, and getting started with table creation, data insertion, and queries using the MergeTree engine family.
Prerequisites
- RHEL 7 server with root or sudo access
- At least 4 GB of RAM (8 GB or more recommended for meaningful workloads)
- At least 20 GB of available disk space for data files
- An internet connection to reach the ClickHouse yum repository, or a local mirror
- Port 8123 (HTTP interface) and 9000 (native TCP interface) accessible if querying remotely
Step 1: Add the ClickHouse YUM Repository
ClickHouse provides an official RPM repository for RHEL-based distributions. Add the repository definition and import the GPG key.
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo
https://packages.clickhouse.com/rpm/clickhouse.repo
Alternatively, create the repo file manually:
sudo tee /etc/yum.repos.d/clickhouse.repo <<'EOF'
[clickhouse-stable]
name=ClickHouse - Stable Repository
baseurl=https://packages.clickhouse.com/rpm/stable/
gpgcheck=1
gpgkey=https://packages.clickhouse.com/rpm/stable/repodata/repomd.xml.key
enabled=1
EOF
Verify the repository is available:
sudo yum repolist | grep clickhouse
Step 2: Install ClickHouse Server and Client
Install both the server daemon and the command-line client. The server handles all storage and query execution; the client provides an interactive SQL shell.
sudo yum install -y clickhouse-server clickhouse-client
The package installs:
- ClickHouse server binary and systemd unit at
/usr/bin/clickhouse-server - Configuration files under
/etc/clickhouse-server/ - Default data directory at
/var/lib/clickhouse/ - Log files written to
/var/log/clickhouse-server/
Step 3: Review the Main Configuration File
The primary configuration is in /etc/clickhouse-server/config.xml. For most single-node installations the defaults work well, but review a few key settings.
sudo vi /etc/clickhouse-server/config.xml
Key sections to review:
<!-- Uncomment to listen on all interfaces instead of localhost only -->
<listen_host>0.0.0.0</listen_host>
<!-- HTTP port (used by JDBC drivers and many tools) -->
<http_port>8123</http_port>
<!-- Native TCP port (used by clickhouse-client) -->
<tcp_port>9000</tcp_port>
<!-- Data storage location -->
<path>/var/lib/clickhouse/</path>
<!-- Log location -->
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
User accounts and passwords are configured separately in /etc/clickhouse-server/users.xml. By default a default user exists with no password and full access — change this before exposing ClickHouse to a network.
Step 4: Start and Enable the ClickHouse Service
sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server
If the status shows active (running), the server is ready. Check the log for any startup errors:
sudo tail -30 /var/log/clickhouse-server/clickhouse-server.log
Open firewall ports if remote access is needed:
sudo firewall-cmd --permanent --add-port=8123/tcp
sudo firewall-cmd --permanent --add-port=9000/tcp
sudo firewall-cmd --reload
Step 5: Connect with clickhouse-client
Launch the interactive SQL client. By default it connects to localhost on port 9000.
clickhouse-client
You will see the ClickHouse prompt. Run a quick test query:
SELECT version();
SELECT number, number * number AS square
FROM system.numbers
LIMIT 10;
The system.numbers table is a built-in virtual table useful for testing. Explore other system tables:
SELECT name, engine, total_rows, total_bytes
FROM system.tables
WHERE database = 'system'
ORDER BY total_bytes DESC
LIMIT 20;
Step 6: Create a Database and Table with MergeTree Engine
MergeTree is ClickHouse’s most important table engine. It supports primary keys, ordering, partitioning, and is the basis for the ReplacingMergeTree, SummingMergeTree, and ReplicatedMergeTree variants used in production clusters.
-- Create a dedicated database
CREATE DATABASE analytics;
-- Switch to the new database
USE analytics;
-- Create a web events table using MergeTree
CREATE TABLE web_events
(
event_date Date,
event_time DateTime,
user_id UInt64,
page String,
country LowCardinality(String),
bytes_sent UInt32,
status_code UInt16
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_date, user_id)
SETTINGS index_granularity = 8192;
Key design decisions in this table:
PARTITION BY toYYYYMM(event_date)— partitions data by month, making it easy to drop old months of dataORDER BY (event_date, user_id)— the primary sort key; ClickHouse physically sorts data on disk by this key enabling fast range scansLowCardinality(String)forcountry— encodes repeated string values as integers internally, saving space and improving performance for low-cardinality columns
Step 7: Insert Data and Query
Insert sample rows to test the table:
INSERT INTO web_events VALUES
('2026-05-01', '2026-05-01 10:15:00', 1001, '/home', 'US', 4096, 200),
('2026-05-01', '2026-05-01 10:16:30', 1002, '/products', 'GB', 8192, 200),
('2026-05-01', '2026-05-01 10:17:45', 1001, '/checkout', 'US', 2048, 200),
('2026-05-02', '2026-05-02 09:00:00', 1003, '/home', 'DE', 4096, 200),
('2026-05-02', '2026-05-02 09:05:00', 1003, '/about', 'DE', 1024, 404);
Run analytical queries:
-- Count events per country
SELECT country, count() AS events
FROM web_events
GROUP BY country
ORDER BY events DESC;
-- Daily traffic summary
SELECT
event_date,
count() AS total_requests,
sum(bytes_sent) AS total_bytes,
countIf(status_code = 200) AS successful
FROM web_events
GROUP BY event_date
ORDER BY event_date;
-- Unique visitors per day
SELECT event_date, uniqExact(user_id) AS unique_visitors
FROM web_events
GROUP BY event_date
ORDER BY event_date;
Step 8: Inspect Table Storage Metadata
ClickHouse’s system.tables and system.parts tables expose internals about data storage.
-- See all tables in the analytics database
SELECT name, engine, total_rows, formatReadableSize(total_bytes) AS size
FROM system.tables
WHERE database = 'analytics';
-- See the physical data parts created by MergeTree
SELECT
table,
partition,
name,
rows,
formatReadableSize(bytes_on_disk) AS disk_size,
marks
FROM system.parts
WHERE database = 'analytics' AND active = 1;
Step 9: Load Data from a CSV File
ClickHouse can ingest data from files using the HTTP interface, which is ideal for bulk loads from scripts or ETL pipelines.
# Create a CSV file
cat > /tmp/events.csv <<'EOF'
2026-05-03,2026-05-03 11:00:00,2001,/home,FR,4096,200
2026-05-03,2026-05-03 11:05:00,2002,/blog,CA,6144,200
2026-05-03,2026-05-03 11:10:00,2003,/contact,AU,2048,200
EOF
# Load via the HTTP interface
curl -s "http://localhost:8123/"
--data-binary @/tmp/events.csv
--get
--data-urlencode
"query=INSERT INTO analytics.web_events FORMAT CSV"
Conclusion
You have installed ClickHouse on RHEL 7, started the server service with systemctl, and performed basic operations including creating a MergeTree table with partitioning and ordering, inserting data, running aggregation queries, and inspecting internal system tables. ClickHouse’s columnar storage, vectorized query execution, and MergeTree engine family make it one of the fastest analytical databases available for workloads involving billions of rows, making it an excellent choice for website analytics, IoT event streams, financial time series, and log analysis at scale.