How to Install ClickHouse Columnar Database on RHEL 7

ClickHouse is an open-source columnar database management system developed by Yandex, designed specifically for online analytical processing (OLAP) workloads. Unlike traditional row-oriented databases, ClickHouse stores data by column, enabling it to read only the columns needed for a query and compress similar data together — resulting in dramatically faster analytical queries over billions of rows. This guide covers installing ClickHouse on RHEL 7, configuring the service, and getting started with table creation, data insertion, and queries using the MergeTree engine family.

Prerequisites

  • RHEL 7 server with root or sudo access
  • At least 4 GB of RAM (8 GB or more recommended for meaningful workloads)
  • At least 20 GB of available disk space for data files
  • An internet connection to reach the ClickHouse yum repository, or a local mirror
  • Port 8123 (HTTP interface) and 9000 (native TCP interface) accessible if querying remotely

Step 1: Add the ClickHouse YUM Repository

ClickHouse provides an official RPM repository for RHEL-based distributions. Add the repository definition and import the GPG key.

sudo yum install -y yum-utils

sudo yum-config-manager --add-repo 
  https://packages.clickhouse.com/rpm/clickhouse.repo

Alternatively, create the repo file manually:

sudo tee /etc/yum.repos.d/clickhouse.repo <<'EOF'
[clickhouse-stable]
name=ClickHouse - Stable Repository
baseurl=https://packages.clickhouse.com/rpm/stable/
gpgcheck=1
gpgkey=https://packages.clickhouse.com/rpm/stable/repodata/repomd.xml.key
enabled=1
EOF

Verify the repository is available:

sudo yum repolist | grep clickhouse

Step 2: Install ClickHouse Server and Client

Install both the server daemon and the command-line client. The server handles all storage and query execution; the client provides an interactive SQL shell.

sudo yum install -y clickhouse-server clickhouse-client

The package installs:

  • ClickHouse server binary and systemd unit at /usr/bin/clickhouse-server
  • Configuration files under /etc/clickhouse-server/
  • Default data directory at /var/lib/clickhouse/
  • Log files written to /var/log/clickhouse-server/

Step 3: Review the Main Configuration File

The primary configuration is in /etc/clickhouse-server/config.xml. For most single-node installations the defaults work well, but review a few key settings.

sudo vi /etc/clickhouse-server/config.xml

Key sections to review:

<!-- Uncomment to listen on all interfaces instead of localhost only -->
<listen_host>0.0.0.0</listen_host>

<!-- HTTP port (used by JDBC drivers and many tools) -->
<http_port>8123</http_port>

<!-- Native TCP port (used by clickhouse-client) -->
<tcp_port>9000</tcp_port>

<!-- Data storage location -->
<path>/var/lib/clickhouse/</path>

<!-- Log location -->
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>

User accounts and passwords are configured separately in /etc/clickhouse-server/users.xml. By default a default user exists with no password and full access — change this before exposing ClickHouse to a network.

Step 4: Start and Enable the ClickHouse Service

sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server

If the status shows active (running), the server is ready. Check the log for any startup errors:

sudo tail -30 /var/log/clickhouse-server/clickhouse-server.log

Open firewall ports if remote access is needed:

sudo firewall-cmd --permanent --add-port=8123/tcp
sudo firewall-cmd --permanent --add-port=9000/tcp
sudo firewall-cmd --reload

Step 5: Connect with clickhouse-client

Launch the interactive SQL client. By default it connects to localhost on port 9000.

clickhouse-client

You will see the ClickHouse prompt. Run a quick test query:

SELECT version();

SELECT number, number * number AS square
FROM system.numbers
LIMIT 10;

The system.numbers table is a built-in virtual table useful for testing. Explore other system tables:

SELECT name, engine, total_rows, total_bytes
FROM system.tables
WHERE database = 'system'
ORDER BY total_bytes DESC
LIMIT 20;

Step 6: Create a Database and Table with MergeTree Engine

MergeTree is ClickHouse’s most important table engine. It supports primary keys, ordering, partitioning, and is the basis for the ReplacingMergeTree, SummingMergeTree, and ReplicatedMergeTree variants used in production clusters.

-- Create a dedicated database
CREATE DATABASE analytics;

-- Switch to the new database
USE analytics;

-- Create a web events table using MergeTree
CREATE TABLE web_events
(
    event_date  Date,
    event_time  DateTime,
    user_id     UInt64,
    page        String,
    country     LowCardinality(String),
    bytes_sent  UInt32,
    status_code UInt16
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_date, user_id)
SETTINGS index_granularity = 8192;

Key design decisions in this table:

  • PARTITION BY toYYYYMM(event_date) — partitions data by month, making it easy to drop old months of data
  • ORDER BY (event_date, user_id) — the primary sort key; ClickHouse physically sorts data on disk by this key enabling fast range scans
  • LowCardinality(String) for country — encodes repeated string values as integers internally, saving space and improving performance for low-cardinality columns

Step 7: Insert Data and Query

Insert sample rows to test the table:

INSERT INTO web_events VALUES
('2026-05-01', '2026-05-01 10:15:00', 1001, '/home', 'US', 4096, 200),
('2026-05-01', '2026-05-01 10:16:30', 1002, '/products', 'GB', 8192, 200),
('2026-05-01', '2026-05-01 10:17:45', 1001, '/checkout', 'US', 2048, 200),
('2026-05-02', '2026-05-02 09:00:00', 1003, '/home', 'DE', 4096, 200),
('2026-05-02', '2026-05-02 09:05:00', 1003, '/about', 'DE', 1024, 404);

Run analytical queries:

-- Count events per country
SELECT country, count() AS events
FROM web_events
GROUP BY country
ORDER BY events DESC;

-- Daily traffic summary
SELECT
    event_date,
    count() AS total_requests,
    sum(bytes_sent) AS total_bytes,
    countIf(status_code = 200) AS successful
FROM web_events
GROUP BY event_date
ORDER BY event_date;

-- Unique visitors per day
SELECT event_date, uniqExact(user_id) AS unique_visitors
FROM web_events
GROUP BY event_date
ORDER BY event_date;

Step 8: Inspect Table Storage Metadata

ClickHouse’s system.tables and system.parts tables expose internals about data storage.

-- See all tables in the analytics database
SELECT name, engine, total_rows, formatReadableSize(total_bytes) AS size
FROM system.tables
WHERE database = 'analytics';

-- See the physical data parts created by MergeTree
SELECT
    table,
    partition,
    name,
    rows,
    formatReadableSize(bytes_on_disk) AS disk_size,
    marks
FROM system.parts
WHERE database = 'analytics' AND active = 1;

Step 9: Load Data from a CSV File

ClickHouse can ingest data from files using the HTTP interface, which is ideal for bulk loads from scripts or ETL pipelines.

# Create a CSV file
cat > /tmp/events.csv <<'EOF'
2026-05-03,2026-05-03 11:00:00,2001,/home,FR,4096,200
2026-05-03,2026-05-03 11:05:00,2002,/blog,CA,6144,200
2026-05-03,2026-05-03 11:10:00,2003,/contact,AU,2048,200
EOF

# Load via the HTTP interface
curl -s "http://localhost:8123/" 
  --data-binary @/tmp/events.csv 
  --get 
  --data-urlencode 
  "query=INSERT INTO analytics.web_events FORMAT CSV"

Conclusion

You have installed ClickHouse on RHEL 7, started the server service with systemctl, and performed basic operations including creating a MergeTree table with partitioning and ordering, inserting data, running aggregation queries, and inspecting internal system tables. ClickHouse’s columnar storage, vectorized query execution, and MergeTree engine family make it one of the fastest analytical databases available for workloads involving billions of rows, making it an excellent choice for website analytics, IoT event streams, financial time series, and log analysis at scale.