Every Linux administrator needs a reliable toolkit for answering the question “why is this server slow?” The answer is almost always in one of four resources: CPU, memory, disk I/O, or network. RHEL 9 includes several powerful tools for diagnosing resource contention: top is included in every base installation and provides a real-time process table; htop adds colour, mouse support, tree views, and an easier interface for killing and renicing processes; vmstat provides concise columnar snapshots of CPU, memory, swap, block I/O, and context switches in a single view; iostat gives per-device disk throughput and utilisation; and sar (part of the sysstat package) records all of these metrics historically so you can investigate what happened at 3 AM rather than only what is happening right now. Understanding all of these tools and when to reach for each one is a core competency for RHEL 9 administration.

Prerequisites

  • RHEL 9 server with root or sudo access
  • EPEL repository enabled (for htop)

Step 1 — Install the Required Tools

# htop is in EPEL
dnf install -y htop

# sysstat provides vmstat, iostat, sar, mpstat, pidstat
dnf install -y sysstat

# Enable sysstat data collection
systemctl enable --now sysstat

Step 2 — Use top for Real-Time Process Monitoring

top is available on every RHEL system without any installation. Launch it:

top

Key interactive commands while top is running:

  • M — sort by memory usage
  • P — sort by CPU usage (default)
  • T — sort by cumulative time
  • k — kill a process by PID
  • r — renice a process
  • 1 — toggle showing all CPU cores individually
  • H — show individual threads instead of processes
  • f — field manager: add/remove columns
  • W — save current settings to ~/.toprc
  • q — quit

The header shows load average (last 1, 5, and 15 minutes). A load average equal to the number of CPU cores means the CPU is fully utilised. Values above the CPU count indicate queueing — processes waiting for CPU time.

Non-interactive usage for scripting:

# Take 1 snapshot (batch mode, 1 iteration)
top -b -n 1 | head -30

# Filter for a specific user's processes
top -u appuser

Step 3 — Use htop for a Better Interactive Experience

htop provides the same information as top with a far more navigable interface:

htop

The top panel shows a horizontal bar graph for each CPU core and memory/swap usage bars. Key features:

  • F2 (Setup) — customise which meters appear in the header, column layout, and colour scheme
  • F3 (Search) — search for a process by name
  • F4 (Filter) — filter the process list (e.g., show only processes matching “nginx”)
  • F5 (Tree) — toggle tree view to see parent/child process relationships
  • F6 (SortBy) — pick the sort column
  • F9 (Kill) — send a signal to the selected process
  • Space — select/tag multiple processes for bulk actions

Useful command-line flags:

# Show only a specific user's processes
htop -u appuser

# Start in tree view
htop -t

# Start with a specific sort column (PERCENT_CPU = default)
htop --sort-key PERCENT_MEM

Step 4 — Use vmstat for CPU, Memory, and I/O Snapshots

vmstat is the most useful tool for getting a quick overview of all resource types at once. It shows a summary in a single row:

# Show 10 snapshots every 2 seconds
vmstat 2 10

Output columns explained:

  • procs: r — processes waiting for CPU (run queue). A value consistently above your CPU core count means CPU is a bottleneck.
  • procs: b — processes in uninterruptible sleep (usually waiting for I/O). High values mean disk I/O is a bottleneck.
  • memory: swpd — amount of virtual memory in use (swap in KB)
  • memory: free — idle memory
  • memory: buff — memory used for block device buffers
  • memory: cache — memory used for file cache
  • swap: si — swap in from disk (KB/s). Non-zero values mean the system is actively swapping.
  • swap: so — swap out to disk (KB/s)
  • io: bi — blocks received from block device (reads per second)
  • io: bo — blocks sent to block device (writes per second)
  • cpu: us — time in user mode (application code)
  • cpu: sy — time in kernel mode (system calls)
  • cpu: id — idle time. Values near 0 mean the CPU is saturated.
  • cpu: wa — time waiting for I/O. High values indicate disk is the bottleneck.
  • cpu: st — steal time: CPU cycles stolen by the hypervisor. Relevant on virtual machines.
# Show active/inactive memory
vmstat -a 2 5

# Show disk statistics (similar to iostat)
vmstat -d 2 5

# Show slabinfo (kernel memory allocator detail)
vmstat -m

Step 5 — Use iostat for Disk I/O Analysis

# CPU summary plus per-device stats, refresh every 2 seconds, 5 iterations
iostat -x 2 5

Key iostat -x columns:

  • %util — percentage of time the device was busy. Values near 100% mean the disk is saturated.
  • await — average time for I/O requests to complete (ms). High values indicate a slow disk or overloaded I/O queue.
  • r/s and w/s — reads and writes per second
  • rkB/s and wkB/s — KB read and written per second
# Show only a specific device
iostat -x sda 2 5

# Display in megabytes
iostat -xm 2 5

Step 6 — Use sar for Historical Data

sar collects and reports system activity. The sysstat service runs a collection job every 10 minutes by default, storing data in /var/log/sa/:

# CPU usage for today
sar -u

# CPU usage yesterday
sar -u -1

# Memory usage for today
sar -r

# Swap activity for today
sar -W

# I/O transfer rate
sar -b

# Load average and run queue
sar -q

# Per-CPU statistics
sar -P ALL -u

# Show all stats for a specific hour window
sar -u -s 14:00:00 -e 16:00:00

Step 7 — Use mpstat for Per-CPU Core Analysis

# Show per-core CPU stats every 2 seconds for 5 iterations
mpstat -P ALL 2 5

This is useful for identifying CPU affinity issues where one core is at 100% while others are idle, which can indicate single-threaded application bottlenecks or interrupt handling imbalance.

Step 8 — Quick Diagnostic Checklist

# 1-second snapshot of all key metrics
echo "=== Load average ===" && uptime
echo "=== Memory ===" && free -h
echo "=== Swap ===" && swapon --show
echo "=== Top CPU consumers ===" && ps aux --sort=-%cpu | head -6
echo "=== Top RAM consumers ===" && ps aux --sort=-%mem | head -6
echo "=== Disk I/O ===" && iostat -x 1 1 | tail -10
echo "=== Open file handles ===" && lsof 2>/dev/null | wc -l

Troubleshooting Decision Tree

  • High load average, low CPU idle (vmstat cpu: id near 0) — CPU bottleneck. Check htop to identify which processes are consuming CPU. Consider optimising the application, adding CPU resources, or distributing load.
  • High load average, high CPU wa (vmstat cpu: wa) — disk I/O bottleneck. Check iostat -x for devices with %util near 100%. Consider faster storage, I/O scheduling tuning, or offloading I/O to a separate disk.
  • High swap activity (vmstat swap: si/so non-zero) — memory pressure. The system is swapping, which degrades performance 100x. Identify which processes use the most RAM and either tune them, restart them, or add RAM.
  • High CPU st (steal time) — the hypervisor is stealing CPU cycles. This is a cloud instance resource contention issue — upgrade the instance type or move to a dedicated host.

Conclusion

You now have a complete toolkit for diagnosing server performance on RHEL 9. Use htop for interactive exploration and process management, vmstat for a quick multi-resource snapshot, iostat for disk I/O analysis, sar for historical data when investigating past incidents, and mpstat for per-core CPU analysis. Combining these tools gives you full visibility into every layer of system resource utilisation.

Next steps: How to Configure Automatic Security Updates on RHEL 9, How to Set Up a Prometheus Node Exporter on RHEL 9, and How to Analyse Application Performance with perf on RHEL 9.