Prometheus and Grafana are the de facto standard observability stack for Kubernetes, providing metrics collection, long-term storage, alerting, and rich dashboarding in a single integrated solution. The kube-prometheus-stack Helm chart bundles Prometheus, Grafana, Alertmanager, and a comprehensive set of pre-built Kubernetes dashboards and alert rules into a single deployment that takes minutes to install. This tutorial walks you through installing the stack on RHEL 8, exposing Grafana for browser access, exploring the built-in Kubernetes dashboards, defining custom PrometheusRule alerts, and configuring Alertmanager routing in the values file. After completing this guide you will have end-to-end visibility into your cluster’s health and performance.
Prerequisites
- A running Kubernetes cluster on RHEL 8
- Helm 3 installed (
helm versionshould show v3.x) kubectlconfigured with cluster-admin privileges- At least 4 GB free memory across the cluster for the full stack
- Persistent volume provisioner available (or use
--set prometheus.prometheusSpec.storageSpec=to disable PVCs for testing)
Step 1 — Add the Prometheus Community Helm Repository
The kube-prometheus-stack chart is maintained by the Prometheus community and distributed through their Helm repository. Add and update it before installing.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm search repo prometheus-community/kube-prometheus-stack
Step 2 — Install kube-prometheus-stack
Create a dedicated namespace and install the chart. The default installation includes Prometheus with 10 day retention, Grafana with admin credentials, Alertmanager, and ServiceMonitors for all core Kubernetes components.
kubectl create namespace monitoring
helm install kube-prometheus-stack
prometheus-community/kube-prometheus-stack
--namespace monitoring
--set grafana.adminPassword='ChangeMe123!'
--set prometheus.prometheusSpec.retention=15d
--wait
Verify all pods come up in the monitoring namespace:
kubectl get pods -n monitoring
You should see pods for prometheus-kube-prometheus-stack-prometheus, alertmanager-kube-prometheus-stack-alertmanager, kube-prometheus-stack-grafana, and several exporters.
Step 3 — Expose Grafana for Browser Access
You can expose Grafana using either a NodePort Service or kubectl port-forwarding. For persistent access, patch the service to NodePort:
# Option A: port-forward (temporary)
kubectl port-forward svc/kube-prometheus-stack-grafana
-n monitoring 3000:80 &
# Option B: change service type to NodePort (persistent)
kubectl patch svc kube-prometheus-stack-grafana
-n monitoring
-p '{"spec":{"type":"NodePort"}}'
kubectl get svc kube-prometheus-stack-grafana -n monitoring
Open the NodePort in your browser and log in with username admin and the password set during installation. Open firewall access if using NodePort:
NODEPORT=$(kubectl get svc kube-prometheus-stack-grafana -n monitoring
-o jsonpath='{.spec.ports[0].nodePort}')
sudo firewall-cmd --permanent --add-port=${NODEPORT}/tcp
sudo firewall-cmd --reload
Step 4 — Explore Built-in Dashboards
The kube-prometheus-stack ships with over 30 pre-built Grafana dashboards. Navigate to Dashboards → Browse in the Grafana UI. Key dashboards to review:
- Kubernetes / Cluster (by Namespace) — namespace-level CPU, memory, and network usage
- Kubernetes / Nodes — per-node CPU saturation, memory pressure, disk I/O, and network
- Kubernetes / Pods — individual pod resource consumption and restart counts
- Kubernetes / API server — request latency, error rates, and etcd metrics
# Query Prometheus directly via port-forward
kubectl port-forward svc/kube-prometheus-stack-prometheus
-n monitoring 9090:9090 &
# Example PromQL: CPU usage per namespace
# Open http://localhost:9090 and enter:
# sum(rate(container_cpu_usage_seconds_total{namespace!=""}[5m])) by (namespace)
Step 5 — Create a Custom PrometheusRule Alert
Define a custom alert that fires when a pod has been restarting frequently. Save the following as pod-restart-alert.yaml:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: pod-restart-alerts
namespace: monitoring
labels:
release: kube-prometheus-stack
spec:
groups:
- name: pod.rules
rules:
- alert: PodHighRestartCount
expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} restarting frequently"
description: "Container {{ $labels.container }} in pod {{ $labels.pod }} has restarted more than 5 times in the last hour."
kubectl apply -f pod-restart-alert.yaml
# Verify Prometheus picked up the rule
kubectl get prometheusrule -n monitoring
Step 6 — Configure Alertmanager via values.yaml
Configure Alertmanager to route warning severity alerts to a Slack channel by creating a custom values file and upgrading the Helm release.
cat > alertmanager-values.yaml << 'EOF'
alertmanager:
config:
global:
slack_api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
route:
receiver: 'slack-warnings'
group_by: ['alertname', 'namespace']
routes:
- match:
severity: warning
receiver: 'slack-warnings'
receivers:
- name: 'slack-warnings'
slack_configs:
- channel: '#k8s-alerts'
title: '{{ .CommonAnnotations.summary }}'
text: '{{ .CommonAnnotations.description }}'
send_resolved: true
EOF
helm upgrade kube-prometheus-stack
prometheus-community/kube-prometheus-stack
--namespace monitoring
--values alertmanager-values.yaml
--reuse-values
Conclusion
You have deployed the full kube-prometheus-stack on RHEL 8, exposed Grafana with browser access, explored built-in Kubernetes cluster and node dashboards, created a custom PrometheusRule alert for pod restart detection, and configured Alertmanager to forward warnings to Slack. This observability stack gives you the metrics visibility needed to catch performance degradation and capacity issues before they become outages.
Next steps: Set up persistent storage for Prometheus with a PersistentVolumeClaim for long-term metric retention, Add application-level metrics by instrumenting your services with the Prometheus client library, and Configure Grafana datasource and dashboard provisioning via ConfigMaps for reproducible setups.