*The author selected Girst Who Code to receive a donation as part of the Write for DOnations program.*

Introduction

Deploying a PostgreSQL database on a Kubernetes cluster has become a popular approach for managing scalable, resilient, and dynamic database environments. Kubernetes has container orchestration capabilities that offer a robust framework for deploying and managing applications, including databases like PostgreSQL, in a distributed environment. This integration provides significant scalability, resilience, and efficient resource utilization advantages. By leveraging Kubernetes features such as scalability, automated deployment, and self-healing capabilities, users can ensure the seamless operation of their PostgreSQL databases in a containerized environment.

This guide will explore the step-by-step process of deploying PostgreSQL on a Kubernetes cluster. Whether you are a developer, DevOps engineer, or system administrator looking to deploy PostgreSQL in a Kubernetes environment effectively, this guide aims to provide comprehensive insights and practical steps to successfully set up and manage PostgreSQL databases within a Kubernetes cluster.

Prerequisites

postgres illustration for: Prerequisites

Before you begin this tutorial, you will need the following:

  • A development server or local machine from which you will deploy the PostgreSQL.

Create a ConfigMap to Store Database Details

In Kubernetes, a ConfigMap is an API object that stores configuration data in key-value pairs, which pods or containers can use in a cluster. ConfigMaps helps decouple configuration details from the application code, making it easier to manage and update configuration settings without changing the application's code.

Let's create a ConfigMap configuration file to store PostgreSQL connection details such as hostname, database name, username, and other settings.

				
					nano postgres-configmap.yaml
				
			

Add the following configuration. Define the default database name, user, and password.

				
					apiVersion: v1
kind: ConfigMap
metadata:
 name: postgres-secret
 labels:
 app: postgres
data:
 POSTGRES_DB: ps_db
 POSTGRES_USER: ps_user
 POSTGRES_PASSWORD: SecurePassword
				
			

Let's break down the above configuration:

  • apiVersion: v1 specifies the Kubernetes API version used for this ConfigMap.
  • kind: ConfigMap defines the Kubernetes resource type.
  • Under metadata, the name field specifies the name of the ConfigMap, set as "postgres-secret." Additionally, labels are applied to the ConfigMap to help identify and organize resources.
  • The data section contains the configuration data as key-value pairs.
  • POSTGRES_DB: Specify the default database name for PostgreSQL.
  • POSTGRES_USER: Specify the default username for PostgreSQL.
  • POSTGRES_PASSWORD: Specify the default password for the PostgreSQL user.

Storing sensitive data in a ConfigMap is not recommended due to security concerns. When handling sensitive data within Kubernetes, it's essential to use Secrets and follow security best practices to ensure the protection and confidentiality of your data.

Save and close the file, then apply the ConfigMap configuration to the Kubernetes.

				
					kubectl apply -f postgres-configmap.yaml
				
			

You can verify the ConfigMap deployment using the following command.

				
					kubectl get configmap
				
			

Output.

				
					NAME DATA AGE
kube-root-ca.crt 1 116s
postgres-secret 3 12s
				
			

Create a PersistentVolume (PV) and a PersistentVolumeClaim (PVC)

PersistentVolume (PV) and PersistentVolumeClaim (PVC) are Kubernetes resources that provide and claim persistent storage in a cluster. A PersistentVolume provides storage resources in the cluster, while a PersistentVolumeClaim allows pods to request specific storage resources.

First, create a YAML file for PersistentVolume.

				
					nano psql-pv.yaml
				
			

Add the following configuration.

				
					apiVersion: v1
kind: PersistentVolume
metadata:
 name: postgres-volume
 labels:
 type: local
 app: postgres
spec:
 storageClassName: manual
 capacity:
 storage: 10Gi
 accessModes:
 - ReadWriteMany
 hostPath:
 path: /data/postgresql
				
			

Here is the explanation of each component:

  • storageClassName: manual specifies the StorageClass for this PersistentVolume. The StorageClass named "manual" indicates that provisioning of the storage is done manually.
  • Capacity specifies the desired capacity of the PersistentVolume.
  • accessModes defines the access modes that the PersistentVolume supports. In this case, it is set to ReadWriteMany, allowing multiple Pods to read and write to the volume simultaneously.
  • hostPath is the volume type created directly on the node's filesystem. It is a directory on the host machine's filesystem (path: "/data/postgresql") that will be used as the storage location for the PersistentVolume. This path refers to a location on the host where the data for the PersistentVolume will be stored.

Save the file, then apply the above configuration to the Kubernetes.

				
					kubectl apply -f psql-pv.yaml
				
			

Next, create a YAML for PersistentVolumeClaim.

				
					nano psql-claim.yaml
				
			

Add the following configurations.

				
					apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: postgres-volume-claim
 labels:
 app: postgres
spec:
 storageClassName: manual
 accessModes:
 - ReadWriteMany
 resources:
 requests:
 storage: 10Gi
				
			

Let's break down the components:

  • kind: PersistentVolumeClaim indicates that this YAML defines a PersistentVolumeClaim resource.
  • storageClassName: manual specifies the desired StorageClass for this PersistentVolumeClaim.
  • accessModes specifies the access mode required by the PersistentVolumeClaim.
  • Resources define the requested resources for the PersistentVolumeClaim:
  • The requests section specifies the amount of storage requested.

Save the file, then apply the configuration to the Kubernetes.

				
					kubectl apply -f psql-claim.yaml
				
			

Now, use the following command to list all the PersistentVolumes created in your Kubernetes cluster:

				
					kubectl get pv
				
			

This command will display details about each PersistentVolume, including its name, capacity, access modes, status, reclaim policy, and storage class.

				
					NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE

postgres-volume 10Gi RWX Retain Bound default/postgres-volume-claim manual 34s
				
			

To list all the PersistentVolumeClaims in the cluster, use the following command:

				
					kubectl get pvc
				
			

This command will show information about the PersistentVolumeClaims, including their names, statuses, requested storage, bound volumes, and their corresponding PersistentVolume if they are bound.

				
					NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE

postgres-volume-claim Bound postgres-volume 10Gi RWX manual 22s
				
			

Create a PostgreSQL Deployment

Creating a PostgreSQL deployment in Kubernetes involves defining a Deployment manifest to orchestrate the PostgreSQL pods.

Create a YAML file ps-deployment.yaml to define the PostgreSQL Deployment.

				
					nano ps-deployment.yaml
				
			

Add the following content.

				
					apiVersion: apps/v1
kind: Deployment
metadata:
 name: postgres
spec:
 replicas: 3
 selector:
 matchLabels:
 app: postgres
 template:
 metadata:
 labels:
 app: postgres
 spec:
 containers:
 - name: postgres
 image: 'postgres:14'
 imagePullPolicy: IfNotPresent
 ports:
 - containerPort: 5432
 envFrom:
 - configMapRef:
 name: postgres-secret
 volumeMounts:
 - mountPath: /var/lib/postgresql/data
 name: postgresdata
 volumes:
 - name: postgresdata
 persistentVolumeClaim:
 claimName: postgres-volume-claim
				
			

Here is a brief explanation of each parameter:

  • replicas: 3 specifies the desired number of replicas.
  • selector specifies how the Deployment identifies which Pods it manages.
  • template defines the Pod template used for creating new Pods controlled by this Deployment. Under metadata, the labels field assigns labels to the Pods created from this template, with app: postgres.
  • containers specify the containers within the Pod.
  • name: postgres is the name assigned to the container.
  • image: postgres:14 specifies the Docker image for the PostgreSQL database.
  • imagePullPolicy: "IfNotPresent" specifies the policy for pulling the container image.
  • ports specify the ports that the container exposes.
  • envFrom allows the container to load environment variables from a ConfigMap.
  • volumeMounts allows mounting volumes into the container.
  • volumes define the volumes that can be mounted into the Pod.
  • name: postgresdata specifies the name of the volume.
  • persistentVolumeClaim refers to a PersistentVolumeClaim named "postgres-volume-claim". This claim is likely used to provide persistent storage to the PostgreSQL container so that data is retained across Pod restarts or rescheduling.

Save and close the file, then apply the deployment.

				
					kubectl apply -f ps-deployment.yaml
				
			

This command creates the PostgreSQL Deployment based on the specifications provided in the YAML file.

To check the status of the created deployment:

				
					kubectl get deployments
				
			

The following output confirms that the PostgreSQL Deployment has been successfully created.

				
					NAME READY UP-TO-DATE AVAILABLE AGE
postgres 3/3 3 3 17s
				
			

To check the running pods, run the following command.

				
					kubectl get pods
				
			

You will see the running pods in the following output.

				
					NAME READY STATUS RESTARTS AGE
postgres-665b7554dc-cddgq 1/1 Running 0 28s
postgres-665b7554dc-kh4tr 1/1 Running 0 28s
postgres-665b7554dc-mgprp 1/1 Running 1 (11s ago) 28s
				
			

Create a Service for PostgreSQL

In Kubernetes, a Service is used to define a logical set of Pods that enable other Pods within the cluster to communicate with a set of Pods without needing to know the specific IP addresses of those Pods.

Let's create a service manifest file to expose PostgreSQL internally within the Kubernetes cluster:

				
					nano ps-service.yaml
				
			

Add the following configuration.

				
					apiVersion: v1
kind: Service
metadata:
 name: postgres
 labels:
 app: postgres
spec:
 type: NodePort
 ports:
 - port: 5432
 selector:
 app: postgres
				
			

Save the file, then apply this YAML configuration to Kubernetes.

				
					kubectl apply -f ps-service.yaml
				
			

Once the service is created, other applications or services within the Kubernetes cluster can communicate with the PostgreSQL database using the Postgres name and port 5432 as the entry point.

You can verify the service deployment using the following command.

				
					kubectl get svc
				
			

Output.

				
					NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6m6s
postgres NodePort 10.98.119.102 <none> 5432:30344/TCP 6s
				
			

Connect to PostgreSQL via kubectl

First, list the available Pods in your namespace to find the PostgreSQL Pod:

				
					kubectl get pods
				
			

You will see the running pods in the following output.

				
					NAME READY STATUS RESTARTS AGE
postgres-665b7554dc-cddgq 1/1 Running 0 28s
postgres-665b7554dc-kh4tr 1/1 Running 0 28s
postgres-665b7554dc-mgprp 1/1 Running 1 (11s ago) 28s
				
			

Locate the name of the PostgreSQL Pod from the output.

Once you have identified the PostgreSQL Pod, use the kubectl exec command to connect the PostgreSQL pod.

				
					kubectl exec -it postgres-665b7554dc-cddgq -- psql -h localhost -U ps_user --password -p 5432 ps_db
				
			
  • postgres-665b7554dc-cddgq: This is the pod's name where the PostgreSQL container is running.
  • ps_user: Specifies the username that will be used to connect to the PostgreSQL database.
  • –password: Prompts for the password interactively.
  • ps_db: Specifies the database name to connect to once authenticated with the provided user.

You will be asked to provide a password for Postgres users. After the successful authentication, you will get into the Postgres shell.

				
					Password:
psql (14.10 (Debian 14.10-1.pgdg120+1))
Type "help" for help.
ps_db=#
				
			

Next, verify the PostgreSQL connection using the following command.

				
					ps_db=# \conninfo
				
			

You will see the following output.

				
					You are connected to database "ps_db" as user "ps_user" on host "localhost" (address "::1") at port "5432".
				
			

You can exit from the PostgreSQL shell using the following command.

				
					exit
				
			

Scale PostgreSQL Deployment

Scaling a PostgreSQL deployment in Kubernetes involves adjusting the number of replicas in the Deployment or StatefulSet that manages the PostgreSQL Pods.

First, check the current state of your PostgreSQL deployment:

				
					kubectl get pods -l app=postgres
				
			

Output.

				
					postgres-665b7554dc-cddgq 1/1 Running 0 2m12s
postgres-665b7554dc-kh4tr 1/1 Running 0 2m12s
postgres-665b7554dc-mgprp 1/1 Running 1 (115s ago) 2m12s
				
			

To scale the PostgreSQL deployment to 5 replicas, use the kubectl scale command:

				
					kubectl scale deployment --replicas=5 postgres
				
			

Replace 5 with the number of replicas you want for your PostgreSQL deployment.

Next, recheck the status of your deployment to ensure that the scaling operation was successful:

				
					kubectl get pods -l app=postgres
				
			

You will see that the number of pods increased to 5:

				
					NAME READY STATUS RESTARTS AGE
postgres-665b7554dc-cddgq 1/1 Running 0 3m56s
postgres-665b7554dc-ftxbl 1/1 Running 0 10s
postgres-665b7554dc-g2nh6 1/1 Running 0 10s
postgres-665b7554dc-kh4tr 1/1 Running 0 3m56s
postgres-665b7554dc-mgprp 1/1 Running 1 (3m39s ago) 3m56s
				
			

Backup and Restore PostgreSQL Database

You can back up a PostgreSQL database running in a Kubernetes Pod using the kubectl exec command in conjunction with the pg_dump tool directly within the Pod.

First, List all Pods to find the name of your PostgreSQL Pod:

				
					kubectl get pods
				
			

Next, use the kubectl exec command to run the pg_dump command inside the PostgreSQL Pod:

				
					kubectl exec -it postgres-665b7554dc-cddgq -- pg_dump -U ps_user -d ps_db > db_backup.sql
				
			

This command dumps the database and redirects the output to a file named db_backup.sql in the local directory.

To restore the database back to the Kubernetes pod, you will need the SQL dump file and the use of the psql command to execute the restore process.

First, use the kubectl cp command to copy the SQL dump file from your local machine into the PostgreSQL Pod:

				
					kubectl cp db_backup.sql postgres-665b7554dc-cddgq:/tmp/db_backup.sql
				
			

Next, connect to the PostgreSQL pod using the following command.

				
					kubectl exec -it postgres-665b7554dc-cddgq -- /bin/bash
				
			

Next, run the psql command to restore the backup from the dump file.

				
					psql -U ps_user -d ps_db -f /tmp/db_backup.sql
				
			

Conclusion

This guide outlined the fundamental steps required to set up PostgreSQL successfully within a Kubernetes environment. By leveraging Kubernetes' orchestration capabilities, organizations can efficiently manage PostgreSQL instances, dynamically scale resources, ensure high availability, and streamline maintenance operations.

If you want to learn more about Kubernetes and Helm, please check out our community page's Kubernetes section.