Highly available and scalable PostgreSQL on Kubernetes (k8s) with the Crunchy PostgreSQL Operator (pgo).
Following on from our post of getting Kubernetes stood up atop VMware’s vSphere. The next stage in deploying Zercurity to Kubernetes is to have a reliable backend database. Which can normally be a headache. However, the team over at Crunchy Data have built an awesome tool to deploy and manage a PostgreSQL cluster.
The Crunchy PostgreSQL Operator (
pgo) can build PostgreSQL replica sets, manage backups and scale up and scale down additional nodes as well as scale the CPU and memory requirements of running Kubernetes pods. Backups can also be managed and scheduled via the
pgo command line tool. Crunchy PostgreSQL also provides a number of add-in pods such as performance metrics via Grafana and Prometheus. As well as pgadmin (a web based GUI to manage your PostgreSQL cluster).
A working and up-to-date Kubernetes cluster. Whilst
pgo is possible to install without helm. We’ll be using helm to deploy and manage our pgo operator nodes.
There are two main parts to
pgo. There is the management layer (control plane) which provides APIs to the
pgo command line tool. Which also exposes a RESTful API to manage your PostgreSQL clusters which is nice. The other part are your deployed database clusters which span a number of other pod types which we’re going to explore.
There are a few ways to deploy
pgo. However, whilst installing helm is an additional step is does make the deployment and upgrade-ability of
pgo a whole lot easier.
curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
sudo apt-get install apt-transport-https --yes
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
From the helm releases page you can download the pre-complied binary. Or if you have homebrew installed you can just run
brew install helm .
curl -o helm.tar.gz https://get.helm.sh/helm-v3.4.1-darwin-amd64.tar.gz
tar -zxvf helm-v3.4.1-darwin-amd64.tar.gz
sudo mv helm /usr/local/bin/helm
sudo chmod +x /usr/local/bin/helm
In order to administer the cluster, Crunchy Data provide a helper utility called
pgo. This will connect to the operator control plane API on port 8443 to create and mange database clusters and users amongst other utilities.
Start by checking out the most recent version of Crunchy Data’s PostgreSQL operator. You’ll need
git installed (
apt install git).
Once cloned we can run there client setup script to download and install the
pgo utility for us.
chmod +x client-setup.sh
Note: In the next part, for Mac OSX users, you must use
~/.bash_profile instead of
Once the script has finished running you’ll see something like this:
You’ll then need to go ahead and append this into the end of your
.bashrc file. All we’re doing here is letting your system know to set some additional environmental variables for each new session you create so you won’t have to enter them in every time. When using the
Simply run the
source command which will apply the new configuration to your current session.
You should now be able to run the
pgo version command to check everything is working (you’ll get an error about fetching the apirserver version. Which we’re going to setup next).
Installing pgo operator
pgo client is install we can now deploy the PostgreSQL operator. Which is super simple with helm:
helm install postgres-operator . -n pgo
This will take a few minutes to deploy everything and once its done. You’ll be able to query the new pods deployed. To check its running correctly.
kubectl -n pgo get pods
You should see a single pod like so:
NAME READY STATUS RESTARTS AGE
postgres-operator-56d6ccb97-tmz7m 4/4 Running 0 2m
Lastly, in order for your local
pgo client to work correctly we need to port-forward the remote API server to your local machine like so:
kubectl -n pgo port-forward svc/postgres-operator 8443:8443
You can now re-run
pgo version which will now yield an apiserver version:
pgo client version 4.5.1
pgo-apiserver version 4.5.1
If you get the following error:
Error: Get “https://127.0.0.1:8443/version": dial tcp 127.0.0.1:8443: connect: connection refused it’ll be because your port-forwarding request (shown above) has died and you’ll need to restart it.
Installing pgo metrics
This stage is optional and can be performed at a later stage. As of version 4.5.1 you can now easily deploy metrics to monitor the performance of both your PostgreSQL hosts and pods with Grafana and Prometheus out-of-the-box with a single command. Awesome:
To deploy support for metrics simply run the following from the root of the postgres operator git repository.
helm install metrics . -n pgo
Once helm has finished running you can check everything has been deployed correctly like so:
kubectl -n pgo get poNAME READY STATUS RESTARTS AGE
postgres-operator-689598d795 4/4 Running 1 7h
Once you’ve spun up your first cluster as shown in the next part of this post the dashboard will be automatically populated. To access the metrics dashboard you’ll need to port-forward the Grafana service to your local machine:
kubectl -n pgo port-forward --address 0.0.0.0 svc/crunchy-grafana 3000:3000
The default login for Grafana is admin and use admin again for the password. Your Grafana dashboard will be available from:
Creating your first cluster
With all that setup done you can now deploy your first PostgreSQL cluster. I’m going to name our new cluster “zercurity” and manually give it a user and database name otherwise, one is generated for you. Using the name of your cluster for the databse name and “testuser” for the username.
This will deploy two additional servers to act as replicas. Usually to provide additional redundancy and read throughput. This can be scaled up and down at a later stage using the
This provides a haproxy style service to load balance and rate limit connecting clients.
This enables support for metrics. As soon as the cluster is up and running you’ll be able to see metrics appear in your Grafana dashboard.
As we’re using replica nodes in our cluster. You can turn on replication synchronization, which is useful for workloads that are sensitive to losing transactions. PostgreSQL will not consider a transaction to be committed until it is committed to all synchronous replicas connected to a primary. This comes at the cost of performance.
Lastly but probably most importantly the
pvc-sizeis the amount of disk space we’re going to allocate for each of our database nodes. In this example we’re allocating 150GB of data. So for 3 nodes (1 master and 2 replicas) that’ll be 450GB in total.
This needs to really exceed your
pvc-size. If your running incremental backup jobs the total size will more or less match the
pvc-size. However, if you’re planning on running complete backups you’ll wan to make sure you have a few multiples of your
pvc-sizeallocated. As the allocated space used will compound over time.
pgo create cluster zercurity --replica-count=2 \
--pgbouncer --pgbadger --metrics --username zercurity \
--sync-replication --database=zercurity --pvc-size=150Gi \
This command will return your new database credentials. However, the cluster will not yet be live and will take a few minutes to come online whilst it configures itself. You can check the status of your cluster like so (where zercurity is your cluster name):
pgo test <zercurity>cluster : zercurity
primary (100.66.14.31:5432): UP
pgbouncer (100.69.208.231:5432): UP
replica (100.67.75.10:5432): UP
replica (zercurity-79cf8bc9c-499jp): UP
replica (zercurity-vcvm-66cfb945f5-5f4hx): UP
replica (zercurity-xfrx-69b4f4bbc-69vtr): UP
You can also grab your database credentials at any time with:
pgo show pgbouncer <zercurity>CLUSTER SERVICE USERNAME PASSWORD CLUSTER IP
--------- ------------------- --------- ------------------------
zercurity zercurity-pgbouncer pgbouncer password 100.69.208.231
If you’ve not opted to use pgbouncer (the flag
-pgbouncer ) or you’d like to connect directly your database instances. As the bgbouncer does not have permission to create new databases. You’ll need to request your login using information with this command:
pgo show user -n pgo <zercurity> --show-system-accounts
You can also quickly rotate your password too:
pgo update pgbouncer zercurity --rotate-password
Connecting to your cluster
Now that you’ve got your flashy new cluster stood up you’re going to want to kick the tires on it. As we’ve done in prior steps we’re going to need to port forward from our pgbouncer service (or primary node) to our local machine like so:
kubectl -n pgo port-forward svc/<zercurity> 5432:5432
With the port exposed locally you can now connect in with the
psql client or any database management tool.
psql -h localhost -p 5432 -U <username> <zercurity>
Once your cluster has been deployed you can add a database web GUI with pgadmin. This can be done with a single command:
pgo create pgadmin <zercurity>
You can then check the deployment status with (it should only take a few minutes):
kubectl -n pgo get poNAME READY STATUS RESTARTS AGE
zercurity-pgadmin-5464c6545-k99n4 1/1 Running 0 7h
After the status has changed to
Running you’ll need to once again port forward the remote service to your local machine:
kubectl port-forward svc/zercurity-pgadmin 5050:5050
You should then be able to access it from your local machine.
Updating & scaling your cluster
-metrics support within the Postgres Operator cluster will give you a strong indication if your database cluster is starting to become starved of resources. Using the
pgo utility you can update all the pods within your cluster:
pgo update cluster <zercurity> --cpu=2.0 --cpu-limit=4.0 \
You can scale up your cluster’s replica count quite easily.
pgo scale <zercurity> --replica-count=1
Scaling down your cluster is a little more involved as you need to let
pgo know which replica pod you wish to terminate.
pgo scaledown <zercurity> --query
pgo scaledown <zercurity> --target=zercurity-replica-xxxx
To create a backup simply run:
pgo backup <zercurity>
This will automatically create an incremental backup from its last backup point. If you wish to force a full backup you can run:
pgo backup <zercurity> --backup-opts="--type=full"
You can also see what backups have been made with the
pgo backup show <zercurity> . Or remove outdated backups with
pgo delete backup <zercurity>
pgo show backup <zercurity>cluster: zercurity
storage type: localstanza: db
cipher: nonedb (current)
wal archive min/max (12-1)full backup: 20201120-222158F
timestamp start/stop: 2020-11-20 / 2020-11-20
wal start/stop: 00000001 / 00000001
database size: 31.3MiB, backup size: 31.3MiB
repository size: 3.8MiB, repository backup size: 3.8MiB
backup reference list:Upgrading pgo
Upgrading the Postgres Operator & pgo
As we’re using helm — upgrading the Postgres Operator is super simple.
helm upgrade postgres-operator . -n pgo
The last part is just to update the
pgo client binary itself. Which can just be downloaded directly from the releases page and replaced with your system binary.
wget -o /usr/local/bin/pgo https://github.com/CrunchyData/postgres-operator/releases/download/v4.4.2/pgo
chmod +x /usr/local/bin/pgo
Once installed veirfy the versions of both the client and remote server with
Its all over!
Hopefully that’s given you a quick dive into how to deploy PostgreSQL on top of Kubernetes. We’ll be following up on this topic in the near future. As well as how Zercurity uses Kubernetes at scale in production with Crunchy Data’s Postgres Operator. However, that’s all for now. Please feel free to get in touch if you have any questions.