Highly available and scalable PostgreSQL on Kubernetes (k8s) with the Crunchy PostgreSQL Operator (pgo).

Zercurity
8 min readNov 29, 2020

--

Following on from our post of getting Kubernetes stood up atop VMware’s vSphere. The next stage in deploying Zercurity to Kubernetes is to have a reliable backend database. Which can normally be a headache. However, the team over at Crunchy Data have built an awesome tool to deploy and manage a PostgreSQL cluster.

The Crunchy PostgreSQL Operator (pgo) can build PostgreSQL replica sets, manage backups and scale up and scale down additional nodes as well as scale the CPU and memory requirements of running Kubernetes pods. Backups can also be managed and scheduled via the pgo command line tool. Crunchy PostgreSQL also provides a number of add-in pods such as performance metrics via Grafana and Prometheus. As well as pgadmin (a web based GUI to manage your PostgreSQL cluster).

Prerequisites

A working and up-to-date Kubernetes cluster. Whilst pgo is possible to install without helm. We’ll be using helm to deploy and manage our pgo operator nodes.

There are two main parts to pgo. There is the management layer (control plane) which provides APIs to the pgo command line tool. Which also exposes a RESTful API to manage your PostgreSQL clusters which is nice. The other part are your deployed database clusters which span a number of other pod types which we’re going to explore.

Installing helm

There are a few ways to deploy pgo. However, whilst installing helm is an additional step is does make the deployment and upgrade-ability of pgo a whole lot easier.

Linux (Debian/Ubuntu)

curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
sudo apt-get install apt-transport-https --yes
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm

Mac OSX

From the helm releases page you can download the pre-complied binary. Or if you have homebrew installed you can just run brew install helm .

https://github.com/helm/helm/releases

curl -o helm.tar.gz https://get.helm.sh/helm-v3.4.1-darwin-amd64.tar.gz
tar -zxvf helm-v3.4.1-darwin-amd64.tar.gz
cd darwin-amd64/
sudo mv helm /usr/local/bin/helm
sudo chmod +x /usr/local/bin/helm

Installing pgo

In order to administer the cluster, Crunchy Data provide a helper utility called pgo. This will connect to the operator control plane API on port 8443 to create and mange database clusters and users amongst other utilities.

Start by checking out the most recent version of Crunchy Data’s PostgreSQL operator. You’ll need git installed (apt install git).

git clone https://github.com/CrunchyData/postgres-operator.git

Once cloned we can run there client setup script to download and install the pgo utility for us.

cd installers/kubectl/client-setup.sh
chmod +x client-setup.sh
./client-setup.sh

Note: In the next part, for Mac OSX users, you must use ~/.bash_profile instead of ~/.bashrc .

Once the script has finished running you’ll see something like this:

export PATH="$PATH:${HOME?}/.pgo/pgo/"
export PGOUSER="${HOME?}/.pgo/pgo/pgouser"
export PGO_CA_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_KEY="${HOME?}/.pgo/pgo/client.key"
export PGO_APISERVER_URL='https://127.0.0.1:8443'
export PGO_NAMESPACE=pgo

You’ll then need to go ahead and append this into the end of your .bashrc file. All we’re doing here is letting your system know to set some additional environmental variables for each new session you create so you won’t have to enter them in every time. When using the pgo utility.

Simply run the source command which will apply the new configuration to your current session.

source ~/.bashrc

You should now be able to run the pgo version command to check everything is working (you’ll get an error about fetching the apirserver version. Which we’re going to setup next).

Installing pgo operator

Once the pgo client is install we can now deploy the PostgreSQL operator. Which is super simple with helm:

cd postgres-operator/installers/helm
helm install postgres-operator . -n pgo

This will take a few minutes to deploy everything and once its done. You’ll be able to query the new pods deployed. To check its running correctly.

kubectl -n pgo get pods

You should see a single pod like so:

NAME                                READY   STATUS    RESTARTS   AGE
postgres-operator-56d6ccb97-tmz7m 4/4 Running 0 2m

Lastly, in order for your local pgo client to work correctly we need to port-forward the remote API server to your local machine like so:

kubectl -n pgo port-forward svc/postgres-operator 8443:8443

You can now re-run pgo version which will now yield an apiserver version:

pgo client version 4.5.1
pgo-apiserver version 4.5.1

If you get the following error: Error: Get “https://127.0.0.1:8443/version": dial tcp 127.0.0.1:8443: connect: connection refused it’ll be because your port-forwarding request (shown above) has died and you’ll need to restart it.

Installing pgo metrics

This stage is optional and can be performed at a later stage. As of version 4.5.1 you can now easily deploy metrics to monitor the performance of both your PostgreSQL hosts and pods with Grafana and Prometheus out-of-the-box with a single command. Awesome:

Showing the Grafana dashboard for pgo

To deploy support for metrics simply run the following from the root of the postgres operator git repository.

cd postgres-operator/installers/metrics/helm
helm install metrics . -n pgo

Once helm has finished running you can check everything has been deployed correctly like so:

kubectl -n pgo get poNAME                               READY   STATUS    RESTARTS   AGE
postgres-operator-689598d795 4/4 Running 1 7h

Once you’ve spun up your first cluster as shown in the next part of this post the dashboard will be automatically populated. To access the metrics dashboard you’ll need to port-forward the Grafana service to your local machine:

kubectl -n pgo port-forward --address 0.0.0.0 svc/crunchy-grafana 3000:3000

The default login for Grafana is admin and use admin again for the password. Your Grafana dashboard will be available from:

http://localhost:3000

Creating your first cluster

With all that setup done you can now deploy your first PostgreSQL cluster. I’m going to name our new cluster “zercurity” and manually give it a user and database name otherwise, one is generated for you. Using the name of your cluster for the databse name and “testuser” for the username.

  • --replica-count=2
    This will deploy two additional servers to act as replicas. Usually to provide additional redundancy and read throughput. This can be scaled up and down at a later stage using the pgo scale command.
  • --pgbouncer
    This provides a haproxy style service to load balance and rate limit connecting clients.
  • --metrics
    This enables support for metrics. As soon as the cluster is up and running you’ll be able to see metrics appear in your Grafana dashboard.
  • --sync-replication (optional)
    As we’re using replica nodes in our cluster. You can turn on replication synchronization, which is useful for workloads that are sensitive to losing transactions. PostgreSQL will not consider a transaction to be committed until it is committed to all synchronous replicas connected to a primary. This comes at the cost of performance.
  • --pvc-size
    Lastly but probably most importantly the pvc-size is the amount of disk space we’re going to allocate for each of our database nodes. In this example we’re allocating 150GB of data. So for 3 nodes (1 master and 2 replicas) that’ll be 450GB in total.
  • --pgbackrest-pvc-size
    This needs to really exceed your pvc-size. If your running incremental backup jobs the total size will more or less match the pvc-size. However, if you’re planning on running complete backups you’ll wan to make sure you have a few multiples of your pvc-size allocated. As the allocated space used will compound over time.
pgo create cluster zercurity --replica-count=2 \
--pgbouncer --pgbadger --metrics --username zercurity \
--sync-replication --database=zercurity --pvc-size=150Gi \
--pgbackrest-pvc-size=500Gi

If you have support for the LoadBalancer service type and require external database access. You can add the following parameter: --service-type=LoadBalancer

This command will return your new database credentials. However, the cluster will not yet be live and will take a few minutes to come online whilst it configures itself. You can check the status of your cluster like so (where zercurity is your cluster name):

pgo test <zercurity>cluster : zercurity
Services
primary (100.66.14.31:5432): UP
pgbouncer (100.69.208.231:5432): UP
replica (100.67.75.10:5432): UP
Instances
replica (zercurity-79cf8bc9c-499jp): UP
replica (zercurity-vcvm-66cfb945f5-5f4hx): UP
replica (zercurity-xfrx-69b4f4bbc-69vtr): UP

You can also grab your database credentials at any time with:

pgo show pgbouncer <zercurity>CLUSTER   SERVICE             USERNAME  PASSWORD  CLUSTER IP
--------- ------------------- --------- ------------------------
zercurity zercurity-pgbouncer pgbouncer password 100.69.208.231

If you’ve not opted to use pgbouncer (the flag -pgbouncer ) or you’d like to connect directly your database instances. As the bgbouncer does not have permission to create new databases. You’ll need to request your login using information with this command:

pgo show user -n pgo <zercurity> --show-system-accounts

You can also quickly rotate your password too: pgo update pgbouncer zercurity --rotate-password

Connecting to your cluster

Now that you’ve got your flashy new cluster stood up you’re going to want to kick the tires on it. As we’ve done in prior steps we’re going to need to port forward from our pgbouncer service (or primary node) to our local machine like so:

kubectl -n pgo port-forward svc/<zercurity> 5432:5432

With the port exposed locally you can now connect in with the psql client or any database management tool.

psql -h localhost -p 5432 -U <username> <zercurity>

Installing pgadmin

Once your cluster has been deployed you can add a database web GUI with pgadmin. This can be done with a single command:

pgo create pgadmin <zercurity>

You can then check the deployment status with (it should only take a few minutes):

kubectl -n pgo get poNAME                               READY   STATUS    RESTARTS   AGE
...
zercurity-pgadmin-5464c6545-k99n4 1/1 Running 0 7h

After the status has changed to Running you’ll need to once again port forward the remote service to your local machine:

kubectl port-forward svc/zercurity-pgadmin 5050:5050

You should then be able to access it from your local machine.

http://localhost:5050

pgadmin installed with pgo.

Updating & scaling your cluster

The -metrics support within the Postgres Operator cluster will give you a strong indication if your database cluster is starting to become starved of resources. Using the pgo utility you can update all the pods within your cluster:

pgo update cluster <zercurity> --cpu=2.0 --cpu-limit=4.0 \
--memory=4Gi --memory-limit=6Gi

Scaling up

You can scale up your cluster’s replica count quite easily.

pgo scale <zercurity> --replica-count=1

Scaling down

Scaling down your cluster is a little more involved as you need to let pgo know which replica pod you wish to terminate.

pgo scaledown <zercurity> --query

pgo scaledown <zercurity> --target=zercurity-replica-xxxx

Backups

To create a backup simply run:

pgo backup <zercurity>

This will automatically create an incremental backup from its last backup point. If you wish to force a full backup you can run:

pgo backup <zercurity> --backup-opts="--type=full"

You can also see what backups have been made with the pgo backup show <zercurity> . Or remove outdated backups with pgo delete backup <zercurity>

pgo show backup <zercurity>cluster: zercurity
storage type: local
stanza: db
status: ok
cipher: none
db (current)
wal archive min/max (12-1)
full backup: 20201120-222158F
timestamp start/stop: 2020-11-20 / 2020-11-20
wal start/stop: 00000001 / 00000001
database size: 31.3MiB, backup size: 31.3MiB
repository size: 3.8MiB, repository backup size: 3.8MiB
backup reference list:Upgrading pgo

Upgrading the Postgres Operator & pgo

As we’re using helm — upgrading the Postgres Operator is super simple.

cd postgres-operator/installers/helm
helm upgrade postgres-operator . -n pgo

The last part is just to update the pgo client binary itself. Which can just be downloaded directly from the releases page and replaced with your system binary.

https://github.com/CrunchyData/postgres-operator/releases

wget -o /usr/local/bin/pgo https://github.com/CrunchyData/postgres-operator/releases/download/v4.4.2/pgo
chmod +x /usr/local/bin/pgo

Once installed veirfy the versions of both the client and remote server with pgo version.

Its all over!

Hopefully that’s given you a quick dive into how to deploy PostgreSQL on top of Kubernetes. We’ll be following up on this topic in the near future. As well as how Zercurity uses Kubernetes at scale in production with Crunchy Data’s Postgres Operator. However, that’s all for now. Please feel free to get in touch if you have any questions.

--

--