Highly available and scalable PostgreSQL on Kubernetes (k8s) with the Crunchy PostgreSQL Operator (pgo).
Following on from our post of getting Kubernetes stood up atop VMware’s vSphere. The next stage in deploying Zercurity to Kubernetes is to have a reliable backend database. Which can normally be a headache. However, the team over at Crunchy Data have built an awesome tool to deploy and manage a PostgreSQL cluster.
The Crunchy PostgreSQL Operator (pgo
) can build PostgreSQL replica sets, manage backups and scale up and scale down additional nodes as well as scale the CPU and memory requirements of running Kubernetes pods. Backups can also be managed and scheduled via the pgo
command line tool. Crunchy PostgreSQL also provides a number of add-in pods such as performance metrics via Grafana and Prometheus. As well as pgadmin (a web based GUI to manage your PostgreSQL cluster).
Prerequisites
A working and up-to-date Kubernetes cluster. Whilst pgo
is possible to install without helm. We’ll be using helm to deploy and manage our pgo operator nodes.
There are two main parts to pgo
. There is the management layer (control plane) which provides APIs to the pgo
command line tool. Which also exposes a RESTful API to manage your PostgreSQL clusters which is nice. The other part are your deployed database clusters which span a number of other pod types which we’re going to explore.
Installing helm
There are a few ways to deploy pgo
. However, whilst installing helm is an additional step is does make the deployment and upgrade-ability of pgo
a whole lot easier.
Linux (Debian/Ubuntu)
curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
sudo apt-get install apt-transport-https --yes
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
Mac OSX
From the helm releases page you can download the pre-complied binary. Or if you have homebrew installed you can just run brew install helm
.
https://github.com/helm/helm/releases
curl -o helm.tar.gz https://get.helm.sh/helm-v3.4.1-darwin-amd64.tar.gz
tar -zxvf helm-v3.4.1-darwin-amd64.tar.gz
cd darwin-amd64/
sudo mv helm /usr/local/bin/helm
sudo chmod +x /usr/local/bin/helm
Installing pgo
In order to administer the cluster, Crunchy Data provide a helper utility called pgo
. This will connect to the operator control plane API on port 8443 to create and mange database clusters and users amongst other utilities.
Start by checking out the most recent version of Crunchy Data’s PostgreSQL operator. You’ll need git
installed (apt install git
).
git clone https://github.com/CrunchyData/postgres-operator.git
Once cloned we can run there client setup script to download and install the pgo
utility for us.
cd installers/kubectl/client-setup.sh
chmod +x client-setup.sh
./client-setup.sh
Note: In the next part, for Mac OSX users, you must use ~/.bash_profile
instead of ~/.bashrc
.
Once the script has finished running you’ll see something like this:
export PATH="$PATH:${HOME?}/.pgo/pgo/"
export PGOUSER="${HOME?}/.pgo/pgo/pgouser"
export PGO_CA_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_KEY="${HOME?}/.pgo/pgo/client.key"
export PGO_APISERVER_URL='https://127.0.0.1:8443'
export PGO_NAMESPACE=pgo
You’ll then need to go ahead and append this into the end of your .bashrc
file. All we’re doing here is letting your system know to set some additional environmental variables for each new session you create so you won’t have to enter them in every time. When using the pgo
utility.
Simply run the source
command which will apply the new configuration to your current session.
source ~/.bashrc
You should now be able to run the pgo version
command to check everything is working (you’ll get an error about fetching the apirserver version. Which we’re going to setup next).
Installing pgo operator
Once the pgo
client is install we can now deploy the PostgreSQL operator. Which is super simple with helm:
cd postgres-operator/installers/helm
helm install postgres-operator . -n pgo
This will take a few minutes to deploy everything and once its done. You’ll be able to query the new pods deployed. To check its running correctly.
kubectl -n pgo get pods
You should see a single pod like so:
NAME READY STATUS RESTARTS AGE
postgres-operator-56d6ccb97-tmz7m 4/4 Running 0 2m
Lastly, in order for your local pgo
client to work correctly we need to port-forward the remote API server to your local machine like so:
kubectl -n pgo port-forward svc/postgres-operator 8443:8443
You can now re-run pgo version
which will now yield an apiserver version:
pgo client version 4.5.1
pgo-apiserver version 4.5.1
If you get the following error: Error: Get “https://127.0.0.1:8443/version": dial tcp 127.0.0.1:8443: connect: connection refused
it’ll be because your port-forwarding request (shown above) has died and you’ll need to restart it.
Installing pgo metrics
This stage is optional and can be performed at a later stage. As of version 4.5.1 you can now easily deploy metrics to monitor the performance of both your PostgreSQL hosts and pods with Grafana and Prometheus out-of-the-box with a single command. Awesome:
To deploy support for metrics simply run the following from the root of the postgres operator git repository.
cd postgres-operator/installers/metrics/helm
helm install metrics . -n pgo
Once helm has finished running you can check everything has been deployed correctly like so:
kubectl -n pgo get poNAME READY STATUS RESTARTS AGE
postgres-operator-689598d795 4/4 Running 1 7h
Once you’ve spun up your first cluster as shown in the next part of this post the dashboard will be automatically populated. To access the metrics dashboard you’ll need to port-forward the Grafana service to your local machine:
kubectl -n pgo port-forward --address 0.0.0.0 svc/crunchy-grafana 3000:3000
The default login for Grafana is admin and use admin again for the password. Your Grafana dashboard will be available from:
Creating your first cluster
With all that setup done you can now deploy your first PostgreSQL cluster. I’m going to name our new cluster “zercurity” and manually give it a user and database name otherwise, one is generated for you. Using the name of your cluster for the databse name and “testuser” for the username.
--replica-count=2
This will deploy two additional servers to act as replicas. Usually to provide additional redundancy and read throughput. This can be scaled up and down at a later stage using thepgo scale
command.--pgbouncer
This provides a haproxy style service to load balance and rate limit connecting clients.--metrics
This enables support for metrics. As soon as the cluster is up and running you’ll be able to see metrics appear in your Grafana dashboard.--sync-replication
(optional)
As we’re using replica nodes in our cluster. You can turn on replication synchronization, which is useful for workloads that are sensitive to losing transactions. PostgreSQL will not consider a transaction to be committed until it is committed to all synchronous replicas connected to a primary. This comes at the cost of performance.--pvc-size
Lastly but probably most importantly thepvc-size
is the amount of disk space we’re going to allocate for each of our database nodes. In this example we’re allocating 150GB of data. So for 3 nodes (1 master and 2 replicas) that’ll be 450GB in total.--pgbackrest-pvc-size
This needs to really exceed yourpvc-size
. If your running incremental backup jobs the total size will more or less match thepvc-size
. However, if you’re planning on running complete backups you’ll wan to make sure you have a few multiples of yourpvc-size
allocated. As the allocated space used will compound over time.
pgo create cluster zercurity --replica-count=2 \
--pgbouncer --pgbadger --metrics --username zercurity \
--sync-replication --database=zercurity --pvc-size=150Gi \
--pgbackrest-pvc-size=500Gi
If you have support for the LoadBalancer service type and require external database access. You can add the following parameter: --service-type=LoadBalancer
This command will return your new database credentials. However, the cluster will not yet be live and will take a few minutes to come online whilst it configures itself. You can check the status of your cluster like so (where zercurity is your cluster name):
pgo test <zercurity>cluster : zercurity
Services
primary (100.66.14.31:5432): UP
pgbouncer (100.69.208.231:5432): UP
replica (100.67.75.10:5432): UP
Instances
replica (zercurity-79cf8bc9c-499jp): UP
replica (zercurity-vcvm-66cfb945f5-5f4hx): UP
replica (zercurity-xfrx-69b4f4bbc-69vtr): UP
You can also grab your database credentials at any time with:
pgo show pgbouncer <zercurity>CLUSTER SERVICE USERNAME PASSWORD CLUSTER IP
--------- ------------------- --------- ------------------------
zercurity zercurity-pgbouncer pgbouncer password 100.69.208.231
If you’ve not opted to use pgbouncer (the flag -pgbouncer
) or you’d like to connect directly your database instances. As the bgbouncer does not have permission to create new databases. You’ll need to request your login using information with this command:
pgo show user -n pgo <zercurity> --show-system-accounts
You can also quickly rotate your password too: pgo update pgbouncer zercurity --rotate-password
Connecting to your cluster
Now that you’ve got your flashy new cluster stood up you’re going to want to kick the tires on it. As we’ve done in prior steps we’re going to need to port forward from our pgbouncer service (or primary node) to our local machine like so:
kubectl -n pgo port-forward svc/<zercurity> 5432:5432
With the port exposed locally you can now connect in with the psql
client or any database management tool.
psql -h localhost -p 5432 -U <username> <zercurity>
Installing pgadmin
Once your cluster has been deployed you can add a database web GUI with pgadmin. This can be done with a single command:
pgo create pgadmin <zercurity>
You can then check the deployment status with (it should only take a few minutes):
kubectl -n pgo get poNAME READY STATUS RESTARTS AGE
...
zercurity-pgadmin-5464c6545-k99n4 1/1 Running 0 7h
After the status has changed to Running
you’ll need to once again port forward the remote service to your local machine:
kubectl port-forward svc/zercurity-pgadmin 5050:5050
You should then be able to access it from your local machine.
Updating & scaling your cluster
The -metrics
support within the Postgres Operator cluster will give you a strong indication if your database cluster is starting to become starved of resources. Using the pgo
utility you can update all the pods within your cluster:
pgo update cluster <zercurity> --cpu=2.0 --cpu-limit=4.0 \
--memory=4Gi --memory-limit=6Gi
Scaling up
You can scale up your cluster’s replica count quite easily.
pgo scale <zercurity> --replica-count=1
Scaling down
Scaling down your cluster is a little more involved as you need to let pgo
know which replica pod you wish to terminate.
pgo scaledown <zercurity> --query
pgo scaledown <zercurity> --target=zercurity-replica-xxxx
Backups
To create a backup simply run:
pgo backup <zercurity>
This will automatically create an incremental backup from its last backup point. If you wish to force a full backup you can run:
pgo backup <zercurity> --backup-opts="--type=full"
You can also see what backups have been made with the pgo backup show <zercurity>
. Or remove outdated backups with pgo delete backup <zercurity>
pgo show backup <zercurity>cluster: zercurity
storage type: localstanza: db
status: ok
cipher: nonedb (current)
wal archive min/max (12-1)full backup: 20201120-222158F
timestamp start/stop: 2020-11-20 / 2020-11-20
wal start/stop: 00000001 / 00000001
database size: 31.3MiB, backup size: 31.3MiB
repository size: 3.8MiB, repository backup size: 3.8MiB
backup reference list:Upgrading pgo
Upgrading the Postgres Operator & pgo
As we’re using helm — upgrading the Postgres Operator is super simple.
cd postgres-operator/installers/helm
helm upgrade postgres-operator . -n pgo
The last part is just to update the pgo
client binary itself. Which can just be downloaded directly from the releases page and replaced with your system binary.
https://github.com/CrunchyData/postgres-operator/releases
wget -o /usr/local/bin/pgo https://github.com/CrunchyData/postgres-operator/releases/download/v4.4.2/pgo
chmod +x /usr/local/bin/pgo
Once installed veirfy the versions of both the client and remote server with pgo version.
Its all over!
Hopefully that’s given you a quick dive into how to deploy PostgreSQL on top of Kubernetes. We’ll be following up on this topic in the near future. As well as how Zercurity uses Kubernetes at scale in production with Crunchy Data’s Postgres Operator. However, that’s all for now. Please feel free to get in touch if you have any questions.