Deploying Kubernetes (k8s) on vSphere 7 with Tanzu Kubernetes Grid (TKG)

For a change of gear away from our usual Osquery posts. Over the last year we’ve done a number of Zercurity deployments onto Kubernetes. With the most common being done on-prem with VMware’s vSphere. Fortunately, as of the most recent release of VMware’s vCenter you can easily deploy Kubernetes with VMware’s Tanzu Kubernetes Grid (TKG).

This post will form part of a series of posts on running Zercurity on top of Kubernetes in a production environment.

As with all things, there a number of ways to deploy and manage Kubernetes on VMware. If you’re running the latest release of vCenter (7.0.1.00100) you can actually deploy a TKG cluster straight from the Workload Management screen. Right from the main dashboard which has a full guide to walk you through the setup process. This also works along side NSX-T Data-center edition for additional management functionality and networking.

TKG Deployment setup on vCenter 7.0.1.00100 via Workload Management

However, for the purposes of this post and to support older versions of ESX (vSphere 6.7u3 and vSphere 7.0) and vCenter we’re going to be using the TKG client utility which spins up its own simple to use web UI anyway for deploying Kubernetes.

Installing Tanzu Kubernetes Grid (TKG)

Note: You will require a VMware account to download these files.

  • VMware Tanzu Kubernetes Grid 1.2.0 CLI
    tkg is used to install, manage and upgrade the Kubernetes cluster running on top vCenter.
  • VMware Tanzu Kubernetes Grid 1.2.0 OVAs for Kubernetes
    In order to deploy the TKG. The setup requires the Photon container image photon-3-v1.17.3_vmware.2.ova. Used for both the workers and management VMs.
  • Kubectl 1.19.1 for VMware Tanzu Kubernetes Grid 1.2.0
    kubectl is a command line tool used to administer your Kubernetes cluster from the command line.
  • VMware Tanzu Kubernetes Grid Extensions Manifest 1.2.0
    Additional services, configuration and RBAC changes that are applied to your cluster post installation.

Prerequisites

Installing kubectl

wget https://download2.vmware.com/software/TKG/1.2.0/kubectl-linux-v1.19.1-vmware.2.gz
gunzip kubectl-linux-v1.19.1-vmware.2.gz
sudo mv kubectl-mac-v1.19.1-vmware.2 /usr/local/bin/kubectl
sudo chmod +x /usr/local/bin/kubectl

Installing docker

If you’re running Mac OSX you can make use of the Docker Desktop app.

sudo apt-get update
sudo apt-get install apt-transport-https \
ca-certificates curl gnupg-agent software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

Lastly, you’ll need to give your current user permission to interact with the docker daemon.

sudo usermod -aG docker <your-user>

Installing Tanzu (tkg)

wget https://download2.vmware.com/software/TKG/1.2.0/tkg-darwin-amd64-v1.2.0-vmware.1.tar.gz
tar -zxvf tkg-linux-amd64-v1.2.0-vmware.1.tar.gz
cd tkg
sudo mv tkg-linux-amd64-v1.2.0+vmware.1 /usr/local/bin/tkg
sudo chmod +x /usr/local/bin/tkg

Once installed you can run tkg version to check tkg is working and installed into your system PATH.

Importing the OVA images

Once the OVA has been imported its deployed as a VM. Do not power on the VM. The last step is to convert it back into a template so it can be used by the TKG installer.

Right click on the imported VM photon-3-kube-v1.19.1+vmware.2a , select the “Template” menu item and choose “Convert to template”. This will take a few moments and the Template will now be visible under the “VMS and Templates” view.

Optional prerequisites

Installing Tanzu Kubernetes Grid

tkg init --ui

Once you run the command your browser should automatically open and point your browser to: http://127.0.0.1:8080/

The Tanzu Kubernetes Grid installation screen.

Choose the “VMware vSphere” deploy option. The next series of steps will help configure the TKG deployment.

The first step is to connect to your vSphere vCenter instance with your administrator credentials. Upon clicking “connect” you’ll see your available data-centers show up. TKG also requires your RSA public key, in order for management cluster to use for management.

TKG installation setup.

Your SSH RSA key is usually located within your home directory:

cat .ssh/id_rsa.pub

If the file doesn't exist or you need to create a new RSA key you can generate one like so:

ssh-keygen -t rsa -C "your@email.com"

If you change the default filename you’ll see two files created, once the command has run. You need to copy and paste the contents of your public key (the .pub file).

The next stage is to name your cluster and provide the sizing details of both the management instances and worker instances for your cluster. The virtual IP address is the main IP address of the API server that provides the load balancing service — aka the ingress server.

Choosing the cluster instance types for vSphere Kubernetes

For the next stage you can provide some optional metadata or labels to make it easier to identify your VMs.

The next stage is the define the resource location. This is where your Kubernetes cluster will reside and the data-store used by the virtual machines. We’re using the root VM folder, our vSAN datastore and lastly, we’ve created a separate resource pool called k8s-prod to manage the clusters CPU, storage and memory limits.

Managing the resources for your Kubernetes cluster.

With the networking configuration, you can use the defaults provided here. However, we’ve created a separate Distributed switch called VM Tanzu Prod which its connected via its own segregated VLAN back into our network.

Defining the Tazu network resources.

The last and final stage is to again select the Proton Kube OVA which we downloaded earlier as the base image for the workers and management virtual machines. If nothing is listed here, make sure you have imported the OVA and converted it from a VM into an OVA template. Use the refresh icon to reload the list without starting over.

Choosing the photon OVA for deployment.

Finally, review your configuration and click “Deploy management cluster”. This can take around 8–10 minutes and even longer depending on your internet connection. As the setup needs to pull down and deploy multiple images for the Docker containers which are used to bootstrap the Tanzu management cluster.

Configuration review your the Tanzu Kubernetes cluster

Once the installation has finished you’ll now see several VMs within the vSphere web client named something similar too: tkg-mgmt-vsphere-20200927183052-control-plane-6rp25 . At this point the Tanzu management plane has now been deployed.

Success. \o/

Our deploy Tanzu cluster

Configuring kubectl

tkg get credentials zercurityCredentials of workload cluster 'zercurity' have been saved
You can now access the cluster by running 'kubectl config use-context zercurity-admin@zercurity'

Using the command above,copy and paste it into our kubectl command, to set your new context. This is useful for switching between multiple clusters:

kubectl config use-context zercurity-admin@zercurity

With kubectl connected to our cluster let’s create our first namespace to check everything is working correctly.

kubectl version
kubectl create namespace zercurity

At this stage you’re almost ready to go and you can start deploying non-persistent containers to test out the cluster.

Installing the VMware TKG extensions

The extensions archive should have been download already from www.vmware.com/go/get-tkg .

wget https://download2.vmware.com/software/TKG/1.2.0/tkg-extensions-manifests-v1.2.0-vmware.1.tar
gunzip tkg-extensions-manifests-v1.2.0-vmware.1.tar-2.gz
cd tkg-extensions-v1.2.0+vmware.1/

I’d recommend applying the following extensions. There are a few more contained within the archive. However, I’d argue these are the primary extensions you’re going to want to add.

kubectl apply -f cert-manager/*
kubectl apply -f ingress/contour/*
kubectl apply -f monitoring/grafana/*
kubectl apply -f monitoring/prometheus/*

Configuring vSAN storage

Tanzu VMware vSAN persistent volume claims.

In order to let the Kubernetes cluster know to use vSAN as its storage backend we need to create a new StorageClass. To make this change simple copy and paste the command below:

cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: thin
annotations:
storageclass.kubernetes.io/is-default-class: true
provisioner: csi.vsphere.vmware.com
allowVolumeExpansion: true
parameters:
storagepolicyname: "vSAN Default Storage Policy"
EOF

You can then check your StorageClass has been correctly applied like so:

kubectl get scNAME            PROVISIONER   RECLAIM  BINDINGMODE  EXPANSION   AGE
thin (default) csi.vsphere.. Delete Immediate true 2s
kubectl describe sc thin

You can also test your StorageClass config is working by creating a quick PersistentVolumeClaim — again, copy and paste the command below.

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: testing
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
EOF

This PersistentVolumeClaim will be created within the default namespace using 1Gi of disk space.

In the event Status shows the <pending> state for more than 30 seconds then this usually means some sort of issue has occurred. You can use the kubectl describe sc thin command to get additional information on the state of the StorageClass .

kubectl get pvcNAME     STATUS  VOLUME         CAPACITY  ACCESS  STORAGECLASS  AGE
testing Bound pvc-3974d60f 1Gi RWO default 6s

Upgrading Kubernetes on TKG

tkg upgrade management-cluster tkg-mgmt-vsphere-20200927183052Upgrading management cluster 'tkg-mgmt-vsphere-20200927183052' to TKG version 'v1.2.0' with Kubernetes version 'v1.19.1+vmware.2'. Are you sure? [y/N]: y
Upgrading management cluster providers...
Checking cert-manager version...
Deleting cert-manager Version="v0.11.0"
Installing cert-manager Version="v0.16.1"
Waiting for cert-manager to be available...
Performing upgrade...
Deleting Provider="cluster-api" Version="" TargetNamespace="capi-system"
Installing Provider="cluster-api" Version="v0.3.10" TargetNamespace="capi-system"
Deleting Provider="bootstrap-kubeadm" Version="" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.10" TargetNamespace="capi-kubeadm-bootstrap-system"
Deleting Provider="control-plane-kubeadm" Version="" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.10" TargetNamespace="capi-kubeadm-control-plane-system"
Deleting Provider="infrastructure-vsphere" Version="" TargetNamespace="capv-system"
Installing Provider="infrastructure-vsphere" Version="v0.7.1" TargetNamespace="capv-system"
Management cluster providers upgraded successfully...
Upgrading management cluster kubernetes version...
Verifying kubernetes version...
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.19.1+vmware.2...
Waiting for kubernetes version to be updated for control plane nodes
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.19.1+vmware.2...
Waiting for kubernetes version to be updated for worker nodes...
updating 'metadata/tkg' add-on...
Management cluster 'tkg-mgmt-vsphere-20200927183052' successfully upgraded to TKG version 'v1.2.0' with kubernetes version 'v1.19.1+vmware.2'

All finished

If you’ve got stuck or have a few suggestions for us to add don’t hesitate to get in touch via our website or leave a comment below.

Real-time security and compliance delivered.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store