Tanzu Mission Control - Data Protection

Overview

Lately I've been playing around in Tanzu Mission Control, TMC for short. TMC is VMware's SaaS offering for managing Kubernetes clusters cross different clouds and providers.

So far we've seen how to attach existing clusters and how to create new clusters.

In this post we'll take a look at how to enable and use the Data Protection capabilities in TMC.

Note that when preparing for this post I came across this post from fellow vExpert Dean Lewis. He has also done a series of blog posts around TMC which is worth checking out

Project Velero

The Data protection capabilities in TMC is backed by the open-source tool Velero. This tool can be used to back up Kubernetes clusters both on-premises and in a public cloud with the ability to select a subset of the resources in the cluster. It differs from a few other solutions as it uses the Kubernetes API to capture (and restore) the state of the resources, where other solutions uses the etcd database directly.

To backup Persistent volumes Velero uses either the storage platform's native snapshot capabilities or the integrated file-level backup tool restic

Backup process

This won't be a detailed post on the Velero solution, but I thought it was interesting to see how the process works.

Velero defines a few new Kubernetes custom resources when installed in your cluster. It uses it's own controller with a watch loop to look for new or modified resources. This follows how Kubernetes manages other resources in the cluster with controller loops.

The backup workflow for a Velero backup explained high-level:

  1. Through the Velero client a new backup resource object is created and sent to the Kubernetes API server. The resource specifies what to back up
  2. The Backup controller validates the new object
  3. The Backup controller starts the backup. It queries the API server for the state of the resources it wants to back up
  4. The Backup controller uploads the backed up data to an object storage service, e.g. AWS S3
  5. The Backup controller creates a snapshot backup of Persistent volumes by default by using the storage provider's API. (This is enabled by default, but can be disabled in the backup job)

Velero backup process

Data Protection in TMC

To enable Data Protection in TMC we need to set up an Account Credential for Data Protection. This credential will specify where the Velero extension should store the backed up data. Currently the only supported object storage provider for this is AWS S3.

Create Account credential

The creation of the Account credential follows the same procedure as we did for setting up a credential for our AWS provider to be able to deploy Tanzu Kubernetes clusters on AWS. Check out this blog post or the documentation for more info.

The difference here is that in stead of setting up through the Management clusters details view we use the Accounts tab in the Administration section and specify that we want to create an AWS data protection credential

Start account credential wizard

This fires of the credential wizard which instructs us to create a CloudFormation stack in AWS. Although it's the same process as we saw with the credential used for deploying clusters we should use a specific account for Data Protection with a new Template.

I won't cover the steps in AWS, again check out my previous post on this or the VMware documentation.

After the CloudFormation stack has been created we fetch the AWS ARN that TMC needs to create the credential for Data protection

Cloud Formation role output

This ARN goes in to the credential wizard in TMC so we can create the credential

Add AWS ARN role to credential

And with this in place we should see our newly created Data protection credential

Data protection credential

Enable Data protection for a cluster

Data protection is disabled by default so to use the capability we need to enable it for a cluster

Enable Data protection

When we hit Enable Data protection we get a dialog box where we select the Account credential. This allows us to use different credentials for different clusters.

Select credential

After selecting the credential we want to use, Data protection is enabled and we can create a backup

Data protection enabled

In our Kubernetes cluster we now have a Velero namespace with some resources created

Velero namespace

Create backup

Let's create a backup of one our clusters by hitting of the Create backup wizard

The first step of the wizard is to select what we want to backup, either the entire cluster, selected namespaces, or by using a label selector

Select what to back up

Next, we'll set the schedule. In my example I just want to do a one-time backup

Select backup schedule

Next, we'll set the retention of the backup

Set retention

The final step is to give the backup a name

Set name

Now TMC will invoke the velero backup process and velero will put our backup in a AWS S3 bucket

AWS S3 buckets

The vmware-tmc-dp-xyz bucket has been created automatically for us, and inside here we'll have folders for our clusters, and the backups created for them. The backup we just created consists of the following files

Backup files

If we create a backup for a different cluster we'll get a folder for this as well under the same vmware-tmc-dp-xyz bucket

Backup folder per cluster

Back in our TMC console we can see the backups created and the status of them.

Backup details

Since this is creating backups in our Kubernetes cluster we can find the same backups with kubectl. This is a nice feature where an Administrator can control this through the TMC console, while Developers can get the same details through their favorite tool kubectl

Backup resources through kubectl

To create a backup or restore a backup we can use the tmc command-line utility, but I've also created a backup through kubectl with a yaml file. You should also be able to use the velero command-line utility

Note that I have no documentation resource for this so not sure if it's supported with TMC.

 1apiVersion: velero.io/v1
 2kind: Backup
 3metadata:
 4  labels:
 5    velero.io/storage-location: tmc-dataprotection-rhmlab
 6    manager: velero-server
 7    operation: Create
 8  name: tkgs-cluster-1-bck-3
 9  namespace: velero
10spec:
11  hooks: {}
12  storageLocation: tmc-dataprotection-rhmlab
13  ttl: 720h0m0s
1kubectl -n velero apply -f backup.yaml

Backup initiated through kubectl

Restore

Now, let's do a simple demo of a restore.

I've created a Namespace in my cluster, and have deployed a Pod running an nginx container

Namespace to be deleted

Now, let's create a backup of this namespace through the TMC console

Created backup of namespace

And then, disaster happens. The namespace gets deleted and our Pod is lost

Namespace deleted

From TMC I can now hit the Restore backup button from the backup created to start the Restore wizard.

First we select what to restore from the backup. In this example I can go with the entire backup since it only contains the namespace deleted, but we could fetch the namespace from a Full cluster backup as well.

Select what to restore

Give the restore a name and hit Restore

Start restore

After a while our Restore operation has finished

Restore finished

And we have our Namespace and Pod back in the cluster

Resources back

Note, for an example of a restore with a volume as well, check out Dean Lewis' post mentioned at the start of this blog post.

Summary

Wow, another lengthy post, but hopefully this has been an intro to what we get from the integrated Data protection feature in Tanzu Mission Control.

I really like the visibility both from the TMC console, which is more of an Admin console, and the kubectl and velero command line utilities which can help Developers in their daily tasks.

Hopefully we will get the ability to set other object storage locations as targets going forward as not all want to or are even allowed to use AWS as a backup location.

Thanks for reading and feel free to reach out if you have any questions or comments.

This page was modified on February 23, 2021: TMC data protection post publish