Backup and Restore in Percona Kubernetes Operator for Percona XtraDB Cluster

Backup and Restore in Percona Kubernetes Operator for Percona XtraDB Cluster

Backup and Restore in Percona Kubernetes Operator for Percona XtraDB ClusterDatabase backups are a fundamental requirement in almost every implementation, no matter the size of the company or the nature of the application. Taking a backup should be a simple task that can be automated to ensure it’s done consistently and on schedule. Percona has an enterprise-grade backup tool, Percona XtraBackup, that can be used to accomplish these tasks. Percona also has a Percona Kubernetes Operator for Percona XtraDB Cluster (PXC Operator), which has Percona XtraBackup built into it. Percona XtraBackup has the ability for both automated and on-demand backups. Today we will explore taking backups and restoring these backups using the PXC Operator deployed on Google Cloud Platform’s Google Kubernetes Engine.

Backup Types

There are two different storage types and backup methods we can use to store backups via the PXC Operator. The first storage type we can use is a Persistent Volume Claim (PVC), which is essentially a ‘ticket’ that requests to use a storage class defined as a Persistent Volume (PV). A PV can be a variety of storage types such as local disk, NFS, or most commonly a block storage (GCP Persistent Disk, AWS EBS, etc.). The second type of storage for backups is by using the S3 protocol or S3 protocol compatible object storage.

There are also two types of backups we can take in regards to the timing of backups. We can take on-demand backups or scheduled backups. The PXC Operator’s deploy/cr.yaml can be edited to schedule backups that are automated.  We can configure the backup schedule using Unix cron string format. We can also take a backup on-demand, by running a single command which we will demonstrate below.

Persistent Volume Backups

When a backup job is launched using a PVC, the PXC Operator will launch a Pod that will connect to the PXC as a 4th member and leverage PXC’s native auto-join capability by starting a State Snapshot (SST). An SST is a full data copy from one node (donor) to the joining node (joiner). The SST is received as a single xbstream file and stored in the volume along with an MD5 checksum of the backup. Xbstream is a custom streaming format that supports simultaneous compression and streaming.

Object Storage Backups (S3)

Like the PVC method, when an S3-based backup job is launched, the PXC Operator will launch a new Pod that will connect to the PXC as a 4th member and will start an SST. The SST is streamed from a donor node as an xbstream to the configured S3 endpoint using xbcloud. The purpose of xbcloud is to download and upload a full or part of xbstream archive to or from the cloud.

S3 Backup Configuration

deploy/backup-s3.yaml

The first step is to add in our access and secret access keys. Each cloud service provider has a different method of distributing these keys.

deploy/cr.yaml

Next, we can go to our deploy/cr.yaml and edit the bucket information. If we wanted to edit the automated backup schedule, we would simply go down a few more lines in deploy/cr.yaml and edit the schedule.

deploy/backup/backup.yaml

Finally, we need to ensure that our storageName in deploy/backup/backup.yaml matches the name that is in our deploy/cr.yaml. We are now ready to take backups!

 

On-Demand S3 Backup and Restore

We will explore taking an on-demand S3 backup. If you want to use a PVC, this can also be configured in deploy/cr.yaml. Keep in mind that the PVC for the filesystem-type backup storage should request enough space to fit all of the backup data.

To take an on-demand backup we would run the command below and can display the backup Pod’s name as well.

kubectl apply -f deploy/backup.yaml
kubectl get pods | grep backup1

In order to list all of the backups, you would run the following command.

kubectl get pxc-backups

Now that we have the name of the backup, we would edit the backupName in deploy/backup/restore.yaml In order to restore a backup, we would run the following command. We can also get the name of the restoring Pod.

kubectl apply -f deploy/restore.yaml
kubectl get pods | grep restore

As we saw, backups can be accomplished very quickly and easily thanks to the efforts of Percona’s Engineer Team. If you’d like to learn more about this process, you can take a look at the documentation.


by Stephen Thorn via Percona Database Performance Blog

Comments