项目作者: sreesanpd

项目描述 :
Scheduled backup and disaster recovery solution for Google Filestore using kubernetes cronjobs
高级语言: Dockerfile
项目地址: git://github.com/sreesanpd/google-filestore-backup-kubernetes-cronjobs.git


google-filestore-backup-kubernetes-cronjobs

Scheduled backup and disaster recovery solution for Google Filestore using kubernetes cronjobs

This project aims to create scheduled backup of Google Cloud Filestore contents to Google Cloud Storage (GCS) buckets at regular intervals. It also replicates the contents of Google Filestore instances in one location to Filestore instance another location at scheduled intervals for the disaster recovery (DR) purposes. It uses kubernetes cronjobs to schedule the backup.

This would be an ideal solution if you are using Filestore instances as storage volumes for your kubernetes containers in Google Kubernetes Engine (GKE). Currently, backup and snapshot features for filestore are in alpha and it didn’t meet our use case. So I ventured out to create my own solution inspired from Benjamin Maynard’s kubernetes-cloud-mysql-backup solution.

Environment Variables

The below table lists all of the Environment Variables that are configurable for google-filestore-backup-kubernetes-cronjobs.

Environment Variables Purpose
GCP_GCLOUD_AUTH Base64 encoded service account key exported as JSON. Example of how to generate: base64 ~/service-key.json
BACKUP_PROVIDER Backend to use for filestore backups. It will be GCP
GCP_BUCKET_NAME Name of the Google Cloud Storage (GCS) bucket where filestore backups will be stored
FILESHARE_MOUNT_PRIMARY Mount path for primary filestore in the container. ‘/mnt/primary-filestore’ is the default location. If you want to change it, make necessary changes in volumeMounts under container spec in the kubernetes cronjob spec. Don’t change /mnt in the mount path
FILESHARE_MOUNT_SECONDARY Mount path for secondary filestore in the container. ‘/mnt/secondary-filestore’ is the default location. If you want to change it, make necessary changes in volumeMounts under container spec in the kubernetes cronjob spec. Don’t change /mnt in the mount path

GCS Backend Configuration

The below subheadings detail how to configure filestore to backup to a Google GCS backend.

GCS - Configuring the Service Account

In order to backup to a GCS Bucket, you must create a Service Account in Google Cloud Platform that contains the neccesary permissions to write to the destination bucket (for example the Storage Obect Creator role).

Once created, you must create a key for the Service Account in JSON format. This key should then be base64 encoded and set in the GCP_GCLOUD_AUTH environment variable. For example, to encode service_account.json you would use the command base64 ~/service-key.json in your terminal and set the output as the GCP_GCLOUD_AUTH environment variable.

GCS - Example Kubernetes Cronjob

An example of how to schedule this container in Kubernetes as a cronjob is below. This would configure a filestore backup to run each day at 01:00am. The GCP Service Account Key is stored in secrets. Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) for primary filestore and secondary filestores will be created.

  • replace nfs server with the IP address of each filestore instances
  • replace nfs path with fileshare name of each filestore instances
  • replace gcp_gcloud_auth with base64 encoded service account key
  • replace the docker image url with the image you have built
  • replace GCP_BUCKET_NAME with the GCS bucket you have created for storing filestore backups
  1. apiVersion: v1
  2. kind: PersistentVolume
  3. metadata:
  4. name: primary-filestore-pv
  5. spec:
  6. storageClassName: primary-filestore
  7. capacity:
  8. storage: 1000Gi
  9. accessModes:
  10. - ReadWriteMany
  11. nfs:
  12. server: "<IP address of google filestore primary instance>"
  13. path: "<fileshare name of the google filestore primary instance>"
  14. ---
  15. apiVersion: v1
  16. kind: PersistentVolumeClaim
  17. metadata:
  18. name: primary-filestore-pvc
  19. spec:
  20. accessModes:
  21. - ReadWriteMany
  22. storageClassName: primary-filestore
  23. resources:
  24. requests:
  25. storage: 1000Gi
  26. ---
  27. apiVersion: v1
  28. kind: PersistentVolume
  29. metadata:
  30. name: secondary-filestore-pv
  31. spec:
  32. storageClassName: secondary-filestore
  33. capacity:
  34. storage: 1000Gi
  35. accessModes:
  36. - ReadWriteMany
  37. nfs:
  38. server: "<IP address of google filestore secondary instance>"
  39. path: "<fileshare name of the google filestore secondary instance>"
  40. ---
  41. apiVersion: v1
  42. kind: PersistentVolumeClaim
  43. metadata:
  44. name: secondary-filestore-pvc
  45. spec:
  46. accessModes:
  47. - ReadWriteMany
  48. storageClassName: secondary-filestore
  49. resources:
  50. requests:
  51. storage: 1000Gi
  52. ---
  53. apiVersion: v1
  54. kind: Secret
  55. metadata:
  56. name: filestore-backup
  57. type: Opaque
  58. data:
  59. gcp_gcloud_auth: "<Base64 encoded Service Account Key>"
  60. ---
  61. apiVersion: batch/v1beta1
  62. kind: CronJob
  63. metadata:
  64. name: filestore-backup
  65. spec:
  66. schedule: "0 01 * * *"
  67. jobTemplate:
  68. spec:
  69. template:
  70. spec:
  71. containers:
  72. - name: filestore-backup
  73. image: "<Docker Image URL>"
  74. imagePullPolicy: Always
  75. volumeMounts:
  76. - name: primary-filestore
  77. mountPath: "/mnt/primary-filestore"
  78. - name: secondary-filestore
  79. mountPath: "/mnt/secondary-filestore"
  80. env:
  81. - name: GCP_GCLOUD_AUTH
  82. valueFrom:
  83. secretKeyRef:
  84. name: filestore-backup
  85. key: gcp_gcloud_auth
  86. - name: BACKUP_PROVIDER
  87. value: "gcp"
  88. - name: GCP_BUCKET_NAME
  89. value: "<Name of the GCS Bucket where filestore backups needs to be stored>"
  90. - name: FILESHARE_MOUNT_PRIMARY
  91. value: "primary-filestore"
  92. - name: FILESHARE_MOUNT_SECONDARY
  93. value: "secondary-filestore"
  94. restartPolicy: Never
  95. volumes:
  96. - name: primary-filestore
  97. persistentVolumeClaim:
  98. claimName: primary-filestore-pvc
  99. - name: secondary-filestore
  100. persistentVolumeClaim:
  101. claimName: secondary-filestore-pvc

Pre-Requisites

In a production evironment, we expect these pre-requisites are already created. If not, create it accordingly.

I use google cloud shell for running the below commands. If you are not using the cloud shell make sure that you have installed necessary tools before running these.

Set the variables

  1. project=my-project-123456
  2. vpcname=gke-vpc
  3. subnet1=subnet1
  4. subnet2=subnet2
  5. storagebucket=$project-filestore-backup$RANDOM
  6. primaryfilestore=filestore-primary
  7. secondaryfilestore=filestore-secondary
  8. primaryfileshare=vol1
  9. secondaryfileshare=vol1
  10. gkecluster=gke-cluster1
  11. serviceaccount=filestore-backup-storage-sa

1. Create VPC & Subnets

Create VPC

  1. gcloud compute networks create $vpcname --subnet-mode=custom --bgp-routing-mode=regional

Create Firewall Rules

  1. gcloud compute firewall-rules create allow-all-access-gke --network $vpcname --allow all

Create Subnet1 with secondary ip range for the gke pods

  1. gcloud compute networks subnets create $subnet1 --network=$vpcname --range=10.128.0.0/19 --region=europe-north1 --secondary-range=pods-$subnet1=10.130.0.0/19

Create Subnet2 with secondary ip range for the gke pods

  1. gcloud compute networks subnets create $subnet2 --network=$vpcname --range=10.132.0.0/19 --region=europe-west4 --secondary-range=pods-$subnet2=10.134.0.0/19

2. Create Storage Bucket for filestore backup

  1. gsutil mb -p $project -c STANDARD -l eu gs://$storagebucket

3. Create Primary and Secondary Filestore instances

Create Primary Filestore Instance

  1. gcloud filestore instances create $primaryfilestore --project=$project --zone=europe-north1-b --tier=STANDARD --file-share=name=$primaryfileshare,capacity=1TB --network=name=$vpcname

Create Secondary Filestore Instance

  1. gcloud filestore instances create $secondaryfilestore --project=$project --zone=europe-west4-a --tier=STANDARD --file-share=name=$secondaryfileshare,capacity=1TB --network=name=$vpcname

4. Create GKE Cluster

  1. gcloud container clusters create $gkecluster \
  2. --region europe-north1 --node-locations europe-north1-a,europe-north1-b --enable-master-authorized-networks \
  3. --network $vpcname \
  4. --subnetwork $subnet1 \
  5. --cluster-secondary-range-name pods-$subnet1 \
  6. --services-ipv4-cidr 10.131.128.0/24 \
  7. --enable-private-nodes \
  8. --enable-ip-alias \
  9. --master-ipv4-cidr 10.131.248.0/28 \
  10. --num-nodes 1 \
  11. --default-max-pods-per-node 64 \
  12. --no-enable-basic-auth \
  13. --no-issue-client-certificate \
  14. --enable-master-authorized-networks \
  15. --master-authorized-networks=35.201.7.129/32

For master-authorized-networks, replace the value with your public ip address where you will use kubectl commands.

If you are using google cloud shell, you can run the below command to get the public ip address of your cloud shell:

  1. curl icanhazip.com

5. Create Container Registry for storing container images

Refer the document to enable container registry and authenticate to it : https://cloud.google.com/container-registry/docs/quickstart

6. Create Service Account for filestore backup & set permissions

Create Service Account

  1. gcloud iam service-accounts create $serviceaccount --description="sa for filestore backup gcs storage" --display-name="filestore-backup-storage-sa"

Set ACL for Servicve Account in filestore backup storage bucket

  1. gsutil iam ch serviceAccount:$serviceaccount@$project.iam.gserviceaccount.com:objectAdmin gs://$storagebucket/

Create JSON Key for the Service Account

  1. gcloud iam service-accounts keys create ~/filestore-backup-storage-sa-key.json --iam-account $serviceaccount@$project.iam.gserviceaccount.com

Convert the Json key to base64 encoded format for using in kubernetes secret

  1. base64 ~/filestore-backup-storage-sa-key.json | tr -d '\n' | tr -d '\r' > ~/filestore-backup-storage-sa-key-base64.txt

This output should be used as the value for GCP_GCLOUD_AUTH in the filestore-backups-cronjob-sample.yaml

Note: Make sure that there is no newline while you copy paste this value to the yaml.

Using the Solution

Clone the repository to the cloud shell

  1. git clone https://github.com/sreesanpd/google-filestore-backup-kubernetes-cronjobs

Change Directory to repository folder

  1. cd google-filestore-backup-kubernetes-cronjobs

Change Directory to Dockerfil foldere

  1. cd docker-resources

Docker build and push to container registry

  1. docker build . -t gcr.io/$project/gcp-filestore-k8s-backup
  2. docker push gcr.io/$project/gcp-filestore-k8s-backup

Note: Make sure that you have followed the steps to enable container registry and authenticated to it as per the step 5 in pre-requisites

Change directory to kubernetes folder

  1. cd ../kubernetes-resources

Modify the yaml with correct values as per your requirement

Refer to GCS - Example Kubernetes Cronjob in this document

Create the cronjob in the GKE cluster

  1. kubectl -f kubernetes-resources/filestore-backups-cronjob-sample.yaml

Troubleshooting

If there is a problem with the cronjob container, you can inspect it by:

  1. Add sleep timer to the container by editing google-filestore-backup-kubernetes-cronjobs/docker-resources/resources/filestore-backup.sh. Otherwise the container will immediately get deleted after running the job.
  1. #!/bin/bash
  2. ## Create the GCloud Authentication file if set ##
  3. if [ ! -z "$GCP_GCLOUD_AUTH" ]
  4. then
  5. echo "$GCP_GCLOUD_AUTH" > "$HOME"/gcloud.json
  6. gcloud auth activate-service-account --key-file="$HOME"/gcloud.json
  7. fi
  8. ## backup filestore to GCS ##
  9. DATE=$(date +"%m-%d-%Y-%T")
  10. gsutil rsync -r /mnt/$FILESHARE_MOUNT_PRIMARY/ gs://$GCP_BUCKET_NAME/$DATE/
  11. ## rsync primary filestore to secondary filestore ##
  12. rsync -avz --delete /mnt/$FILESHARE_MOUNT_PRIMARY/ /mnt/$FILESHARE_MOUNT_SECONDARY/
  13. sleep 1000
  1. Build and push the container image to container repository

  2. Deploy the cronjob

  3. Login to the cronjob container shell by running the below command:

  1. kubectl exec -it podname -- /bin/bash

Replace the podname with your cronjob’s pod name.

  1. Run the bash script manually using:
  1. bash -x /filestore-backup.sh
  1. You can also check whether the filestore has been properly mounted by running the command:
  1. df -h
  1. If any problem with deploying the cronjob, run:
  1. kubectl describe cronjob filestore-backup

Clean Up

Delete GKE Cluster

  1. gcloud container clusters delete $gkecluster --region europe-north1 --quiet

Delete Filestore Instances

  1. gcloud filestore instances delete $primaryfilestore --zone europe-north1-b --quiet
  2. gcloud filestore instances delete $secondaryfilestore --zone europe-west4-a --quiet

Delete Storage Bucket

  1. gsutil rm -r gs://$storagebucket
  2. gsutil rb -f gs://$storagebucket

Delete Subnets

  1. gcloud compute networks subnets delete $subnet1 --region europe-north1 --quiet
  2. gcloud compute networks subnets delete $subnet2 --region europe-west4 --quiet

Delete Firewall Rules

  1. gcloud compute firewall-rules delete allow-all-access-gke --quiet

Delete VPC

  1. gcloud compute networks delete $vpcname --quiet

Delete Service Account

  1. gcloud iam service-accounts delete $serviceaccount@$project.iam.gserviceaccount.com --quiet

TODO

  • replace ubuntu image with alpine linux
  • integrate Microsoft Teams notifications
  • integrate slack notifications