When managing Kubernetes workloads, you may encounter scenarios where you must migrate data from one persistent volume to another. This can happen due to storage class changes, resizing constraints, cloud provider migrations, or performance optimizations. Ensuring a smooth transition while maintaining data integrity is crucial.
As per experience, migrating persistent volume (PV) data in Kubernetes can be complex, especially when dealing with large datasets or ensuring minimal downtime.
PV-Migrate is an open-source tool designed to simplify this process by providing a reliable, automated way to transfer data between two persistent volumes in the same or different namespaces.
It leverages rsync over Kubernetes jobs to efficiently copy data while preserving permissions, file structures, and symbolic links. It works seamlessly across different storage classes, allowing users to migrate data without needing manual intervention or external backup tools.
This guide will explore the strategies to migrate Kubernetes volume PersistentVolumeClaims
from one AWS EKS Amazon (Elastic Kubernetes Service) Cluster to another using PV-Migrate.
Installation
There are various installation methods for different use cases. You can follow your installation preference here:
https://github.com/utkuozdemir/pv-migrate/blob/master/INSTALL.md
Usage
Once the pv-migrate is installed, we can now explore the command to start our migration.
Here's what you need to know the command, it's usage and the flags.
https://github.com/utkuozdemir/pv-migrate/blob/master/USAGE.md#usage
Notable Flags
--dest
and--source
- Using these flags we can specify whatPersistentVolumeClaims
were copying from and wherePersistentVolumeClaims
were copying to.--source-kubeconfig
and--dest-kubeconfig
- Using these flags we can specify on what clusters we're working.--source-context
and--dest-kubeconfig
- Using these flags we will specify the needed context of the clusters we're working, this goes hand in hand with the above flags.--source-namespace
and--dest-namespace
- specify the namespaces of where the source and destinationPersistentVolumeClaims
resides.--helm-set
- Using this flag we can pass rsync extra arguments and otherpv-migrate
helm configuration values.--strategies
- Using this flag you can specify the order of strategies you want to implement in your migration.
Strategies
PV-Migrate offers variety of strategies on how you want to migrate you volume contents and it during the execution of the migration it will try these strategies except until one of the strategies work or all of them will fail except for the local
which is in experimental phase at the moment.
mnt2
(Mount both) - Mounts both PVCs in a single pod and runs a regular rsync, without using SSH or the network. Only applicable if source and destination PVCs are in the same namespace and both can be mounted from a single pod.svc
(Service) - Runsrsync+ssh
over a Kubernetes Service (ClusterIP). Only applicable when source and destination PVCs are in the same Kubernetes cluster.lbsvc
(Load Balancer Service) - Runsrsync+ssh
over a Kubernetes Service of type LoadBalancer. Always applicable (will fail if LoadBalancer IP is not assigned for a long period).local
(Local Transfer) - Runs sshd on both source and destination, then uses a combination ofkubectl
port-forward logic and an SSH reverse proxy to tunnel all the traffic over the client device (the device which runs pv-migrate, e.g. your laptop). Requires ssh command to be available on the client device.
Note that this strategy is experimental (and not enabled by default), potentially can put heavy load on both apiservers and is not as resilient as others. It is recommended for small amounts of data and/or when the only access to both clusters seems to be through kubectl (e.g. for air-gapped clusters, on jump hosts etc.).
https://github.com/utkuozdemir/pv-migrate/blob/master/USAGE.md#strategies
Setup
In our scenario, our EKS Clusters live in different accounts and different VPC, so there are additional steps to configure networking accessibility compared on working inside the same cluster.
Configure EKS Cluster Config file
You can generate or update your Kubernetes Config file using the below commands.
Don't forget to modify the values for each flags to relevant values for your source and destination cluster.
Generate
aws eks update-kubeconfig --region <region-code> --name <eks-cluster-name> --kubeconfig ./<kube-config-file-name> --profile <aws-config-profile>
Update
aws eks update-kubeconfig --region <region-code> --name <eks-cluster-name> --profile <aws-config-profile>
Whether you generated or updated your kube-config file, you can either open the file or use the below command to know the context for your source and destination cluster.
kubectl config get-contexts
Take note for the values of your source and destination kube-config
files and contexts.
Configure Networking
As previously stated, our EKS clusters lives on different account and different VPC.
In this scenario, we have couple of options on how to configure our networks; either the communication will be public or we will preserve the communication in private leveraging AWS backbone network.
In our case, we choose the latter and the below are our configuration.
VPC Peering
We've established VPC Peering between our VPCs
Route Table
We've modify the Route Table we have per relevant subnets and created routes to point traffic between the VPCs using their CIDR.
Security Groups
We've modify the Control Plane Security Group for each EKS cluster to allow communication to another.
Configure Permissions
As our EKS clusters lives on different accounts, we need to allow cross-account access between EKS clusters.
This is simply configuring either the aws-auth
ConfigMap or EKS iam access entries.
Using aws-auth
Using aws-auth
you can create your Kubernetes Roles and Role Bindings to map your Users or Roles that you are using in this migration.
Note: This option will be deprecated soon.
Here's the example aws-auth
ConfigMap:
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::111122223333:role/my-role
username: system:node:{{EC2PrivateDNSName}}
- groups:
- eks-console-dashboard-full-access-group
rolearn: arn:aws:iam::111122223333:role/my-console-viewer-role
username: my-console-viewer-role
mapUsers: |
- groups:
- system:masters
userarn: arn:aws:iam::111122223333:user/admin
username: admin
- groups:
- eks-console-dashboard-restricted-access-group
userarn: arn:aws:iam::444455556666:user/my-user
username: my-user
For more details, you can follow an AWS documentation here.
Using EKS IAM access entries
This is mostly similar with the former but this can be managed outside your EKS cluster.
Fundamentally, an EKS access entry associates a set of Kubernetes permissions with an IAM identity, such as an IAM role.
Considering that aws-auth
will be deprecated soon, it's better to practice using this.
For more details, you can follow an AWS documentation here.
Migration
Once the above steps are configured, we can start the migration. In the previous steps, we gathered all the values needed and we just need to substitute those values into the below command.
Note: Before starting the migration, consider stopping or scaling down the relevant Kubernetes resources to zero. This prevents new data from being written during the migration, ensuring data consistency and avoiding potential conflicts.
pv-migrate \
--strategies "lbsvc" \
--lbsvc-timeout 10m0s \
--helm-timeout 10m0s \
--helm-set rsync.extraArgs="--ignore-times --checksum" \
--helm-set rsync.maxRetries=20 \
--helm-set rsync.retryPeriodSeconds=60 \
--log-level "DEBUG" \
--source-kubeconfig "<source-kube-config-file>" \
--source-context "<source-context>" \
--source-namespace "<source-namespace>" \
--source "<source-persistent-volume-claim-name>" \
--dest-kubeconfig "<destination-kube-config-file>" \
--dest-context "<destination-context>" \
--dest-namespace "<destination-namespace>" \
--dest "<destination-persistent-volume-claim-name>" \
--dest-delete-extraneous-files
In the command above, we've prioritized using the lbsvc
strategy. As stated in the Strategy section, this implementation leverages Kubernetes Service type of LoadBalancer
in which in AWS it creates a AWS Load Balancer and performs the migration.
Although this setup does not work as it is, we need to override the default values of the pv-migrate
and tailor it on how AWS Load Balancer interacts with the EKS Cluster.
Specifically AWS does takes time to create the Load Balancer and granting it an IP sometimes it takes it 10 minutes to complete the Load Balancer creation. Hence we've added couple of --helm-set
flags.
--helm-set rsync.maxRetries=20
and--helm-set rsync.retryPeriodSeconds=60
- Using this flag, we can provide EKS with sufficient time to wait for AWS to assign an IP address to the Load Balancer. Since this process can take some time, EKS may initially be unable to communicate using the FQDN.rsync.extraArgs="--ignore-times --checksum"
- Using this flag we ensures a reliable sync by verifying all file contents, even if timestamps match.
Conclusion
PV-Migrate is a powerful tool for migrating Persistent Volume Claims (PVCs) between EKS clusters across different AWS accounts. By leveraging rsync over Kubernetes jobs, it ensures efficient and reliable data transfer while preserving file integrity and permissions. Unlike manual methods or snapshot-based approaches, pv-migrate simplifies the migration process without requiring downtime or complex configurations.
With its ability to handle cross-cluster and cross-account migrations, its an excellent choice for EKS administrators looking for a seamless and automated way to transfer persistent data between environments.
Before we embark further into our cloud journey, I invite you to stay connected with me on social media platforms. Follow along on Twitter, LinkedIn. Let's continue this exploration together and build a thriving community of cloud enthusiasts. Join me on this exciting adventure!
Top comments (0)