Protecting Magento e-commerce platform in AKS against disasters with Astra Control Service

Abstract

Introduction

Scenario

Deploying Magento

Protecting Magento with Astra Control Service

Custom execution hooks for ElasticSearch and Magento

Simulating disaster and recover the application to another cluster

Disaster recovery simulation

Summary

Resources

Abstract

In this article, we describe how to protect a multi-tier application with multiple components (like Magento, now Adobe Commerce) on Azure Kubernetes Service against disasters like the complete loss of a region with NetApp® Astra™ Control Service. We demonstrate how the use of pre-snapshot execution hooks in Astra Control Service enables us to create application-consistent snapshots and backups across all application tiers and recover the application to a different region in case of a disaster.

Co-authors: Patric Uebele, Sayan Saha

Introduction

NetApp® Astra™ Control is a solution that makes it easier to manage, protect, and move data-rich Kubernetes workloads within and across public clouds and on-premises. Astra Control provides persistent container storage that leverages NetApp’s proven and expansive storage portfolio in the public cloud and on premises, supporting Azure managed disks as storage backend options as well.

Astra Control also offers a rich set of application-aware data management functionality (like snapshot and restore, backup and restore, activity logs, and active cloning) for local data protection, disaster recovery, data audit, and mobility use cases for your modern apps. Astra Control provides complete protection of stateful Kubernetes applications by saving both data and metadata, like deployments, config maps, services, secrets, that constitute an application in Kubernetes. Astra Control can be managed via its user interface, accessed by any web browser, or via its powerful REST API.

For a set of validated applications (MySQL, MariaDB, PostgreSQL, and Jenkins), Astra Control already includes the necessary hooks to guarantee application consistent snapshots and backups. For other applications, Astra Control allows us to add custom hooks to be executed before and after taking snapshots of applications managed by Astra Control. With Owner, Admin, or Member roles in Astra Control, we can define custom execution hooks for non-validated applications to guarantee consistent snapshots. Templates for execution hook scripts can be found in the Astra Control documentation.

Astra Control has two variants:

  1. Astra Control Service (ACS) – A fully managed application-aware data management service that supports Azure Kubernetes Service (AKS), Azure Disk Storage, and Azure NetApp Files (ANF).
  2. Astra Control Center (ACC) – application-aware data management for on-premises Kubernetes clusters, delivered as a customer-managed Kubernetes application from NetApp.

To showcase Astra Control’s backup and recovery capabilities in AKS, we use Magento, an open-source e-commerce platform written in PHP. Magento consists of a web-based front end, an Elasticsearch instance for search and analysis features, and a MariaDB database that tracks all the shopping inventory and transaction details. Every pod in the application uses persistent volumes to store data: ReadWriteOnce (RWO) volumes backed by Azure Disk for Elasticsearch and MariaDB, a ReadWriteMany (RWX) volume backed by Azure NetApp Files for the web frontend, storing media files like product images.

Scenario

In the following, we will demonstrate how custom execution hooks enable us to take consistent snapshots and backups across all the components of Magento. Based on the templates for custom execution hooks in the ACS documentation, we’ll write simple hook scripts for Elasticsearch and Magento, add the scripts as pre- and post-snapshot execution hooks to ACS, and test their functionality in a disaster recovery simulation with a running Magento instance across two AKS clusters in separate regions.

Deploying Magento

We deploy Magento on AKS cluster pu-aks-1 in Location eastus. The cluster is managed by our ACS account already, with Azure disk (default) chosen as the default storage class and ACS automatically also installed Astra Trident as storage provisioner for RWX volumes backed by NetApp Azure Files in service level premium (storage class netapp-anf-perf-premium🙁

GeertVanTeylingen_0-1649929130568.jpeg

To deploy the Magento application, we use the appropriate helm chart from the Bitnami Helm chart repository, specifying the parameters to use RWX access mode volumes with storage class netapp-anf-perf-premium for the Magento PV:

After some minutes, all the pods are up and running and one can connect via the external LoadBalancer IP:

For a realistic experience, we install Magento 2 sample data, following the steps here and try connecting to Magento via its external IP address:

GeertVanTeylingen_1-1649929130602.jpeg

Protecting Magento with Astra Control Service

Switching to the ACS UI, we see that the myshop Magento application was discovered by ACS, and we can start managing and protecting it:

GeertVanTeylingen_2-1649929130607.jpeg

Looking at the detail of the managed myshop app in ACS, we see that ACS already provides execution hooks for MariaDB, as it’s one of the applications validated with ACS:

GeertVanTeylingen_3-1649929130616.jpeg

Custom execution hooks for ElasticSearch and Magento

To ensure that snapshots and backups are application consistent across all the Magento tiers, we add custom execution hooks for ElasticSearch and Magento, quiescing or flushing the caches before taking a snapshot/backup.

For ElasticSearch, we use the script below as pre-snapshot hook:

and as post-snapshot hook:

For Magento, flushing its cache before taking a snapshot should be sufficient, so we want to add this script as pre-snapshot hook:

To add the above custom execution hooks to ACS, we follow the ACS documentation and the steps in this blog post.

Now execution hooks for all Magento components are in place:

GeertVanTeylingen_4-1649929130626.jpeg

And we can start a backup to test proper execution of the hooks:

GeertVanTeylingen_5-1649929130644.jpeg

In the Activity log, we can confirm that all the pre-snapshot hooks are executed before the snapshot process starts:

GeertVanTeylingen_6-1649929130654.jpeg

Followed by the post-snapshot hooks, which are executed before the backup process begins, moving the data from the snapshots to the object storage bucket:

GeertVanTeylingen_7-1649929130657.jpeg

We can also check for the details of each hook execution like container/image, and duration:

GeertVanTeylingen_8-1649929130663.jpeg

Simulating disaster and recover the application to another cluster

In the next step, we want to test the recovery of the Magento e-commerce platform after a simulated disaster.

Let’s first start some activity on our sample shopping platform by creating a user account:

GeertVanTeylingen_9-1649929130671.jpeg

GeertVanTeylingen_10-1649929130682.jpeg

And add some items to the user’s shopping basket and wish list:

GeertVanTeylingen_11-1649929130686.jpeg

GeertVanTeylingen_12-1649929130693.jpeg

After doing the above updates on the shopping platform, we create a snapshot of the myshop application (with all the execution hooks still enabled):

GeertVanTeylingen_13-1649929130701.jpeg

And take a backup from this snapshot:GeertVanTeylingen_14-1649929130708.jpeg

The backup myshop-backup-20220405131329 contains the most recent updates to the shopping platform now:

GeertVanTeylingen_15-1649929130713.jpeg

Disaster recovery simulation

To simulate the complete loss of the cluster pu-aks-1 hosting the myshop application, we delete the cluster from the Azure console.

ACS detects that both the application and the cluster are not reachable anymore:

GeertVanTeylingen_16-1649929130725.jpeg

Both the cluster and the application will be put in the state Removed by ACS:

GeertVanTeylingen_17-1649929130728.jpeg

GeertVanTeylingen_18-1649929130732.jpeg

As the backups are stored in object storage and we can add buckets with a very high level of redundancy to Astra Control (see the ACS documentation and this blog post for instructions on how to add additional buckets to Astra Control for storing your backups), the backups will be available even after the loss of a region and we can recover the application in such a scenario from an existing backup:GeertVanTeylingen_19-1649929130739.jpeg

To recover the shopping application from our simulated loss of a complete Azure region, we bring up a new AKS cluster pu-aks-dr in the Azure region westeurope and add it to ACS (to reduce Recovery Time Objective (RTO) we can keep an Astra Control managed AKS cluster in westeurope ready to run the restored application):

GeertVanTeylingen_20-1649929130749.jpeg

As we explicitly specified the storage class netapp-anf-perf-premium for the Magento PV during installation via the helm chart, we must make sure that the same storage classes are available on our recovery cluster (this is a known limitation in ACS, see here) and set the same default storage class when adding the DR cluster to ACS:

GeertVanTeylingen_21-1649929130766.jpeg

GeertVanTeylingen_22-1649929130776.jpeg

Once the DR cluster pu-aks-dr is managed by ACS and Astra Trident has been installed and configured by ACS (i.e., the storage class netapp-anf-perf-premium backed by Azure NetApp Files is available on the cluster) we can initiate the restore of the myshop application, choosing its recent backup mshop-backup-20220405131329 as restore source:

GeertVanTeylingen_23-1649929130785.jpeg

We select pu-aks-dr as the destination cluster and restore into the same namespace myshop in which the original application was deployed:

GeertVanTeylingen_24-1649929130799.jpeg

GeertVanTeylingen_25-1649929130810.jpeg

The restore to pu-aks-dr will start immediately

GeertVanTeylingen_26-1649929130815.jpeg

And finish after a few minutes, resulting in a 2nd managed application myshop in Healthy state running on pu-aks-dr:

GeertVanTeylingen_27-1649929130819.jpeg

Checking on the command line, we see that all pods of the restored app are ready on cluster pu-aks-dr:

Nevertheless, there’s one last manual step to do before we can access the shopping platform again. As we can see above, the restored Magento service has, for obvious reasons, a different external IP address from the original installation. Using the helm chart as we did for deployment, the external IP address is stored in Magento’s base URLs (see here, e.g.). Checking the Magento configuration in the restored Magento pod, we see that the base_url parameters still point to the original IP address 20.232.249.211:

We must update both the secure and unsecure base_url setting with the new external IP address 20.31.226.13 in the Magento pod:

Alternatively, we could have installed the primary Magento application on the source cluster with the MagentoHost parameter pointing to a FQDN instead of the LoadBalancer service IP. Then add a static DNS entry to point the FQDN to the LoadBalancer service IP we got assigned with. When restoring Magento to a different cluster, update the DNS entry with the new LoadBalancer IP on the destination cluster.

With the base_url parameter set to the new IP address of the restored application, we can connect to the restored sample shop again:

GeertVanTeylingen_28-1649929130832.jpeg

and login as the customer we created initially (using the same login credentials):

GeertVanTeylingen_29-1649929130841.jpeg

The content of the shopping cart was preserved:

GeertVanTeylingen_30-1649929130856.jpeg

as well as the wish list:

GeertVanTeylingen_31-1649929130863.jpeg

We can continue the shopping process where we left off, e.g., by adding the wish list content to the shopping cart:

GeertVanTeylingen_32-1649929130872.jpeg

GeertVanTeylingen_33-1649929130883.jpeg

Summary

In this article we described how we can make Magento (an E-commerce platform) running on AKS using Azure Disk Storage and Azure NetApp Files resilient to disasters, enabling us to provide business continuity for the platform. NetApp® Astra™ Control makes it easy to protect business-critical AKS workloads (stateful and stateless) with just a few clicks. Get started with Astra Control Service today with a free plan.

Resources

  1. https://docs.netapp.com/us-en/astra-control-service/index.html
  2. https://docs.microsoft.com/azure/architecture/example-scenario/magento/magento-azure
  3. https://cloud.netapp.com/blog/astra-blg-easily-integrate-protection-into-your-kubernetes-ci/cd-pipeline-with-netapp-astra-control
  4. https://www.cloudways.com/blog/magento-2-sample-data/
  5. https://techcommunity.microsoft.com/t5/azure-architecture-blog/protecting-mongodb-on-aks-anf-with-astra-control-service-using/ba-p/3057574
  6. https://kubernetes.io/docs/concepts/storage/persistent-volumes/

https://techcommunity.microsoft.com/t5/azure-architecture-blog/protecting-magento-e-commerce-platform-in-aks-against-disasters/ba-p/3285525

Leave a Reply