November 08, 2023

Article
5 min

5 Ways to Determine the Right Disaster Recovery as a Service Solution for You

DRaaS in public cloud offers advantages like cloud economics, reliability, ease of use and flexibility to support infrequent, unpredictable disaster scenarios. Here’s 5 factors to consider when evaluating public cloud as a solution for your DR needs.

Person standing in data centre looking at a digital tablet, with a serious expression.

Using an on-premises site as a disaster recovery target can be complex, expensive and unreliable. Disaster Recovery as a Service (DRaaS) delivered in the public cloud offers advantages such as cloud economics, reliability, ease of use and the flexibility to support infrequent but unpredictable disaster scenarios.

DRaaS running on an elastic public cloud with built-in automation reduces the amount of underutilized hardware and maintenance tasks, simplifies the deployment in case of a disaster recovery (DR) event and increases the reliability of the DR solution with non-disruptive testing.

Here are five key factors to consider when evaluating public cloud as a solution for your DR needs.

1: Identify the RTOs required by different applications

When determining which applications to back up in the cloud, categorize them based on their recovery time objectives (RTOs), the acceptable amount of time before an application comes back online. Some DRaaS solutions have RTOs ranging in minutes, others ranging in hours and others in days.

Different business needs dictate different RTO levels. A revenue-generating application can rarely be down for long, while HR applications can typically take eight hours or more to come back online without meaningful business impact. Naturally, the shorter the RTO requirements for an application, the more expensive it is to recover those applications in the required timeline.

2: Determine if the service offers failover automation and orchestration

If data is only copied to the cloud, organizations are left with the task of setting up a full environment, spinning up compute instances, moving data to the right cloud storage service and setting up networking. Many of these tasks are highly manual and require significant time to execute. For applications with an RTO of two days or more, that might not be an issue. However, for revenue-generating applications, that is typically too long.

For more critical applications, organizations should choose cloud-based services that offer DR failover orchestration and automation. Such services deploy a DR environment based on a pre-defined runbook. They spin up the required nodes, power on VMs in the correct sequence according to the right dependencies, run scripts and map IP networks automatically, all with very little human intervention. This ensures that critical applications can be powered up in time and minimizes any business impact of a disaster.

3: Evaluate the length of time needed for VM format conversions

Traditional applications, which are still very dominant in many organizations, are typically deployed as virtual machines (VMs). Different hypervisors have different VM formats, and many public clouds do not have the same VM formats as an organization’s on-premises VMs.

For applications written and deployed on one hypervisor to be used on another hypervisor, the VM disk format needs to be converted. VM format conversion is typically a long and complicated process, and organizations can spend many months in conversion. More importantly, during this process, your applications are not protected in the case of a disaster.

4: How often do they run non-disruptive DR tests

Data centres are not static – existing applications get updated or replaced, and more applications are added over time.

This results in a drift between an organization’s original DR plan and an effective DR plan that can keep up to date with the changing applications. To make sure this does not occur, you need to test your DR plan often, at least once per quarter. Since these tests are not real disasters, they should not affect an organization's current running applications.

A good DRaaS solution should offer extensive nondisruptive testing and provide detailed reports generated by these tests.

5: Ensure cost efficiency compared to existing DR solutions

DRaaS solutions need storage components in steady state to store the data that needs to be protected. Organizations need a highly efficient storage layer in the cloud to store this data to optimize their costs. To further reduce the costs, you should be able to spin up the infrastructure in the cloud only when needed during a DR testing or failover event.

Consider whether the DRaaS solution always requires bringing back all the data upon failback or does it allow optimized failback, what is the price and metric of the data being protected and what are the cloud provider’s egress charges while transferring the data to/from the cloud infrastructure. These factors will significantly affect the DR costs.

VMware DRaaS solutions on VMware Cloud

VMware Cloud Disaster Recovery offers on-demand disaster recovery, delivered as an easy-to-use SaaS solution with cloud economics. It combines cost-efficient cloud storage with simple SaaS-based management for IT resiliency at scale. You can benefit from consistent VMware operations across production and DR sites and a ‘pay when you need’ failover capacity model for disaster recovery resources.

VMware Cloud Disaster Recovery can protect a very broad set of IT services in a cost-efficient manner, with fast recovery capabilities.

VMware Site Recovery™ for VMware Cloud™ on AWS offers a complete DR service. VMware Site Recovery can protect mission critical IT services that require very low RPO and RTO. With Site Recovery, customers have access to global, reliable infrastructure, with the familiar interface of vSphere and vCenter, and without the need for re-platforming.

Additionally, by leveraging widely proven and tested DR solutions such as VMware Site Recovery Manager™ (SRM), customers can orchestrate and automate failover, failback and IP network remapping, and conduct non-disruptive testing that generates extensive reports.