Rethinking Disaster Recovery: Top Recovery Scenarios for AWS

Click to learn more about author Andrew Langsam.

Regardless of what AWS and other cloud providers offer in their service-level agreements (SLAs), it is ultimately up to the enterprise to ensure proper protection of their own applications and data in the cloud. Still, the number of companies we have seen brought to their knees due to different disasters is astonishing. Staggering numbers such as 60 percent of small companies closing within six months of being hacked and nearly 30 percent of businesses losing revenue due to outages last year simply demonstrate that many fail to understand that the redundancy of the cloud is not enough to protect and recover their information.

Whether facing an actual natural disaster, an outage, or malicious attacks, not having a safety net to recover data in the cloud and ultimately keep the business running is unthinkable. Infrastructures, data, and backups are susceptible to complete loss without a disaster recovery plan to fall back to.

AWS Top Recovery Scenarios

Despite all the precautions taken, organizations have to be aware of the challenges they will face – specifically in AWS infrastructures – as this is the most popular and largest cloud used by companies worldwide. It is important to plan for both the complications we could potentially foresee and the issues that might be completely out of our hands and will take us by surprise.

Below are the five most common scenarios that organizations have to be aware of and know how to respond to mitigate incidents and avoid downtime.

Accidents happen: Human mistakes lead to accidental resource or volumes deletion. Protecting data against accidental termination is crucial for business continuity. The first step is to back up data regularly and ensure that all mission-critical data is following a reliable process. AWS users should also enable termination protection, which helps prevent issues such as administrators terminating an EC2 instance by mistake or accidentally deleting a specific Amazon EBS volume.

Thwarting thieves: Enterprises continue to overlook network security in AWS, and by extension they leave their environments open to malicious attacks. To stay ahead of these, customers must have strict policies in place: there shouldn’t be open ports (only those needed by applications); networks should be segmented and protected by services like AWS WAF, and an additional application-level control should be maintained on the instances themselves. Furthermore, having a secondary account ready to take over allows IT to quickly re-deploy infrastructures using CloudFormation templates and recover data from backups, putting you back in business with minimum downtime.

Unlawful access: Compromised accounts are one of the worst things that can happen to a business running on AWS. In this case, while the issue of a compromised account is being resolved by Amazon support, users could rely on a secondary account to take over. In preparation, backups should be safely stored away from the primary AWS account. Alternatively, customers could architect a cloud infrastructure to encompass multiple AWS accounts, each one dedicated to a specific environment or team, which would mitigate the area of attack.

Failover preparation: We have to be prepared for all scenarios, including those out of our hands, such as service disruptions by a cloud provider. Although it isn’t something that happens often in AWS, disruptions such as the outage of DynamoDB suffered in 2015 shed light over single points of failure and the need to have data backed up and ready to be served from other regions, and not only in different availability zones.

As an additional point of protection, organizations can leverage Global Tables, which allow them to have automated replication of their tables across multiple regions, along with full support for multi-master writes.

Facing natural disasters: Although there are new methods to know when a disaster will hit, we are not aware of how it will affect the business. In the face of hurricanes, storms, and other natural disasters, enterprises have to be prepared for the worse. Identifying and prioritizing the mission-critical applications and data is the first step. AWS customers should replicate those critical components across multiple availability zones, where common points of failure such as generators, UPS units, and air conditioning, are not shared. This will ensure up-time during a disaster and the continuity of the business.

Completely avoiding disasters in today’s IT world is almost impossible: No matter how prepared a company is, eventually something will go wrong. And when it does, the effectiveness of your disaster recovery plan will be put to the test. Whether you choose to create a secondary account or maintain an environment in a separate region, you should always back up data for everything that is mission-critical to minimize downtime and keep your business up and running.

LISTEN NOW: MY CAREER IN DATA PODCAST

Data Topics

Leave a Reply Cancel reply