This post is a continuation of the series on Solid State Drives (SSDs) and their role in enterprise storage. In the first post, I discussed the differences between traditional HDDs and SSDs. In the second post, I looked at the challenges associated with data destruction and asset disposal. In this post, I will discuss how to minimize the risks associated with data loss and data destruction.
There are a couple of areas that can make or break an organization in the event of a data disaster for both HDDs and SDDs. Studies have suggested that data loss costs companies more than $18 billion per year and that 50 percent of companies that have an outage lasting 10 days or more will go out of business within 5 years, with 70 percent of them closing within the first 12 months. Understanding your storage and taking the steps to minimize the impact of a data disaster can greatly improve your organization’s chances of surviving.
I have broken the basics into five categories:
- Disaster recovery plan documentation
- Map storage, servers, networking, physical layout
- Electronic vs. physical
- Denoting a Data Recovery provider
- Understanding how the storage works and how the data is laid out
- Dynamic/pool allocation vs. static allocation
- RAID configurations
- Minimizing writes to SSD
- Operating system and file system choices
- File system choices have consequences
- Disk utilities and their impact
- Choosing a backup vendor
- Restoring data to a different location in a crisis
- Testing backups
- Data center organization and design
A good disaster recovery plan is essential in an emergency. A good plan includes documentation of all of the assets in the data centers, how the systems are configured (including storage, servers and networking), physical maps of the data center, contacts and the plan for recovery, including contact trees, equipment lists, vendor contacts and priority lists. A natural disaster may make it impossible to access a copy of the plan on the server, so be mindful of only having one copy, especially if it is electronic. Multiple copies, both electronic and physical, are recommended.
There are free templates on the web for creating your own or there are some great companies that will do an onsite assessment, map all of your resources and help you create a disaster recovery plan tailored to your specific needs. Every good disaster recovery plan includes a data recovery company that can assist as needed. It is a good idea to work with the recovery company ahead of time and execute all necessary agreements before a disaster occurs so time isn’t wasted when data needs to be recovered.
It is important to have an understanding of your storage and how the data is laid out before a disaster hits. Find a storage vendor that will sit down and walk through how their storage works, including what type of RAID and file system is being used. As an example, RAID 0 will be faster than RAID 5, but does not include any redundancy. Whereas RAID 6 allows for a two drive failure, but performance may suffer. It is important to get the right blend of performance and protection in the storage you select. One of the current trends in storage is the dynamic allocation of volumes where disks are put together in a common pool and then carved into volumes as needed. While this makes the allocation of storage convenient for the storage administrators, it can make data recovery very complicated and in some cases impossible if the maps that allocate the volumes are damaged or overwritten. Another area to consider is whether your storage uses traditional, solid state or some hybrid combination of SSDs and HDDs. Storage vendors should treat SSD and HDD differently to maximize the longevity of the storage, minimize the impact of failures (NAND flash fails more frequently) and maximize performance. In short, do your homework before you choose a storage vendor!
Operating/File System Choices
Carefully choose the operating system and the file system that are right for both your storage and your data. So many times, I see corrupted volumes because of the file system selected and the data that is written to the disk. Make sure to ask the storage vendor which operating systems have been tested and which are preferred on a particular storage platform. Be wary of proprietary solutions that do not have documentation or real-world testing. Many vendors allow you to “test” a system before purchase, so dedicate time and resources to this “test” phase to ensure you select the proper storage for your environment.
This is often one of the most overlooked areas in a disaster recovery plan. I often hear from potential data recovery customers that, “we backup our data.” When I ask if they have tested the restore of their backups, the answer if often “no” or a look of confusion. One of the best programs I have seen for testing backups is a company that included restore as part of the compensation plan for its employees. The system admins had to do a bare metal restore of all of the critical systems in a lab environment on a quarterly basis. Failure to fully restore the system was cause for a verbal warning, a second failure resulted in a written warning, and athird failure in a calendar year was grounds for termination. Successful restores were rewarded with a quarterly bonus, and if all four quarters in the year were successful, an additional annual bonus was given. A good way to avoid a complete data disaster is to first make sure the backup solution you purchase is actually backing up the data you need to protect and second, make sure that the data that was backed up can be restored. Finally, backups should be stored offsite to avoid a natural disaster, fire, flood, etc. Different backup solution vendors approach the backup and restore tasks differently, make sure to try out a backup vendor before you buy to make sure that the solution offered meets your individual needs.
Data Center Organization and Design
This is such a huge area (power, cooling, physical layout, density, security, etc.), and so much has been written about it by specialists in the area that it is not worth a deep dive. However, it is worth noting that poor design and documentation can lead to much larger issues in the event of a disaster. I strongly recommend a data center audit to help identify strengths and weaknesses in your current setup. Catch the little problems before they become much bigger challenges.
So much of preventing and responding to a data disaster is about planning. A good plan can eliminate many of the challenges that individuals and companies face when confronted with data loss. Take the time to prepare for the worst when times are good, and you will never lose data again.