Data Insecurity: RAID

In our series about data protection against physical loss, so far we’ve discussed choosing the proper drives for a particular scenario. However we must still address the looming possibility of a drive failure, no matter the quality of the drive.

A popular way to protect data against a single hard drive failure is to spread the risk across logical groups of drives in a RAID set or Redundant Array of Inexpensive (or Independent) Disks. RAID sets configure drives in a certain arrangements to maximize data integrity and performance, all the while appearing to be a single volume at the software level.

RAID is normally configured at the hardware level, such as a drive controller on a server. Many desktop motherboards have RAID options available, also some operating systems offer software-defined RAID. These are usually seen as inferior to a dedicated hardware RAID controller, though there are some OSes that are designed specifically for managing software-defined storage, such as FreeNAS and Unraid.

The most common “levels” of standard RAID are 0, 1, 5, 6, 10. There are other iterations and proprietary variations, but most are based on these levels.

RAID 0

In a RAID 0 set, known as striping, two or more drives are grouped together in a logical volume, increasing the speed of drive read/write access and uses all of the available drive space. Two 1TB drives would be recognized as a single 2TB volume.

Unfortunately RAID 0 is actually the opposite of data protection. Because the volume is spanned over more than 1 drive, the logical volume will be destroyed if a single drive in the array fails. The likelihood of data loss is far higher than with a single drive.

With the advent of high speed SSD/NVMe drives, there is limited benefit to using RAID 0, however it can still be useful for non-critical data that requires faster data access. Flash-based storage is still relatively expensive, so in some cases RAID 0 is still a viable option.

RAID 1

RAID 1 sets, known as mirrored sets, are exactly what they sound like. The data is duplicated across two or more drives all of the same size, still reading as a single volume. If and when a drive fails, all the information is still available on the non-failed drives. The failed drive can be replaced and the array rebuilt with no data loss.

RAID 1 is an exceptionally robust way to store active data, the loss of all but one drive is easily recoverable. This level of security comes at a cost; first, write speeds will be no faster than the slowest drive in the set; second, it is a very inefficient way to store data.

For example, two 1TB drives may be set up in a RAID 1 set, the drives are recognized as a 1TB volume. It doesn’t automatically halve the storage though, three 1TB drives in RAID 1 would still be seen as a 1TB volume, but there would effectively be 3 copies of the data.

RAID 5

RAID 5 was developed to strike a balance between the speed and efficiency of RAID 0 with the fault tolerance of RAID 1. Known as striping with distributed parity, the data is split across sets of 3 or more drives (striping) but uses the equivalent of a single disk in the array for protection. Effectively, you lose the capacity of only 1 drive in the set.

In the event of a drive failure in a RAID 5 set, the parity section of each drive can be used to rebuild the contents of the lost drive. RAID rebuild algorithms are outside the bounds of the scope of this article, but as drives become larger, there can be significant rebuild times when a RAID 5 set loses a drive.

A minimum of 3 drives is required for a RAID 5 set. In a standard 3-drive array, ⅓ of each drive would be reserved for parity. So a set of three 1TB drives would be seen as a 2TB volume and a set of four 1TB drives would create a 3TB volume.

RAID 5 is vastly more efficient than RAID 1 for data storage and is generally faster, however it is much slower recovering from a failed drive. Rebuild times are increasingly important as common drives begin exceeding 10TB. If a second drive should fail during a RAID 5 rebuild, it is almost guaranteed all the data from the array will be lost. RAID 5 is used less common lately due to the vulnerability of data during a rebuild.

RAID 6

RAID 6 is known as striping with double distributed parity. It’s essentially RAID 5 with an additional parity drive. Because parity is doubled, RAID 6 sets can withstand two concurrent drive faults. For larger drive sets, RAID 6 is probably the most well-balanced of all RAID levels.

There is a fairly high cost for a RAID 6 set, it requires a minimum of 4 drives, two dedicated for parity. So arrays with smaller numbers of drives, it can be just as inefficient as RAID 1, however RAID 6 sets can scale to dozens of drives with only two parity drives.

Nested RAID 10

RAID 10, or 1+0, arrays are a combination of both RAID 1 and RAID 0. RAID 10 nests two RAID 1 sets into a single RAID 0 set. This combination couples the reliability of mirrored arrays with the speed of striped arrays.

RAID 10 has a high cost of entry, minimum 4 drives. For example, four 1TB drives in a RAID 10 set would be a stripe of two 1TB mirrored arrays. Those four 1TB drives would yield only 1TB of usable space. However it can be extremely fast on drive access and array rebuilds.

Depending on configuration and which drives fail in a RAID 10 set, it may also be able to withstand multiple drive failures.

The number of nested RAID combinations are virtually unlimited, and many higher-end cards can even span an array across multiple controllers.

More?

In addition, many vendors offer their own proprietary RAID solutions, most are usually based on RAID 6. Some systems use distributed storage networks that can span hundreds/thousands of sites, but again, that is way beyond the scope of this introductory-level article.

Bonus: Hot Spares

Hot spares are available with most RAID levels (not 0), depending on the controller and configuration. A spare is a drive assigned to a set that is available to automatically join a RAID set, taking the place of a drive that has failed.

For example, if a drive fails in a RAID 6 set but the set has a hot spare available, the array rebuilding process can begin as soon as the fault is detected by the controller, requiring no human intervention.

Many controllers allow for multiple hot spares, allowing for a greater degree of data integrity.