RAID5 implementation aspects

ZAR has been discontinued

After about twenty years, I felt ZAR can no longer be updated to match the modern requirements, and I decided to retire it.

ZAR is replaced by Klennet Recovery, my new general-purpose DIY data recovery software.

If you are looking specifically for recovery of image files (like JPEG, CR2, and NEF), take a look at Klennet Carver, a separate video and photo recovery software.

The theory behind RAID level 5 is simple, but implementations are not. The number of parameters controlling the exact layout is greater than in a RAID 0 and as a result, more variations are possible. Here is a brief discussion of these additional RAID5 parameters (assuming that stripe size and order of member disks in the array is already known).

Parity placement

Let's first consider parity location. Parity should be distributed evenly across member disks, which leaves us with two parameters:

Starting disk number - a number of disk containing parity at row 0 (at the very start of the array). Most typical implementation is to put the first parity block into the end of the row (last column). This is shown in the example below (parity starts at Disk 2 in both examples).
Rotation - an increment value applied to the number of the disk containing parity when writing a new row. Typical values are either +1 (to the right, forward layout) or -1 (to the left, inverted layout).

Parity placement examples: Forward and inverted layouts

Parity placement examples: forward and inverted layouts, parity starts at Disk 2 in both cases.

Once the parity has been placed, we then need to define how the data is distributed (data interleaving rule). Two widespread approaches are discussed below.

Placing data - checkerboard layout

In a "checkerboard" layout, data is placed into the array blocks (stripes) left-to-right, skipping parity. Two examples of the resulting placement are provided below:

Placement of the adjacent blocks is worth some discussion. In the 3-disk forward checkerboard layout it is possible that adjacent data blocks are placed on the same physical disk, effectively doubling the stripe size - see blocks 8/9, 14/15 on the upper left chart. This may cause undesired performance hit.

Placing data - Microsoft (LDM) layout

Windows 2000 and higher (or, more precisely, the Logical Disk Manager - LDM - component) uses more sophisticated algorithm to place data to achieve better block spacing. The process of selecting what goes where is best described in two steps:

Write parity into the first column of the array, followed by the data blocks in ascending order.
Rotate the rows until parity "slips" into its place.

This results in ordering different than a "checkerboard" layout, as illustrated below:

LDM RAID5 parity and data placement

Please note that "forward LDM" layout results in minimum distance between blocks in the same column, which is not very efficient performance-wise. So it is not used. Contrary to that, an "inverted LDM" layout gives the largest possible difference between two blocks in the column (equal to the number of the array member disks). This layout is used by Windows to achieve maximum performance of the RAID5 array. The net result is as follows:

Detecting an array layout automatically

Human being can solve the problem easily (especially thinking about an experienced human being) by looking at the overall picture of the broken array. The automated software lacks the human perception and has to utilize different approaches. Automated approaches mostly rely on some statistical properties of the array. This imposes some requirements as far as broken array is concerned.

Arrays containing large amounts of data are easier to reconstruct because they contain more samples for a statistics.
Arrays with more drives require more data (for more samples)
RAID5 is more complicated because it has more parameters (layout type and parity parameters are to be determined).

ZAR RAID Recovery Software can detect RAID5 configuration automatically. For more information about RAID5 recovery please refer to this page.