Home \ Articles \
RAID recovery performance
RAID recovery performance
As the storage systems grew, there was influx of complaints that ZAR
performance is inadequate for the tasks it has to deal with. The most
recent example is when we have a report of ZAR taking impractically
long and finally running out of memory while attempting to recover an
1TB RAID5 array (exactly following our RAID5 recovery tutorial) holding about three millions of files.|
There are three major aspects of this,
- 1TB of disk space takes quite a long time to scan
- Processing 3,000,000 files requires massive computation effort.
- Memory requirements also become a concern with 3,000,000 files.
So we started on it and achieved some significant progress as it came to
version 8.3 build 7. The measurements were mostly done out of curiosity;
however, you can see what the speed is like.
Test system setup
The machine is 3.06GHz hyperthreaded Pentium 4, with 2GB physical
memory. The array is created on the two 40GB Seagate U6 drives in a RAID
0 (stripe set). These two 5400RPM drives (made circa 2002) are capable
of delivering about 40MB/sec combined throughput in linear reads. Tests
were performed with ZAR version 8.3 build 7.|
Additional tweaks were
- Cache sizes in ZAR are set to the minimum allowed values.
- The /3GB switch was added to the boot.ini entry to allow
ZAR to use 3GB of the virtual memory
|Number of files
|Number of directories
|Peak memory usage, MB
(1) In a best case scenario, disk scan time for a RAID
array is proportional to the member disk size. In a worst case, disk
scan time is proportional to the array size.
(2) This does not include time required to actually copy the files.
(3) Starting with nine millions of files we ran out of the
available physical memory and massive swapping started. It was
impractical to continue testing under these conditions and just one more
run was performed to ensure we can actually handle 11M files.
Performance and hardware guidelines
Recovering a huge RAID volume requires that you do some prior
planning and set up. You do not want the system to run out of memory and start
swapping, as the recovery will grind to a halt. So to recover some
really big array, you need to
- add a /3GB switch in your boot.ini file
- have at least 3GB (preferably 4GB) physical memory installed.
- make sure you have some place to offload that data
Under these conditions, ZAR will recover an array containing about
12,000,000 objects (files and directories combined). This number is a
full-stop limit - if the volume contains more than 12M objects, ZAR will
eventually run out of memory.
On the other hand, a typical NTFS volume,
something you most likely have at home, contains 500,000 files or less. For this typical
setup, 512MB of a physical memory is enough with low cache size, and 1GB
is enough if you have the cache size at maximum. Most of the reasonably
modern systems will thus perform quite well on a typical volume. In
these cases, no
additional tweaks are necessary.