Home | Contact | Projects | Post Mortem | Publications | Pictures | Private | Blog

Redundant Array of Non-Striped Really Independent Disks

Note: RAnsrID is still under hefty development! Some of the indicated features may or may not work yet.
RAnsrID was first presented on LinuxTag 2010. Slides are available on my publications page.

Upstream code is available on gitorious.org, project ransrid. Get the source with
    git clone git://gitorious.org/ransrid/ransrid.git

Patches are welcome, as always :-) My TODO list is still rather long...

RAnsrID is a RAID-lookalike multi-disk block device for storing large amounts of data in a highly redundant way. It differs from regular RAIDs in a number of unique properties, that makes it the better suited solution for some applications, most related to home use and small business cases. Additionally, it adds a few features regular software RAID implementations do not have.

Expected use cases include, but are not limited to:

RAnsrID is not a reasonable block device to use as a system or main data disk! In fact, it is recommended to use a RAID 1 array for its journal (e.g. the system disks(s)). Don't mistake this with a filesystem journal, RAnsrID needs its own journal to get rid of the so-called "RAID write hole" (see Wikipedia's article about RAID).

Also, even using a highly redundant disk system doesn't help with user errors (e.g. file deletions), thus it is not an alternative to a backup. However, if the size of the data makes backups prohibitively expensive, RAnsrID used with a filesystem that provides cheap snapshots may be considered safe enough for non-mission-critical data.

RAnsrID is implemented as a multi-threaded network block device (nbd) server in user space, written in C. The main reason for this model was ease of implementation, as debugging in user space is extremely easier, and the used Reed-Solomon algorithm and journaling technique are difficult beasts. The algorithms have been written from scratch, so they fit RAnsrID's use case best (it's also good to write the algorithms personally in order to understand the characteristics of Galois Field mathematics completely, and a mild form of NIH syndrom is involved, too). Additionally, I still have little experience hacking the Linux kernel yet (I had more experience hacking Solaris in the past). This imposes one big disadvantage for RAnsrID at the moment: the block device server and the block device user must be running on separate hosts for read/write mounts, otherwise the machine can (and probably will) deadlock upon writes. I will experiment with a virtual file server host in the future and verify whether that is enough.

So is hardware RAID, software RAID, or RAnsrID the best suited solution for you? The following table tries to summarize the available boundary conditions; the solution you actually purchase/use might have additional restrictions, though.

Implementation Dedicated hardware Kernel driver User space network block device
Visibility to user One large device One large device One device with one partition per data disk
Max. No of disks 257 (1) 257 (1) 256 (2)
Max. No of redundancy disksTyp. 2 (RAID6) (3) 2 (RAID6) 255 (typically 16) (2)
Live adding/removing of data disksNo (4) No (4) Yes (up to limit on creation time)
Live adding/removing of redundancy disksNo No Yes (up to limit on creation time)
Live attaching/detaching disks w/o rebuildingNo No Yes (write depending on policy)
Possibility to mount detached disks soloNo No Yes (strict r/o mounting policy!)
Write on (partially)broken arraysYes Yes (?) Yes (depending on policy)
Cost High Low (5) Low - Very low (6)
Read performance > 100 MB/s ≈ 100 MB/s (5) ≈ 30 MB/s (6) (7)
Write performance > 100 MB/s ≈ 50-100 MB/s (5)≈ 10-20 MB/s (6) (7)
CPU usage Very low Medium High during writes
Disk access during read All disks All disks Only selected data disk
Disk access during write All disks All disks Selected data disk and all redundancy disks
Power usage while idle High Medium-Low Very low (up to < 0.1 W / disk) (6)
Data integrity Extremely High Medium Very high (8)
Data integrity: Erasure resilienceYes Yes Yes
Data integrity: Error resilienceRAID6: Yes (full speed, 1 error)RAID6: (?)Yes (no. red. disks / 2 errors) (9)
Data integrity: Write hole No (buffered cache) Yes No (write journal)

(1)RAID 5: Unlimited data disks, 1 redundancy disk. RAID 6: 255 data disks, 2 redundancy disks.
(2)Number of disks has to be split into maximum number of data and redundancy disks upon initial creation, e.g. 240/16.
(3)Some hardware RAIDs have additional (non-standard) levels with more than 2 redundancy disks.
(4)With the exception of RAID 4, which is considered dead nowadays.
Also, it's typically possible to add/remove disks by triggering a live rebuild of the array.
(5)Expecting use with internal harddisks - performance numbers and general usage quality drop significantly with external disks.
(6)Expecting use with external USB2 hard disks.
(7)Numbers estimated - no performance test performed yet. Numbers also depend on CPU speed. Large block writes are supposed to be much faster than small block writes.
(8)Device state journaling, write journaling, rebuild journaling, add/remove journaling turned on.
(9)All disks have to run for error detection. This also potentially slows down reads (calculative overhead).