Tuesday, August 29, 2006

Creating a 4-disk RAID10 using mdadm

Since I can't seem to find instructions on how to do this (yet)...

I'm going to create a 4-disk RAID10 array using Linux Software RAID and mdadm. The old way is to create individual RAID1 volumes and then stripe a RAID0 volume over the RAID1 arrays. That requires creating extra /dev/mdN nodes which can be confusing to the admin that follows you.

1) Create the /dev/mdN node for the new RAID10 array. In my case, I already have /dev/md0 to /dev/md4 so I'm going to create /dev/md5 (note that "5" appears twice in the command).

# mknod /dev/md5 b 9 5

2) Use fdisk on the (4) drives, create a single primary partition of type "fd" (Linux raid autodetect). Note that I have *nothing* on these brand new drives, so I don't care if it wipes out data.

3) Create the mdadm RAID set using 4 devices and a level of RAID10.

# mdadm --create /dev/md5 -v --raid-devices=4 --chunk=32 --level=raid10 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1

Which will result in the following output:

mdadm: layout defaults to n1
mdadm: size set to 732571904K
mdadm: array /dev/md5 started.

# cat /proc/mdstat

Personalities : [raid1] [raid10]
md5 : active raid10 sdf1[3] sde1[2] sdd1[1] sdc1[0]
1465143808 blocks 32K chunks 2 near-copies [4/4] [UUUU]
[>....................] resync = 0.2% (3058848/1465143808) finish=159.3min speed=152942K/sec


As you can see, we get around 150MB/s from the RAID10 array. The regular RAID1 arrays only have about 75MB/s throughput (same as a single 750GB drive).

A final note. My mdadm.conf file is completely empty on this system. That works well for simple systems, but you'll want to create a configuration file in more complex setups.

Updates:

Most of the arrays that I've built have been based on 7200 RPM SATA drives. For small arrays (4 disks w/ a hot spare), often you can find enough ports on the motherboard. For larger arrays, you'll need to look for PCIe SATA controllers. I've used Promise and 3ware SATA RAID cards. Basically any card that allows the SATA drives to be seen and is supported directly in the Linux kernel are good bets (going forward we're going to switch to Areca at work).

Starting in 2011 we switched over to using LSI SAS adapters with either 8 or 16 ports (as 1:4 mini-SAS breakout cables).  The latest one that I had good results with is the LSI SAS 9201-16i. We are using the SuperMicro SAS enclosure along with a total of (10) 300GB 15k RPM SAS drives.

(Note that the SuperMicro CSE-M35T-1 or CSE-M35T-1B is the SATA version.  If you want the SAS version you have to look for CSE-M35TQB or CSE-M35TQ.  These are very good enclosures which fit five 3.5" SAS drives into the space of three 5.25" bays.  The TQB enclosure can hold SAS or SATA drives, while the other version is SATA only.)

Example of creating a 6-drive RAID10 array:

# mdadm --create /dev/md5 --raid-devices=6 --spare-devices=1 --layout=n2 --level=raid10 /dev/sd[a-g]1

In this case, we're setting up a 6-drive RAID10 array along with 1 hot-spare. Disks sda to sdg all have a single partition on them, tagged as "fd" Linux RAID in fdisk.

"n2" is the default RAID10 layout for mdadm and is a good default that provides balanced performance for reads and writes.  It is also the most common RAID 1+0 definition where each of the mirror disks is identical to the other mirror disk with no offset.

"o2" is a modest offset version where each sector on the next disk is slightly offset from the first disk.

"f2" is an optional layout that has better read performance, but worse write performance.

4 comments:

Robert said...

After futzing with doing RAID10 for 2 days now I found this page... Some of the other pages I found had me doing RAID1 in mdadm and then using lvm for the RAID0 part. Which if you mess it up, you're hosed... While I'm VERY fond of LVM - I've been using it on HP-UX since 1995 - this was so simple for a this small 1U server that won't be changing anytime soon. Thanks!

Anonymous said...

This howto seems extremely simple... yet logical at all. I'll try it out tomorrow, since i've been many days trying this level 10. Thank you!

Petrus said...

A very useful how-to :D

Could you also explain what type of hardware (specially hard drives and controllers) are you using for this RAID10?
I'd like to build a NAS server which will have a heavy transfer and load.

Thanks in advance.

Anonymous said...

Thanks for writing this (5 years ago!) -- I found it helpful yesterday. However, I would note that the resync speed is not actually the RAID speed, but how fast the slowest drive is syncing to the array. When I build a RAID10 out of WD caviar black drives, I get 1/2 the benchmark write for resync (equivalent to a single disk write speed). Also, the resync speed will change if you've already made the drives equal. In an array of WD caviar green drives, randomed disks get a resync speed of 50MB/s (under the specs for write speed on these drives). If I mirror the drives with a "dd if=/dev/sda of=/dev/sdb" before building the drives, I get a resync speed of 700MB/s (well over the write speed of 4 of these drives combined). I'd like to let people know not to worry if the resync is slow... especially using cheap disks.