RAID arrays are an important part of any mission critical enterprise architecture. When we talk RAID here we are talking mirrored RAID, or mirrored and striped RAID, not simply striping which gives you a larger drive from several smaller drives. While that may be great for some home or desktop applications, for a enterprise application that simply doubles your changes of a failed system.
We often spec out RAID 1 or higher mirrored systems with RAID 1+0 being the most common (mirrored and striped) so that you increase access performance AND keep the system up if a single drive fails (on a 3 drive RAID 1+0 configuration). Along the way we’ve learned some tips & tricks that may help you out. To start with we’ll post some info on Linux RAID and eventually expand this article to include Windows information.
Fake v. Real Raid
One thing we’ve learned recently is that in the flood of new low cost servers there has also been a flood of those servers coming with on board RAID controllers. Unfortunately these new RAID controllers use a low cost solution that basically pretends to be a RAID controller by modifying the BIOS software. In essence they are software RAID controllers posing a hardware RAID controllers. This means you have all of the BAD features of both systems.
One easy way to tell if you have a server with “fake raid” is to configure the drives in RAID mode from the BIOS. Then boot and install Linux. If the Linux install sees both drives versus a single drive then the “on board RAID” is a poser. Skip it. Configure the BIOS in standard drive mode & use the software RAID.
Most current Linux distros have RAID setup and configuration built into the setup and installation process. We’ll leave the details to other web articles.
MDADM – Linux RAID Utility
mdadm is the Linux utility used to manage and monitor RAID arrays. After configuration a pair of drives, typically denoted with sda0, sdb0 etc. show up in your standard Linux command as md0. They are “paired up” to make up the single RAID drive that most of your applications care about.
mdadm is how you look “inside” the single RAID array and see what is going on. Here is an example of a simple “show me the status” command on the RAID array. In this case we have a failed secondary drive in a 2-disk RAID1 array:
[root@dev:log]# mdadm --detail /dev/md0
Version : 00.90.03
Creation Time : Thu Jan 8 12:20:13 2009
Raid Level : raid1
Array Size : 104320 (101.89 MiB 106.82 MB)
Used Dev Size : 104320 (101.89 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Wed Jul 28 07:27:08 2010
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : a6ef9671:2a98f9e9:d1146f90:29b5d7da
Events : 0.826
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 0 0 1 removed [root@dev:~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb sda1 104320 blocks [2/2] [UU] md1 : active raid1 sda2 1020032 blocks [2/1] [U_] md2 : active raid1 sda5 482431808 blocks [2/1] [U_] unused devices: <none>
Rebuild An Array
Shut down the system with the failed drive, unless you have a hot-swap drive setup. Pull the bad drive, partition it if necessary, and tell MDADM to rebuild the array.
[root@dev:~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1 104320 blocks [2/1] [U_] md1 : active raid1 sda2 1020032 blocks [2/1] [U_] md2 : active raid1 sda5 482431808 blocks [2/1] [U_] unused devices: <none> [root@dev:~]# mdadm --add /dev/md0 /dev/sdb1 mdadm: added /dev/sdb1 [root@dev:~]# mdadm --add /dev/md1 /dev/sdb2 mdadm: added /dev/sdb2 [root@dev:~]# mdadm --add /dev/md2 /dev/sdb5 mdadm: added /dev/sdb5
This command adds the replaced drive, /dev/sdb in our case for our second SATA drive, to the first RAID array named md0.
Remove A Drive
To remove a drive it must be marked faulty, then removed.
[root@dev:~]# mdadm --fail /dev/md0 /dev/sdb [root@dev:~]# mdadm --remove /dev/md0 /dev/sdb
We had to do this on our drive because we forgot to partition it into a boot and data (/ and /boot and /dev/shm) partition. Thus the /dev/sdb instead of /dev/sdb1, etc. as it the norm for a partitioned drive.
Checking Rebuild Progress
[root@dev:~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1 sda1
104320 blocks [2/2] [UU]
md1 : active raid1 sdb2 sda2
1020032 blocks [2/2] [UU]
md2 : active raid1 sdb5 sda5
482431808 blocks [2/1] [U_]
[>....................] recovery = 0.8% (4050176/482431808) finish=114.5min speed=69592K/sec
unused devices: <none>
FDISK – Drive Partitioning
To properly re-add a drive to an array you will need to set the partitions correctly. You do this with fdisk. First, look at the partitions on the valid drive then copy that to the new drive that is to replace the failed drive.
[root@dev:~]# fdisk /dev/sda The number of cylinders for this disk is set to 60801. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Command (m for help): p Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 13 104391 fd Linux raid autodetect /dev/sda2 14 140 1020127+ fd Linux raid autodetect /dev/sda3 141 741 4827532+ 8e Linux LVM /dev/sda4 742 60801 482431950 5 Extended /dev/sda5 742 60801 482431918+ fd Linux raid autodetect[root@dev:~]# fdisk /dev/sda
Use "n" to create the new partitions, and "t" to set the type to match above.
That should get you started. Google & Linux man commands are your friend. As we have time we’ll publish more Linux RAID tricks here.