Rebuild mdadm RAID array

This is only a small documentation for myself, because I couldn't remember the mdadm commands to rebuild a failed disk. Maybe this is the 100.000 post about mdadm :-)

So what happen, on one of my linux cubes the raid failed because of an broken disk. Thanks god i've a fallback disk next to the machine.

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sda1[0](F) sdb1[4] sdd1[3] sdc1[2]
      5860141056 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]

First of all I will get all the information from the failed disk. To be sure which disk I should replace.

Checking with mdadm if I get some raid information, but maybe not ...

# mdadm --examine /dev/sda1
mdadm: No md superblock detected on /dev/sda1.

Using hdparm to get the serial number of the device.

# hdparm -i /dev/sda
/dev/sda:
 Model=WDC WD20EARS-00MVWB0, FwRev=51.0AB51, SerialNo=WD-WCAZA3223068

I also checked the disk via smartctl, to be sure something is broken.

# smartctl -a /dev/sda
[...]
Error 9216 occurred at disk power-on lifetime: 22877 hours (953 days + 5 hours)
[...]

Than I removed the disk from the existing RAID array.

# mdadm /dev/md0 --remove /dev/sda1
mdadm: hot removed /dev/sda1 from /dev/md0

Second I shutdown the machine and replace the broken disk by checking the serial number. Third I insert the new disk and boot the machine. You could check via dmesg if the new disk is detected by the system.

Copy the partition table from an existing disk (sdb) to the new one (sda).

# sfdisk -d /dev/sdb | sfdisk /dev/sda

Assemble the RAID array and add the new disk.

# mdadm --assemble --run /dev/md0
# mdadm /dev/md0 --manage --add /dev/sda1 
mdadm: added /dev/sda1

You should check the status of rebuild via cat /proc/mdstat.

# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sda1[5] sdb1[4] sdc1[2] sdd1[3]
      5860141056 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      [>....................]  recovery =  0.0% (185816/1953380352) finish=1051.0min speed=30969K/sec

Posted

December 28, 2013, 11:26 am

Tags

, , ,

More

Permalink

Comments

Send your comment by mail.