Thursday, September 2, 2010

HOWTO Replace a failing disk on Linux Software RAID-5

SkyHi @ Thursday, September 02, 2010

Contents

[hide]

[edit] Signs of impending disk failure

[edit] Stage 1: SMART Pre-failure warnings

According to SMART, one of the drives developed an unreadable sector, and sent me an email: "SMART error (CurrentPendingSector) detected":
The following warning/error was logged by the smartd daemon:
   
   Device: /dev/sdb, 1 Currently unreadable (pending) sectors
To determine the extent of the problem, execute a long selftest:
# smartctl -t long /dev/sdb
Some of the symptoms during the pre-failure period are:
  • "Wait for IO" state (%wa in top) goes though the roof; the disk subsystem is effectively halted while the disk tries to recover from the read-error.
  • The system becomes very sluggish at times.
As always, check that your backups are up-to-date and properly readable.

[edit] Stage 2: mdadm Fail event

After encountering some more read-errors, the RAID monitoring software decided to fail the disk/partition that was causing problems:
A Fail event had been detected on md device /dev/md1.
 
 It could be related to component device /dev/sdb2.
 
 Faithfully yours, etc.
 
 P.S. The /proc/mdstat file currently contains the following:
 
 Personalities : [raid6] [raid5] [raid4] [raid1] 
 md0 : active raid1 sda1[0] sdb1[1] sdc1[2] sdd1[3]
      128384 blocks [4/4] [UUUU]
 
 md1 : active raid5 sdd2[3] sdc2[2] sdb2[4](F) sda2[0]
      1464765696 blocks level 5, 256k chunk, algorithm 2 [4/3] [U_UU]
 
 unused devices: 
You can verify this information using mdadm:
[root@hal ~]# mdadm --detail /dev/md1
   /dev/md1:
           Version : 0.90
     Creation Time : Sun Sep  2 18:56:50 2007
        Raid Level : raid5
        Array Size : 1464765696 (1396.91 GiB 1499.92 GB)
     Used Dev Size : 488255232 (465.64 GiB 499.97 GB)
      Raid Devices : 4
     Total Devices : 4
   Preferred Minor : 1
       Persistence : Superblock is persistent
   
       Update Time : Tue Feb  2 12:16:34 2010
             State : clean, degraded
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 1
     Spare Devices : 0
   
            Layout : left-symmetric
        Chunk Size : 256K
   
              UUID : f8a183b6:6748af53:6c1c8c11:87458cf7
            Events : 0.964132
   
       Number   Major   Minor   RaidDevice State
          0       8        2        0      active sync   /dev/sda2
          1       0        0        1      removed
          2       8       34        2      active sync   /dev/sdc2
          3       8       50        3      active sync   /dev/sdd2
   
          4       8       18        -      faulty spare   /dev/sdb2
The RAID array is running in degraded mode - the failed disk needs to be replaced as soon as possible. Again, check your backups!

[edit] Prepare for disk replacement

The faulty disk, /dev/sdb, was partitioned and used in two RAID arrays (/dev/md0 and /dev/md1). We need to mark all occurrences of this disk as faulty before physically replacing the disk!
[root@hal ~]# mdadm --fail /dev/md0 /dev/sdb1
   mdadm: set /dev/sdb1 faulty in /dev/md0
The faulty disk is no longer part of any RAID array; erase all data before sending it in for replacement:
[root@hal ~]# dd if=/dev/zero of=/dev/sdb1
   [root@hal ~]# dd if=/dev/zero of=/dev/sdb2

[edit] Recovering from disk failure

[edit] Partitioning

Replace the broken harddisk with a new one, create the proper partitions and set the bootable flag on partition 1:
[root@hal ~]# fdisk -l /dev/sdb
   
   Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
   255 heads, 63 sectors/track, 121601 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   
      Device Boot      Start         End      Blocks   Id  System
   /dev/sdb1   *           1          16      128488+  fd  Linux raid autodetect
   /dev/sdb2              17      121601   976631512+  fd  Linux raid autodetect
Note that the replacement disk is 1TB, while the original disk was 500GB. You can safely create a partition that is larger than the original one - in fact, this may be useful when doing a "rolling upgrade" to a set of larger disks.

[edit] Add to RAID arrays

Add the new disks/partitions to the RAID arrays:
[root@hal ~]# mdadm --add /dev/md0 /dev/sdb1
   mdadm: added /dev/sdb1
   
   [root@hal ~]# mdadm --add /dev/md1 /dev/sdb2
   mdadm: added /dev/sdb2
Array reconstruction should now begin automatically:
[root@hal ~]# cat /proc/mdstat 
   Personalities : [raid6] [raid5] [raid4] [raid1] 
   md0 : active raid1 sdb1[1] sda1[0] sdc1[2] sdd1[3]
         128384 blocks [4/4] [UUUU]
         
   md1 : active raid5 sdb2[4] sdd2[3] sdc2[2] sda2[0]
         1464765696 blocks level 5, 256k chunk, algorithm 2 [4/3] [U_UU]
         [>....................]  recovery =  0.0% (84068/488255232) finish=193.4min speed=42034K/sec
         
   unused devices: 

[edit] Check boot-loader

The new disk is bootable, but GRUB is not yet installed:
[root@hal ~]# file -s /dev/sda
   /dev/sda: x86 boot sector; partition 1: ID=0xfd, active, starthead 1, startsector 63, 256977 sectors;
            partition 2: ID=0xfd, starthead 0, startsector 257040, 976511025 sectors, code offset 0x48
   
   [root@hal ~]# file -s /dev/sdb
   /dev/sdb: x86 boot sector; partition 1: ID=0xfd, active, starthead 1, startsector 63, 256977 sectors;
            partition 2: ID=0xfd, starthead 0, startsector 257040, 1953263025 sectors
Install GRUB on the new disk, /dev/sdb (GRUB uses different names for your disks, /dev/sda => hd0, /dev/sdb => hd1 etc.):
[root@hal ~]# grub
   <...snip...>
   grub> root (hd1,0)
   root (hd1,0)
    Filesystem type is ext2fs, partition type 0xfd
   grub> setup (hd1)
   setup (hd1)
    Checking if "/boot/grub/stage1" exists... no
    Checking if "/grub/stage1" exists... yes
    Checking if "/grub/stage2" exists... yes
    Checking if "/grub/e2fs_stage1_5" exists... yes
    Running "embed /grub/e2fs_stage1_5 (hd1)"...  15 sectors are embedded.
   succeeded
    Running "install /grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/grub/stage2 /grub/grub.conf"... succeeded
   Done.
   grub> quit
   quit
Verify the result:
[root@hal ~]# file -s /dev/sdb
   /dev/sdb: x86 boot sector; partition 1: ID=0xfd, active, starthead 1, startsector 63, 256977 sectors;
            partition 2: ID=0xfd, starthead 0, startsector 257040, 1953263025 sectors, code offset 0x48

[edit] Wait for reconstruction to complete

During array reconstruction, performance may suffer. Check RAID status for an estimated time of completion:
[root@hal ~]# cat /proc/mdstat 
   Personalities : [raid6] [raid5] [raid4] [raid1] 
   md0 : active raid1 sdb1[1] sda1[0] sdc1[2] sdd1[3]
         128384 blocks [4/4] [UUUU]
         
   md1 : active raid5 sdb2[4] sdd2[3] sdc2[2] sda2[0]
         1464765696 blocks level 5, 256k chunk, algorithm 2 [4/3] [U_UU]
         [===>.................]  recovery = 18.5% (90480896/488255232) finish=147.6min speed=44912K/sec
         
   unused devices: 

[edit] Navigation