Replace hard or solid state drive with a bigger one and grow software (mdadm) raid

Written by - 1 comments

Published on June 7th 2019 - Listed in Linux Hardware

In my last post I announced a new release of the check_smart monitoring plugin, that it would now check additional SMART attributes (not just Current_Pending_Sector). And as soon as I rolled out the new version on to the servers, I was immediately alarmed about a failing SSD:

Model Family:     Samsung based SSDs
Device Model:     SAMSUNG SSD PM810 2.5" 7mm 128GB
Serial Number:    XXXXXXXXXXXXXX
LU WWN Device Id: 5 0000f0 000000000
Firmware Version: AXM08D1Q
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 1
SATA Version is:  SATA 2.6, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Jun  6 20:32:23 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
  5 Reallocated_Sector_Ct   0x0033   099   099   ---    Pre-fail  Always       -       16
  9 Power_On_Hours          0x0032   095   095   ---    Old_age   Always       -       25108
 12 Power_Cycle_Count       0x0032   099   099   ---    Old_age   Always       -       890
175 Program_Fail_Count_Chip 0x0032   099   099   ---    Old_age   Always       -       11
176 Erase_Fail_Count_Chip   0x0032   100   100   ---    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0013   075   075   ---    Pre-fail  Always       -       877
178 Used_Rsvd_Blk_Cnt_Chip  0x0013   080   080   ---    Pre-fail  Always       -       396
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   082   082   ---    Pre-fail  Always       -       722
180 Unused_Rsvd_Blk_Cnt_Tot 0x0013   082   082   ---    Pre-fail  Always       -       3310
181 Program_Fail_Cnt_Total  0x0032   099   099   ---    Old_age   Always       -       16
182 Erase_Fail_Count_Total  0x0032   100   100   ---    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   099   099   ---    Pre-fail  Always       -       16
187 Uncorrectable_Error_Cnt 0x0032   067   067   ---    Old_age   Always       -       33281
195 ECC_Error_Rate          0x001a   001   001   ---    Old_age   Always       -       33281
198 Offline_Uncorrectable   0x0030   100   100   ---    Old_age   Offline      -       0
199 CRC_Error_Count         0x003e   253   253   ---    Old_age   Always       -       0
232 Available_Reservd_Space 0x0013   080   080   ---    Pre-fail  Always       -       1620
241 Total_LBAs_Written      0x0032   037   037   ---    Old_age   Always       -       2708399316
242 Total_LBAs_Read         0x0032   035   035   ---    Old_age   Always       -       2781759092

16 already reallocated sectors (which on this drive were also counted as Program_Fail_Cnt_Total and Runtime_Bad_Block) and more than 33'000 non-correctable errors! The SSD however is quite "old", given the drive was running for more than 25'000 hours and it's also a drive of an older solid state generation.

That drive is part of a software RAID-1, managed by mdadm, which is shown as physical volume (PV) to the Logical Volume Manager (LVM):

# cat /proc/mdstat

md3 : active raid1 sdc1[1] sdb1[0]
      124968256 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices:

# pvs | grep md3
  /dev/md3   vgssd    lvm2 a--  119.18g      0

I thought that's a great moment to increase that raid by replacing both 128GB drives with two newer 224GB drives.

Replacing the drives

To keep the data, I first removed the drive /dev/sdc (with the huge amount of errors in the SMART table above) following a previous step by step guide I once wrote (Some notes on how to replace a HDD in software raid).

After I physically replaced the drive, I did one step differently than in the mentioned guide: Instead of copying the partition table from the still remaining drive (/dev/sdb) I manually created a new partition, filling up the whole drive:

# fdisk /dev/sdc

Welcome to fdisk (util-linux 2.29.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-468877311, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-468877311, default 468877311):

Created a new partition 1 of type 'Linux' and of size 223.6 GiB.

Command (m for help): t
Selected partition 1
Partition type (type L to list all types): da
Changed type of partition 'Linux' to 'Non-FS data'.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

Note: I set the partition type to "Non-FS data" (da), because the previously used partition type for software raids fd (Linux raid autodetect) is now deprecated.

Then I added the new drive /dev/sdc into the still existing raid-1 (md3):

# mdadm /dev/md3 -a /dev/sdc1
mdadm: added /dev/sdc1

Of course this raid device now needs to rebuild:

# cat /proc/mdstat

md3 : active raid1 sdb1[3] sdc1[2]
      124968256 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.6% (801792/124968256) finish=10.3min speed=200448K/sec
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices:

I waited until the raid was rebuilt. Of course at that moment, the raid itself still runs with the old size, because the older drive (/dev/sdb) is still a 128GB drive.

Now it's time to replace the second drive (/dev/sdb). I did the exact same steps as before with /dev/sdc, following the mdadm drive replacement guide but with the bigger partition.

Growing the mdadm raid device

Once the raid was rebuilt (once more) the raid now runs with two 224GB drives, yet the raid is still limited to the old 128GB size. Growing/expanding a raid device is actually very easy:

# mdadm --grow /dev/md3 --size=max
mdadm: component size of /dev/md3 has been set to 234372096K

This enforces a resync on the raid device:

# cat /proc/mdstat

md3 : active raid1 sdc1[2] sdb1[3]
      234372096 blocks super 1.2 [2/2] [UU]
      [===============>.....]  resync = 78.2% (183370304/234372096) finish=6.3min speed=134554K/sec
      bitmap: 1/1 pages [4KB], 131072KB chunk

unused devices:

Note the larger sizes (sectors) behind the current sync status in percent. 

Once the resync was completed, the PV can now be increased:

# pvresize /dev/md3
  Physical volume "/dev/md3" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized

VoilĂ , due to the grown PV, the volume group (VG) now has more space available:

# pvs | grep md3
  /dev/md3   vgssd    lvm2 a--  223.51g 104.34g

Add a comment

Show form to leave a comment

Comments (newest first)

Cristian from wrote on Oct 16th, 2019:


works like a charm