ZFS drive replacement | Alt255 Blog

January 12, 2020

ZFS drive replacement

ZFS

Drive failure imminent

Recently one of my hard drives started throwing errors:

Jun  4 22:51:48 system kernel: [182083.672985] ata7.00: exception Emask 0x0 SAct 0x60 SErr 0x0 action 0x0
Jun  4 22:51:48 system kernel: [182083.673073] ata7.00: irq_stat 0x40000008
Jun  4 22:51:48 system kernel: [182083.673123] ata7.00: failed command: READ FPDMA QUEUED
Jun  4 22:51:48 system kernel: [182083.673191] ata7.00: cmd 60/00:28:50:b0:96/01:00:01:00:00/40 tag 5 ncq 131072 in
Jun  4 22:51:48 system kernel: [182083.673191]          res 41/40:00:48:b1:96/00:00:01:00:00/40 Emask 0x409 (media error) <F>
Jun  4 22:51:48 system kernel: [182083.673361] ata7.00: status: { DRDY ERR }
Jun  4 22:51:48 system kernel: [182083.673409] ata7.00: error: { UNC }
Jun  4 22:51:48 system kernel: [182083.674899] ata7.00: configured for UDMA/133
Jun  4 22:51:48 system kernel: [182083.674924] sd 6:0:0:0: [sdc] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun  4 22:51:48 system kernel: [182083.674932] sd 6:0:0:0: [sdc] tag#5 Sense Key : Medium Error [current] [descriptor]
Jun  4 22:51:48 system kernel: [182083.674939] sd 6:0:0:0: [sdc] tag#5 Add.  Sense: Unrecovered read error - auto reallocate failed
Jun  4 22:51:48 system kernel: [182083.674947] sd 6:0:0:0: [sdc] tag#5 CDB: Read(16) 88 00 00 00 00 00 01 96 b0 50 00 00 01 00 00 00
Jun  4 22:51:48 system kernel: [182083.674951] blk_update_request: I/O error, dev sdc, sector 26653000
Jun  4 22:51:48 system kernel: [182083.675063] ata7: EH complete
Jun  4 22:51:50 system zed: eid=11 class=io pool=vpool

These types of errors started showing up during zpool scrub events.

Replacing a drive in a ZFS pool

Once the new drive arrived and was attached to the system, it was time to add the drive to an existing mirror. The steps are basically (1) partition the drive, (2) encrypt the drive, (3) add the drive into the zpool, and (4) remove the failing drive from the zpool. ZFS does have the ability to combine steps (3) and (4) via zpool replace, but I preferred to keep the steps separate as it gave me an opportunity to closely follow the process.

Use lsblk to determine the drive letter sd* for the new drive. Then lookup the drive’s stable (permanent) identifier in /dev/disk/by-id. For example, to find the identifier for disk sda:

$ ls -l /dev/disk/by-id/ | grep sda
lrwxrwxrwx 1 root root  9 Jun  9 20:04 wwn-0x84221ea347353432 -> ../../sda

(Multiple identifiers for the same drive may be listed, I’ve arbitrarily chosen to use the one with the above format.)

Now partition the new drive using parted. In this case, I’ll be using a disk label of vault5:

$ parted -a optimal /dev/disk/by-id/wwn-0x84221ea347353432
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Error: /dev/sda: unrecognised disk label
Model: ATA WDC WD40EFRX-68N (scsi)
Disk /dev/sda: 4001GB
Sector size (logical/physical): 512B/4096B
Partition Table: unknown
Disk Flags:
(parted) mklabel gpt
(parted) unit MiB
(parted) mkpart vault5 1 -1
(parted) print
Model: ATA WDC WD40EFRX-68N (scsi)
Disk /dev/sda: 3815448MiB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start    End         Size        File system  Name    Flags
 1      1.00MiB  3815447MiB  3815446MiB               vault5

(parted) quit
Information: You may need to update /etc/fstab.

Encrypt the entire drive using LUKS:

$ cryptsetup luksFormat --cipher aes-xts-plain64 --hash sha512 --key-size 512 \
  --iter-time 5000 --use-random --verify-passphrase \
  /dev/disk/by-id/wwn-0x84221ea347353432-part1
$ cryptsetup luksOpen /dev/disk/by-id/wwn-0x84221ea347353432-part1 vault5_crypt

Optionally add a secondary passphrase:

$ cryptsetup luksAddKey --iter-time 5000 /dev/disk/by-id/wwn-0x84221ea347353432-part1

Prior to adding the new drive, my zfs setup looked like:

  NAME            
  vpool           
    mirror-0      
      vault1_crypt
      vault2_crypt
    mirror-1      
      vault3_crypt
      vault4_crypt

To add the new vault5 drive as a mirror of vault1 and vault2, I would zpool attach to one of the existing drives in the mirror. The mirror would grow from a 2-way mirror to a 3-way mirror:

$ zpool attach -oashift=12 vpool vault2_crypt /dev/mapper/vault5_crypt

While the resilvering process is underway, I might run watch zpool status -v to follow along.

Once I’m satisified with the new drive in the mirror, I can remove one of the old drives by running zpool detach. The mirror would shrink from a 3-way mirror back to a 2-way mirror:

$ zpool detach vpool vault2_crypt

If replacing both drives in a 2-way mirror, after both new drives have been attached and the old drives detached from the pool, any increase in mirror size will not be immediately available. There are two options to increase the mirror size:

Option 1) First export then import the pool.

Option 2) For each new device,zpool online -e vpool <new_device>, such as: zpool online -e vpool vault5_crypt.

References