From linux-btrfs Thu Sep 02 09:23:55 2021 From: Tomasz Chmielewski Date: Thu, 02 Sep 2021 09:23:55 +0000 To: linux-btrfs Subject: Re: how to replace a failed drive? Message-Id: <15ee35137c360d54f0ee1f80579a7614 () wpkg ! org> X-MARC-Message: https://marc.info/?l=linux-btrfs&m=163057463719814 On 2021-09-02 10:00, Andrei Borzenkov wrote: > On 02.09.2021 10:45, Anand Jain wrote: >> On 02/09/2021 06:07, Tomasz Chmielewski wrote: >>> I'm trying to follow >>> https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices >>> to replace a failed drive. But it seems to be written by a person who >>> never attempted to replace a failed drive in btrfs filesystem, and >>> who >>> never used mdadm RAID (to see how good RAID experience should look >>> like). >>> >>> What I have: >>> >>> - RAID-10 over 4 devices (/dev/sd[a-d]2) >>> - 1 disk (/dev/sdb2) crashed and was no longer seen by the operating >>> system >>> - it was replaced using hot-swapping - new drive registered itself as >>> /dev/sde >>> - I've partitioned /dev/sde, so that /dev/sde2 matches the size of >>> other btrfs devices >>> - because I couldn't remove the faulty device (it wouldn't go below >>> my >>> current number of devices) I've added the new device to btrfs >>> filesystem: >>> >> >> >>> btrfs device add /dev/sde2 /data/lxd >> >>  Wiki is correct. >> >>  $ btrfs replace start 7 /dev/sdf1 /mnt >> > > Where exactly user is supposed to find out the correct number of > missing > device? Because > ... > >>> >>> # btrfs filesystem show /data/lxd >>> Label: 'lxd5'  uuid: 2b77b498-a644-430b-9dd9-2ad3d381448a >>>          Total devices 5 FS bytes used 2.84TiB >>>          devid    1 size 1.73TiB used 1.60TiB path /dev/sda2 >>>          devid    3 size 1.73TiB used 1.60TiB path /dev/sdd2 >>>          devid    4 size 1.73TiB used 1.60TiB path /dev/sdc2 >>>          devid    6 size 1.73TiB used 0.00B path /dev/sde2 >>>          *** Some devices missing >>> > > It only shows existing devices. "Some devices missing" is not exactly > helping. More useful would be "devid 7 missing". Exactly this! Fine documentation says: Now replace the absent device with the new drive /dev/sdf1 on the filesystem currently mounted on /mnt (since the device is absent, you can use any devid number that isn't present; 2,5,7,9 would all work the same): sudo btrfs replace start 7 /dev/sdf1 /mnt I saw devid 1, 3, 4 and 6 in my "btrfs filesystem show ..." output. Pairing that with "you can use any devid number that isn't present" from documentation, I've used "2", as it was a devid number which wasn't present. So this failed with an error. btrfs replace start 2 /dev/sde2 /data/lxd This did work: btrfs replace start 5 /dev/sde2 /data/lxd Highly confusing, and again, not what documentation says. Tomasz Chmielewski