[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-btrfs
Subject:    Re: how to replace a failed drive?
From:       Tomasz Chmielewski <mangoo () wpkg ! org>
Date:       2021-09-02 9:23:55
Message-ID: 15ee35137c360d54f0ee1f80579a7614 () wpkg ! org
[Download RAW message or body]

On 2021-09-02 10:00, Andrei Borzenkov wrote:
> On 02.09.2021 10:45, Anand Jain wrote:
>> On 02/09/2021 06:07, Tomasz Chmielewski wrote:
>>> I'm trying to follow
>>> https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices
>>> to replace a failed drive. But it seems to be written by a person who
>>> never attempted to replace a failed drive in btrfs filesystem, and 
>>> who
>>> never used mdadm RAID (to see how good RAID experience should look 
>>> like).
>>> 
>>> What I have:
>>> 
>>> - RAID-10 over 4 devices (/dev/sd[a-d]2)
>>> - 1 disk (/dev/sdb2) crashed and was no longer seen by the operating
>>> system
>>> - it was replaced using hot-swapping - new drive registered itself as
>>> /dev/sde
>>> - I've partitioned /dev/sde, so that /dev/sde2 matches the size of
>>> other btrfs devices
>>> - because I couldn't remove the faulty device (it wouldn't go below 
>>> my
>>> current number of devices) I've added the new device to btrfs 
>>> filesystem:
>>> 
>> 
>> 
>>> btrfs device add /dev/sde2 /data/lxd
>> 
>>  Wiki is correct.
>> 
>>  $ btrfs replace start 7 /dev/sdf1 /mnt
>> 
> 
> Where exactly user is supposed to find out the correct number of 
> missing
> device? Because
> ...
> 
>>> 
>>> # btrfs filesystem show /data/lxd
>>> Label: 'lxd5'  uuid: 2b77b498-a644-430b-9dd9-2ad3d381448a
>>>          Total devices 5 FS bytes used 2.84TiB
>>>          devid    1 size 1.73TiB used 1.60TiB path /dev/sda2
>>>          devid    3 size 1.73TiB used 1.60TiB path /dev/sdd2
>>>          devid    4 size 1.73TiB used 1.60TiB path /dev/sdc2
>>>          devid    6 size 1.73TiB used 0.00B path /dev/sde2
>>>          *** Some devices missing
>>> 
> 
> It only shows existing devices. "Some devices missing" is not exactly
> helping. More useful would be "devid 7 missing".

Exactly this!

Fine documentation says:

    Now replace the absent device with the new drive /dev/sdf1 on the 
filesystem currently mounted on /mnt (since the device is absent, you 
can
    use any devid number that isn't present; 2,5,7,9 would all work the 
same):

      sudo btrfs replace start 7 /dev/sdf1 /mnt



I saw devid 1, 3, 4 and 6 in my "btrfs filesystem show ..." output. 
Pairing that with "you can use any devid number that isn't present" from 
documentation, I've used "2", as it was a devid number which wasn't 
present.

So this failed with an error.

btrfs replace start 2 /dev/sde2 /data/lxd


This did work:

btrfs replace start 5 /dev/sde2 /data/lxd



Highly confusing, and again, not what documentation says.


Tomasz Chmielewski
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic