Message ID | 1570175403-4073-4-git-send-email-anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix issues due to alien device | expand |
On 4.10.19 г. 10:50 ч., Anand Jain wrote: > btrfs_free_extra_devids() reorgs fs_devices::latest_bdev > to point to the bdev with greatest device::generation number. > For a typical-missing device the generation number is zero so > fs_devices::latest_bdev will never point to it. > > But if the missing device is due to alienating [1], then > device::generation is not-zero and if it is >= to rest of > device::generation in the list, then fs_devices::latest_bdev > ends up pointing to the missing device and reports the error > like this [2] > > [1] > mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs > mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc > sleep 3 # avoid racing with udev's useless scans if needed > btrfs dev add -f /dev/sdb /btrfs Hm, here I think the correct way is to refuse adding /dev/sdb to an existing fs if it's detected to be part of a different one. I.e it should require wipefs to be done. > > mount -o degraded /dev/sdc /btrfs1 > > [2] > mount: wrong fs type, bad option, bad superblock on /dev/sdc, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. > > kernel: BTRFS warning (device sdc): devid 1 uuid 072a0192-675b-4d5a-8640-a5cf2b2c704d is missing > kernel: BTRFS error (device sdc): failed to read devices > kernel: BTRFS error (device sdc): open_ctree failed > > Fix the root of the issue, by checking if the the device is not > missing before it can be a contender for the fs_devices::latest_bdev > title. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > fs/btrfs/volumes.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 05ade8c7342b..6c42048ec099 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1186,6 +1186,8 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step) > &device->dev_state)) { > if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, > &device->dev_state) && > + !test_bit(BTRFS_DEV_STATE_MISSING, > + &device->dev_state) && > (!latest_dev || > device->generation > latest_dev->generation)) { > latest_dev = device; >
On 04/10/2019 09:11, Nikolay Borisov wrote: > > > On 4.10.19 г. 10:50 ч., Anand Jain wrote: >> btrfs_free_extra_devids() reorgs fs_devices::latest_bdev >> to point to the bdev with greatest device::generation number. >> For a typical-missing device the generation number is zero so >> fs_devices::latest_bdev will never point to it. >> >> But if the missing device is due to alienating [1], then >> device::generation is not-zero and if it is >= to rest of >> device::generation in the list, then fs_devices::latest_bdev >> ends up pointing to the missing device and reports the error >> like this [2] >> >> [1] >> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs >> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc >> sleep 3 # avoid racing with udev's useless scans if needed >> btrfs dev add -f /dev/sdb /btrfs > > Hm, here I think the correct way is to refuse adding /dev/sdb to an > existing fs if it's detected to be part of a different one. I.e it > should require wipefs to be done. I disagree. -f means "force overwrite of existing filesystem on the given disk(s)". It shouldn't be any different whether the existing fs is btrfs or something else. Graham
On 4.10.19 г. 12:08 ч., Graham Cobb wrote: > On 04/10/2019 09:11, Nikolay Borisov wrote: >> >> >> On 4.10.19 г. 10:50 ч., Anand Jain wrote: >>> btrfs_free_extra_devids() reorgs fs_devices::latest_bdev >>> to point to the bdev with greatest device::generation number. >>> For a typical-missing device the generation number is zero so >>> fs_devices::latest_bdev will never point to it. >>> >>> But if the missing device is due to alienating [1], then >>> device::generation is not-zero and if it is >= to rest of >>> device::generation in the list, then fs_devices::latest_bdev >>> ends up pointing to the missing device and reports the error >>> like this [2] >>> >>> [1] >>> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs >>> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc >>> sleep 3 # avoid racing with udev's useless scans if needed >>> btrfs dev add -f /dev/sdb /btrfs >> >> Hm, here I think the correct way is to refuse adding /dev/sdb to an >> existing fs if it's detected to be part of a different one. I.e it >> should require wipefs to be done. > > I disagree. -f means "force overwrite of existing filesystem on the > given disk(s)". It shouldn't be any different whether the existing fs is > btrfs or something else. You are right, looking at btrfs device add implementation though it seems that wipefs is being done on the device to be added: cmd_device_add btrfs_prepare_device btrfs_wipe_existing_sb So if the device to be added is formatted and then it goes through btrfs_init_new_device which commits the transaction which in turns rewrites the sb then why is the error happening in the first place? > > Graham >
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 05ade8c7342b..6c42048ec099 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1186,6 +1186,8 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step) &device->dev_state)) { if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state) && + !test_bit(BTRFS_DEV_STATE_MISSING, + &device->dev_state) && (!latest_dev || device->generation > latest_dev->generation)) { latest_dev = device;
btrfs_free_extra_devids() reorgs fs_devices::latest_bdev to point to the bdev with greatest device::generation number. For a typical-missing device the generation number is zero so fs_devices::latest_bdev will never point to it. But if the missing device is due to alienating [1], then device::generation is not-zero and if it is >= to rest of device::generation in the list, then fs_devices::latest_bdev ends up pointing to the missing device and reports the error like this [2] [1] mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc sleep 3 # avoid racing with udev's useless scans if needed btrfs dev add -f /dev/sdb /btrfs mount -o degraded /dev/sdc /btrfs1 [2] mount: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. kernel: BTRFS warning (device sdc): devid 1 uuid 072a0192-675b-4d5a-8640-a5cf2b2c704d is missing kernel: BTRFS error (device sdc): failed to read devices kernel: BTRFS error (device sdc): open_ctree failed Fix the root of the issue, by checking if the the device is not missing before it can be a contender for the fs_devices::latest_bdev title. Signed-off-by: Anand Jain <anand.jain@oracle.com> --- fs/btrfs/volumes.c | 2 ++ 1 file changed, 2 insertions(+)