Message ID | 3eb88dc3914667123ebb0823bbab9e07b24cf099.1660299977.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fix issues during suspended replace operation | expand |
On Fri, Aug 12, 2022 at 11:56 AM Anand Jain <anand.jain@oracle.com> wrote: > > If the filesystem mounts with the replace-operation in a suspended state > and try to cancel the suspended replace-operation, we hit the assert. The > assert came from the commit fe97e2e173af ("btrfs: dev-replace: replace's > scrub must not be running in suspended state") that was actually not > required. So just remove it. > > $ mount /dev/sda5 /btrfs > > BTRFS info (device sda5): cannot continue dev_replace, tgtdev is missing > BTRFS info (device sda5): you may cancel the operation after 'mount -o degraded' > > $ mount -o degraded /dev/sda5 /btrfs <-- success. > > $ btrfs replace cancel /btrfs > > kernel: assertion failed: ret != -ENOTCONN, in fs/btrfs/dev-replace.c:1131 > kernel: ------------[ cut here ]------------ > kernel: kernel BUG at fs/btrfs/ctree.h:3750! > > After the patch: > > $ btrfs replace cancel /btrfs > > BTRFS info (device sda5): suspended dev_replace from /dev/sda5 (devid 1) to <missing disk> canceled Anand, can you please add a test case to fstests? This is a scenario with no coverage at all in fstests, therefore specially useful to have it there. Thanks. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > fs/btrfs/dev-replace.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c > index 488f2105c5d0..9d46a702bc11 100644 > --- a/fs/btrfs/dev-replace.c > +++ b/fs/btrfs/dev-replace.c > @@ -1124,8 +1124,7 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info) > up_write(&dev_replace->rwsem); > > /* Scrub for replace must not be running in suspended state */ > - ret = btrfs_scrub_cancel(fs_info); > - ASSERT(ret != -ENOTCONN); > + btrfs_scrub_cancel(fs_info); > > trans = btrfs_start_transaction(root, 0); > if (IS_ERR(trans)) { > -- > 2.33.1 >
On 22/09/2022 18:00, Filipe Manana wrote: > On Fri, Aug 12, 2022 at 11:56 AM Anand Jain <anand.jain@oracle.com> wrote: >> >> If the filesystem mounts with the replace-operation in a suspended state >> and try to cancel the suspended replace-operation, we hit the assert. The >> assert came from the commit fe97e2e173af ("btrfs: dev-replace: replace's >> scrub must not be running in suspended state") that was actually not >> required. So just remove it. >> >> $ mount /dev/sda5 /btrfs >> >> BTRFS info (device sda5): cannot continue dev_replace, tgtdev is missing >> BTRFS info (device sda5): you may cancel the operation after 'mount -o degraded' >> >> $ mount -o degraded /dev/sda5 /btrfs <-- success. >> >> $ btrfs replace cancel /btrfs >> >> kernel: assertion failed: ret != -ENOTCONN, in fs/btrfs/dev-replace.c:1131 >> kernel: ------------[ cut here ]------------ >> kernel: kernel BUG at fs/btrfs/ctree.h:3750! >> >> After the patch: >> >> $ btrfs replace cancel /btrfs >> >> BTRFS info (device sda5): suspended dev_replace from /dev/sda5 (devid 1) to <missing disk> canceled > > Anand, can you please add a test case to fstests? > This is a scenario with no coverage at all in fstests, therefore > specially useful to have it there. > I thought about it before and found that unless we implement the replace-pause sub-command, we can't get a pending replace item in an unmounted btrfs using a script. However, to test it manually, I did an abrupt reboot (or power-off, I think). Thanks. > Thanks. > >> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> >> --- >> fs/btrfs/dev-replace.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c >> index 488f2105c5d0..9d46a702bc11 100644 >> --- a/fs/btrfs/dev-replace.c >> +++ b/fs/btrfs/dev-replace.c >> @@ -1124,8 +1124,7 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info) >> up_write(&dev_replace->rwsem); >> >> /* Scrub for replace must not be running in suspended state */ >> - ret = btrfs_scrub_cancel(fs_info); >> - ASSERT(ret != -ENOTCONN); >> + btrfs_scrub_cancel(fs_info); >> >> trans = btrfs_start_transaction(root, 0); >> if (IS_ERR(trans)) { >> -- >> 2.33.1 >> > >
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 488f2105c5d0..9d46a702bc11 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -1124,8 +1124,7 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info) up_write(&dev_replace->rwsem); /* Scrub for replace must not be running in suspended state */ - ret = btrfs_scrub_cancel(fs_info); - ASSERT(ret != -ENOTCONN); + btrfs_scrub_cancel(fs_info); trans = btrfs_start_transaction(root, 0); if (IS_ERR(trans)) {
If the filesystem mounts with the replace-operation in a suspended state and try to cancel the suspended replace-operation, we hit the assert. The assert came from the commit fe97e2e173af ("btrfs: dev-replace: replace's scrub must not be running in suspended state") that was actually not required. So just remove it. $ mount /dev/sda5 /btrfs BTRFS info (device sda5): cannot continue dev_replace, tgtdev is missing BTRFS info (device sda5): you may cancel the operation after 'mount -o degraded' $ mount -o degraded /dev/sda5 /btrfs <-- success. $ btrfs replace cancel /btrfs kernel: assertion failed: ret != -ENOTCONN, in fs/btrfs/dev-replace.c:1131 kernel: ------------[ cut here ]------------ kernel: kernel BUG at fs/btrfs/ctree.h:3750! After the patch: $ btrfs replace cancel /btrfs BTRFS info (device sda5): suspended dev_replace from /dev/sda5 (devid 1) to <missing disk> canceled Signed-off-by: Anand Jain <anand.jain@oracle.com> --- fs/btrfs/dev-replace.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)