Message ID | 20211027135334.19880-1-nborisov@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: Don't set balance as a running exclusive op in case of skip_balance | expand |
On 27/10/2021 21:53, Nikolay Borisov wrote: > Currently balance is set as a running exclusive op in > btrfs_recover_balance in case of remount and a paused balance. That's a > bit eager because btrfs_recover_balance executes always and is not > affected by the 'skip_balance' mount option. This can lead to cases in > which a user has mounted the filesystem with 'skip_balance' due to > running out of space yet is unable to add a device to rectify the ENOSPC > because balance is set as a running exclusive op. > > Fix this by setting balance in btrfs_resume_balance_async which takes > into consideration whether 'skip_balance' has been added or not. > Hmm, no. I roughly remember it was purposefully to avoid replacing intervened with the half-completed/paused balance. As below. Not sure if it is ok now? Before patch: Can't replace with a paused balance. ------------ $ btrfs bal start --full-balance /btrfs balance paused by user $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs ERROR: unable to start device replace, another exclusive operation 'balance' in progress $ mount -o remount,ro /dev/vg/scratch1 /btrfs $ mount -o remount,rw,skip_balance /dev/vg/scratch1 /btrfs $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs ERROR: unable to start device replace, another exclusive operation 'balance' in progress $ umount /btrfs $ mount -o skip_balance /dev/vg/scratch1 /btrfs $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs ERROR: unable to start device replace, another exclusive operation 'balance' in progress ------------ After patch: Replace is successful even if there is a paused balance. ------------ $ mount -o skip_balance /dev/vg/scratch1 /btrfs [ 1367.567379] BTRFS info (device dm-1): balance: resume skipped $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs <---- [ 1391.318192] BTRFS info (device dm-1): dev_replace from /dev/mapper/vg-scratch1 (devid 1) to /dev/mapper/vg-scratch0 started [ 1408.243475] BTRFS info (device dm-1): dev_replace from /dev/mapper/vg-scratch1 (devid 1) to /dev/mapper/vg-scratch0 finished $ btrfs bal resume /btrfs Done, had to relocate 5 out of 12 chunks ------------ Thanks, Anand > Fixes: > Signed-off-by: Nikolay Borisov <nborisov@suse.com> > --- > > I'm inclined to put a Fixes: ed0fb78fb6aa ("Btrfs: bring back balance pause/resume logic") > tag since this is the commit that moved the exclusive op tracking of > resume_balance_async to btrfs_recover_balance. However that's far too back in the > past and given the commit message of that commit I wonder if doing this > re-arrangement in older kernels is correct. David what's your take on this, > perhahps just a stable tag would be sufficient? > > > fs/btrfs/volumes.c | 28 ++++++++++++++-------------- > 1 file changed, 14 insertions(+), 14 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 546bf1146b2d..bff52fa1b255 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -4432,6 +4432,20 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info) > return 0; > } > > + /* > + * This should never happen, as the paused balance state is recovered > + * during mount without any chance of other exclusive ops to collide. > + * > + * This gives the exclusive op status to balance and keeps in paused > + * state until user intervention (cancel or umount). If the ownership > + * cannot be assigned, show a message but do not fail. The balance > + * is in a paused state and must have fs_info::balance_ctl properly > + * set up. > + */ > + if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) > + btrfs_warn(fs_info, > + "balance: cannot set exclusive op status, resume manually"); > + > /* > * A ro->rw remount sequence should continue with the paused balance > * regardless of who pauses it, system or the user as of now, so set > @@ -4490,20 +4504,6 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info) > btrfs_balance_sys(leaf, item, &disk_bargs); > btrfs_disk_balance_args_to_cpu(&bctl->sys, &disk_bargs); > > - /* > - * This should never happen, as the paused balance state is recovered > - * during mount without any chance of other exclusive ops to collide. > - * > - * This gives the exclusive op status to balance and keeps in paused > - * state until user intervention (cancel or umount). If the ownership > - * cannot be assigned, show a message but do not fail. The balance > - * is in a paused state and must have fs_info::balance_ctl properly > - * set up. > - */ > - if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) > - btrfs_warn(fs_info, > - "balance: cannot set exclusive op status, resume manually"); > - > btrfs_release_path(path); > > mutex_lock(&fs_info->balance_mutex); >
On 28.10.21 г. 10:57, Anand Jain wrote: > On 27/10/2021 21:53, Nikolay Borisov wrote: >> Currently balance is set as a running exclusive op in >> btrfs_recover_balance in case of remount and a paused balance. That's a >> bit eager because btrfs_recover_balance executes always and is not >> affected by the 'skip_balance' mount option. This can lead to cases in >> which a user has mounted the filesystem with 'skip_balance' due to >> running out of space yet is unable to add a device to rectify the ENOSPC >> because balance is set as a running exclusive op. >> >> Fix this by setting balance in btrfs_resume_balance_async which takes >> into consideration whether 'skip_balance' has been added or not. >> > > Hmm, no. I roughly remember it was purposefully to avoid replacing > intervened with the half-completed/paused balance. As below. > Not sure if it is ok now? > > Before patch: Can't replace with a paused balance. > > ------------ > $ btrfs bal start --full-balance /btrfs > balance paused by user > > $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs > ERROR: unable to start device replace, another exclusive operation > 'balance' in progress > > $ mount -o remount,ro /dev/vg/scratch1 /btrfs > $ mount -o remount,rw,skip_balance /dev/vg/scratch1 /btrfs > > $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs > ERROR: unable to start device replace, another exclusive operation > 'balance' in progress > > $ umount /btrfs > $ mount -o skip_balance /dev/vg/scratch1 /btrfs > > $ btrfs rep start /dev/vg/scratch1 /dev/vg/scratch0 /btrfs > ERROR: unable to start device replace, another exclusive operation > 'balance' in progress > ------------ > > > After patch: Replace is successful even if there is a paused balance. That's a fair point, so the patch needs to be augmented such that only ADD is allowed but other exclusive ops are not. I will look into it. <snip>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 546bf1146b2d..bff52fa1b255 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4432,6 +4432,20 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info) return 0; } + /* + * This should never happen, as the paused balance state is recovered + * during mount without any chance of other exclusive ops to collide. + * + * This gives the exclusive op status to balance and keeps in paused + * state until user intervention (cancel or umount). If the ownership + * cannot be assigned, show a message but do not fail. The balance + * is in a paused state and must have fs_info::balance_ctl properly + * set up. + */ + if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) + btrfs_warn(fs_info, + "balance: cannot set exclusive op status, resume manually"); + /* * A ro->rw remount sequence should continue with the paused balance * regardless of who pauses it, system or the user as of now, so set @@ -4490,20 +4504,6 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info) btrfs_balance_sys(leaf, item, &disk_bargs); btrfs_disk_balance_args_to_cpu(&bctl->sys, &disk_bargs); - /* - * This should never happen, as the paused balance state is recovered - * during mount without any chance of other exclusive ops to collide. - * - * This gives the exclusive op status to balance and keeps in paused - * state until user intervention (cancel or umount). If the ownership - * cannot be assigned, show a message but do not fail. The balance - * is in a paused state and must have fs_info::balance_ctl properly - * set up. - */ - if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) - btrfs_warn(fs_info, - "balance: cannot set exclusive op status, resume manually"); - btrfs_release_path(path); mutex_lock(&fs_info->balance_mutex);
Currently balance is set as a running exclusive op in btrfs_recover_balance in case of remount and a paused balance. That's a bit eager because btrfs_recover_balance executes always and is not affected by the 'skip_balance' mount option. This can lead to cases in which a user has mounted the filesystem with 'skip_balance' due to running out of space yet is unable to add a device to rectify the ENOSPC because balance is set as a running exclusive op. Fix this by setting balance in btrfs_resume_balance_async which takes into consideration whether 'skip_balance' has been added or not. Fixes: Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- I'm inclined to put a Fixes: ed0fb78fb6aa ("Btrfs: bring back balance pause/resume logic") tag since this is the commit that moved the exclusive op tracking of resume_balance_async to btrfs_recover_balance. However that's far too back in the past and given the commit message of that commit I wonder if doing this re-arrangement in older kernels is correct. David what's your take on this, perhahps just a stable tag would be sufficient? fs/btrfs/volumes.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)