diff mbox series

btrfs: skip reclaim if block_group is empty

Message ID 8f825fce9d2968034da43e09a4ebc38ec19a2e49.1665427766.git.boris@bur.io (mailing list archive)
State New, archived
Headers show
Series btrfs: skip reclaim if block_group is empty | expand

Commit Message

Boris Burkov Oct. 10, 2022, 6:49 p.m. UTC
As we delete extents from a block group, at some deletion we cross below
the reclaim threshold. It is possible we are still in the middle of
deleting more extents and might soon hit 0. If that occurs, we would
leave the block group on the reclaim list, not in the care of unused
deletion or async discard.

It is pointless and wasteful to relocate empty block groups, so if we do
notice that case (we might not if the reclaim worker runs *before* we
finish emptying it), don't bother with relocating the block group.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/block-group.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Filipe Manana Oct. 11, 2022, 9:43 a.m. UTC | #1
On Mon, Oct 10, 2022 at 8:25 PM Boris Burkov <boris@bur.io> wrote:
>
> As we delete extents from a block group, at some deletion we cross below
> the reclaim threshold. It is possible we are still in the middle of
> deleting more extents and might soon hit 0. If that occurs, we would
> leave the block group on the reclaim list, not in the care of unused
> deletion or async discard.
>
> It is pointless and wasteful to relocate empty block groups, so if we do

Hum? Why pointless and wasteful?
Relocating an empty block group results in deleting it.

In fact, before we tracked unused block groups and had the cleaner
kthread remove them, that was the only
way to delete unused block groups - trigger relocation from user space.

btrfs_relocate_chunk() explicitly calls btrfs_remove_chunk() at the
end, and the relocation itself
will do nothing except:

1) commit the current transaction when it starts
2) search the extent tree for extents in this block group - here it
will not find anything, and therefore do nothing.
3) commit another transaction

So I don't quite understand what this patch is trying to accomplish.

At the very least the changelog needs to be more detailed.

As it is, it gives the wrong idea that the relocation will leave the
block group around.
If your goal is to avoid the 2 transaction commits and the search on
the extent tree, then please be explicit in
the changelog.

Thanks.

> notice that case (we might not if the reclaim worker runs *before* we
> finish emptying it), don't bother with relocating the block group.
>
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
>  fs/btrfs/block-group.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 11fd52657b76..c3ea627d2457 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -1608,6 +1608,25 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
>                         up_write(&space_info->groups_sem);
>                         goto next;
>                 }
> +               if (bg->used == 0) {
> +                       /*
> +                        * It is possible that we trigger relocation on a block
> +                        * group as its extents are deleted and it first goes
> +                        * below the threshold, then shortly goes empty. In that
> +                        * case, we will do relocation, even though we could
> +                        * more cheaply just delete the unused block group. Try
> +                        * to catch that case here, though of course it is
> +                        * possible there is a delete still coming the future,
> +                        * so we can't avoid needless relocation of this sort
> +                        * altogether. We can at least avoid relocating empty
> +                        * block groups.
> +                        */
> +                       if (!btrfs_test_opt(fs_info, DISCARD_ASYNC))
> +                               btrfs_mark_bg_unused(bg);
> +                       spin_unlock(&bg->lock);
> +                       up_write(&space_info->groups_sem);
> +                       goto next;
> +               }
>                 spin_unlock(&bg->lock);
>
>                 /* Get out fast, in case we're unmounting the filesystem */
> --
> 2.37.2
>
Boris Burkov Oct. 11, 2022, 6:05 p.m. UTC | #2
On Tue, Oct 11, 2022 at 10:43:33AM +0100, Filipe Manana wrote:
> On Mon, Oct 10, 2022 at 8:25 PM Boris Burkov <boris@bur.io> wrote:
> >
> > As we delete extents from a block group, at some deletion we cross below
> > the reclaim threshold. It is possible we are still in the middle of
> > deleting more extents and might soon hit 0. If that occurs, we would
> > leave the block group on the reclaim list, not in the care of unused
> > deletion or async discard.
> >
> > It is pointless and wasteful to relocate empty block groups, so if we do
> 
> Hum? Why pointless and wasteful?
> Relocating an empty block group results in deleting it.
> 
> In fact, before we tracked unused block groups and had the cleaner
> kthread remove them, that was the only
> way to delete unused block groups - trigger relocation from user space.
> 
> btrfs_relocate_chunk() explicitly calls btrfs_remove_chunk() at the
> end, and the relocation itself
> will do nothing except:
> 
> 1) commit the current transaction when it starts
> 2) search the extent tree for extents in this block group - here it
> will not find anything, and therefore do nothing.
> 3) commit another transaction
> 
> So I don't quite understand what this patch is trying to accomplish.
> 
> At the very least the changelog needs to be more detailed.
> 
> As it is, it gives the wrong idea that the relocation will leave the
> block group around.
> If your goal is to avoid the 2 transaction commits and the search on
> the extent tree, then please be explicit in
> the changelog.
> 
> Thanks.

Apologies for the exuberant and unclear language. I'm happy to tone it
down and make it more precise. Thanks for the feedback.

We see a lot of these empty reclaims in practice and I thought this
could be a helpful little cleanup, nothing major.

My aim was essentially what you said. In my mind, delete unused is less
complicated and has less overhead than an empty relocation. I agree
that the empty relocation does accomplish the task of getting rid of the
bg, but I figure if we have a more direct way of accomplishing it
already, we should go with that. To be completely clear, I was also 
thinking about creating the relocation inode as un-needed overhead.
For what it's worth, I have not benchmarked btrfs_delete_unused_bgs
against relocating empty bgs.

Thanks,
Boris

> 
> > notice that case (we might not if the reclaim worker runs *before* we
> > finish emptying it), don't bother with relocating the block group.
> >
> > Signed-off-by: Boris Burkov <boris@bur.io>
> > ---
> >  fs/btrfs/block-group.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> > index 11fd52657b76..c3ea627d2457 100644
> > --- a/fs/btrfs/block-group.c
> > +++ b/fs/btrfs/block-group.c
> > @@ -1608,6 +1608,25 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> >                         up_write(&space_info->groups_sem);
> >                         goto next;
> >                 }
> > +               if (bg->used == 0) {
> > +                       /*
> > +                        * It is possible that we trigger relocation on a block
> > +                        * group as its extents are deleted and it first goes
> > +                        * below the threshold, then shortly goes empty. In that
> > +                        * case, we will do relocation, even though we could
> > +                        * more cheaply just delete the unused block group. Try
> > +                        * to catch that case here, though of course it is
> > +                        * possible there is a delete still coming the future,
> > +                        * so we can't avoid needless relocation of this sort
> > +                        * altogether. We can at least avoid relocating empty
> > +                        * block groups.
> > +                        */
> > +                       if (!btrfs_test_opt(fs_info, DISCARD_ASYNC))
> > +                               btrfs_mark_bg_unused(bg);
> > +                       spin_unlock(&bg->lock);
> > +                       up_write(&space_info->groups_sem);
> > +                       goto next;
> > +               }
> >                 spin_unlock(&bg->lock);
> >
> >                 /* Get out fast, in case we're unmounting the filesystem */
> > --
> > 2.37.2
> >
Filipe Manana Oct. 12, 2022, 9:51 a.m. UTC | #3
On Tue, Oct 11, 2022 at 11:05:15AM -0700, Boris Burkov wrote:
> On Tue, Oct 11, 2022 at 10:43:33AM +0100, Filipe Manana wrote:
> > On Mon, Oct 10, 2022 at 8:25 PM Boris Burkov <boris@bur.io> wrote:
> > >
> > > As we delete extents from a block group, at some deletion we cross below
> > > the reclaim threshold. It is possible we are still in the middle of
> > > deleting more extents and might soon hit 0. If that occurs, we would
> > > leave the block group on the reclaim list, not in the care of unused
> > > deletion or async discard.
> > >
> > > It is pointless and wasteful to relocate empty block groups, so if we do
> > 
> > Hum? Why pointless and wasteful?
> > Relocating an empty block group results in deleting it.
> > 
> > In fact, before we tracked unused block groups and had the cleaner
> > kthread remove them, that was the only
> > way to delete unused block groups - trigger relocation from user space.
> > 
> > btrfs_relocate_chunk() explicitly calls btrfs_remove_chunk() at the
> > end, and the relocation itself
> > will do nothing except:
> > 
> > 1) commit the current transaction when it starts
> > 2) search the extent tree for extents in this block group - here it
> > will not find anything, and therefore do nothing.
> > 3) commit another transaction
> > 
> > So I don't quite understand what this patch is trying to accomplish.
> > 
> > At the very least the changelog needs to be more detailed.
> > 
> > As it is, it gives the wrong idea that the relocation will leave the
> > block group around.
> > If your goal is to avoid the 2 transaction commits and the search on
> > the extent tree, then please be explicit in
> > the changelog.
> > 
> > Thanks.
> 
> Apologies for the exuberant and unclear language. I'm happy to tone it
> down and make it more precise. Thanks for the feedback.
> 
> We see a lot of these empty reclaims in practice and I thought this
> could be a helpful little cleanup, nothing major.
> 
> My aim was essentially what you said. In my mind, delete unused is less
> complicated and has less overhead than an empty relocation. I agree
> that the empty relocation does accomplish the task of getting rid of the
> bg, but I figure if we have a more direct way of accomplishing it
> already, we should go with that. To be completely clear, I was also 
> thinking about creating the relocation inode as un-needed overhead.
> For what it's worth, I have not benchmarked btrfs_delete_unused_bgs
> against relocating empty bgs.

Ok. It was not clear to me if you were experiencing empty block groups not
being deleted and/or some significant performance problem.

The idea looks fine to me.

I only think we should have the changelog more explicit and detailed.

I.e., mention that although the block group gets deleted by relecation,
it's less efficient due to those transaction commits, extent tree search
(plus anything else I may have missed).

Plus that through the cleaner kthread (btrfs_delete_unused_bgs()) we also
do other important things like dealing with trim/discard and zone finishing,
and that's why we defer the task to the cleaner instead of calling
btrfs_remove_chunk() directly from the block group reclaim worker.

Thanks.

> 
> Thanks,
> Boris
> 
> > 
> > > notice that case (we might not if the reclaim worker runs *before* we
> > > finish emptying it), don't bother with relocating the block group.
> > >
> > > Signed-off-by: Boris Burkov <boris@bur.io>
> > > ---
> > >  fs/btrfs/block-group.c | 19 +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > >
> > > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> > > index 11fd52657b76..c3ea627d2457 100644
> > > --- a/fs/btrfs/block-group.c
> > > +++ b/fs/btrfs/block-group.c
> > > @@ -1608,6 +1608,25 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> > >                         up_write(&space_info->groups_sem);
> > >                         goto next;
> > >                 }
> > > +               if (bg->used == 0) {
> > > +                       /*
> > > +                        * It is possible that we trigger relocation on a block
> > > +                        * group as its extents are deleted and it first goes
> > > +                        * below the threshold, then shortly goes empty. In that
> > > +                        * case, we will do relocation, even though we could
> > > +                        * more cheaply just delete the unused block group. Try
> > > +                        * to catch that case here, though of course it is
> > > +                        * possible there is a delete still coming the future,
> > > +                        * so we can't avoid needless relocation of this sort
> > > +                        * altogether. We can at least avoid relocating empty
> > > +                        * block groups.
> > > +                        */
> > > +                       if (!btrfs_test_opt(fs_info, DISCARD_ASYNC))
> > > +                               btrfs_mark_bg_unused(bg);
> > > +                       spin_unlock(&bg->lock);
> > > +                       up_write(&space_info->groups_sem);
> > > +                       goto next;
> > > +               }
> > >                 spin_unlock(&bg->lock);
> > >
> > >                 /* Get out fast, in case we're unmounting the filesystem */
> > > --
> > > 2.37.2
> > >
diff mbox series

Patch

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 11fd52657b76..c3ea627d2457 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1608,6 +1608,25 @@  void btrfs_reclaim_bgs_work(struct work_struct *work)
 			up_write(&space_info->groups_sem);
 			goto next;
 		}
+		if (bg->used == 0) {
+			/*
+			 * It is possible that we trigger relocation on a block
+			 * group as its extents are deleted and it first goes
+			 * below the threshold, then shortly goes empty. In that
+			 * case, we will do relocation, even though we could
+			 * more cheaply just delete the unused block group. Try
+			 * to catch that case here, though of course it is
+			 * possible there is a delete still coming the future,
+			 * so we can't avoid needless relocation of this sort
+			 * altogether. We can at least avoid relocating empty
+			 * block groups.
+			 */
+			if (!btrfs_test_opt(fs_info, DISCARD_ASYNC))
+				btrfs_mark_bg_unused(bg);
+			spin_unlock(&bg->lock);
+			up_write(&space_info->groups_sem);
+			goto next;
+		}
 		spin_unlock(&bg->lock);
 
 		/* Get out fast, in case we're unmounting the filesystem */