diff mbox series

btrfs: qgroup: Don't trace subtree if we're dropping tree reloc tree

Message ID 20180905050339.5356-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: qgroup: Don't trace subtree if we're dropping tree reloc tree | expand

Commit Message

Qu Wenruo Sept. 5, 2018, 5:03 a.m. UTC
Tree reloc tree doesn't contribute to qgroup numbers, as we have
accounted them at balance time (check replace_path()).

Skip such unneeded subtree trace should reduce some performance
overhead.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent-tree.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

David Sterba Sept. 5, 2018, 1:11 p.m. UTC | #1
On Wed, Sep 05, 2018 at 01:03:39PM +0800, Qu Wenruo wrote:
> Tree reloc tree doesn't contribute to qgroup numbers, as we have

I think you can call it just 'reloc tree', I'm fixing that in all
changelogs and comments anyway.

> accounted them at balance time (check replace_path()).
> 
> Skip such unneeded subtree trace should reduce some performance
> overhead.

Please provide some numbers or description of the improvement. There are
several performance problems caused by qgroups so it would be good to
get a better idea how much this patch is going to help. Thanks.
Qu Wenruo Sept. 6, 2018, 1:41 a.m. UTC | #2
On 2018/9/5 下午9:11, David Sterba wrote:
> On Wed, Sep 05, 2018 at 01:03:39PM +0800, Qu Wenruo wrote:
>> Tree reloc tree doesn't contribute to qgroup numbers, as we have
> 
> I think you can call it just 'reloc tree', I'm fixing that in all
> changelogs and comments anyway.

But there is another tree called data reloc tree.
That why I'm sticking to tree reloc tree to distinguish from data reloc
tree.

> 
>> accounted them at balance time (check replace_path()).
>>
>> Skip such unneeded subtree trace should reduce some performance
>> overhead.
> 
> Please provide some numbers or description of the improvement. There are
> several performance problems caused by qgroups so it would be good to
> get a better idea how much this patch is going to help. Thanks.

That's the problem.
For my internal test, with 3000+ tree blocks, metadata balance could
save about 1~2%.
But according to dump-tree, the tree layout is almost the worst case
scenario, just one metadata block group owns all the tree blocks.

To get a real world scenario, I need a file with hundreds GB or even
several TB and populate it with a good amount of inline files and enough
CoW to fragment the metadata usage.
Which I don't have such free space.

Anyone who is still struggling with balance + quota, any test data is
appreciated.

Thanks,
Qu
David Sterba Sept. 6, 2018, 11:15 a.m. UTC | #3
On Thu, Sep 06, 2018 at 09:41:14AM +0800, Qu Wenruo wrote:
> 
> 
> On 2018/9/5 下午9:11, David Sterba wrote:
> > On Wed, Sep 05, 2018 at 01:03:39PM +0800, Qu Wenruo wrote:
> >> Tree reloc tree doesn't contribute to qgroup numbers, as we have
> > 
> > I think you can call it just 'reloc tree', I'm fixing that in all
> > changelogs and comments anyway.
> 
> But there is another tree called data reloc tree.
> That why I'm sticking to tree reloc tree to distinguish from data reloc
> tree.

So call it 'reloc tree' and 'data reloc tree', the naming of
BTRFS_TREE_RELOC_OBJECTID does not follow the other tree naming and I
don't know if the inention was to name it after the 'tree log' scheme or
the 'extent tree'. Or if there was an intention behind the naming at
all.

If this is going to bother us too much, I won't mind renaming it to
BTRFS_RELOC_TREE_OBJECTID everywhere. For consistency.

> >> accounted them at balance time (check replace_path()).
> >>
> >> Skip such unneeded subtree trace should reduce some performance
> >> overhead.
> > 
> > Please provide some numbers or description of the improvement. There are
> > several performance problems caused by qgroups so it would be good to
> > get a better idea how much this patch is going to help. Thanks.
> 
> That's the problem.
> For my internal test, with 3000+ tree blocks, metadata balance could
> save about 1~2%.
> But according to dump-tree, the tree layout is almost the worst case
> scenario, just one metadata block group owns all the tree blocks.

If you do such test, it's also a good example for the changelog. It
describes the worst case and this information can be used to prepare
testing environment on large data samples.
Qu Wenruo Sept. 6, 2018, 11:20 a.m. UTC | #4
On 2018/9/6 下午7:15, David Sterba wrote:
> On Thu, Sep 06, 2018 at 09:41:14AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2018/9/5 下午9:11, David Sterba wrote:
>>> On Wed, Sep 05, 2018 at 01:03:39PM +0800, Qu Wenruo wrote:
>>>> Tree reloc tree doesn't contribute to qgroup numbers, as we have
>>>
>>> I think you can call it just 'reloc tree', I'm fixing that in all
>>> changelogs and comments anyway.
>>
>> But there is another tree called data reloc tree.
>> That why I'm sticking to tree reloc tree to distinguish from data reloc
>> tree.
> 
> So call it 'reloc tree' and 'data reloc tree', the naming of
> BTRFS_TREE_RELOC_OBJECTID does not follow the other tree naming and I
> don't know if the inention was to name it after the 'tree log' scheme or
> the 'extent tree'. Or if there was an intention behind the naming at
> all.
> 
> If this is going to bother us too much, I won't mind renaming it to
> BTRFS_RELOC_TREE_OBJECTID everywhere. For consistency.
> 

I'll go reloc tree for later reference.

Renaming makes sense, although I'd like to change its name to something
more distinguish (without the reloc part).

For now, I'm fine using reloc tree.

>>>> accounted them at balance time (check replace_path()).
>>>>
>>>> Skip such unneeded subtree trace should reduce some performance
>>>> overhead.
>>>
>>> Please provide some numbers or description of the improvement. There are
>>> several performance problems caused by qgroups so it would be good to
>>> get a better idea how much this patch is going to help. Thanks.
>>
>> That's the problem.
>> For my internal test, with 3000+ tree blocks, metadata balance could
>> save about 1~2%.
>> But according to dump-tree, the tree layout is almost the worst case
>> scenario, just one metadata block group owns all the tree blocks.
> 
> If you do such test, it's also a good example for the changelog. It
> describes the worst case and this information can be used to prepare
> testing environment on large data samples.

Already preparing the real world case (great thanks for Fujitsu guys
providing 1TB storage for the test).

I'll provide the data along after the test is done.

Thanks,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index de6f75f5547b..4588153f414c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8643,7 +8643,13 @@  static noinline int do_walk_down(struct btrfs_trans_handle *trans,
 			parent = 0;
 		}
 
-		if (need_account) {
+		/*
+		 * Tree reloc tree doesn't contribute to qgroup numbers, and
+		 * we have already accounted them at merge time (replace_path),
+		 * thus we could skip expensive subtree trace here.
+		 */
+		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
+		    need_account) {
 			ret = btrfs_qgroup_trace_subtree(trans, next,
 							 generation, level - 1);
 			if (ret) {