mbox series

[0/2] Drop some mis-uses of READA

Message ID 20200313210954.148686-1-josef@toxicpanda.com (mailing list archive)
Headers show
Series Drop some mis-uses of READA | expand

Message

Josef Bacik March 13, 2020, 9:09 p.m. UTC
In debugging Zygo's huge commit delays I noticed we were burning a bunch of time
doing READA in cases where we don't need to.  The way READA works in btrfs is
we'll load up adjacent nodes and leaves as we walk down.  This is useful for
operations where we're going to be reading sequentially across the tree.

But for delayed refs we're looking up one bytenr, and then another one which
could be elsewhere in the tree.  With large enough extent trees this results in
a lot of unneeded latency.

The same applies to build_backref_tree, but that's even worse because we're
looking up backrefs, which are essentially randomly spread out across the extent
root.  Thanks,

Josef

Comments

Qu Wenruo March 14, 2020, 2:56 a.m. UTC | #1
On 2020/3/14 上午5:09, Josef Bacik wrote:
> In debugging Zygo's huge commit delays I noticed we were burning a bunch of time
> doing READA in cases where we don't need to.  The way READA works in btrfs is
> we'll load up adjacent nodes and leaves as we walk down.  This is useful for
> operations where we're going to be reading sequentially across the tree.
> 
> But for delayed refs we're looking up one bytenr, and then another one which
> could be elsewhere in the tree.  With large enough extent trees this results in
> a lot of unneeded latency.
> 
> The same applies to build_backref_tree, but that's even worse because we're
> looking up backrefs, which are essentially randomly spread out across the extent
> root.  Thanks,

There are quite some other locations abusing READA.

E.g. btrfs_read_block_groups(), where we're just searching for block
group items. There is no guarantee that next block group item is in next
a few leaves.

I guess it's a good time to review all READA abuse. Or would you mind me
to do that?

Thanks,
Qu
> 
> Josef
>
David Sterba March 18, 2020, 2:40 p.m. UTC | #2
On Fri, Mar 13, 2020 at 05:09:52PM -0400, Josef Bacik wrote:
> In debugging Zygo's huge commit delays I noticed we were burning a bunch of time
> doing READA in cases where we don't need to.  The way READA works in btrfs is
> we'll load up adjacent nodes and leaves as we walk down.  This is useful for
> operations where we're going to be reading sequentially across the tree.
> 
> But for delayed refs we're looking up one bytenr, and then another one which
> could be elsewhere in the tree.  With large enough extent trees this results in
> a lot of unneeded latency.
> 
> The same applies to build_backref_tree, but that's even worse because we're
> looking up backrefs, which are essentially randomly spread out across the extent
> root.  Thanks,

Makes sense, I'll add it to misc-next. Thanks.
David Sterba March 18, 2020, 2:44 p.m. UTC | #3
On Sat, Mar 14, 2020 at 10:56:06AM +0800, Qu Wenruo wrote:
> 
> 
> On 2020/3/14 上午5:09, Josef Bacik wrote:
> > In debugging Zygo's huge commit delays I noticed we were burning a bunch of time
> > doing READA in cases where we don't need to.  The way READA works in btrfs is
> > we'll load up adjacent nodes and leaves as we walk down.  This is useful for
> > operations where we're going to be reading sequentially across the tree.
> > 
> > But for delayed refs we're looking up one bytenr, and then another one which
> > could be elsewhere in the tree.  With large enough extent trees this results in
> > a lot of unneeded latency.
> > 
> > The same applies to build_backref_tree, but that's even worse because we're
> > looking up backrefs, which are essentially randomly spread out across the extent
> > root.  Thanks,
> 
> There are quite some other locations abusing READA.
> 
> E.g. btrfs_read_block_groups(), where we're just searching for block
> group items. There is no guarantee that next block group item is in next
> a few leaves.
> 
> I guess it's a good time to review all READA abuse. Or would you mind me
> to do that?

If you find some clear example where the items are scattered over the
tree then yes. For the rest it would be good to put a comment that the
readahead really helps. Thanks.