diff mbox series

ceph: have ceph_writepages_start call pagevec_lookup_range_tag

Message ID 20200914183452.378189-1-jlayton@kernel.org (mailing list archive)
State New, archived
Headers show
Series ceph: have ceph_writepages_start call pagevec_lookup_range_tag | expand

Commit Message

Jeff Layton Sept. 14, 2020, 6:34 p.m. UTC
Currently it calls pagevec_lookup_range_nr_tag(), but Willy pointed out
that that is probably inefficient, as we might end up having to search
several times if we get down to looking for one more page to fill a
write.

"I think ceph is misusing pagevec_lookup_range_nr_tag().  Let's suppose
 you get a range which is AAAAbbbbAAAAbbbbAAAAbbbbbbbb(...)bbbbAAAA and
 you try to fetch max_pages=13.  First loop will get AAAAbbbbAAAAb and
 have 8 locked_pages.  The next call will get bbbAA and now
 locked_pages=10.  Next call gets AAb ... and now you're iterating your
 way through all the 'b' one page at a time until you find that first A."

'A' here refers to pages that are eligible for writeback and 'b'
represents ones that aren't (for whatever reason).

Ceph is also the only caller of pagevec_lookup_range_nr_tag(), so
changing this code to use pagevec_lookup_range_tag() should allow us to
eliminate that call as well. That may mean that we sometimes find more
pages than are needed, but the extra references will just get put at the
end regardless.

Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

I'm still testing this, but it looks good so far. If it's OK, we'll get
this in for v5.10, and then I'll send a patch to remove
pagevec_lookup_range_nr_tag.

Comments

Matthew Wilcox Sept. 14, 2020, 6:45 p.m. UTC | #1
On Mon, Sep 14, 2020 at 02:34:52PM -0400, Jeff Layton wrote:
> Ceph is also the only caller of pagevec_lookup_range_nr_tag(), so
> changing this code to use pagevec_lookup_range_tag() should allow us to
> eliminate that call as well. That may mean that we sometimes find more
> pages than are needed, but the extra references will just get put at the
> end regardless.

That was the part I wasn't clear about!

So, let's suppose max_pages is 10 and we get 15 pages back.

We'll run the

for (i = 0; i < pvec_pages && locked_pages < max_pages; i++) {
}
loop ten times, then hit:

if (i) {
	for (j = 0; j < pvec_pages; j++) {
		if (!pvec.pages[j])
			continue;
OK, we do that ten times, then
		if (n < j)
			pvec.pages[n] = pvec.pages[j];
so we now have five pages clustered at the bottom of pvec
                        pvec.nr = n;
... then we do the new_request: stanza ...
ah, and then we call pagevec_release(&pvec);
and everything is good!

Excellent.  I was overwhelmed by the amount of code in this function.
Glad to see the patch was so simple in the end.

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

> Reported-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/ceph/addr.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> I'm still testing this, but it looks good so far. If it's OK, we'll get
> this in for v5.10, and then I'll send a patch to remove
> pagevec_lookup_range_nr_tag.
> 
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 6ea761c84494..b03dbaa9d345 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -962,9 +962,8 @@ static int ceph_writepages_start(struct address_space *mapping,
>  		max_pages = wsize >> PAGE_SHIFT;
>  
>  get_more_pages:
> -		pvec_pages = pagevec_lookup_range_nr_tag(&pvec, mapping, &index,
> -						end, PAGECACHE_TAG_DIRTY,
> -						max_pages - locked_pages);
> +		pvec_pages = pagevec_lookup_range_tag(&pvec, mapping, &index,
> +						end, PAGECACHE_TAG_DIRTY);
>  		dout("pagevec_lookup_range_tag got %d\n", pvec_pages);
>  		if (!pvec_pages && !locked_pages)
>  			break;
> -- 
> 2.26.2
>
diff mbox series

Patch

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 6ea761c84494..b03dbaa9d345 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -962,9 +962,8 @@  static int ceph_writepages_start(struct address_space *mapping,
 		max_pages = wsize >> PAGE_SHIFT;
 
 get_more_pages:
-		pvec_pages = pagevec_lookup_range_nr_tag(&pvec, mapping, &index,
-						end, PAGECACHE_TAG_DIRTY,
-						max_pages - locked_pages);
+		pvec_pages = pagevec_lookup_range_tag(&pvec, mapping, &index,
+						end, PAGECACHE_TAG_DIRTY);
 		dout("pagevec_lookup_range_tag got %d\n", pvec_pages);
 		if (!pvec_pages && !locked_pages)
 			break;