Message ID | 20200914183452.378189-1-jlayton@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ceph: have ceph_writepages_start call pagevec_lookup_range_tag | expand |
On Mon, Sep 14, 2020 at 02:34:52PM -0400, Jeff Layton wrote: > Ceph is also the only caller of pagevec_lookup_range_nr_tag(), so > changing this code to use pagevec_lookup_range_tag() should allow us to > eliminate that call as well. That may mean that we sometimes find more > pages than are needed, but the extra references will just get put at the > end regardless. That was the part I wasn't clear about! So, let's suppose max_pages is 10 and we get 15 pages back. We'll run the for (i = 0; i < pvec_pages && locked_pages < max_pages; i++) { } loop ten times, then hit: if (i) { for (j = 0; j < pvec_pages; j++) { if (!pvec.pages[j]) continue; OK, we do that ten times, then if (n < j) pvec.pages[n] = pvec.pages[j]; so we now have five pages clustered at the bottom of pvec pvec.nr = n; ... then we do the new_request: stanza ... ah, and then we call pagevec_release(&pvec); and everything is good! Excellent. I was overwhelmed by the amount of code in this function. Glad to see the patch was so simple in the end. Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> > Reported-by: Matthew Wilcox <willy@infradead.org> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/ceph/addr.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > I'm still testing this, but it looks good so far. If it's OK, we'll get > this in for v5.10, and then I'll send a patch to remove > pagevec_lookup_range_nr_tag. > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index 6ea761c84494..b03dbaa9d345 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -962,9 +962,8 @@ static int ceph_writepages_start(struct address_space *mapping, > max_pages = wsize >> PAGE_SHIFT; > > get_more_pages: > - pvec_pages = pagevec_lookup_range_nr_tag(&pvec, mapping, &index, > - end, PAGECACHE_TAG_DIRTY, > - max_pages - locked_pages); > + pvec_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, > + end, PAGECACHE_TAG_DIRTY); > dout("pagevec_lookup_range_tag got %d\n", pvec_pages); > if (!pvec_pages && !locked_pages) > break; > -- > 2.26.2 >
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 6ea761c84494..b03dbaa9d345 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -962,9 +962,8 @@ static int ceph_writepages_start(struct address_space *mapping, max_pages = wsize >> PAGE_SHIFT; get_more_pages: - pvec_pages = pagevec_lookup_range_nr_tag(&pvec, mapping, &index, - end, PAGECACHE_TAG_DIRTY, - max_pages - locked_pages); + pvec_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, + end, PAGECACHE_TAG_DIRTY); dout("pagevec_lookup_range_tag got %d\n", pvec_pages); if (!pvec_pages && !locked_pages) break;
Currently it calls pagevec_lookup_range_nr_tag(), but Willy pointed out that that is probably inefficient, as we might end up having to search several times if we get down to looking for one more page to fill a write. "I think ceph is misusing pagevec_lookup_range_nr_tag(). Let's suppose you get a range which is AAAAbbbbAAAAbbbbAAAAbbbbbbbb(...)bbbbAAAA and you try to fetch max_pages=13. First loop will get AAAAbbbbAAAAb and have 8 locked_pages. The next call will get bbbAA and now locked_pages=10. Next call gets AAb ... and now you're iterating your way through all the 'b' one page at a time until you find that first A." 'A' here refers to pages that are eligible for writeback and 'b' represents ones that aren't (for whatever reason). Ceph is also the only caller of pagevec_lookup_range_nr_tag(), so changing this code to use pagevec_lookup_range_tag() should allow us to eliminate that call as well. That may mean that we sometimes find more pages than are needed, but the extra references will just get put at the end regardless. Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ceph/addr.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) I'm still testing this, but it looks good so far. If it's OK, we'll get this in for v5.10, and then I'll send a patch to remove pagevec_lookup_range_nr_tag.