diff mbox series

iomap: Do not use GFP_NORETRY to allocate BIOs

Message ID 20200323131244.29435-1-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series iomap: Do not use GFP_NORETRY to allocate BIOs | expand

Commit Message

Matthew Wilcox March 23, 2020, 1:12 p.m. UTC
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

If we use GFP_NORETRY, we have to be able to handle failures, and it's
tricky to handle failure here.  Other implementations of ->readpages
do not attempt to handle BIO allocation failures, so this is no worse.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Christoph Hellwig March 23, 2020, 1:20 p.m. UTC | #1
On Mon, Mar 23, 2020 at 06:12:44AM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> If we use GFP_NORETRY, we have to be able to handle failures, and it's
> tricky to handle failure here.  Other implementations of ->readpages
> do not attempt to handle BIO allocation failures, so this is no worse.

do_mpage_readpage tries to use it, I guess that is wher I copied it
from..

But I don't think it is a bad idea, so the patch itself looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

It probably wants a fixes tag, though:

Fixes: 72b4daa24129 ("iomap: add an iomap-based readpage and readpages implementation")
Matthew Wilcox March 23, 2020, 1:40 p.m. UTC | #2
On Mon, Mar 23, 2020 at 06:20:52AM -0700, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 06:12:44AM -0700, Matthew Wilcox wrote:
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > If we use GFP_NORETRY, we have to be able to handle failures, and it's
> > tricky to handle failure here.  Other implementations of ->readpages
> > do not attempt to handle BIO allocation failures, so this is no worse.
> 
> do_mpage_readpage tries to use it, I guess that is wher I copied it
> from..

Oh, I see that now.  It uses readahead_gfp_mask(), and I was grepping for
GFP_NORETRY so I didn't spot it.  It falls back to block_read_full_page()
which we can't do.  That will allocate smaller BIOs, so there's an argument
that we should do the same.  How about this:

+++ b/fs/iomap/buffered-io.c
@@ -302,6 +302,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
        if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) {
                gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
+               gfp_t orig_gfp = gfp;
                int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
 
                if (ctx->bio)
@@ -310,6 +311,8 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
                if (ctx->is_readahead) /* same as readahead_gfp_mask */
                        gfp |= __GFP_NORETRY | __GFP_NOWARN;
                ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
+               if (!ctx->bio)
+                       ctx->bio = bio_alloc(orig_gfp, 1);
                ctx->bio->bi_opf = REQ_OP_READ;
                if (ctx->is_readahead)
                        ctx->bio->bi_opf |= REQ_RAHEAD;
Christoph Hellwig March 23, 2020, 1:55 p.m. UTC | #3
On Mon, Mar 23, 2020 at 06:40:32AM -0700, Matthew Wilcox wrote:
> Oh, I see that now.  It uses readahead_gfp_mask(), and I was grepping for
> GFP_NORETRY so I didn't spot it.  It falls back to block_read_full_page()
> which we can't do.  That will allocate smaller BIOs, so there's an argument
> that we should do the same.  How about this:

That looks silly to me.  This just means we'll keep iterating over
small bios for readahead..  Either we just ignore the different gfp
mask, or we need to go all the way and handle errors, although that
doesn't really look nice.
Matthew Wilcox March 23, 2020, 3:10 p.m. UTC | #4
On Mon, Mar 23, 2020 at 06:55:00AM -0700, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 06:40:32AM -0700, Matthew Wilcox wrote:
> > Oh, I see that now.  It uses readahead_gfp_mask(), and I was grepping for
> > GFP_NORETRY so I didn't spot it.  It falls back to block_read_full_page()
> > which we can't do.  That will allocate smaller BIOs, so there's an argument
> > that we should do the same.  How about this:
> 
> That looks silly to me.  This just means we'll keep iterating over
> small bios for readahead..  Either we just ignore the different gfp
> mask, or we need to go all the way and handle errors, although that
> doesn't really look nice.

I'm not sure it's silly, although I'd love to see bio_alloc() support
nr_iovecs == 0 meaning "allocate me any size biovec and tell me what
size I got in ->bi_max_vecs".  By allocating a small biovec this time,
we do one allocation rather than two, and maybe by the time we come to
allocate the next readahead bio, kswapd will have succeeded in freeing
up more memory for us.
Darrick J. Wong March 23, 2020, 3:38 p.m. UTC | #5
On Mon, Mar 23, 2020 at 06:12:44AM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> If we use GFP_NORETRY, we have to be able to handle failures, and it's
> tricky to handle failure here.  Other implementations of ->readpages
> do not attempt to handle BIO allocation failures, so this is no worse.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

This fixes the regression I saw, though I see what looks like a new
patch forming further down in this thread, so please let me know if
you'd prefer I take this change, the other one, or both.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/iomap/buffered-io.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 417115bfaf6b..2336642d7390 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -307,8 +307,6 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  		if (ctx->bio)
>  			submit_bio(ctx->bio);
>  
> -		if (ctx->is_readahead) /* same as readahead_gfp_mask */
> -			gfp |= __GFP_NORETRY | __GFP_NOWARN;
>  		ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
>  		ctx->bio->bi_opf = REQ_OP_READ;
>  		if (ctx->is_readahead)
> -- 
> 2.25.1
>
Christoph Hellwig March 23, 2020, 4:51 p.m. UTC | #6
On Mon, Mar 23, 2020 at 08:10:54AM -0700, Matthew Wilcox wrote:
> > That looks silly to me.  This just means we'll keep iterating over
> > small bios for readahead..  Either we just ignore the different gfp
> > mask, or we need to go all the way and handle errors, although that
> > doesn't really look nice.
> 
> I'm not sure it's silly,

Oh well, I'm not going to be in the way of fixing a bug I added.  So
feel free to go ahead with this and mention it matches mpage_readpages.

> although I'd love to see bio_alloc() support
> nr_iovecs == 0 meaning "allocate me any size biovec and tell me what
> size I got in ->bi_max_vecs".  By allocating a small biovec this time,
> we do one allocation rather than two, and maybe by the time we come to
> allocate the next readahead bio, kswapd will have succeeded in freeing
> up more memory for us.

Sounds easy enough - especially as callers don't need to look at
bi_max_vecs anyway, that is the job of bio_add_page and friends.  That
being said an upper bound still sounds useful - no need to allocate
a a gigantic bio if we know we only need a few pages.
Christoph Hellwig March 31, 2020, 9:21 a.m. UTC | #7
On Mon, Mar 23, 2020 at 09:51:54AM -0700, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 08:10:54AM -0700, Matthew Wilcox wrote:
> > > That looks silly to me.  This just means we'll keep iterating over
> > > small bios for readahead..  Either we just ignore the different gfp
> > > mask, or we need to go all the way and handle errors, although that
> > > doesn't really look nice.
> > 
> > I'm not sure it's silly,
> 
> Oh well, I'm not going to be in the way of fixing a bug I added.  So
> feel free to go ahead with this and mention it matches mpage_readpages.

Are you going to submit this as a formal patch?
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 417115bfaf6b..2336642d7390 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -307,8 +307,6 @@  iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		if (ctx->bio)
 			submit_bio(ctx->bio);
 
-		if (ctx->is_readahead) /* same as readahead_gfp_mask */
-			gfp |= __GFP_NORETRY | __GFP_NOWARN;
 		ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
 		ctx->bio->bi_opf = REQ_OP_READ;
 		if (ctx->is_readahead)