iomap: Handle memory allocation failure in readahead
diff mbox series

Message ID 20200401030421.17195-1-willy@infradead.org
State New
Headers show
Series
  • iomap: Handle memory allocation failure in readahead
Related show

Commit Message

Matthew Wilcox April 1, 2020, 3:04 a.m. UTC
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

bio_alloc() can fail when we use GFP_NORETRY.  If it does, allocate
a bio large enough for a single page like mpage_readpages() does.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Darrick J. Wong April 1, 2020, 4:31 a.m. UTC | #1
On Tue, Mar 31, 2020 at 08:04:21PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> bio_alloc() can fail when we use GFP_NORETRY.  If it does, allocate
> a bio large enough for a single page like mpage_readpages() does.

Why does mpage_readpages() do that?

Is this a means to guarantee some kind of forward (readahead?) progress?
Forgive my ignorance, but if memory is so tight we can't allocate a bio
for readahead then why not exit having accomplished nothing?

--D

> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  fs/iomap/buffered-io.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 417115bfaf6b..c258801f18d4 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -302,6 +302,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  
>  	if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) {
>  		gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
> +		gfp_t orig_gfp = gfp;
>  		int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
>  
>  		if (ctx->bio)
> @@ -310,6 +311,8 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  		if (ctx->is_readahead) /* same as readahead_gfp_mask */
>  			gfp |= __GFP_NORETRY | __GFP_NOWARN;
>  		ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
> +		if (!ctx->bio)
> +			ctx->bio = bio_alloc(orig_gfp, 1);
>  		ctx->bio->bi_opf = REQ_OP_READ;
>  		if (ctx->is_readahead)
>  			ctx->bio->bi_opf |= REQ_RAHEAD;
> -- 
> 2.25.1
>
Matthew Wilcox April 1, 2020, 11:23 a.m. UTC | #2
On Tue, Mar 31, 2020 at 09:31:25PM -0700, Darrick J. Wong wrote:
> On Tue, Mar 31, 2020 at 08:04:21PM -0700, Matthew Wilcox wrote:
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > bio_alloc() can fail when we use GFP_NORETRY.  If it does, allocate
> > a bio large enough for a single page like mpage_readpages() does.
> 
> Why does mpage_readpages() do that?
> 
> Is this a means to guarantee some kind of forward (readahead?) progress?
> Forgive my ignorance, but if memory is so tight we can't allocate a bio
> for readahead then why not exit having accomplished nothing?

As far as I can tell, it's just a general fallback in mpage_readpages().

 * If anything unusual happens, such as:
 *
 * - encountering a page which has buffers
 * - encountering a page which has a non-hole after a hole
 * - encountering a page with non-contiguous blocks
 *
 * then this code just gives up and calls the buffer_head-based read function.

The actual code for that is:

                args->bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
                                        min_t(int, args->nr_pages,
                                              BIO_MAX_PAGES),
                                        gfp);
                if (args->bio == NULL)
                        goto confused;
...
confused:
        if (args->bio)
                args->bio = mpage_bio_submit(REQ_OP_READ, op_flags, args->bio);
        if (!PageUptodate(page))
                block_read_full_page(page, args->get_block);
        else
                unlock_page(page);

As the comment implies, there are a lot of 'goto confused' cases in
do_mpage_readpage().

Ideally, yes, we'd just give up on reading this page because it's
only readahead, and we shouldn't stall actual work in order to reclaim
memory so we can finish doing readahead.  However, handling a partial
page read is painful.  Allocating a bio big enough for a single page is
much easier on the mm than allocating a larger bio (for a start, it's a
single allocation, not a pair of allocations), so this is a reasonable
compromise between simplicity of code and quality of implementation.
Christoph Hellwig April 1, 2020, 3:50 p.m. UTC | #3
On Tue, Mar 31, 2020 at 08:04:21PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> bio_alloc() can fail when we use GFP_NORETRY.  If it does, allocate
> a bio large enough for a single page like mpage_readpages() does.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks ok - not because I'm a fan of the pattern, but because we have
a real bug and this seems to be the quickest fix and similar to the
mpage codebase..

Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong April 1, 2020, 4:48 p.m. UTC | #4
On Wed, Apr 01, 2020 at 04:23:21AM -0700, Matthew Wilcox wrote:
> On Tue, Mar 31, 2020 at 09:31:25PM -0700, Darrick J. Wong wrote:
> > On Tue, Mar 31, 2020 at 08:04:21PM -0700, Matthew Wilcox wrote:
> > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > > 
> > > bio_alloc() can fail when we use GFP_NORETRY.  If it does, allocate
> > > a bio large enough for a single page like mpage_readpages() does.
> > 
> > Why does mpage_readpages() do that?
> > 
> > Is this a means to guarantee some kind of forward (readahead?) progress?
> > Forgive my ignorance, but if memory is so tight we can't allocate a bio
> > for readahead then why not exit having accomplished nothing?
> 
> As far as I can tell, it's just a general fallback in mpage_readpages().
> 
>  * If anything unusual happens, such as:
>  *
>  * - encountering a page which has buffers
>  * - encountering a page which has a non-hole after a hole
>  * - encountering a page with non-contiguous blocks
>  *
>  * then this code just gives up and calls the buffer_head-based read function.
> 
> The actual code for that is:
> 
>                 args->bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
>                                         min_t(int, args->nr_pages,
>                                               BIO_MAX_PAGES),
>                                         gfp);
>                 if (args->bio == NULL)
>                         goto confused;
> ...
> confused:
>         if (args->bio)
>                 args->bio = mpage_bio_submit(REQ_OP_READ, op_flags, args->bio);
>         if (!PageUptodate(page))
>                 block_read_full_page(page, args->get_block);
>         else
>                 unlock_page(page);
> 
> As the comment implies, there are a lot of 'goto confused' cases in
> do_mpage_readpage().
> 
> Ideally, yes, we'd just give up on reading this page because it's
> only readahead, and we shouldn't stall actual work in order to reclaim
> memory so we can finish doing readahead.  However, handling a partial
> page read is painful.  Allocating a bio big enough for a single page is
> much easier on the mm than allocating a larger bio (for a start, it's a
> single allocation, not a pair of allocations), so this is a reasonable
> compromise between simplicity of code and quality of implementation.

Hmm, ok.  I'll add a comment about that:

		/*
		 * If the bio_alloc fails, try it again for a single page to
		 * avoid having to deal with partial page reads.  This emulates
		 * what do_mpage_readpage does.
		 */
		if (!ctx->bio)
			ctx->bio = bio_alloc(orig_gfp, 1);

...in the hopes that if anyone ever makes partial page reads less
painful, they'll hopefully find this breadcrumb and clean up iomap too.

If that's ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D
Matthew Wilcox April 1, 2020, 4:58 p.m. UTC | #5
On Wed, Apr 01, 2020 at 09:48:25AM -0700, Darrick J. Wong wrote:
> On Wed, Apr 01, 2020 at 04:23:21AM -0700, Matthew Wilcox wrote:
> > On Tue, Mar 31, 2020 at 09:31:25PM -0700, Darrick J. Wong wrote:
> > > On Tue, Mar 31, 2020 at 08:04:21PM -0700, Matthew Wilcox wrote:
> > > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > > > 
> > > > bio_alloc() can fail when we use GFP_NORETRY.  If it does, allocate
> > > > a bio large enough for a single page like mpage_readpages() does.
> > > 
> > > Why does mpage_readpages() do that?
> > > 
> > > Is this a means to guarantee some kind of forward (readahead?) progress?
> > > Forgive my ignorance, but if memory is so tight we can't allocate a bio
> > > for readahead then why not exit having accomplished nothing?
> > 
> > As far as I can tell, it's just a general fallback in mpage_readpages().
> > 
> >  * If anything unusual happens, such as:
> >  *
> >  * - encountering a page which has buffers
> >  * - encountering a page which has a non-hole after a hole
> >  * - encountering a page with non-contiguous blocks
> >  *
> >  * then this code just gives up and calls the buffer_head-based read function.
> > 
> > The actual code for that is:
> > 
> >                 args->bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
> >                                         min_t(int, args->nr_pages,
> >                                               BIO_MAX_PAGES),
> >                                         gfp);
> >                 if (args->bio == NULL)
> >                         goto confused;
> > ...
> > confused:
> >         if (args->bio)
> >                 args->bio = mpage_bio_submit(REQ_OP_READ, op_flags, args->bio);
> >         if (!PageUptodate(page))
> >                 block_read_full_page(page, args->get_block);
> >         else
> >                 unlock_page(page);
> > 
> > As the comment implies, there are a lot of 'goto confused' cases in
> > do_mpage_readpage().
> > 
> > Ideally, yes, we'd just give up on reading this page because it's
> > only readahead, and we shouldn't stall actual work in order to reclaim
> > memory so we can finish doing readahead.  However, handling a partial
> > page read is painful.  Allocating a bio big enough for a single page is
> > much easier on the mm than allocating a larger bio (for a start, it's a
> > single allocation, not a pair of allocations), so this is a reasonable
> > compromise between simplicity of code and quality of implementation.
> 
> Hmm, ok.  I'll add a comment about that:
> 
> 		/*
> 		 * If the bio_alloc fails, try it again for a single page to
> 		 * avoid having to deal with partial page reads.  This emulates
> 		 * what do_mpage_readpage does.
> 		 */
> 		if (!ctx->bio)
> 			ctx->bio = bio_alloc(orig_gfp, 1);
> 
> ...in the hopes that if anyone ever makes partial page reads less
> painful, they'll hopefully find this breadcrumb and clean up iomap too.
> 
> If that's ok,
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

That makes perfect sense; thank you.  Assuming you'll just apply it with
that change.

Patch
diff mbox series

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 417115bfaf6b..c258801f18d4 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -302,6 +302,7 @@  iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
 	if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) {
 		gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
+		gfp_t orig_gfp = gfp;
 		int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
 
 		if (ctx->bio)
@@ -310,6 +311,8 @@  iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		if (ctx->is_readahead) /* same as readahead_gfp_mask */
 			gfp |= __GFP_NORETRY | __GFP_NOWARN;
 		ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
+		if (!ctx->bio)
+			ctx->bio = bio_alloc(orig_gfp, 1);
 		ctx->bio->bi_opf = REQ_OP_READ;
 		if (ctx->is_readahead)
 			ctx->bio->bi_opf |= REQ_RAHEAD;