Message ID | 20231109210608.2252323-2-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | More buffer_head cleanups | expand |
On Fri, Nov 10, 2023 at 6:07 AM Matthew Wilcox (Oracle) wrote: > > Rename grow_dev_page() to grow_dev_folio() and make it return a bool. > Document what that bool means; it's more subtle than it first appears. > Also rename the 'failed' label to 'unlock' beacuse it's not exactly > 'failed'. It just hasn't succeeded. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > fs/buffer.c | 50 +++++++++++++++++++++++++------------------------- > 1 file changed, 25 insertions(+), 25 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 967f34b70aa8..8dad6c691e14 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1024,40 +1024,43 @@ static sector_t folio_init_buffers(struct folio *folio, > } > > /* > - * Create the page-cache page that contains the requested block. > + * Create the page-cache folio that contains the requested block. > * > * This is used purely for blockdev mappings. > + * > + * Returns false if we have a 'permanent' failure. Returns true if > + * we succeeded, or the caller should retry. > */ > -static int > -grow_dev_page(struct block_device *bdev, sector_t block, > - pgoff_t index, int size, int sizebits, gfp_t gfp) > +static bool grow_dev_folio(struct block_device *bdev, sector_t block, > + pgoff_t index, unsigned size, int sizebits, gfp_t gfp) > { > struct inode *inode = bdev->bd_inode; > struct folio *folio; > struct buffer_head *bh; > - sector_t end_block; > - int ret = 0; > + sector_t end_block = 0; > > folio = __filemap_get_folio(inode->i_mapping, index, > FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp); > if (IS_ERR(folio)) > - return PTR_ERR(folio); > + return false; > > bh = folio_buffers(folio); > if (bh) { > if (bh->b_size == size) { > end_block = folio_init_buffers(folio, bdev, > (sector_t)index << sizebits, size); > - goto done; > + goto unlock; > } > + > + /* Caller should retry if this call fails */ > + end_block = ~0ULL; > if (!try_to_free_buffers(folio)) > - goto failed; > + goto unlock; > } > > - ret = -ENOMEM; > bh = folio_alloc_buffers(folio, size, gfp | __GFP_ACCOUNT); > if (!bh) > - goto failed; > + goto unlock; Regarding this folio_alloc_buffers() error path, If folio_buffers() was NULL, here end_block is 0, so this function returns false (which means "have a permanent failure"). But, if folio_buffers() existed and they were freed with try_to_free_buffers() because of bh->b_size != size, here end_block has been set to ~0ULL, so it seems to return true ("succeeded"). Does this semantic change match your intent? Otherwise, I think end_block should be set to 0 just before calling folio_alloc_buffers(). Regards, Ryusuke Konishi > > /* > * Link the folio to the buffers and initialise them. Take the > @@ -1069,20 +1072,19 @@ grow_dev_page(struct block_device *bdev, sector_t block, > end_block = folio_init_buffers(folio, bdev, > (sector_t)index << sizebits, size); > spin_unlock(&inode->i_mapping->private_lock); > -done: > - ret = (block < end_block) ? 1 : -ENXIO; > -failed: > +unlock: > folio_unlock(folio); > folio_put(folio); > - return ret; > + return block < end_block; > } > > /* > - * Create buffers for the specified block device block's page. If > - * that page was dirty, the buffers are set dirty also. > + * Create buffers for the specified block device block's folio. If > + * that folio was dirty, the buffers are set dirty also. Returns false > + * if we've hit a permanent error. > */ > -static int > -grow_buffers(struct block_device *bdev, sector_t block, int size, gfp_t gfp) > +static bool grow_buffers(struct block_device *bdev, sector_t block, > + unsigned size, gfp_t gfp) > { > pgoff_t index; > int sizebits; > @@ -1099,11 +1101,11 @@ grow_buffers(struct block_device *bdev, sector_t block, int size, gfp_t gfp) > "device %pg\n", > __func__, (unsigned long long)block, > bdev); > - return -EIO; > + return false; > } > > - /* Create a page with the proper size buffers.. */ > - return grow_dev_page(bdev, block, index, size, sizebits, gfp); > + /* Create a folio with the proper size buffers */ > + return grow_dev_folio(bdev, block, index, size, sizebits, gfp); > } > > static struct buffer_head * > @@ -1124,14 +1126,12 @@ __getblk_slow(struct block_device *bdev, sector_t block, > > for (;;) { > struct buffer_head *bh; > - int ret; > > bh = __find_get_block(bdev, block, size); > if (bh) > return bh; > > - ret = grow_buffers(bdev, block, size, gfp); > - if (ret < 0) > + if (!grow_buffers(bdev, block, size, gfp)) > return NULL; > } > } > -- > 2.42.0 > >
On Fri, Nov 10, 2023 at 12:43:43PM +0900, Ryusuke Konishi wrote: > On Fri, Nov 10, 2023 at 6:07 AM Matthew Wilcox (Oracle) wrote: > > + /* Caller should retry if this call fails */ > > + end_block = ~0ULL; > > if (!try_to_free_buffers(folio)) > > - goto failed; > > + goto unlock; > > } > > > > > - ret = -ENOMEM; > > bh = folio_alloc_buffers(folio, size, gfp | __GFP_ACCOUNT); > > if (!bh) > > - goto failed; > > + goto unlock; > > Regarding this folio_alloc_buffers() error path, > If folio_buffers() was NULL, here end_block is 0, so this function > returns false (which means "have a permanent failure"). > > But, if folio_buffers() existed and they were freed with > try_to_free_buffers() because of bh->b_size != size, here end_block > has been set to ~0ULL, so it seems to return true ("succeeded"). > > Does this semantic change match your intent? > > Otherwise, I think end_block should be set to 0 just before calling > folio_alloc_buffers(). Thanks for the review, and sorry for taking so long to get back to you. The change was unintentional (but memory allocation failure wth GFP_KERNEL happens so rarely that our testing was never going to catch it) I think I should just move the assignment to end_block inside the 'if'. It's just more obvious what's going on. Andrew prodded me to be more explicit about why memory allocation is a "permanent" failure, but failing to free buffers is not. I'll turn this into a proper patch submission later. diff --git a/fs/buffer.c b/fs/buffer.c index d5ce6b29c893..d3bcf601d3e5 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1028,8 +1028,8 @@ static sector_t folio_init_buffers(struct folio *folio, * * This is used purely for blockdev mappings. * - * Returns false if we have a 'permanent' failure. Returns true if - * we succeeded, or the caller should retry. + * Returns false if we have a failure which cannot be cured by retrying + * without sleeping. Returns true if we succeeded, or the caller should retry. */ static bool grow_dev_folio(struct block_device *bdev, sector_t block, pgoff_t index, unsigned size, gfp_t gfp) @@ -1051,10 +1051,17 @@ static bool grow_dev_folio(struct block_device *bdev, sector_t block, goto unlock; } - /* Caller should retry if this call fails */ - end_block = ~0ULL; - if (!try_to_free_buffers(folio)) + /* + * Retrying may succeed; for example the folio may finish + * writeback, or buffers may be cleaned. This should not + * happen very often; maybe we have old buffers attached to + * this blockdev's page cache and we're trying to change + * the block size? + */ + if (!try_to_free_buffers(folio)) { + end_block = ~0ULL; goto unlock; + } } bh = folio_alloc_buffers(folio, size, gfp | __GFP_ACCOUNT);
On Sat, Dec 30, 2023 at 6:23 PM Matthew Wilcox wrote: > > On Fri, Nov 10, 2023 at 12:43:43PM +0900, Ryusuke Konishi wrote: > > On Fri, Nov 10, 2023 at 6:07 AM Matthew Wilcox (Oracle) wrote: > > > + /* Caller should retry if this call fails */ > > > + end_block = ~0ULL; > > > if (!try_to_free_buffers(folio)) > > > - goto failed; > > > + goto unlock; > > > } > > > > > > > > - ret = -ENOMEM; > > > bh = folio_alloc_buffers(folio, size, gfp | __GFP_ACCOUNT); > > > if (!bh) > > > - goto failed; > > > + goto unlock; > > > > Regarding this folio_alloc_buffers() error path, > > If folio_buffers() was NULL, here end_block is 0, so this function > > returns false (which means "have a permanent failure"). > > > > But, if folio_buffers() existed and they were freed with > > try_to_free_buffers() because of bh->b_size != size, here end_block > > has been set to ~0ULL, so it seems to return true ("succeeded"). > > > > Does this semantic change match your intent? > > > > Otherwise, I think end_block should be set to 0 just before calling > > folio_alloc_buffers(). > > Thanks for the review, and sorry for taking so long to get back to you. > The change was unintentional (but memory allocation failure wth GFP_KERNEL > happens so rarely that our testing was never going to catch it) > > I think I should just move the assignment to end_block inside the 'if'. > It's just more obvious what's going on. Agree. I also think this fix is better. Regards, Ryusuke Konishi > Andrew prodded me to be more > explicit about why memory allocation is a "permanent" failure, but > failing to free buffers is not. > > I'll turn this into a proper patch submission later. > > diff --git a/fs/buffer.c b/fs/buffer.c > index d5ce6b29c893..d3bcf601d3e5 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1028,8 +1028,8 @@ static sector_t folio_init_buffers(struct folio *folio, > * > * This is used purely for blockdev mappings. > * > - * Returns false if we have a 'permanent' failure. Returns true if > - * we succeeded, or the caller should retry. > + * Returns false if we have a failure which cannot be cured by retrying > + * without sleeping. Returns true if we succeeded, or the caller should retry. > */ > static bool grow_dev_folio(struct block_device *bdev, sector_t block, > pgoff_t index, unsigned size, gfp_t gfp) > @@ -1051,10 +1051,17 @@ static bool grow_dev_folio(struct block_device *bdev, sector_t block, > goto unlock; > } > > - /* Caller should retry if this call fails */ > - end_block = ~0ULL; > - if (!try_to_free_buffers(folio)) > + /* > + * Retrying may succeed; for example the folio may finish > + * writeback, or buffers may be cleaned. This should not > + * happen very often; maybe we have old buffers attached to > + * this blockdev's page cache and we're trying to change > + * the block size? > + */ > + if (!try_to_free_buffers(folio)) { > + end_block = ~0ULL; > goto unlock; > + } > } > > bh = folio_alloc_buffers(folio, size, gfp | __GFP_ACCOUNT);
diff --git a/fs/buffer.c b/fs/buffer.c index 967f34b70aa8..8dad6c691e14 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1024,40 +1024,43 @@ static sector_t folio_init_buffers(struct folio *folio, } /* - * Create the page-cache page that contains the requested block. + * Create the page-cache folio that contains the requested block. * * This is used purely for blockdev mappings. + * + * Returns false if we have a 'permanent' failure. Returns true if + * we succeeded, or the caller should retry. */ -static int -grow_dev_page(struct block_device *bdev, sector_t block, - pgoff_t index, int size, int sizebits, gfp_t gfp) +static bool grow_dev_folio(struct block_device *bdev, sector_t block, + pgoff_t index, unsigned size, int sizebits, gfp_t gfp) { struct inode *inode = bdev->bd_inode; struct folio *folio; struct buffer_head *bh; - sector_t end_block; - int ret = 0; + sector_t end_block = 0; folio = __filemap_get_folio(inode->i_mapping, index, FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp); if (IS_ERR(folio)) - return PTR_ERR(folio); + return false; bh = folio_buffers(folio); if (bh) { if (bh->b_size == size) { end_block = folio_init_buffers(folio, bdev, (sector_t)index << sizebits, size); - goto done; + goto unlock; } + + /* Caller should retry if this call fails */ + end_block = ~0ULL; if (!try_to_free_buffers(folio)) - goto failed; + goto unlock; } - ret = -ENOMEM; bh = folio_alloc_buffers(folio, size, gfp | __GFP_ACCOUNT); if (!bh) - goto failed; + goto unlock; /* * Link the folio to the buffers and initialise them. Take the @@ -1069,20 +1072,19 @@ grow_dev_page(struct block_device *bdev, sector_t block, end_block = folio_init_buffers(folio, bdev, (sector_t)index << sizebits, size); spin_unlock(&inode->i_mapping->private_lock); -done: - ret = (block < end_block) ? 1 : -ENXIO; -failed: +unlock: folio_unlock(folio); folio_put(folio); - return ret; + return block < end_block; } /* - * Create buffers for the specified block device block's page. If - * that page was dirty, the buffers are set dirty also. + * Create buffers for the specified block device block's folio. If + * that folio was dirty, the buffers are set dirty also. Returns false + * if we've hit a permanent error. */ -static int -grow_buffers(struct block_device *bdev, sector_t block, int size, gfp_t gfp) +static bool grow_buffers(struct block_device *bdev, sector_t block, + unsigned size, gfp_t gfp) { pgoff_t index; int sizebits; @@ -1099,11 +1101,11 @@ grow_buffers(struct block_device *bdev, sector_t block, int size, gfp_t gfp) "device %pg\n", __func__, (unsigned long long)block, bdev); - return -EIO; + return false; } - /* Create a page with the proper size buffers.. */ - return grow_dev_page(bdev, block, index, size, sizebits, gfp); + /* Create a folio with the proper size buffers */ + return grow_dev_folio(bdev, block, index, size, sizebits, gfp); } static struct buffer_head * @@ -1124,14 +1126,12 @@ __getblk_slow(struct block_device *bdev, sector_t block, for (;;) { struct buffer_head *bh; - int ret; bh = __find_get_block(bdev, block, size); if (bh) return bh; - ret = grow_buffers(bdev, block, size, gfp); - if (ret < 0) + if (!grow_buffers(bdev, block, size, gfp)) return NULL; } }
Rename grow_dev_page() to grow_dev_folio() and make it return a bool. Document what that bool means; it's more subtle than it first appears. Also rename the 'failed' label to 'unlock' beacuse it's not exactly 'failed'. It just hasn't succeeded. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- fs/buffer.c | 50 +++++++++++++++++++++++++------------------------- 1 file changed, 25 insertions(+), 25 deletions(-)