Message ID | 20230811161528.506437-3-willy@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add and use bdev_getblk() | expand |
On Fri 11-08-23 17:15:27, Matthew Wilcox (Oracle) wrote: > grow_dev_page() is only called by grow_buffers(). grow_buffers() > is only called by __getblk_slow() and __getblk_slow() is only called > from __getblk_gfp(), so it is safe to move the GFP flags setting > all the way up. With that done, add a new bdev_getblk() entry point > that leaves the GFP flags the way the caller specified them. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Can't we just finish this gfp parameter conversion for all the users? There are five __getblk_gfp() users, three in buffer_head.h directly generate gfp mask, two (__bread_gfp() and sb_getblk_gfp()) pass it from the caller. All three __bread_gfp() callers are in buffer_head.h and directly generate gfp mask. sb_getblk_gfp() has five callers, all in ext4 and easily convertable as well. This results not only in cleaner code but also just checking sb_getblk_gfp() callers shows how confused they currently are about the gfp argument (passing NOFS, NOFAIL and other pointless flags). Secondly, we can keep using sb_getblk_gfp() from the filesystems instead of having to decide between sb_getblk_gfp() and bdev_getblk(). If you don't have time for this, I guess I can find some... Honza > --- > fs/buffer.c | 60 ++++++++++++++++++++++++------------- > include/linux/buffer_head.h | 2 ++ > 2 files changed, 41 insertions(+), 21 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 7326acc29541..122b7d16befb 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1048,20 +1048,11 @@ grow_dev_page(struct block_device *bdev, sector_t block, > struct buffer_head *bh; > sector_t end_block; > int ret = 0; > - gfp_t gfp_mask; > - > - gfp_mask = mapping_gfp_constraint(inode->i_mapping, ~__GFP_FS) | gfp; > - > - /* > - * XXX: __getblk_slow() can not really deal with failure and > - * will endlessly loop on improvised global reclaim. Prefer > - * looping in the allocator rather than here, at least that > - * code knows what it's doing. > - */ > - gfp_mask |= __GFP_NOFAIL; > > folio = __filemap_get_folio(inode->i_mapping, index, > - FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp_mask); > + FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp); > + if (IS_ERR(folio)) > + return PTR_ERR(folio); > > bh = folio_buffers(folio); > if (bh) { > @@ -1074,7 +1065,9 @@ grow_dev_page(struct block_device *bdev, sector_t block, > goto failed; > } > > - bh = folio_alloc_buffers(folio, size, gfp_mask); > + bh = folio_alloc_buffers(folio, size, gfp); > + if (!bh) > + goto failed; > > /* > * Link the folio to the buffers and initialise them. Take the > @@ -1426,24 +1419,49 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size) > } > EXPORT_SYMBOL(__find_get_block); > > +/** > + * bdev_getblk - Get a buffer_head in a block device's buffer cache. > + * @bdev: The block device. > + * @block: The block number. > + * @size: The size of buffer_heads for this @bdev. > + * @gfp: The memory allocation flags to use. > + * > + * In contrast to __getblk_gfp(), the @gfp flags must be all of the flags; > + * they are not augmented with the mapping's GFP flags. > + * > + * Return: The buffer head, or NULL if memory could not be allocated. > + */ > +struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, > + unsigned size, gfp_t gfp) > +{ > + struct buffer_head *bh = __find_get_block(bdev, block, size); > + > + might_alloc(gfp); > + if (bh) > + return bh; > + > + return __getblk_slow(bdev, block, size, gfp); > +} > +EXPORT_SYMBOL(bdev_getblk); > + > /* > * __getblk_gfp() will locate (and, if necessary, create) the buffer_head > * which corresponds to the passed block_device, block and size. The > * returned buffer has its reference count incremented. > - * > - * __getblk_gfp() will lock up the machine if grow_dev_page's > - * try_to_free_buffers() attempt is failing. FIXME, perhaps? > */ > struct buffer_head * > __getblk_gfp(struct block_device *bdev, sector_t block, > unsigned size, gfp_t gfp) > { > - struct buffer_head *bh = __find_get_block(bdev, block, size); > + gfp |= mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS); > > - might_sleep(); > - if (bh == NULL) > - bh = __getblk_slow(bdev, block, size, gfp); > - return bh; > + /* > + * Prefer looping in the allocator rather than here, at least that > + * code knows what it's doing. > + */ > + gfp |= __GFP_NOFAIL; > + > + return bdev_getblk(bdev, block, size, gfp); > } > EXPORT_SYMBOL(__getblk_gfp); > > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h > index d17efb8b7976..01110db9213c 100644 > --- a/include/linux/buffer_head.h > +++ b/include/linux/buffer_head.h > @@ -233,6 +233,8 @@ void __wait_on_buffer(struct buffer_head *); > wait_queue_head_t *bh_waitq_head(struct buffer_head *bh); > struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block, > unsigned size); > +struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, > + unsigned size, gfp_t gfp); > struct buffer_head *__getblk_gfp(struct block_device *bdev, sector_t block, > unsigned size, gfp_t gfp); > void __brelse(struct buffer_head *); > -- > 2.40.1 >
On Thu, Sep 14, 2023 at 11:16:25AM +0200, Jan Kara wrote: > On Fri 11-08-23 17:15:27, Matthew Wilcox (Oracle) wrote: > > grow_dev_page() is only called by grow_buffers(). grow_buffers() > > is only called by __getblk_slow() and __getblk_slow() is only called > > from __getblk_gfp(), so it is safe to move the GFP flags setting > > all the way up. With that done, add a new bdev_getblk() entry point > > that leaves the GFP flags the way the caller specified them. > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > Can't we just finish this gfp parameter conversion for all the users? > There are five __getblk_gfp() users, three in buffer_head.h directly > generate gfp mask, two (__bread_gfp() and sb_getblk_gfp()) pass it from the > caller. All three __bread_gfp() callers are in buffer_head.h and directly > generate gfp mask. sb_getblk_gfp() has five callers, all in ext4 and easily > convertable as well. > > This results not only in cleaner code but also just checking > sb_getblk_gfp() callers shows how confused they currently are about the gfp > argument (passing NOFS, NOFAIL and other pointless flags). Secondly, we can > keep using sb_getblk_gfp() from the filesystems instead of having to decide > between sb_getblk_gfp() and bdev_getblk(). I didn't do __bread_gfp() because it's basically an internal interface. All users call sb_bread(), sb_bread_unmovable() or __bread(). It doesn't seem worth doing. Now, if we start to see people actually using __bread_gfp() outside of those three interfaces, I'd agree we need to make it use GFP flags properly. BTW, Andrew has taken the bdev_getblk() series into the mm tree, so testing that tree might be a good idea for the ext4 developers (and other filesystems; an earlier revision of this patchset had a bug which would have only affected nilfs2).
diff --git a/fs/buffer.c b/fs/buffer.c index 7326acc29541..122b7d16befb 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1048,20 +1048,11 @@ grow_dev_page(struct block_device *bdev, sector_t block, struct buffer_head *bh; sector_t end_block; int ret = 0; - gfp_t gfp_mask; - - gfp_mask = mapping_gfp_constraint(inode->i_mapping, ~__GFP_FS) | gfp; - - /* - * XXX: __getblk_slow() can not really deal with failure and - * will endlessly loop on improvised global reclaim. Prefer - * looping in the allocator rather than here, at least that - * code knows what it's doing. - */ - gfp_mask |= __GFP_NOFAIL; folio = __filemap_get_folio(inode->i_mapping, index, - FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp_mask); + FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp); + if (IS_ERR(folio)) + return PTR_ERR(folio); bh = folio_buffers(folio); if (bh) { @@ -1074,7 +1065,9 @@ grow_dev_page(struct block_device *bdev, sector_t block, goto failed; } - bh = folio_alloc_buffers(folio, size, gfp_mask); + bh = folio_alloc_buffers(folio, size, gfp); + if (!bh) + goto failed; /* * Link the folio to the buffers and initialise them. Take the @@ -1426,24 +1419,49 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size) } EXPORT_SYMBOL(__find_get_block); +/** + * bdev_getblk - Get a buffer_head in a block device's buffer cache. + * @bdev: The block device. + * @block: The block number. + * @size: The size of buffer_heads for this @bdev. + * @gfp: The memory allocation flags to use. + * + * In contrast to __getblk_gfp(), the @gfp flags must be all of the flags; + * they are not augmented with the mapping's GFP flags. + * + * Return: The buffer head, or NULL if memory could not be allocated. + */ +struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, + unsigned size, gfp_t gfp) +{ + struct buffer_head *bh = __find_get_block(bdev, block, size); + + might_alloc(gfp); + if (bh) + return bh; + + return __getblk_slow(bdev, block, size, gfp); +} +EXPORT_SYMBOL(bdev_getblk); + /* * __getblk_gfp() will locate (and, if necessary, create) the buffer_head * which corresponds to the passed block_device, block and size. The * returned buffer has its reference count incremented. - * - * __getblk_gfp() will lock up the machine if grow_dev_page's - * try_to_free_buffers() attempt is failing. FIXME, perhaps? */ struct buffer_head * __getblk_gfp(struct block_device *bdev, sector_t block, unsigned size, gfp_t gfp) { - struct buffer_head *bh = __find_get_block(bdev, block, size); + gfp |= mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS); - might_sleep(); - if (bh == NULL) - bh = __getblk_slow(bdev, block, size, gfp); - return bh; + /* + * Prefer looping in the allocator rather than here, at least that + * code knows what it's doing. + */ + gfp |= __GFP_NOFAIL; + + return bdev_getblk(bdev, block, size, gfp); } EXPORT_SYMBOL(__getblk_gfp); diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index d17efb8b7976..01110db9213c 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -233,6 +233,8 @@ void __wait_on_buffer(struct buffer_head *); wait_queue_head_t *bh_waitq_head(struct buffer_head *bh); struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block, unsigned size); +struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, + unsigned size, gfp_t gfp); struct buffer_head *__getblk_gfp(struct block_device *bdev, sector_t block, unsigned size, gfp_t gfp); void __brelse(struct buffer_head *);
grow_dev_page() is only called by grow_buffers(). grow_buffers() is only called by __getblk_slow() and __getblk_slow() is only called from __getblk_gfp(), so it is safe to move the GFP flags setting all the way up. With that done, add a new bdev_getblk() entry point that leaves the GFP flags the way the caller specified them. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- fs/buffer.c | 60 ++++++++++++++++++++++++------------- include/linux/buffer_head.h | 2 ++ 2 files changed, 41 insertions(+), 21 deletions(-)