Message ID | 20190901055130.30572-1-hsiangkao@aol.com (mailing list archive) |
---|---|
Headers | show |
Series | erofs: patchset addressing Christoph's comments | expand |
On Sun, Sep 01, 2019 at 01:51:09PM +0800, Gao Xiang wrote: > Hi, > > This patchset is based on the following patch by Pratik Shinde, > https://lore.kernel.org/linux-erofs/20190830095615.10995-1-pratikshinde320@gmail.com/ > > All patches addressing Christoph's comments on v6, which are trivial, > most deleted code are from erofs specific fault injection, which was > followed f2fs and previously discussed in earlier topic [1], but > let's follow what Christoph's said now. I like where the cleanups are going. But this is really just basic code quality stuff. We're not addressing the issues with large amounts of functionality duplicating VFS helpers.
Hi Christoph, On Mon, Sep 02, 2019 at 05:46:45AM -0700, Christoph Hellwig wrote: > On Sun, Sep 01, 2019 at 01:51:09PM +0800, Gao Xiang wrote: > > Hi, > > > > This patchset is based on the following patch by Pratik Shinde, > > https://lore.kernel.org/linux-erofs/20190830095615.10995-1-pratikshinde320@gmail.com/ > > > > All patches addressing Christoph's comments on v6, which are trivial, > > most deleted code are from erofs specific fault injection, which was > > followed f2fs and previously discussed in earlier topic [1], but > > let's follow what Christoph's said now. > > I like where the cleanups are going. But this is really just basic For now, I think it will go to Greg's tree. Before that, I will do patchset v2 to address all remaining comments... > code quality stuff. We're not addressing the issues with large amounts > of functionality duplicating VFS helpers. You means killing erofs_get_meta_page or avoid erofs_read_raw_page? - For killing erofs_get_meta_page, here is the current erofs_get_meta_page: 35 struct page *erofs_get_meta_page(struct super_block *sb, erofs_blk_t blkaddr) 36 { 37 struct inode *const bd_inode = sb->s_bdev->bd_inode; 38 struct address_space *const mapping = bd_inode->i_mapping; 39 /* prefer retrying in the allocator to blindly looping below */ 40 const gfp_t gfp = mapping_gfp_constraint(mapping, ~__GFP_FS); 41 struct page *page; 42 int err; 43 44 repeat: 45 page = find_or_create_page(mapping, blkaddr, gfp); 46 if (!page) 47 return ERR_PTR(-ENOMEM); 48 49 DBG_BUGON(!PageLocked(page)); 50 51 if (!PageUptodate(page)) { 52 struct bio *bio; 53 54 bio = erofs_grab_bio(sb, REQ_OP_READ | REQ_META, 55 blkaddr, 1, sb, erofs_readendio); 56 57 if (bio_add_page(bio, page, PAGE_SIZE, 0) != PAGE_SIZE) { 58 err = -EFAULT; 59 goto err_out; 60 } 61 62 submit_bio(bio); 63 lock_page(page); 64 65 /* this page has been truncated by others */ 66 if (page->mapping != mapping) { 67 unlock_page(page); 68 put_page(page); 69 goto repeat; 70 } 71 72 /* more likely a read error */ 73 if (!PageUptodate(page)) { 74 err = -EIO; 75 goto err_out; 76 } 77 } 78 return page; 79 80 err_out: 81 unlock_page(page); 82 put_page(page); 83 return ERR_PTR(err); 84 } I think it is simple enough. read_cache_page need write a similar filler, or read_cache_page_gfp will call .readpage, and then introduce buffer_heads, that is what I'd like to avoid now (no need these bd_inode buffer_heads in memory...) - For erofs_read_raw_page, it can be avoided after iomap tail-end packing feature is done... If we remove it now, it will make EROFS broken. It is no urgent and Chao will focus on iomap tail-end packing feature. Thanks, Gao Xiang
On Mon, Sep 02, 2019 at 10:24:52PM +0800, Gao Xiang wrote: > > code quality stuff. We're not addressing the issues with large amounts > > of functionality duplicating VFS helpers. > > You means killing erofs_get_meta_page or avoid erofs_read_raw_page? > > - For killing erofs_get_meta_page, here is the current erofs_get_meta_page: > I think it is simple enough. read_cache_page need write a similar > filler, or read_cache_page_gfp will call .readpage, and then > introduce buffer_heads, that is what I'd like to avoid now (no need these > bd_inode buffer_heads in memory...) If using read_cache_page_gfp and ->readpage works, please do. The fact that the block device inode uses buffer heads is an implementation detail that might not last very long and should be invisible to you. It also means you can get rid of a lot of code that you don't have to maintain and others don't have to update for global API changes. > - For erofs_read_raw_page, it can be avoided after iomap tail-end packing > feature is done... If we remove it now, it will make EROFS broken. > It is no urgent and Chao will focus on iomap tail-end packing feature. Ok. I wish we would have just sorted this out beforehand, which we could have trivially done without all that staging mess.
Hi Christoph, On Mon, Sep 02, 2019 at 08:23:23AM -0700, Christoph Hellwig wrote: > On Mon, Sep 02, 2019 at 10:24:52PM +0800, Gao Xiang wrote: > > > code quality stuff. We're not addressing the issues with large amounts > > > of functionality duplicating VFS helpers. > > > > You means killing erofs_get_meta_page or avoid erofs_read_raw_page? > > > > - For killing erofs_get_meta_page, here is the current erofs_get_meta_page: > > > I think it is simple enough. read_cache_page need write a similar > > filler, or read_cache_page_gfp will call .readpage, and then > > introduce buffer_heads, that is what I'd like to avoid now (no need these > > bd_inode buffer_heads in memory...) > > If using read_cache_page_gfp and ->readpage works, please do. The > fact that the block device inode uses buffer heads is an implementation > detail that might not last very long and should be invisible to you. > It also means you can get rid of a lot of code that you don't have > to maintain and others don't have to update for global API changes. I care about those useless buffer_heads in memory for our products... Since we are nobh filesystem (a little request, could I use it after buffer_heads are fully avoided, I have no idea why I need those buffer_heads in memory.... But I think bd_inode is good for caching metadata...) > > > - For erofs_read_raw_page, it can be avoided after iomap tail-end packing > > feature is done... If we remove it now, it will make EROFS broken. > > It is no urgent and Chao will focus on iomap tail-end packing feature. > > Ok. I wish we would have just sorted this out beforehand, which we > could have trivially done without all that staging mess. Firstly, I'd like to introduce EROFS as a self-contained filesystem to introduce new fixed-sized output compression to upstream and promote it... And then we can do many improvements for EROFS in parallel... (if we introduce EROFS and touch many core modules like iomap, mm readahead code or modify LZ4 code at once... It could be more careful... Let's improve it step-by-step... We are a dedicated team if the Linux community needs us, we will still here... It will be actively maintained.) Thanks, Gao Xiang
On Mon, Sep 02, 2019 at 11:50:38PM +0800, Gao Xiang wrote: > > > You means killing erofs_get_meta_page or avoid erofs_read_raw_page? > > > > > > - For killing erofs_get_meta_page, here is the current erofs_get_meta_page: > > > > > I think it is simple enough. read_cache_page need write a similar > > > filler, or read_cache_page_gfp will call .readpage, and then > > > introduce buffer_heads, that is what I'd like to avoid now (no need these > > > bd_inode buffer_heads in memory...) > > > > If using read_cache_page_gfp and ->readpage works, please do. The > > fact that the block device inode uses buffer heads is an implementation > > detail that might not last very long and should be invisible to you. > > It also means you can get rid of a lot of code that you don't have > > to maintain and others don't have to update for global API changes. > > I care about those useless buffer_heads in memory for our products... > > Since we are nobh filesystem (a little request, could I use it > after buffer_heads are fully avoided, I have no idea why I need > those buffer_heads in memory.... But I think bd_inode is good > for caching metadata...) Then please use read_cache_page with iomap_readpage(s), and write comment explaining why your are not using read_cache_page_gfp.
Hi Christoph, On Mon, Sep 02, 2019 at 11:58:03PM -0700, Christoph Hellwig wrote: > On Mon, Sep 02, 2019 at 11:50:38PM +0800, Gao Xiang wrote: > > > > You means killing erofs_get_meta_page or avoid erofs_read_raw_page? > > > > > > > > - For killing erofs_get_meta_page, here is the current erofs_get_meta_page: > > > > > > > I think it is simple enough. read_cache_page need write a similar > > > > filler, or read_cache_page_gfp will call .readpage, and then > > > > introduce buffer_heads, that is what I'd like to avoid now (no need these > > > > bd_inode buffer_heads in memory...) > > > > > > If using read_cache_page_gfp and ->readpage works, please do. The > > > fact that the block device inode uses buffer heads is an implementation > > > detail that might not last very long and should be invisible to you. > > > It also means you can get rid of a lot of code that you don't have > > > to maintain and others don't have to update for global API changes. > > > > I care about those useless buffer_heads in memory for our products... > > > > Since we are nobh filesystem (a little request, could I use it > > after buffer_heads are fully avoided, I have no idea why I need > > those buffer_heads in memory.... But I think bd_inode is good > > for caching metadata...) > > Then please use read_cache_page with iomap_readpage(s), and write > comment explaining why your are not using read_cache_page_gfp. I implement a prelimitary version, but I have no idea it is a really cleanup for now. From 001e3e64c81e4ced0d22b147e6abf90060e704b5 Mon Sep 17 00:00:00 2001 From: Gao Xiang <gaoxiang25@huawei.com> Date: Tue, 3 Sep 2019 16:13:00 +0800 Subject: [PATCH] erofs: use iomap_readpage for erofs_get_meta_page Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> --- fs/erofs/Kconfig | 1 + fs/erofs/data.c | 91 ++++++++++++++++++++++++++---------------------- 2 files changed, 51 insertions(+), 41 deletions(-) diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig index 9d634d3a1845..c9eeb0bf4737 100644 --- a/fs/erofs/Kconfig +++ b/fs/erofs/Kconfig @@ -3,6 +3,7 @@ config EROFS_FS tristate "EROFS filesystem support" depends on BLOCK + select FS_IOMAP help EROFS (Enhanced Read-Only File System) is a lightweight read-only file system with modern designs (eg. page-sized diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 3881c0689134..34c6e05fab71 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -5,6 +5,9 @@ * Created by Gao Xiang <gaoxiang25@huawei.com> */ #include "internal.h" +#include <linux/iomap.h> +#include <linux/mpage.h> +#include <linux/sched/mm.h> #include <linux/prefetch.h> #include <trace/events/erofs.h> @@ -51,54 +54,60 @@ static struct bio *erofs_grab_raw_bio(struct super_block *sb, return bio; } -struct page *erofs_get_meta_page(struct super_block *sb, erofs_blk_t blkaddr) +static int erofs_meta_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned int flags, + struct iomap *iomap) { - struct inode *const bd_inode = sb->s_bdev->bd_inode; - struct address_space *const mapping = bd_inode->i_mapping; - /* prefer retrying in the allocator to blindly looping below */ - const gfp_t gfp = mapping_gfp_constraint(mapping, ~__GFP_FS); - struct page *page; - int err; - -repeat: - page = find_or_create_page(mapping, blkaddr, gfp); - if (!page) - return ERR_PTR(-ENOMEM); - - DBG_BUGON(!PageLocked(page)); - - if (!PageUptodate(page)) { - struct bio *bio; + const unsigned int blkbits = inode->i_blkbits; + + iomap->flags = 0; + iomap->bdev = I_BDEV(inode); + iomap->offset = round_down(pos, 1 << blkbits); + iomap->addr = iomap->offset; + iomap->length = round_up(length, 1 << blkbits); + iomap->type = IOMAP_MAPPED; + return 0; +} - bio = erofs_grab_raw_bio(sb, blkaddr, 1, true); +static const struct iomap_ops erofs_meta_iomap_ops = { + .iomap_begin = erofs_meta_iomap_begin, +}; - if (bio_add_page(bio, page, PAGE_SIZE, 0) != PAGE_SIZE) { - err = -EFAULT; - goto err_out; - } +static int +erofs_meta_get_block(struct inode *inode, sector_t iblock, + struct buffer_head *bh, int create) +{ + bh->b_bdev = I_BDEV(inode); + bh->b_blocknr = iblock; + set_buffer_mapped(bh); + return 0; +} - submit_bio(bio); - lock_page(page); +static int erofs_read_meta_page(void *file, struct page *page) +{ + /* in case of getting some pages with buffer_heads */ + if (i_blocksize(page->mapping->host) == PAGE_SIZE && + !page_has_buffers(page)) + return iomap_readpage(page, &erofs_meta_iomap_ops); + + /* + * cannot use blkdev_readpage or blkdev_get_block directly + * since static in block_dev.c + */ + return mpage_readpage(page, erofs_meta_get_block); +} - /* this page has been truncated by others */ - if (page->mapping != mapping) { - unlock_page(page); - put_page(page); - goto repeat; - } +struct page *erofs_get_meta_page(struct super_block *sb, erofs_blk_t blkaddr) +{ + struct inode *const bd_inode = sb->s_bdev->bd_inode; + struct address_space *const mapping = bd_inode->i_mapping; + unsigned int nofs_flag; + struct page *page; - /* more likely a read error */ - if (!PageUptodate(page)) { - err = -EIO; - goto err_out; - } - } + nofs_flag = memalloc_nofs_save(); + page = read_cache_page(mapping, blkaddr, erofs_read_meta_page, NULL); + memalloc_nofs_restore(nofs_flag); return page; - -err_out: - unlock_page(page); - put_page(page); - return ERR_PTR(err); } static int erofs_map_blocks_flatmode(struct inode *inode,
On Tue, Sep 03, 2019 at 04:17:49PM +0800, Gao Xiang wrote: > I implement a prelimitary version, but I have no idea it is a really > cleanup for now. The fact that this has to guess the block device address_space implementation is indeed pretty ugly. I'd much prefer to just use read_cache_page_gfp, and live with the fact that this allocates bufferheads behind you for now. I'll try to speed up my attempts to get rid of the buffer heads on the block device mapping instead.
Hi Christoph, On Tue, Sep 03, 2019 at 08:37:04AM -0700, Christoph Hellwig wrote: > On Tue, Sep 03, 2019 at 04:17:49PM +0800, Gao Xiang wrote: > > I implement a prelimitary version, but I have no idea it is a really > > cleanup for now. > > The fact that this has to guess the block device address_space > implementation is indeed pretty ugly. I'd much prefer to just use > read_cache_page_gfp, and live with the fact that this allocates > bufferheads behind you for now. I'll try to speed up my attempts to > get rid of the buffer heads on the block device mapping instead. Fully agree with your point. Let me use read_cache_page_gfp instead for now, hoping block device quickly avoiding from buffer_heads... Thanks, Gao Xiang