Message ID | 20190925005214.27240-3-willy@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Large pages in the page-cache | expand |
On Tue, Sep 24, 2019 at 05:52:01PM -0700, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > This helper is useful for both large pages in the page cache and for > supporting block size larger than page size. Convert some example > users (we have a few different ways of writing this idiom). > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> I'm actually working on abstrcting this code from both block size and page size via the helpers below. We ahve need to support block size > page size, and so that requires touching a bunch of all the same code as this patchset. I'm currently trying to combine your last patch set with my patchset so I can easily test allocating 64k page cache pages on a 64k block size filesystem on a 4k page size machine with XFS.... /* * Return the chunk size we should use for page cache based operations. * This supports both large block sizes and variable page sizes based on the * restriction that order-n blocks and page cache pages are order-n file offset * aligned. * * This will return the inode block size for block size < page_size(page), * otherwise it will return page_size(page). */ static inline unsigned iomap_chunk_size(struct inode *inode, struct page *page) { return min_t(unsigned, page_size(page), i_blocksize(inode)); } static inline unsigned iomap_chunk_bits(struct inode *inode, struct page *page) { return min_t(unsigned, page_shift(page), inode->i_blkbits); } static inline unsigned iomap_chunks_per_page(struct inode *inode, struct page *page) { return page_size(page) >> inode->i_blkbits; } Basically, the process is to convert the iomap code over to iterating "chunks" rather than blocks or pages, and then allocate a struct iomap_page according to the difference between page and block size.... > --- > fs/iomap/buffered-io.c | 4 ++-- > fs/jfs/jfs_metapage.c | 2 +- > fs/xfs/xfs_aops.c | 8 ++++---- > include/linux/pagemap.h | 13 +++++++++++++ > 4 files changed, 20 insertions(+), 7 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index e25901ae3ff4..0e76a4b6d98a 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -24,7 +24,7 @@ iomap_page_create(struct inode *inode, struct page *page) > { > struct iomap_page *iop = to_iomap_page(page); > > - if (iop || i_blocksize(inode) == PAGE_SIZE) > + if (iop || i_blocks_per_page(inode, page) <= 1) > return iop; That also means checks like these become: if (iop || iomap_chunks_per_page(inode, page) <= 1) as a single file can now have multiple pages per block, a page per block and multiple blocks per page as the page size changes... I'd like to only have to make one pass over this code to abstract out page and block sizes, so I'm guessing we'll need to do some co-ordination here.... > @@ -636,4 +636,17 @@ static inline unsigned long dir_pages(struct inode *inode) > PAGE_SHIFT; > } > > +/** > + * i_blocks_per_page - How many blocks fit in this page. > + * @inode: The inode which contains the blocks. > + * @page: The (potentially large) page. > + * > + * Context: Any context. > + * Return: The number of filesystem blocks covered by this page. > + */ > +static inline > +unsigned int i_blocks_per_page(struct inode *inode, struct page *page) > +{ > + return page_size(page) >> inode->i_blkbits; > +} > #endif /* _LINUX_PAGEMAP_H */ It also means that we largely don't need to touch mm headers as all the helpers end up being iomap specific and private... Cheers, Dave.
On Wed, Sep 25, 2019 at 06:36:50PM +1000, Dave Chinner wrote: > I'm actually working on abstrcting this code from both block size > and page size via the helpers below. We ahve need to support block > size > page size, and so that requires touching a bunch of all the > same code as this patchset. I'm currently trying to combine your > last patch set with my patchset so I can easily test allocating 64k > page cache pages on a 64k block size filesystem on a 4k page size > machine with XFS.... This all makes sense ... > > - if (iop || i_blocksize(inode) == PAGE_SIZE) > > + if (iop || i_blocks_per_page(inode, page) <= 1) > > return iop; > > That also means checks like these become: > > if (iop || iomap_chunks_per_page(inode, page) <= 1) > > as a single file can now have multiple pages per block, a page per > block and multiple blocks per page as the page size changes... > > I'd like to only have to make one pass over this code to abstract > out page and block sizes, so I'm guessing we'll need to do some > co-ordination here.... Yup. I'm happy if you want to send your patches out; I'll keep going with the patches I have for the moment, and we'll figure out how to merge the two series in a way that makes sense.
On Fri, Oct 04, 2019 at 12:28:12PM -0700, Matthew Wilcox wrote: > On Wed, Sep 25, 2019 at 06:36:50PM +1000, Dave Chinner wrote: > > I'm actually working on abstrcting this code from both block size > > and page size via the helpers below. We ahve need to support block > > size > page size, and so that requires touching a bunch of all the > > same code as this patchset. I'm currently trying to combine your > > last patch set with my patchset so I can easily test allocating 64k > > page cache pages on a 64k block size filesystem on a 4k page size > > machine with XFS.... > > This all makes sense ... > > > > - if (iop || i_blocksize(inode) == PAGE_SIZE) > > > + if (iop || i_blocks_per_page(inode, page) <= 1) > > > return iop; > > > > That also means checks like these become: > > > > if (iop || iomap_chunks_per_page(inode, page) <= 1) > > > > as a single file can now have multiple pages per block, a page per > > block and multiple blocks per page as the page size changes... > > > > I'd like to only have to make one pass over this code to abstract > > out page and block sizes, so I'm guessing we'll need to do some > > co-ordination here.... > > Yup. I'm happy if you want to send your patches out; I'll keep going > with the patches I have for the moment, and we'll figure out how to > merge the two series in a way that makes sense. I'm waiting for the xfs -> iomap writeback changes to land in a stable branch so I don't have to do things twice in slightly different ways in the patchset. Once we get that in an iomap-next branch I'll rebase my patches on top of it and go from there... Cheers, Dave.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e25901ae3ff4..0e76a4b6d98a 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -24,7 +24,7 @@ iomap_page_create(struct inode *inode, struct page *page) { struct iomap_page *iop = to_iomap_page(page); - if (iop || i_blocksize(inode) == PAGE_SIZE) + if (iop || i_blocks_per_page(inode, page) <= 1) return iop; iop = kmalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); @@ -128,7 +128,7 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len) bool uptodate = true; if (iop) { - for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { + for (i = 0; i < i_blocks_per_page(inode, page); i++) { if (i >= first && i <= last) set_bit(i, iop->uptodate); else if (!test_bit(i, iop->uptodate)) diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c index a2f5338a5ea1..176580f54af9 100644 --- a/fs/jfs/jfs_metapage.c +++ b/fs/jfs/jfs_metapage.c @@ -473,7 +473,7 @@ static int metapage_readpage(struct file *fp, struct page *page) struct inode *inode = page->mapping->host; struct bio *bio = NULL; int block_offset; - int blocks_per_page = PAGE_SIZE >> inode->i_blkbits; + int blocks_per_page = i_blocks_per_page(inode, page); sector_t page_start; /* address of page in fs blocks */ sector_t pblock; int xlen; diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index f16d5f196c6b..102cfd8a97d6 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -68,7 +68,7 @@ xfs_finish_page_writeback( mapping_set_error(inode->i_mapping, -EIO); } - ASSERT(iop || i_blocksize(inode) == PAGE_SIZE); + ASSERT(iop || i_blocks_per_page(inode, bvec->bv_page) <= 1); ASSERT(!iop || atomic_read(&iop->write_count) > 0); if (!iop || atomic_dec_and_test(&iop->write_count)) @@ -839,7 +839,7 @@ xfs_aops_discard_page( page, ip->i_ino, offset); error = xfs_bmap_punch_delalloc_range(ip, start_fsb, - PAGE_SIZE / i_blocksize(inode)); + i_blocks_per_page(inode, page)); if (error && !XFS_FORCED_SHUTDOWN(mp)) xfs_alert(mp, "page discard unable to remove delalloc mapping."); out_invalidate: @@ -877,7 +877,7 @@ xfs_writepage_map( uint64_t file_offset; /* file offset of page */ int error = 0, count = 0, i; - ASSERT(iop || i_blocksize(inode) == PAGE_SIZE); + ASSERT(iop || i_blocks_per_page(inode, page) <= 1); ASSERT(!iop || atomic_read(&iop->write_count) == 0); /* @@ -886,7 +886,7 @@ xfs_writepage_map( * one. */ for (i = 0, file_offset = page_offset(page); - i < (PAGE_SIZE >> inode->i_blkbits) && file_offset < end_offset; + i < i_blocks_per_page(inode, page) && file_offset < end_offset; i++, file_offset += len) { if (iop && !test_bit(i, iop->uptodate)) continue; diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 37a4d9e32cd3..750770a2c685 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -636,4 +636,17 @@ static inline unsigned long dir_pages(struct inode *inode) PAGE_SHIFT; } +/** + * i_blocks_per_page - How many blocks fit in this page. + * @inode: The inode which contains the blocks. + * @page: The (potentially large) page. + * + * Context: Any context. + * Return: The number of filesystem blocks covered by this page. + */ +static inline +unsigned int i_blocks_per_page(struct inode *inode, struct page *page) +{ + return page_size(page) >> inode->i_blkbits; +} #endif /* _LINUX_PAGEMAP_H */