Message ID | 20201014030357.21898-6-willy@infradead.org (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
Series | Transparent Huge Page support for XFS | expand |
On Wed, Oct 14, 2020 at 04:03:48AM +0100, Matthew Wilcox (Oracle) wrote: > If we're punching a hole in a THP, we need to remove the per-page > iomap data as the THP is about to be split and each page will need > its own. This means that writepage can now come across a page with > no iop allocated, so remove the assertion that there is already one, > and just create one (with the uptodate bits set) if there isn't one. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > fs/iomap/buffered-io.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 95ac66731297..4633ebd03a3f 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -60,6 +60,8 @@ iomap_page_create(struct inode *inode, struct page *page) > iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), > GFP_NOFS | __GFP_NOFAIL); > spin_lock_init(&iop->uptodate_lock); > + if (PageUptodate(page)) > + bitmap_fill(iop->uptodate, nr_blocks); > attach_page_private(page, iop); > return iop; > } > @@ -494,10 +496,14 @@ iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len) > * If we are invalidating the entire page, clear the dirty state from it > * and release it to avoid unnecessary buildup of the LRU. > */ > - if (offset == 0 && len == PAGE_SIZE) { > + if (offset == 0 && len == thp_size(page)) { > WARN_ON_ONCE(PageWriteback(page)); > cancel_dirty_page(page); > iomap_page_release(page); > + } else if (PageTransHuge(page)) { > + /* Punching a hole in a THP requires releasing the iop */ > + WARN_ON_ONCE(!PageUptodate(page) && PageDirty(page)); > + iomap_page_release(page); > } > } > EXPORT_SYMBOL_GPL(iomap_invalidatepage); > @@ -1363,14 +1369,13 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, > struct writeback_control *wbc, struct inode *inode, > struct page *page, u64 end_offset) > { > - struct iomap_page *iop = to_iomap_page(page); > + struct iomap_page *iop = iomap_page_create(inode, page); > struct iomap_ioend *ioend, *next; > unsigned len = i_blocksize(inode); > u64 file_offset; /* file offset of page */ > int error = 0, count = 0, i; > LIST_HEAD(submit_list); > > - WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop); > WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0); > > /* > @@ -1415,7 +1420,6 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, > */ > if (wpc->ops->discard_page) > wpc->ops->discard_page(page); > - ClearPageUptodate(page); Er, I don't get it -- why do we now leave the page up to date after writeback fails? --D > unlock_page(page); > goto done; > } > -- > 2.28.0 >
On Wed, Oct 14, 2020 at 09:33:47AM -0700, Darrick J. Wong wrote: > > @@ -1415,7 +1420,6 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, > > */ > > if (wpc->ops->discard_page) > > wpc->ops->discard_page(page); > > - ClearPageUptodate(page); > > Er, I don't get it -- why do we now leave the page up to date after > writeback fails? The page is still uptodate -- every byte in this page is at least as new as the corresponding bytes on disk.
On Wed, Oct 14, 2020 at 06:26:34PM +0100, Matthew Wilcox wrote: > On Wed, Oct 14, 2020 at 09:33:47AM -0700, Darrick J. Wong wrote: > > > @@ -1415,7 +1420,6 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, > > > */ > > > if (wpc->ops->discard_page) > > > wpc->ops->discard_page(page); > > > - ClearPageUptodate(page); > > > > Er, I don't get it -- why do we now leave the page up to date after > > writeback fails? > > The page is still uptodate -- every byte in this page is at least as new > as the corresponding bytes on disk. > That seems rather odd if the preceding ->discard_page() turned an underlying delalloc block into a hole. Technically the original written data is still in the page, but it's no longer allocated/mapped or dirty so really no longer in sync with on-disk state. Hm? Brian
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 95ac66731297..4633ebd03a3f 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -60,6 +60,8 @@ iomap_page_create(struct inode *inode, struct page *page) iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), GFP_NOFS | __GFP_NOFAIL); spin_lock_init(&iop->uptodate_lock); + if (PageUptodate(page)) + bitmap_fill(iop->uptodate, nr_blocks); attach_page_private(page, iop); return iop; } @@ -494,10 +496,14 @@ iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len) * If we are invalidating the entire page, clear the dirty state from it * and release it to avoid unnecessary buildup of the LRU. */ - if (offset == 0 && len == PAGE_SIZE) { + if (offset == 0 && len == thp_size(page)) { WARN_ON_ONCE(PageWriteback(page)); cancel_dirty_page(page); iomap_page_release(page); + } else if (PageTransHuge(page)) { + /* Punching a hole in a THP requires releasing the iop */ + WARN_ON_ONCE(!PageUptodate(page) && PageDirty(page)); + iomap_page_release(page); } } EXPORT_SYMBOL_GPL(iomap_invalidatepage); @@ -1363,14 +1369,13 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct page *page, u64 end_offset) { - struct iomap_page *iop = to_iomap_page(page); + struct iomap_page *iop = iomap_page_create(inode, page); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); u64 file_offset; /* file offset of page */ int error = 0, count = 0, i; LIST_HEAD(submit_list); - WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop); WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0); /* @@ -1415,7 +1420,6 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, */ if (wpc->ops->discard_page) wpc->ops->discard_page(page); - ClearPageUptodate(page); unlock_page(page); goto done; }
If we're punching a hole in a THP, we need to remove the per-page iomap data as the THP is about to be split and each page will need its own. This means that writepage can now come across a page with no iop allocated, so remove the assertion that there is already one, and just create one (with the uptodate bits set) if there isn't one. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- fs/iomap/buffered-io.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)