Message ID | 070b1d025ef6eb292638bb97683cd5c35ffe42eb.1721775142.git.boris@bur.io (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fixes for buffered write qgroup rsv leaks | expand |
在 2024/7/24 08:25, Boris Burkov 写道: > In the buffered write path, dirty pages can be said to "own" the qgroup > reservation until they create an ordered_extent. It is possible for > there to be outstanding dirty pages when a transaction is aborted, in > which case there is no cancellation path for freeing this reservation > and it is leaked. > > We do already walk the list of outstanding delalloc inodes in > btrfs_destroy_delalloc_inodes and call invalidate_inode_pages2 on them. > > This does *not* call btrfs_invalidate_folio, as one might guess, but > rather calls launder_folio and release_folio. Since this is a > reservation associated with dirty pages only, rather than something > associated with the private bit (ordered_extent is cancelled separately > already in the cleanup txn path), implementing this release should be > done via launder_folio. Well, the launder_folio() is completely new to me. But indeed invalidate_inode_pages2_range() calls folio_launder() then invalidate_complete_folio2() -> filemap_release_folio() -> btrfs_release_folio(). Thus we lack the handling for such cases. > > Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/inode.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 8f38eefc8acd..c5155981f99a 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -7198,6 +7198,11 @@ static void wait_subpage_spinlock(struct page *page) > spin_unlock_irq(&subpage->lock); > } > > +static int btrfs_launder_folio(struct folio *folio) > +{ > + return btrfs_qgroup_free_data(folio_to_inode(folio), NULL, folio_pos(folio), PAGE_SIZE, NULL); > +} > + > static bool __btrfs_release_folio(struct folio *folio, gfp_t gfp_flags) > { > if (try_release_extent_mapping(&folio->page, gfp_flags)) { > @@ -10133,6 +10138,7 @@ static const struct address_space_operations btrfs_aops = { > .writepages = btrfs_writepages, > .readahead = btrfs_readahead, > .invalidate_folio = btrfs_invalidate_folio, > + .launder_folio = btrfs_launder_folio, > .release_folio = btrfs_release_folio, > .migrate_folio = btrfs_migrate_folio, > .dirty_folio = filemap_dirty_folio,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8f38eefc8acd..c5155981f99a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7198,6 +7198,11 @@ static void wait_subpage_spinlock(struct page *page) spin_unlock_irq(&subpage->lock); } +static int btrfs_launder_folio(struct folio *folio) +{ + return btrfs_qgroup_free_data(folio_to_inode(folio), NULL, folio_pos(folio), PAGE_SIZE, NULL); +} + static bool __btrfs_release_folio(struct folio *folio, gfp_t gfp_flags) { if (try_release_extent_mapping(&folio->page, gfp_flags)) { @@ -10133,6 +10138,7 @@ static const struct address_space_operations btrfs_aops = { .writepages = btrfs_writepages, .readahead = btrfs_readahead, .invalidate_folio = btrfs_invalidate_folio, + .launder_folio = btrfs_launder_folio, .release_folio = btrfs_release_folio, .migrate_folio = btrfs_migrate_folio, .dirty_folio = filemap_dirty_folio,
In the buffered write path, dirty pages can be said to "own" the qgroup reservation until they create an ordered_extent. It is possible for there to be outstanding dirty pages when a transaction is aborted, in which case there is no cancellation path for freeing this reservation and it is leaked. We do already walk the list of outstanding delalloc inodes in btrfs_destroy_delalloc_inodes and call invalidate_inode_pages2 on them. This does *not* call btrfs_invalidate_folio, as one might guess, but rather calls launder_folio and release_folio. Since this is a reservation associated with dirty pages only, rather than something associated with the private bit (ordered_extent is cancelled separately already in the cleanup txn path), implementing this release should be done via launder_folio. Signed-off-by: Boris Burkov <boris@bur.io> --- fs/btrfs/inode.c | 6 ++++++ 1 file changed, 6 insertions(+)