diff mbox series

btrfs: qgroup: Fix data leakage caused by falloc and truncate

Message ID 20200716073624.7774-1-wqu@suse.com
State New, archived
Headers show
Series btrfs: qgroup: Fix data leakage caused by falloc and truncate | expand

Commit Message

Qu Wenruo July 16, 2020, 7:36 a.m. UTC
When running tests like generic/013 on test device with btrfs quota
enabled, it can normally lead to data leakage, detected at unmount time:

  BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
  ------------[ cut here ]------------
  WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
  RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
  Call Trace:
   btrfs_put_super+0x15/0x17 [btrfs]
   btrfs_kill_super+0x17/0x30 [btrfs]
  ---[ end trace caf08beafeca2392 ]---
  BTRFS error (device dm-3): qgroup reserved space leaked

In the offending case, the offending operations are:
2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0

The first writev will cause btrfs to reserve space for range
[980K, 980K + 24K).

Then the next truncate should free all the reserved qgroup data space of
that range.

However if memory cleaning writeback happens for the first page of that
extent, then that page can have dirty bit cleaned, thus the
btrfs_qgroup_free_data() call in btrfs_invalidatepage() will skip it as
it's not a dirty page.

Instead of checking the dirty bit of a page, call
btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().

As qgroup rsv are completely binded to the QGROUP_RESERVED bit of
io_tree, not binded to page status, thus we won't cause double freeing

Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero")
Signed-off-by: Qu Wenruo <wqu@suse.com>
 fs/btrfs/inode.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)
diff mbox series


diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7c03b402529e..0c9251b500b6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8157,21 +8157,15 @@  static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 	 * Qgroup reserved space handler
 	 * Page here will be either
-	 * 1) Already written to disk
+	 * 1) Already written to disk or ordered extent already submitted
 	 *    In this case, its reserved space is released from data rsv map
 	 *    and will be freed by delayed_ref handler finally.
 	 *    So even we call qgroup_free_data(), it won't decrease reserved
 	 *    space.
 	 * 2) Not written to disk
-	 *    This means the reserved space should be freed here. However,
-	 *    if a truncate invalidates the page (by clearing PageDirty)
-	 *    and the page is accounted for while allocating extent
-	 *    in btrfs_check_data_free_space() we let delayed_ref to
-	 *    free the entire extent.
+	 *    This means the reserved space should be freed here.
-	if (PageDirty(page))
-		btrfs_qgroup_free_data(BTRFS_I(inode), NULL, page_start,
-				       PAGE_SIZE);
+	btrfs_qgroup_free_data(BTRFS_I(inode), NULL, page_start, PAGE_SIZE);
 	if (!inode_evicting) {
 		clear_extent_bit(tree, page_start, page_end, EXTENT_LOCKED |