diff mbox

Btrfs: really fix trim 0 bytes after a device delete

Message ID 20644.2675.653732.24816@localhost.localdomain (mailing list archive)
State New, archived
Headers show

Commit Message

Lutz Euler Nov. 14, 2012, 9:17 p.m. UTC
Commit 2cac13e41bf5b99ffc426bd28dfd2248df1dfa67, "fix trim 0 bytes after
a device delete", said:
  A user reported a bug of btrfs's trim, that is we will trim 0 bytes
  after a device delete.
The commit didn't attack the root of the problem so did not fix the bug
except for a special case.

For block discard, btrfs_trim_fs directly compares the range passed in
against the filesystem's objectids. The former is bounded by the sum of
the sizes of the devices of the filesystem, the latter is a completely
unrelated set of intervals of 64-bit integers. The bug reported occurred
as the smallest objectid was larger than the sum of the device sizes.
The previous commit only fixes the case where the smallest objectid is
nonzero and the largest objectid less than the sum of the device sizes,
but it would still trim too little if the largest objectid is larger
than that and nothing in the reported situation.

The current mapping between the given range and the objectids is thus
clearly broken, so, to fix the bug and as a first step towards a
complete solution, simply ignore the range parameter's start and length
fields and always trim the whole filesystem. (While this makes it
impossible to only partly trim a filesystem, due to the broken mapping
this often didn't work anyway.)

Reported-by: Lutz Euler <lutz.euler@freenet.de>
Signed-off-by: Lutz Euler <lutz.euler@freenet.de>
---
 fs/btrfs/extent-tree.c |   55 +++++++++++++++++++++--------------------------
 1 files changed, 25 insertions(+), 30 deletions(-)
diff mbox

Patch

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3d3e2c1..9c2bb59 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8092,44 +8092,39 @@  int btrfs_trim_fs(struct btrfs_root *root, struct fstrim_range *range)
 	u64 start;
 	u64 end;
 	u64 trimmed = 0;
-	u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
 	int ret = 0;
 
 	/*
-	 * try to trim all FS space, our block group may start from non-zero.
+	 * The range passed in is a subinterval of the interval from 0
+	 * to the sum of the sizes of the devices of the filesystem.
+	 * The objectid's used in the filesystem can span any set of
+	 * subintervals of the interval from 0 to (u64)-1. As there is
+	 * neither a simple nor an agreed upon mapping between these
+	 * two ranges we ignore the range parameter's start and len
+	 * fields and always trim the whole filesystem (that is, only
+	 * the free space in allocated chunks).
 	 */
-	if (range->len == total_bytes)
-		cache = btrfs_lookup_first_block_group(fs_info, range->start);
-	else
-		cache = btrfs_lookup_block_group(fs_info, range->start);
+	cache = btrfs_lookup_first_block_group(fs_info, 0);
 
 	while (cache) {
-		if (cache->key.objectid >= (range->start + range->len)) {
-			btrfs_put_block_group(cache);
-			break;
-		}
-
-		start = max(range->start, cache->key.objectid);
-		end = min(range->start + range->len,
-				cache->key.objectid + cache->key.offset);
+		start = cache->key.objectid;
+		end = cache->key.objectid + cache->key.offset;
 
-		if (end - start >= range->minlen) {
-			if (!block_group_cache_done(cache)) {
-				ret = cache_block_group(cache, NULL, root, 0);
-				if (!ret)
-					wait_block_group_cache_done(cache);
-			}
-			ret = btrfs_trim_block_group(cache,
-						     &group_trimmed,
-						     start,
-						     end,
-						     range->minlen);
+		if (!block_group_cache_done(cache)) {
+			ret = cache_block_group(cache, NULL, root, 0);
+			if (!ret)
+				wait_block_group_cache_done(cache);
+		}
+		ret = btrfs_trim_block_group(cache,
+					     &group_trimmed,
+					     start,
+					     end,
+					     range->minlen);
 
-			trimmed += group_trimmed;
-			if (ret) {
-				btrfs_put_block_group(cache);
-				break;
-			}
+		trimmed += group_trimmed;
+		if (ret) {
+			btrfs_put_block_group(cache);
+			break;
 		}
 
 		cache = next_block_group(fs_info->tree_root, cache);