[v2,2/3] btrfs: Split remaining space to discard in chunks

Message ID	20240902205828.943155-3-luca.stefani.ge1@gmail.com (mailing list archive)
State	New, archived
Headers	show Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 582AF21019F; Mon, 2 Sep 2024 21:00:03 +0000 (UTC) From: Luca Stefani <luca.stefani.ge1@gmail.com> To: Cc: Luca Stefani <luca.stefani.ge1@gmail.com>, Qu Wenruo <wqu@suse.com>, Jens Axboe <axboe@kernel.dk>, Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>, David Sterba <dsterba@suse.com>, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [PATCH v2 2/3] btrfs: Split remaining space to discard in chunks Date: Mon, 2 Sep 2024 22:56:11 +0200 Message-ID: <20240902205828.943155-3-luca.stefani.ge1@gmail.com> In-Reply-To: <20240902205828.943155-1-luca.stefani.ge1@gmail.com> References: <20240902205828.943155-1-luca.stefani.ge1@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	btrfs: Don't block suspend during fstrim \| expand [v2,0/3] btrfs: Don't block suspend during fstrim [v2,1/3] block: Export bio_discard_limit [v2,2/3] btrfs: Split remaining space to discard in chunks [v2,3/3] btrfs: Don't block system suspend during fstrim

Message ID

20240902205828.943155-3-luca.stefani.ge1@gmail.com (mailing list archive)

State

New, archived

Headers

From: Luca Stefani <luca.stefani.ge1@gmail.com>
To: 
Cc: Luca Stefani <luca.stefani.ge1@gmail.com>,
	Qu Wenruo <wqu@suse.com>,
	Jens Axboe <axboe@kernel.dk>,
	Chris Mason <clm@fb.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org
Subject: [PATCH v2 2/3] btrfs: Split remaining space to discard in chunks
Date: Mon,  2 Sep 2024 22:56:11 +0200
Message-ID: <20240902205828.943155-3-luca.stefani.ge1@gmail.com>
In-Reply-To: <20240902205828.943155-1-luca.stefani.ge1@gmail.com>
References: <20240902205828.943155-1-luca.stefani.ge1@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

btrfs: Don't block suspend during fstrim | expand

Commit Message

Luca Stefani Sept. 2, 2024, 8:56 p.m. UTC

Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
mostly empty although we will do the split according to our super block
locations, the last super block ends at 256G, we can submit a huge
discard for the range [256G, 8T), causing a super large delay.

We now split the space left to discard based the block discard limit
in preparation of introduction of cancellation signals handling.

Reported-by: Qu Wenruo <wqu@suse.com>
Closes: https://lore.kernel.org/lkml/2e15214b-7e95-4e64-a899-725de12c9037@gmail.com/T/#mdfef1d8b36334a15c54cd009f6aadf49e260e105
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
 fs/btrfs/extent-tree.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

Comments

Qu Wenruo Sept. 2, 2024, 10:31 p.m. UTC | #1

在 2024/9/3 06:26, Luca Stefani 写道:
> Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
> mostly empty although we will do the split according to our super block
> locations, the last super block ends at 256G, we can submit a huge
> discard for the range [256G, 8T), causing a super large delay.
> 
> We now split the space left to discard based the block discard limit
> in preparation of introduction of cancellation signals handling.
> 
> Reported-by: Qu Wenruo <wqu@suse.com>

Well, I'd say who ever reported the original fstrim and suspension 
failure should be the reporter, not me.

And David's advice is indeed pretty good for the new split according to 
the discard limit.

> Closes: https://lore.kernel.org/lkml/2e15214b-7e95-4e64-a899-725de12c9037@gmail.com/T/#mdfef1d8b36334a15c54cd009f6aadf49e260e105
> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> ---
>   fs/btrfs/extent-tree.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index feec49e6f9c8..894684f4f497 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1239,8 +1239,9 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>   			       u64 *discarded_bytes)
>   {
>   	int j, ret = 0;
> -	u64 bytes_left, end;
> +	u64 bytes_left, bytes_to_discard, end;

You can calculate the discard limit here, no need to recalculate inside 
the loop.

For the other variables only utilized inside the loop, you can always 
declare them inside the loop.

Otherwise looks good to me.

Thanks,
Qu
>   	u64 aligned_start = ALIGN(start, 1 << SECTOR_SHIFT);
> +	sector_t sector, nr_sects, bio_sects;
>   
>   	/* Adjust the range to be aligned to 512B sectors if necessary. */
>   	if (start != aligned_start) {
> @@ -1300,13 +1301,23 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>   		bytes_left = end - start;
>   	}
>   
> -	if (bytes_left) {
> -		ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
> -					   bytes_left >> SECTOR_SHIFT,
> -					   GFP_NOFS);
> +	while (bytes_left) {
> +		sector = start >> SECTOR_SHIFT;
> +		nr_sects = bytes_left >> SECTOR_SHIFT;
> +		bio_sects = min(nr_sects, bio_discard_limit(bdev, sector));
> +		bytes_to_discard = bio_sects << SECTOR_SHIFT;
> +
> +		ret = blkdev_issue_discard(bdev, sector, bio_sects, GFP_NOFS);
> +
>   		if (!ret)
> -			*discarded_bytes += bytes_left;
> +			*discarded_bytes += bytes_to_discard;
> +		else if (ret != -EOPNOTSUPP)
> +			return ret;
> +
> +		start += bytes_to_discard;
> +		bytes_left -= bytes_to_discard;
>   	}
> +
>   	return ret;
>   }
>

David Sterba Sept. 2, 2024, 11:41 p.m. UTC | #2

On Tue, Sep 03, 2024 at 08:01:16AM +0930, Qu Wenruo wrote:
> 
> 
> 在 2024/9/3 06:26, Luca Stefani 写道:
> > Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
> > mostly empty although we will do the split according to our super block
> > locations, the last super block ends at 256G, we can submit a huge
> > discard for the range [256G, 8T), causing a super large delay.
> > 
> > We now split the space left to discard based the block discard limit
> > in preparation of introduction of cancellation signals handling.
> > 
> > Reported-by: Qu Wenruo <wqu@suse.com>
> 
> Well, I'd say who ever reported the original fstrim and suspension 
> failure should be the reporter, not me.

Once the patch is finalized I'll add the links to reports but yeah if
they're added when the patches are submitted it's helpful.

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index feec49e6f9c8..894684f4f497 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1239,8 +1239,9 @@  static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 			       u64 *discarded_bytes)
 {
 	int j, ret = 0;
-	u64 bytes_left, end;
+	u64 bytes_left, bytes_to_discard, end;
 	u64 aligned_start = ALIGN(start, 1 << SECTOR_SHIFT);
+	sector_t sector, nr_sects, bio_sects;
 
 	/* Adjust the range to be aligned to 512B sectors if necessary. */
 	if (start != aligned_start) {
@@ -1300,13 +1301,23 @@  static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 		bytes_left = end - start;
 	}
 
-	if (bytes_left) {
-		ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
-					   bytes_left >> SECTOR_SHIFT,
-					   GFP_NOFS);
+	while (bytes_left) {
+		sector = start >> SECTOR_SHIFT;
+		nr_sects = bytes_left >> SECTOR_SHIFT;
+		bio_sects = min(nr_sects, bio_discard_limit(bdev, sector));
+		bytes_to_discard = bio_sects << SECTOR_SHIFT;
+
+		ret = blkdev_issue_discard(bdev, sector, bio_sects, GFP_NOFS);
+
 		if (!ret)
-			*discarded_bytes += bytes_left;
+			*discarded_bytes += bytes_to_discard;
+		else if (ret != -EOPNOTSUPP)
+			return ret;
+
+		start += bytes_to_discard;
+		bytes_left -= bytes_to_discard;
 	}
+
 	return ret;
 }

[v2,2/3] btrfs: Split remaining space to discard in chunks

Commit Message

Comments

Patch