Message ID | 20240902205828.943155-3-luca.stefani.ge1@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: Don't block suspend during fstrim | expand |
在 2024/9/3 06:26, Luca Stefani 写道: > Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device, > mostly empty although we will do the split according to our super block > locations, the last super block ends at 256G, we can submit a huge > discard for the range [256G, 8T), causing a super large delay. > > We now split the space left to discard based the block discard limit > in preparation of introduction of cancellation signals handling. > > Reported-by: Qu Wenruo <wqu@suse.com> Well, I'd say who ever reported the original fstrim and suspension failure should be the reporter, not me. And David's advice is indeed pretty good for the new split according to the discard limit. > Closes: https://lore.kernel.org/lkml/2e15214b-7e95-4e64-a899-725de12c9037@gmail.com/T/#mdfef1d8b36334a15c54cd009f6aadf49e260e105 > Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> > --- > fs/btrfs/extent-tree.c | 23 +++++++++++++++++------ > 1 file changed, 17 insertions(+), 6 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index feec49e6f9c8..894684f4f497 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -1239,8 +1239,9 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, > u64 *discarded_bytes) > { > int j, ret = 0; > - u64 bytes_left, end; > + u64 bytes_left, bytes_to_discard, end; You can calculate the discard limit here, no need to recalculate inside the loop. For the other variables only utilized inside the loop, you can always declare them inside the loop. Otherwise looks good to me. Thanks, Qu > u64 aligned_start = ALIGN(start, 1 << SECTOR_SHIFT); > + sector_t sector, nr_sects, bio_sects; > > /* Adjust the range to be aligned to 512B sectors if necessary. */ > if (start != aligned_start) { > @@ -1300,13 +1301,23 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, > bytes_left = end - start; > } > > - if (bytes_left) { > - ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, > - bytes_left >> SECTOR_SHIFT, > - GFP_NOFS); > + while (bytes_left) { > + sector = start >> SECTOR_SHIFT; > + nr_sects = bytes_left >> SECTOR_SHIFT; > + bio_sects = min(nr_sects, bio_discard_limit(bdev, sector)); > + bytes_to_discard = bio_sects << SECTOR_SHIFT; > + > + ret = blkdev_issue_discard(bdev, sector, bio_sects, GFP_NOFS); > + > if (!ret) > - *discarded_bytes += bytes_left; > + *discarded_bytes += bytes_to_discard; > + else if (ret != -EOPNOTSUPP) > + return ret; > + > + start += bytes_to_discard; > + bytes_left -= bytes_to_discard; > } > + > return ret; > } >
On Tue, Sep 03, 2024 at 08:01:16AM +0930, Qu Wenruo wrote: > > > 在 2024/9/3 06:26, Luca Stefani 写道: > > Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device, > > mostly empty although we will do the split according to our super block > > locations, the last super block ends at 256G, we can submit a huge > > discard for the range [256G, 8T), causing a super large delay. > > > > We now split the space left to discard based the block discard limit > > in preparation of introduction of cancellation signals handling. > > > > Reported-by: Qu Wenruo <wqu@suse.com> > > Well, I'd say who ever reported the original fstrim and suspension > failure should be the reporter, not me. Once the patch is finalized I'll add the links to reports but yeah if they're added when the patches are submitted it's helpful.
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index feec49e6f9c8..894684f4f497 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1239,8 +1239,9 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, u64 *discarded_bytes) { int j, ret = 0; - u64 bytes_left, end; + u64 bytes_left, bytes_to_discard, end; u64 aligned_start = ALIGN(start, 1 << SECTOR_SHIFT); + sector_t sector, nr_sects, bio_sects; /* Adjust the range to be aligned to 512B sectors if necessary. */ if (start != aligned_start) { @@ -1300,13 +1301,23 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, bytes_left = end - start; } - if (bytes_left) { - ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, - bytes_left >> SECTOR_SHIFT, - GFP_NOFS); + while (bytes_left) { + sector = start >> SECTOR_SHIFT; + nr_sects = bytes_left >> SECTOR_SHIFT; + bio_sects = min(nr_sects, bio_discard_limit(bdev, sector)); + bytes_to_discard = bio_sects << SECTOR_SHIFT; + + ret = blkdev_issue_discard(bdev, sector, bio_sects, GFP_NOFS); + if (!ret) - *discarded_bytes += bytes_left; + *discarded_bytes += bytes_to_discard; + else if (ret != -EOPNOTSUPP) + return ret; + + start += bytes_to_discard; + bytes_left -= bytes_to_discard; } + return ret; }
Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device, mostly empty although we will do the split according to our super block locations, the last super block ends at 256G, we can submit a huge discard for the range [256G, 8T), causing a super large delay. We now split the space left to discard based the block discard limit in preparation of introduction of cancellation signals handling. Reported-by: Qu Wenruo <wqu@suse.com> Closes: https://lore.kernel.org/lkml/2e15214b-7e95-4e64-a899-725de12c9037@gmail.com/T/#mdfef1d8b36334a15c54cd009f6aadf49e260e105 Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> --- fs/btrfs/extent-tree.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-)