diff mbox series

btrfs: simplify the reserved space handling inside copy_one_range()

Message ID dd15d8ede1b17f86d2be14390c3927b1633b1a72.1742776906.git.wqu@suse.com (mailing list archive)
State New
Headers show
Series btrfs: simplify the reserved space handling inside copy_one_range() | expand

Commit Message

Qu Wenruo March 24, 2025, 12:41 a.m. UTC
Inside that function we have the following variables all involved to
help handling the reserved space:

- block_offset
- reserve_bytes
- dirty_blocks
- num_blocks
- release_bytes

Many of them (block_offset, dirty_blocks, num_blocks) are only utilized
once or twice as a temporary variables.

Furthermore the variable @release_bytes are utilized for two different
operations:

- As a temporary variable to release exceed range if a short copy
  happened
  And after a short-copy, the @release_bytes will be immediately
  re-calculated to cancel the change such temporary usage.

- As a variables to record the length that will be released

To fix all those unnecessary variables along with the inconsistent
variable usage:

- Introduce @reserved_start and @reserved_len variable
  Both are utilized to track the current reserved range (block aligned).

- Use above variables to calculate the range which needs to be shrunk
  When a short copy happened, we need to shrink the reserved space
  beyond the last block.

  The range to be released is always [@last_block, @reserved_start +
  @reserved_len), and the new @reserved_len will always be
  @last_block - @reserved_start.
  (@last_block is the exclusive block aligned file offset).

- Remove the five variables we no longer need

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/file.c | 71 ++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 39 deletions(-)

Comments

Qu Wenruo March 24, 2025, 5:45 a.m. UTC | #1
在 2025/3/24 11:11, Qu Wenruo 写道:
> Inside that function we have the following variables all involved to
> help handling the reserved space:
> 
> - block_offset
> - reserve_bytes
> - dirty_blocks
> - num_blocks
> - release_bytes
> 
> Many of them (block_offset, dirty_blocks, num_blocks) are only utilized
> once or twice as a temporary variables.
> 
> Furthermore the variable @release_bytes are utilized for two different
> operations:
> 
> - As a temporary variable to release exceed range if a short copy
>    happened
>    And after a short-copy, the @release_bytes will be immediately
>    re-calculated to cancel the change such temporary usage.
> 
> - As a variables to record the length that will be released
> 
> To fix all those unnecessary variables along with the inconsistent
> variable usage:
> 
> - Introduce @reserved_start and @reserved_len variable
>    Both are utilized to track the current reserved range (block aligned).
> 
> - Use above variables to calculate the range which needs to be shrunk
>    When a short copy happened, we need to shrink the reserved space
>    beyond the last block.
> 
>    The range to be released is always [@last_block, @reserved_start +
>    @reserved_len), and the new @reserved_len will always be
>    @last_block - @reserved_start.
>    (@last_block is the exclusive block aligned file offset).
> 
> - Remove the five variables we no longer need
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---

This will need a fix at least.

It works fine on x86_64 but not on aarch64 with subpage cases.

Will get it fixed in the next update.

Thanks,
Qu
Qu Wenruo March 24, 2025, 6:32 a.m. UTC | #2
在 2025/3/24 11:11, Qu Wenruo 写道:
> Inside that function we have the following variables all involved to
> help handling the reserved space:
> 
> - block_offset
> - reserve_bytes
> - dirty_blocks
> - num_blocks
> - release_bytes
> 
> Many of them (block_offset, dirty_blocks, num_blocks) are only utilized
> once or twice as a temporary variables.
> 
> Furthermore the variable @release_bytes are utilized for two different
> operations:
> 
> - As a temporary variable to release exceed range if a short copy
>    happened
>    And after a short-copy, the @release_bytes will be immediately
>    re-calculated to cancel the change such temporary usage.
> 
> - As a variables to record the length that will be released
> 
> To fix all those unnecessary variables along with the inconsistent
> variable usage:
> 
> - Introduce @reserved_start and @reserved_len variable
>    Both are utilized to track the current reserved range (block aligned).
> 
> - Use above variables to calculate the range which needs to be shrunk
>    When a short copy happened, we need to shrink the reserved space
>    beyond the last block.
> 
>    The range to be released is always [@last_block, @reserved_start +
>    @reserved_len), and the new @reserved_len will always be
>    @last_block - @reserved_start.
>    (@last_block is the exclusive block aligned file offset).
> 
> - Remove the five variables we no longer need
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/file.c | 71 ++++++++++++++++++++++---------------------------
>   1 file changed, 32 insertions(+), 39 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index b72fc00bc2f6..4d83962ec8d6 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1164,15 +1164,13 @@ static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
>   {
>   	struct btrfs_fs_info *fs_info = inode->root->fs_info;
>   	struct extent_state *cached_state = NULL;
> -	const size_t block_offset = start & (fs_info->sectorsize - 1);
> +	const u32 blocksize = fs_info->sectorsize;
>   	size_t write_bytes = min(iov_iter_count(i), PAGE_SIZE - offset_in_page(start));
> -	size_t reserve_bytes;
>   	size_t copied;
> -	size_t dirty_blocks;
> -	size_t num_blocks;
>   	struct folio *folio = NULL;
> -	u64 release_bytes;
>   	int extents_locked;
> +	const u64 reserved_start = round_down(start, blocksize);
> +	u64 reserved_len;
>   	u64 lockstart;
>   	u64 lockend;
>   	bool only_release_metadata = false;
> @@ -1190,23 +1188,22 @@ static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
>   			    &only_release_metadata);
>   	if (ret < 0)
>   		return ret;
> -	reserve_bytes = ret;
> -	release_bytes = reserve_bytes;
> +	reserved_len = ret;
>   
>   again:
>   	ret = balance_dirty_pages_ratelimited_flags(inode->vfs_inode.i_mapping,
>   						    bdp_flags);
>   	if (ret) {
> -		btrfs_delalloc_release_extents(inode, reserve_bytes);
> -		release_space(inode, *data_reserved, start, release_bytes,
> +		btrfs_delalloc_release_extents(inode, reserved_len);
> +		release_space(inode, *data_reserved, reserved_start, reserved_len,
>   			      only_release_metadata);
>   		return ret;
>   	}
>   
>   	ret = prepare_one_folio(&inode->vfs_inode, &folio, start, write_bytes, false);
>   	if (ret) {
> -		btrfs_delalloc_release_extents(inode, reserve_bytes);
> -		release_space(inode, *data_reserved, start, release_bytes,
> +		btrfs_delalloc_release_extents(inode, reserved_len);
> +		release_space(inode, *data_reserved, reserved_start, reserved_len,
>   			      only_release_metadata);
>   		return ret;
>   	}
> @@ -1217,8 +1214,8 @@ static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
>   		if (!nowait && extents_locked == -EAGAIN)
>   			goto again;
>   
> -		btrfs_delalloc_release_extents(inode, reserve_bytes);
> -		release_space(inode, *data_reserved, start, release_bytes,
> +		btrfs_delalloc_release_extents(inode, reserved_len);
> +		release_space(inode, *data_reserved, reserved_start, reserved_len,
>   			      only_release_metadata);
>   		ret = extents_locked;
>   		return ret;
> @@ -1235,36 +1232,32 @@ static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
>   	 * sure they don't happen by forcing retry this copy.
>   	 */
>   	if (unlikely(copied < write_bytes)) {
> +		u64 last_block;
> +		u64 release_len;
> +
>   		if (!folio_test_uptodate(folio)) {
>   			iov_iter_revert(i, copied);
>   			copied = 0;
>   		}
> +
> +		if (copied == 0)
> +			last_block = round_down(start, blocksize);

This changed the release range for short-copy.

Originally for a zero-copied case, we only release from round_up(start, 
blocksize) to the end.

And the current block is later released by that "release_bytes = 
round_up(copied + block_offset, fs_info->sectorsize);" line.

But now we release every range in one go.

Normally this should be fine, but later btrfs_delalloc_release_extents() 
itself can not handle zero length properly.


So the fix is pretty simple, just skip the 
btrfs_delalloc_release_extents() call if our reserved_len is already.

Thanks,
Qu> +
> +		/*
> +		 * Since we got a short copy, release the reserved bytes
> +		 * byond the last block.
> +		 */
> +		if (only_release_metadata)
> +			btrfs_delalloc_release_metadata(inode, release_len, true);
> +		else
> +			btrfs_delalloc_release_space(inode, *data_reserved,
> +					last_block, release_len, true);
> +		reserved_len = last_block - reserved_start;
>   	}
>   
> -	num_blocks = BTRFS_BYTES_TO_BLKS(fs_info, reserve_bytes);
> -	dirty_blocks = round_up(copied + block_offset, fs_info->sectorsize);
> -	dirty_blocks = BTRFS_BYTES_TO_BLKS(fs_info, dirty_blocks);
> -
> -	if (copied == 0)
> -		dirty_blocks = 0;
> -
> -	if (num_blocks > dirty_blocks) {
> -		/* Release everything except the sectors we dirtied. */
> -		release_bytes -= dirty_blocks << fs_info->sectorsize_bits;
> -		if (only_release_metadata) {
> -			btrfs_delalloc_release_metadata(inode,
> -						release_bytes, true);
> -		} else {
> -			const u64 release_start = round_up(start + copied,
> -							   fs_info->sectorsize);
> -
> -			btrfs_delalloc_release_space(inode,
> -					*data_reserved, release_start,
> -					release_bytes, true);
> -		}
> -	}
> -	release_bytes = round_up(copied + block_offset, fs_info->sectorsize);
> -
>   	ret = btrfs_dirty_folio(inode, folio, start, copied,
>   				&cached_state, only_release_metadata);
>   	/*
> @@ -1280,10 +1273,10 @@ static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
>   	else
>   		free_extent_state(cached_state);
>   
> -	btrfs_delalloc_release_extents(inode, reserve_bytes);
> +	btrfs_delalloc_release_extents(inode, reserved_len);
>   	if (ret) {
>   		btrfs_drop_folio(fs_info, folio, start, copied);
> -		release_space(inode, *data_reserved, start, release_bytes,
> +		release_space(inode, *data_reserved, reserved_start, reserved_len,
>   			      only_release_metadata);
>   		return ret;
>   	}
diff mbox series

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index b72fc00bc2f6..4d83962ec8d6 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1164,15 +1164,13 @@  static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct extent_state *cached_state = NULL;
-	const size_t block_offset = start & (fs_info->sectorsize - 1);
+	const u32 blocksize = fs_info->sectorsize;
 	size_t write_bytes = min(iov_iter_count(i), PAGE_SIZE - offset_in_page(start));
-	size_t reserve_bytes;
 	size_t copied;
-	size_t dirty_blocks;
-	size_t num_blocks;
 	struct folio *folio = NULL;
-	u64 release_bytes;
 	int extents_locked;
+	const u64 reserved_start = round_down(start, blocksize);
+	u64 reserved_len;
 	u64 lockstart;
 	u64 lockend;
 	bool only_release_metadata = false;
@@ -1190,23 +1188,22 @@  static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
 			    &only_release_metadata);
 	if (ret < 0)
 		return ret;
-	reserve_bytes = ret;
-	release_bytes = reserve_bytes;
+	reserved_len = ret;
 
 again:
 	ret = balance_dirty_pages_ratelimited_flags(inode->vfs_inode.i_mapping,
 						    bdp_flags);
 	if (ret) {
-		btrfs_delalloc_release_extents(inode, reserve_bytes);
-		release_space(inode, *data_reserved, start, release_bytes,
+		btrfs_delalloc_release_extents(inode, reserved_len);
+		release_space(inode, *data_reserved, reserved_start, reserved_len,
 			      only_release_metadata);
 		return ret;
 	}
 
 	ret = prepare_one_folio(&inode->vfs_inode, &folio, start, write_bytes, false);
 	if (ret) {
-		btrfs_delalloc_release_extents(inode, reserve_bytes);
-		release_space(inode, *data_reserved, start, release_bytes,
+		btrfs_delalloc_release_extents(inode, reserved_len);
+		release_space(inode, *data_reserved, reserved_start, reserved_len,
 			      only_release_metadata);
 		return ret;
 	}
@@ -1217,8 +1214,8 @@  static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
 		if (!nowait && extents_locked == -EAGAIN)
 			goto again;
 
-		btrfs_delalloc_release_extents(inode, reserve_bytes);
-		release_space(inode, *data_reserved, start, release_bytes,
+		btrfs_delalloc_release_extents(inode, reserved_len);
+		release_space(inode, *data_reserved, reserved_start, reserved_len,
 			      only_release_metadata);
 		ret = extents_locked;
 		return ret;
@@ -1235,36 +1232,32 @@  static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
 	 * sure they don't happen by forcing retry this copy.
 	 */
 	if (unlikely(copied < write_bytes)) {
+		u64 last_block;
+		u64 release_len;
+
 		if (!folio_test_uptodate(folio)) {
 			iov_iter_revert(i, copied);
 			copied = 0;
 		}
+
+		if (copied == 0)
+			last_block = round_down(start, blocksize);
+		else
+			last_block = round_up(start + copied, blocksize);
+		release_len = reserved_start + reserved_len - last_block;
+
+		/*
+		 * Since we got a short copy, release the reserved bytes
+		 * byond the last block.
+		 */
+		if (only_release_metadata)
+			btrfs_delalloc_release_metadata(inode, release_len, true);
+		else
+			btrfs_delalloc_release_space(inode, *data_reserved,
+					last_block, release_len, true);
+		reserved_len = last_block - reserved_start;
 	}
 
-	num_blocks = BTRFS_BYTES_TO_BLKS(fs_info, reserve_bytes);
-	dirty_blocks = round_up(copied + block_offset, fs_info->sectorsize);
-	dirty_blocks = BTRFS_BYTES_TO_BLKS(fs_info, dirty_blocks);
-
-	if (copied == 0)
-		dirty_blocks = 0;
-
-	if (num_blocks > dirty_blocks) {
-		/* Release everything except the sectors we dirtied. */
-		release_bytes -= dirty_blocks << fs_info->sectorsize_bits;
-		if (only_release_metadata) {
-			btrfs_delalloc_release_metadata(inode,
-						release_bytes, true);
-		} else {
-			const u64 release_start = round_up(start + copied,
-							   fs_info->sectorsize);
-
-			btrfs_delalloc_release_space(inode,
-					*data_reserved, release_start,
-					release_bytes, true);
-		}
-	}
-	release_bytes = round_up(copied + block_offset, fs_info->sectorsize);
-
 	ret = btrfs_dirty_folio(inode, folio, start, copied,
 				&cached_state, only_release_metadata);
 	/*
@@ -1280,10 +1273,10 @@  static int copy_one_range(struct btrfs_inode *inode, struct iov_iter *i,
 	else
 		free_extent_state(cached_state);
 
-	btrfs_delalloc_release_extents(inode, reserve_bytes);
+	btrfs_delalloc_release_extents(inode, reserved_len);
 	if (ret) {
 		btrfs_drop_folio(fs_info, folio, start, copied);
-		release_space(inode, *data_reserved, start, release_bytes,
+		release_space(inode, *data_reserved, reserved_start, reserved_len,
 			      only_release_metadata);
 		return ret;
 	}