diff mbox series

[v3,06/12] ext4: update delalloc data reserve spcae in ext4_es_insert_extent()

Message ID 20240813123452.2824659-7-yi.zhang@huaweicloud.com (mailing list archive)
State New
Headers show
Series ext4: simplify the counting and management of delalloc reserved blocks | expand

Commit Message

Zhang Yi Aug. 13, 2024, 12:34 p.m. UTC
From: Zhang Yi <yi.zhang@huawei.com>

Now that we update data reserved space for delalloc after allocating
new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is
enabled, we also need to query the extents_status tree to calculate the
exact reserved clusters. This is complicated now and it appears that
it's better to do this job in ext4_es_insert_extent(), because
__es_remove_extent() have already count delalloc blocks when removing
delalloc extents and __revise_pending() return new adding pending count,
we could update the reserved blocks easily in ext4_es_insert_extent().

We direct reduce the reserved cluster count when replacing a delalloc
extent. However, thers are two special cases need to concern about the
quota claiming when doing direct block allocation (e.g. from fallocate).

A),
fallocate a range that covers a delalloc extent but start with
non-delayed allocated blocks, e.g. a hole.

  hhhhhhh+ddddddd+ddddddd
  ^^^^^^^^^^^^^^^^^^^^^^^  fallocate this range

Current ext4_map_blocks() can't always trim the extent since it may
release i_data_sem before calling ext4_map_create_blocks() and raced by
another delayed allocation. Hence the EXT4_GET_BLOCKS_DELALLOC_RESERVE
may not set even when we are replacing a delalloc extent, without this
flag set, the quota has already been claimed by ext4_mb_new_blocks(), so
we should release the quota reservations instead of claim them again.

B),
bigalloc feature is enabled, fallocate a range that contains non-delayed
allocated blocks.

  |<         one cluster       >|
  hhhhhhh+hhhhhhh+hhhhhhh+ddddddd
  ^^^^^^^  fallocate this range

This case is similar to above case, the EXT4_GET_BLOCKS_DELALLOC_RESERVE
flag is also not set.

Hence we should release the quota reservations if we replace a delalloc
extent but without EXT4_GET_BLOCKS_DELALLOC_RESERVE set.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/extents.c        | 37 -------------------------------------
 fs/ext4/extents_status.c | 25 ++++++++++++++++++++++++-
 fs/ext4/indirect.c       |  7 -------
 3 files changed, 24 insertions(+), 45 deletions(-)

Comments

Jan Kara Sept. 4, 2024, 10:22 a.m. UTC | #1
On Tue 13-08-24 20:34:46, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> Now that we update data reserved space for delalloc after allocating
> new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is
> enabled, we also need to query the extents_status tree to calculate the
> exact reserved clusters. This is complicated now and it appears that
> it's better to do this job in ext4_es_insert_extent(), because
> __es_remove_extent() have already count delalloc blocks when removing
> delalloc extents and __revise_pending() return new adding pending count,
> we could update the reserved blocks easily in ext4_es_insert_extent().
> 
> We direct reduce the reserved cluster count when replacing a delalloc
> extent. However, thers are two special cases need to concern about the
> quota claiming when doing direct block allocation (e.g. from fallocate).
> 
> A),
> fallocate a range that covers a delalloc extent but start with
> non-delayed allocated blocks, e.g. a hole.
> 
>   hhhhhhh+ddddddd+ddddddd
>   ^^^^^^^^^^^^^^^^^^^^^^^  fallocate this range
> 
> Current ext4_map_blocks() can't always trim the extent since it may
> release i_data_sem before calling ext4_map_create_blocks() and raced by
> another delayed allocation. Hence the EXT4_GET_BLOCKS_DELALLOC_RESERVE
> may not set even when we are replacing a delalloc extent, without this
> flag set, the quota has already been claimed by ext4_mb_new_blocks(), so
> we should release the quota reservations instead of claim them again.
> 
> B),
> bigalloc feature is enabled, fallocate a range that contains non-delayed
> allocated blocks.
> 
>   |<         one cluster       >|
>   hhhhhhh+hhhhhhh+hhhhhhh+ddddddd
>   ^^^^^^^  fallocate this range
> 
> This case is similar to above case, the EXT4_GET_BLOCKS_DELALLOC_RESERVE
> flag is also not set.
> 
> Hence we should release the quota reservations if we replace a delalloc
> extent but without EXT4_GET_BLOCKS_DELALLOC_RESERVE set.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

I really like how simple this ended being! Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/extents.c        | 37 -------------------------------------
>  fs/ext4/extents_status.c | 25 ++++++++++++++++++++++++-
>  fs/ext4/indirect.c       |  7 -------
>  3 files changed, 24 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 671dacd7c873..db8f9d79477c 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4357,43 +4357,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>  		goto out;
>  	}
>  
> -	/*
> -	 * Reduce the reserved cluster count to reflect successful deferred
> -	 * allocation of delayed allocated clusters or direct allocation of
> -	 * clusters discovered to be delayed allocated.  Once allocated, a
> -	 * cluster is not included in the reserved count.
> -	 */
> -	if (test_opt(inode->i_sb, DELALLOC) && allocated_clusters) {
> -		if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) {
> -			/*
> -			 * When allocating delayed allocated clusters, simply
> -			 * reduce the reserved cluster count and claim quota
> -			 */
> -			ext4_da_update_reserve_space(inode, allocated_clusters,
> -							1);
> -		} else {
> -			ext4_lblk_t lblk, len;
> -			unsigned int n;
> -
> -			/*
> -			 * When allocating non-delayed allocated clusters
> -			 * (from fallocate, filemap, DIO, or clusters
> -			 * allocated when delalloc has been disabled by
> -			 * ext4_nonda_switch), reduce the reserved cluster
> -			 * count by the number of allocated clusters that
> -			 * have previously been delayed allocated.  Quota
> -			 * has been claimed by ext4_mb_new_blocks() above,
> -			 * so release the quota reservations made for any
> -			 * previously delayed allocated clusters.
> -			 */
> -			lblk = EXT4_LBLK_CMASK(sbi, map->m_lblk);
> -			len = allocated_clusters << sbi->s_cluster_bits;
> -			n = ext4_es_delayed_clu(inode, lblk, len);
> -			if (n > 0)
> -				ext4_da_update_reserve_space(inode, (int) n, 0);
> -		}
> -	}
> -
>  	/*
>  	 * Cache the extent and update transaction to commit on fdatasync only
>  	 * when it is _not_ an unwritten extent.
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 0580bc4bc762..41adf0d69959 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -853,6 +853,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>  	struct extent_status newes;
>  	ext4_lblk_t end = lblk + len - 1;
>  	int err1 = 0, err2 = 0, err3 = 0;
> +	int resv_used = 0, pending = 0;
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	struct extent_status *es1 = NULL;
>  	struct extent_status *es2 = NULL;
> @@ -891,7 +892,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>  		pr = __alloc_pending(true);
>  	write_lock(&EXT4_I(inode)->i_es_lock);
>  
> -	err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
> +	err1 = __es_remove_extent(inode, lblk, end, &resv_used, es1);
>  	if (err1 != 0)
>  		goto error;
>  	/* Free preallocated extent if it didn't get used. */
> @@ -921,9 +922,31 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>  			__free_pending(pr);
>  			pr = NULL;
>  		}
> +		pending = err3;
>  	}
>  error:
>  	write_unlock(&EXT4_I(inode)->i_es_lock);
> +	/*
> +	 * Reduce the reserved cluster count to reflect successful deferred
> +	 * allocation of delayed allocated clusters or direct allocation of
> +	 * clusters discovered to be delayed allocated.  Once allocated, a
> +	 * cluster is not included in the reserved count.
> +	 *
> +	 * When direct allocating (from fallocate, filemap, DIO, or clusters
> +	 * allocated when delalloc has been disabled by ext4_nonda_switch())
> +	 * an extent either 1) contains delayed blocks but start with
> +	 * non-delayed allocated blocks (e.g. hole) or 2) contains non-delayed
> +	 * allocated blocks which belong to delayed allocated clusters when
> +	 * bigalloc feature is enabled, quota has already been claimed by
> +	 * ext4_mb_new_blocks(), so release the quota reservations made for
> +	 * any previously delayed allocated clusters instead of claim them
> +	 * again.
> +	 */
> +	resv_used += pending;
> +	if (resv_used)
> +		ext4_da_update_reserve_space(inode, resv_used,
> +				flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE);
> +
>  	if (err1 || err2 || err3 < 0)
>  		goto retry;
>  
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index d8ca7f64f952..7404f0935c90 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -652,13 +652,6 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>  	ext4_update_inode_fsync_trans(handle, inode, 1);
>  	count = ar.len;
>  
> -	/*
> -	 * Update reserved blocks/metadata blocks after successful block
> -	 * allocation which had been deferred till now.
> -	 */
> -	if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
> -		ext4_da_update_reserve_space(inode, count, 1);
> -
>  got_it:
>  	map->m_flags |= EXT4_MAP_MAPPED;
>  	map->m_pblk = le32_to_cpu(chain[depth-1].key);
> -- 
> 2.39.2
>
diff mbox series

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 671dacd7c873..db8f9d79477c 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4357,43 +4357,6 @@  int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 		goto out;
 	}
 
-	/*
-	 * Reduce the reserved cluster count to reflect successful deferred
-	 * allocation of delayed allocated clusters or direct allocation of
-	 * clusters discovered to be delayed allocated.  Once allocated, a
-	 * cluster is not included in the reserved count.
-	 */
-	if (test_opt(inode->i_sb, DELALLOC) && allocated_clusters) {
-		if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) {
-			/*
-			 * When allocating delayed allocated clusters, simply
-			 * reduce the reserved cluster count and claim quota
-			 */
-			ext4_da_update_reserve_space(inode, allocated_clusters,
-							1);
-		} else {
-			ext4_lblk_t lblk, len;
-			unsigned int n;
-
-			/*
-			 * When allocating non-delayed allocated clusters
-			 * (from fallocate, filemap, DIO, or clusters
-			 * allocated when delalloc has been disabled by
-			 * ext4_nonda_switch), reduce the reserved cluster
-			 * count by the number of allocated clusters that
-			 * have previously been delayed allocated.  Quota
-			 * has been claimed by ext4_mb_new_blocks() above,
-			 * so release the quota reservations made for any
-			 * previously delayed allocated clusters.
-			 */
-			lblk = EXT4_LBLK_CMASK(sbi, map->m_lblk);
-			len = allocated_clusters << sbi->s_cluster_bits;
-			n = ext4_es_delayed_clu(inode, lblk, len);
-			if (n > 0)
-				ext4_da_update_reserve_space(inode, (int) n, 0);
-		}
-	}
-
 	/*
 	 * Cache the extent and update transaction to commit on fdatasync only
 	 * when it is _not_ an unwritten extent.
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 0580bc4bc762..41adf0d69959 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -853,6 +853,7 @@  void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 	struct extent_status newes;
 	ext4_lblk_t end = lblk + len - 1;
 	int err1 = 0, err2 = 0, err3 = 0;
+	int resv_used = 0, pending = 0;
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct extent_status *es1 = NULL;
 	struct extent_status *es2 = NULL;
@@ -891,7 +892,7 @@  void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 		pr = __alloc_pending(true);
 	write_lock(&EXT4_I(inode)->i_es_lock);
 
-	err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
+	err1 = __es_remove_extent(inode, lblk, end, &resv_used, es1);
 	if (err1 != 0)
 		goto error;
 	/* Free preallocated extent if it didn't get used. */
@@ -921,9 +922,31 @@  void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 			__free_pending(pr);
 			pr = NULL;
 		}
+		pending = err3;
 	}
 error:
 	write_unlock(&EXT4_I(inode)->i_es_lock);
+	/*
+	 * Reduce the reserved cluster count to reflect successful deferred
+	 * allocation of delayed allocated clusters or direct allocation of
+	 * clusters discovered to be delayed allocated.  Once allocated, a
+	 * cluster is not included in the reserved count.
+	 *
+	 * When direct allocating (from fallocate, filemap, DIO, or clusters
+	 * allocated when delalloc has been disabled by ext4_nonda_switch())
+	 * an extent either 1) contains delayed blocks but start with
+	 * non-delayed allocated blocks (e.g. hole) or 2) contains non-delayed
+	 * allocated blocks which belong to delayed allocated clusters when
+	 * bigalloc feature is enabled, quota has already been claimed by
+	 * ext4_mb_new_blocks(), so release the quota reservations made for
+	 * any previously delayed allocated clusters instead of claim them
+	 * again.
+	 */
+	resv_used += pending;
+	if (resv_used)
+		ext4_da_update_reserve_space(inode, resv_used,
+				flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE);
+
 	if (err1 || err2 || err3 < 0)
 		goto retry;
 
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index d8ca7f64f952..7404f0935c90 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -652,13 +652,6 @@  int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
 	ext4_update_inode_fsync_trans(handle, inode, 1);
 	count = ar.len;
 
-	/*
-	 * Update reserved blocks/metadata blocks after successful block
-	 * allocation which had been deferred till now.
-	 */
-	if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
-		ext4_da_update_reserve_space(inode, count, 1);
-
 got_it:
 	map->m_flags |= EXT4_MAP_MAPPED;
 	map->m_pblk = le32_to_cpu(chain[depth-1].key);