diff mbox series

[2/2] btrfs: reduce div64 calls by limiting the number of stripes of a chunk to u32

Message ID 275da997c70615256bed3cff8e74ab2b3fecbafc.1675586554.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: reduce div64 calls for __btrfs_map_block() and its variants | expand

Commit Message

Qu Wenruo Feb. 5, 2023, 8:53 a.m. UTC
There are quite some div64 calls inside btrfs_map_block() and its
variants.

One of such calls are for @stripe_nr, where @stripe_nr is the number of
stripes before our logical bytenr inside a chunk.

However we can eliminate such div64 calls by just reducing the width of
@stripe_nr from 64 to 32.

This can be done because our chunk size limit is already 10G, with fixed
stripe length 64K.
Thus a U32 is definitely enough to contain the number of stripes.

With such width reduction, we can get rid of slower div64, and extra
warning for certain 32bit arch.

This patch would do:

- Reduce the width of various stripe_nr variables

- Add a new tree-checker chunk validation on chunk length
  Make sure no chunk can reach 256G, which can also act as a bitflip
  checker.

- Replace unnecessary div64 calls with regular modulo and division
  32bit division and modulo are much faster than 64bit operations, and
  we are finally free of the div64 fear at least in those involved
  functions.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/block-group.c  | 14 ++++----
 fs/btrfs/tree-checker.c | 14 ++++++++
 fs/btrfs/volumes.c      | 79 ++++++++++++++++++++++-------------------
 fs/btrfs/volumes.h      |  2 +-
 4 files changed, 65 insertions(+), 44 deletions(-)

Comments

kernel test robot Feb. 5, 2023, 12:09 p.m. UTC | #1
Hi Qu,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on v6.2-rc6]
[also build test ERROR on linus/master]
[cannot apply to kdave/for-next next-20230203]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-remove-map_lookup-stripe_len/20230205-165612
patch link:    https://lore.kernel.org/r/275da997c70615256bed3cff8e74ab2b3fecbafc.1675586554.git.wqu%40suse.com
patch subject: [PATCH 2/2] btrfs: reduce div64 calls by limiting the number of stripes of a chunk to u32
config: i386-randconfig-a003 (https://download.01.org/0day-ci/archive/20230205/202302051950.KouT9cVY-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/dc356ab4c1f118e55634c17c1e086bfc0536198d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Qu-Wenruo/btrfs-remove-map_lookup-stripe_len/20230205-165612
        git checkout dc356ab4c1f118e55634c17c1e086bfc0536198d
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "__udivdi3" [fs/btrfs/btrfs.ko] undefined!
kernel test robot Feb. 5, 2023, 12:20 p.m. UTC | #2
Hi Qu,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on v6.2-rc6]
[also build test ERROR on linus/master]
[cannot apply to kdave/for-next next-20230203]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-remove-map_lookup-stripe_len/20230205-165612
patch link:    https://lore.kernel.org/r/275da997c70615256bed3cff8e74ab2b3fecbafc.1675586554.git.wqu%40suse.com
patch subject: [PATCH 2/2] btrfs: reduce div64 calls by limiting the number of stripes of a chunk to u32
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20230205/202302052002.9uoFbMlg-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/dc356ab4c1f118e55634c17c1e086bfc0536198d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Qu-Wenruo/btrfs-remove-map_lookup-stripe_len/20230205-165612
        git checkout dc356ab4c1f118e55634c17c1e086bfc0536198d
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   m68k-linux-ld: fs/btrfs/block-group.o: in function `btrfs_rmap_block':
>> block-group.c:(.text+0x396c): undefined reference to `__udivdi3'
Christoph Hellwig Feb. 6, 2023, 6:44 a.m. UTC | #3
> +	if (unlikely(((u64)U32_MAX << BTRFS_STRIPE_LEN_SHIFT) <= length)) {

Nit, at least to me this order reads weird and I had to stop and think
a while.   This version:

	if (unlikely(length >= ((u64)U32_MAX << BTRFS_STRIPE_LEN_SHIFT))) {

would be more obvious.

> @@ -6344,7 +6345,7 @@ int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
>  
>  	io_geom->len = len;
>  	io_geom->offset = offset;
> -	io_geom->stripe_len = stripe_len;
> +	io_geom->stripe_len = BTRFS_STRIPE_LEN;

This conflicts with the bio split series that has been in for-next
for a few weeks (but not in misc-next yet).
Qu Wenruo Feb. 6, 2023, 10:16 a.m. UTC | #4
On 2023/2/6 14:44, Christoph Hellwig wrote:
>> +	if (unlikely(((u64)U32_MAX << BTRFS_STRIPE_LEN_SHIFT) <= length)) {
> 
> Nit, at least to me this order reads weird and I had to stop and think
> a while.   This version:
> 
> 	if (unlikely(length >= ((u64)U32_MAX << BTRFS_STRIPE_LEN_SHIFT))) {
> 
> would be more obvious.
> 
>> @@ -6344,7 +6345,7 @@ int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
>>   
>>   	io_geom->len = len;
>>   	io_geom->offset = offset;
>> -	io_geom->stripe_len = stripe_len;
>> +	io_geom->stripe_len = BTRFS_STRIPE_LEN;
> 
> This conflicts with the bio split series that has been in for-next
> for a few weeks (but not in misc-next yet).

I really hope David could solve it manually.

As I'm not sure if we should really base most of our development on 
for-next, as normally that branch is not updated frequently as misc-next.

Thanks,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 1c74af23f54c..4d7c62e5a4d8 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2020,20 +2020,20 @@  int btrfs_rmap_block(struct btrfs_fs_info *fs_info, u64 chunk_start,
 		if (bdev && map->stripes[i].dev->bdev != bdev)
 			continue;
 
-		stripe_nr = physical - map->stripes[i].physical;
-		stripe_nr = div64_u64_rem(stripe_nr, BTRFS_STRIPE_LEN, &offset);
+		stripe_nr = (physical - map->stripes[i].physical) >>
+			     BTRFS_STRIPE_LEN_SHIFT;
+		offset = (physical - map->stripes[i].physical) &
+			 ((1 << BTRFS_STRIPE_LEN_SHIFT) - 1);
 
 		if (map->type & (BTRFS_BLOCK_GROUP_RAID0 |
-				 BTRFS_BLOCK_GROUP_RAID10)) {
-			stripe_nr = stripe_nr * map->num_stripes + i;
-			stripe_nr = div_u64(stripe_nr, map->sub_stripes);
-		}
+				 BTRFS_BLOCK_GROUP_RAID10))
+			stripe_nr = (stripe_nr * map->num_stripes + i) /
+				     map->sub_stripes;
 		/*
 		 * The remaining case would be for RAID56, multiply by
 		 * nr_data_stripes().  Alternatively, just use rmap_len below
 		 * instead of map->stripe_len
 		 */
-
 		bytenr = chunk_start + stripe_nr * io_stripe_size + offset;
 
 		/* Ensure we don't add duplicate addresses */
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index baad1ed7e111..7968fc89f278 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -849,6 +849,20 @@  int btrfs_check_chunk_valid(struct extent_buffer *leaf,
 			  stripe_len);
 		return -EUCLEAN;
 	}
+	/*
+	 * We artificially limit the chunk size, so that the number of stripes
+	 * inside a chunk can be fit into a U32.
+	 * The current limit (256G) would be way too large for real world usage
+	 * anyway, and it's already way larger than our existing limit (10G).
+	 *
+	 * Thus it should be a good way to catch obvious bitflip.
+	 */
+	if (unlikely(((u64)U32_MAX << BTRFS_STRIPE_LEN_SHIFT) <= length)) {
+		chunk_err(leaf, chunk, logical,
+			  "chunk length too large: have %llu limit %llu",
+			  length, (u64)U32_MAX << BTRFS_STRIPE_LEN_SHIFT);
+		return -EUCLEAN;
+	}
 	if (unlikely(type & ~(BTRFS_BLOCK_GROUP_TYPE_MASK |
 			      BTRFS_BLOCK_GROUP_PROFILE_MASK))) {
 		chunk_err(leaf, chunk, logical,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4c91cc6471c3..ce073083b0dc 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5940,15 +5940,15 @@  struct btrfs_discard_stripe *btrfs_map_discard(struct btrfs_fs_info *fs_info,
 	struct btrfs_discard_stripe *stripes;
 	u64 length = *length_ret;
 	u64 offset;
-	u64 stripe_nr;
-	u64 stripe_nr_end;
+	u32 stripe_nr;
+	u32 stripe_nr_end;
+	u32 stripe_cnt;
 	u64 stripe_end_offset;
-	u64 stripe_cnt;
 	u64 stripe_offset;
 	u32 stripe_index;
 	u32 factor = 0;
 	u32 sub_stripes = 0;
-	u64 stripes_per_dev = 0;
+	u32 stripes_per_dev = 0;
 	u32 remaining_stripes = 0;
 	u32 last_stripe = 0;
 	int ret;
@@ -5974,13 +5974,13 @@  struct btrfs_discard_stripe *btrfs_map_discard(struct btrfs_fs_info *fs_info,
 	 * stripe_nr counts the total number of stripes we have to stride
 	 * to get to this block
 	 */
-	stripe_nr = div64_u64(offset, BTRFS_STRIPE_LEN);
+	stripe_nr = offset >> BTRFS_STRIPE_LEN_SHIFT;
 
 	/* stripe_offset is the offset of this block in its stripe */
 	stripe_offset = offset - (stripe_nr << BTRFS_STRIPE_LEN_SHIFT);
 
-	stripe_nr_end = round_up(offset + length, BTRFS_STRIPE_LEN);
-	stripe_nr_end = div64_u64(stripe_nr_end, BTRFS_STRIPE_LEN);
+	stripe_nr_end = round_up(offset + length, BTRFS_STRIPE_LEN) >>
+			BTRFS_STRIPE_LEN_SHIFT;
 	stripe_cnt = stripe_nr_end - stripe_nr;
 	stripe_end_offset = (stripe_nr_end << BTRFS_STRIPE_LEN_SHIFT) -
 			    (offset + length);
@@ -6001,18 +6001,18 @@  struct btrfs_discard_stripe *btrfs_map_discard(struct btrfs_fs_info *fs_info,
 		factor = map->num_stripes / sub_stripes;
 		*num_stripes = min_t(u64, map->num_stripes,
 				    sub_stripes * stripe_cnt);
-		stripe_nr = div_u64_rem(stripe_nr, factor, &stripe_index);
+		stripe_index = stripe_nr % factor;
+		stripe_nr /= factor;
 		stripe_index *= sub_stripes;
 		stripes_per_dev = div_u64_rem(stripe_cnt, factor,
 					      &remaining_stripes);
-		div_u64_rem(stripe_nr_end - 1, factor, &last_stripe);
-		last_stripe *= sub_stripes;
+		last_stripe = (stripe_nr_end - 1) % factor * sub_stripes;
 	} else if (map->type & (BTRFS_BLOCK_GROUP_RAID1_MASK |
 				BTRFS_BLOCK_GROUP_DUP)) {
 		*num_stripes = map->num_stripes;
 	} else {
-		stripe_nr = div_u64_rem(stripe_nr, map->num_stripes,
-					&stripe_index);
+		stripe_index = stripe_nr % map->num_stripes;
+		stripe_nr /= map->num_stripes;
 	}
 
 	stripes = kcalloc(*num_stripes, sizeof(*stripes), GFP_NOFS);
@@ -6286,11 +6286,10 @@  int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
 			  struct btrfs_io_geometry *io_geom)
 {
 	struct map_lookup *map;
-	const u32 stripe_len = BTRFS_STRIPE_LEN;
 	u64 len;
 	u64 offset;
 	u64 stripe_offset;
-	u64 stripe_nr;
+	u32 stripe_nr;
 	u64 raid56_full_stripe_start = (u64)-1;
 	int data_stripes;
 
@@ -6303,20 +6302,22 @@  int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
 	 * Stripe_nr is where this block falls in
 	 * stripe_offset is the offset of this block in its stripe.
 	 */
-	stripe_nr = div64_u64_rem(offset, stripe_len, &stripe_offset);
+	stripe_offset = offset & ((1 << BTRFS_STRIPE_LEN_SHIFT) - 1);
+	stripe_nr = offset >> BTRFS_STRIPE_LEN_SHIFT;
 	ASSERT(stripe_offset < U32_MAX);
 
 	data_stripes = nr_data_stripes(map);
 
 	/* Only stripe based profiles needs to check against stripe length. */
 	if (map->type & BTRFS_BLOCK_GROUP_STRIPE_MASK) {
-		u64 max_len = stripe_len - stripe_offset;
+		u64 max_len = BTRFS_STRIPE_LEN - stripe_offset;
 
 		/*
 		 * In case of raid56, we need to know the stripe aligned start
 		 */
 		if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
-			unsigned long full_stripe_len = stripe_len * data_stripes;
+			unsigned long full_stripe_len = data_stripes <<
+							BTRFS_STRIPE_LEN_SHIFT;
 			raid56_full_stripe_start = offset;
 
 			/*
@@ -6333,7 +6334,7 @@  int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
 			 * reads, just allow a single stripe (on a single disk).
 			 */
 			if (op == BTRFS_MAP_WRITE) {
-				max_len = stripe_len * data_stripes -
+				max_len = (data_stripes << BTRFS_STRIPE_LEN_SHIFT) -
 					  (offset - raid56_full_stripe_start);
 			}
 		}
@@ -6344,7 +6345,7 @@  int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
 
 	io_geom->len = len;
 	io_geom->offset = offset;
-	io_geom->stripe_len = stripe_len;
+	io_geom->stripe_len = BTRFS_STRIPE_LEN;
 	io_geom->stripe_nr = stripe_nr;
 	io_geom->stripe_offset = stripe_offset;
 	io_geom->raid56_stripe_offset = raid56_full_stripe_start;
@@ -6353,7 +6354,7 @@  int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
 }
 
 static void set_io_stripe(struct btrfs_io_stripe *dst, const struct map_lookup *map,
-		          u32 stripe_index, u64 stripe_offset, u64 stripe_nr)
+			  u32 stripe_index, u64 stripe_offset, u32 stripe_nr)
 {
 	dst->dev = map->stripes[stripe_index].dev;
 	dst->physical = map->stripes[stripe_index].physical +
@@ -6369,8 +6370,8 @@  int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 	struct extent_map *em;
 	struct map_lookup *map;
 	u64 stripe_offset;
-	u64 stripe_nr;
 	u64 stripe_len;
+	u32 stripe_nr;
 	u32 stripe_index;
 	int data_stripes;
 	int i;
@@ -6433,8 +6434,8 @@  int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 	num_stripes = 1;
 	stripe_index = 0;
 	if (map->type & BTRFS_BLOCK_GROUP_RAID0) {
-		stripe_nr = div_u64_rem(stripe_nr, map->num_stripes,
-				&stripe_index);
+		stripe_index = stripe_nr % map->num_stripes;
+		stripe_nr /= map->num_stripes;
 		if (!need_full_stripe(op))
 			mirror_num = 1;
 	} else if (map->type & BTRFS_BLOCK_GROUP_RAID1_MASK) {
@@ -6460,8 +6461,8 @@  int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 	} else if (map->type & BTRFS_BLOCK_GROUP_RAID10) {
 		u32 factor = map->num_stripes / map->sub_stripes;
 
-		stripe_nr = div_u64_rem(stripe_nr, factor, &stripe_index);
-		stripe_index *= map->sub_stripes;
+		stripe_index = stripe_nr % factor * map->sub_stripes;
+		stripe_nr /= factor;
 
 		if (need_full_stripe(op))
 			num_stripes = map->sub_stripes;
@@ -6477,9 +6478,16 @@  int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 
 	} else if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		if (need_raid_map && (need_full_stripe(op) || mirror_num > 1)) {
-			/* push stripe_nr back to the start of the full stripe */
-			stripe_nr = div64_u64(raid56_full_stripe_start,
-					stripe_len * data_stripes);
+			/*
+			 * Push stripe_nr back to the start of the full stripe
+			 * For those cases needing a full stripe, @stripe_nr
+			 * is the full stripe number.
+			 *
+			 * Original we go raid56_full_stripe_start / full_stripe_len,
+			 * but that can be expensive.
+			 * Here we just divide @stripe_nr with @data_stripes.
+			 */
+			stripe_nr /= data_stripes;
 
 			/* RAID[56] write or recovery. Return all stripes */
 			num_stripes = map->num_stripes;
@@ -6497,25 +6505,24 @@  int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 			 * Mirror #2 is RAID5 parity block.
 			 * Mirror #3 is RAID6 Q block.
 			 */
-			stripe_nr = div_u64_rem(stripe_nr,
-					data_stripes, &stripe_index);
+			stripe_index = stripe_nr % data_stripes;
+			stripe_nr /= data_stripes;
 			if (mirror_num > 1)
 				stripe_index = data_stripes + mirror_num - 2;
 
 			/* We distribute the parity blocks across stripes */
-			div_u64_rem(stripe_nr + stripe_index, map->num_stripes,
-					&stripe_index);
+			stripe_index = (stripe_nr + stripe_index) % map->num_stripes;
 			if (!need_full_stripe(op) && mirror_num <= 1)
 				mirror_num = 1;
 		}
 	} else {
 		/*
-		 * after this, stripe_nr is the number of stripes on this
+		 * After this, stripe_nr is the number of stripes on this
 		 * device we have to walk to find the data, and stripe_index is
 		 * the number of our device in the stripe array
 		 */
-		stripe_nr = div_u64_rem(stripe_nr, map->num_stripes,
-				&stripe_index);
+		stripe_index = stripe_nr % map->num_stripes;
+		stripe_nr /= map->num_stripes;
 		mirror_num = stripe_index + 1;
 	}
 	if (stripe_index >= map->num_stripes) {
@@ -6577,7 +6584,7 @@  int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 		unsigned rot;
 
 		/* Work out the disk rotation on this stripe-set */
-		div_u64_rem(stripe_nr, num_stripes, &rot);
+		rot = stripe_nr % num_stripes;
 
 		/* Fill in the logical address of each stripe */
 		tmp = stripe_nr * data_stripes;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 7d0d9f25864c..96dda0f4c564 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -67,7 +67,7 @@  struct btrfs_io_geometry {
 	/* offset of address in stripe */
 	u32 stripe_offset;
 	/* number of stripe where address falls */
-	u64 stripe_nr;
+	u32 stripe_nr;
 	/* offset of raid56 stripe into the chunk */
 	u64 raid56_stripe_offset;
 };