diff mbox series

[1/2] btrfs: zoned: count fresh BG region as zone unusable

Message ID bdcc434abd271dfd2b75c1b018ed6dbe425f562e.1678689012.git.naohiro.aota@wdc.com (mailing list archive)
State New, archived
Headers show
Series btrfs: zoned: replace active_total_bytes counter | expand

Commit Message

Naohiro Aota March 13, 2023, 7:06 a.m. UTC
The naming of space_info->active_total_bytes is misleading. It counts not
only active block groups but also full ones which are previously active but
now inactive. That confusion results in a bug not counting the full BGs
into active_total_bytes on mount time.

For a background, there are three kinds of block groups in terms of
activation.

1. Block groups never activated
2. Block groups currently active
3. Block groups previously active and currently inactive (due to fully
   written or zone finish)

What we really wanted to exclude from "total_bytes" is the total size of
BGs #1. They seem empty and allocatable but since they are not activated,
we cannot rely on them to do the space reservation.

And, since BGs #1 never get activated, they should have no "used",
"reserved" and "pinned" bytes.

OTOH, BGs #3 can be counted in the "total", since they are already full we
cannot allocate from them anyway. For them, "total_bytes == used + reserved
+ pinned + zone_unusable" should hold.

Tracking #2 and #3 as "active_total_bytes" (current implementation) is
confusing. And, tracking #1 and subtract that properly from "total_bytes"
every time you need space reservation is cumbersome.

Instead, we can count the whole region of a newly allocated block group as
zone_unusable. Then, once that block group is activated, release [0 ..
zone_capcity] from the zone_unusable counters. With this, we can eliminate
the confusing ->active_total_bytes and the code will be common among
regular and the zoned mode. Also, no additional counter is needed with this
approach.

Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes")
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/free-space-cache.c |  8 +++++++-
 fs/btrfs/zoned.c            | 23 +++++++++++++++++++----
 2 files changed, 26 insertions(+), 5 deletions(-)

Comments

kernel test robot March 13, 2023, 9:44 a.m. UTC | #1
Hi Naohiro,

I love your patch! Yet something to improve:

[auto build test ERROR on kdave/for-next]
[also build test ERROR on linus/master v6.3-rc2 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Naohiro-Aota/btrfs-zoned-count-fresh-BG-region-as-zone-unusable/20230313-150709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link:    https://lore.kernel.org/r/bdcc434abd271dfd2b75c1b018ed6dbe425f562e.1678689012.git.naohiro.aota%40wdc.com
patch subject: [PATCH 1/2] btrfs: zoned: count fresh BG region as zone unusable
config: i386-debian-10.3 (https://download.01.org/0day-ci/archive/20230313/202303131759.8M0pXaNR-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/c266ae6ba937488d8339e586ebcc8c93d3389eb0
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Naohiro-Aota/btrfs-zoned-count-fresh-BG-region-as-zone-unusable/20230313-150709
        git checkout c266ae6ba937488d8339e586ebcc8c93d3389eb0
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash fs/btrfs/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303131759.8M0pXaNR-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/kernel.h:22,
                    from arch/x86/include/asm/percpu.h:27,
                    from arch/x86/include/asm/preempt.h:6,
                    from include/linux/preempt.h:78,
                    from include/linux/spinlock.h:56,
                    from include/linux/mmzone.h:8,
                    from include/linux/gfp.h:7,
                    from include/linux/mm.h:7,
                    from include/linux/pagemap.h:8,
                    from fs/btrfs/free-space-cache.c:6:
   fs/btrfs/free-space-cache.c: In function '__btrfs_add_free_space_zoned':
>> fs/btrfs/free-space-cache.c:2700:27: error: 'BTRFS_FS_ACTIVE_ZONE_TRACKING' undeclared (first use in this function)
    2700 |                  test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
         |                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/bitops.h:49:32: note: in definition of macro 'bitop'
      49 |         ((__builtin_constant_p(nr) &&                                   \
         |                                ^~
   fs/btrfs/free-space-cache.c:2700:18: note: in expansion of macro 'test_bit'
    2700 |                  test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
         |                  ^~~~~~~~
   fs/btrfs/free-space-cache.c:2700:27: note: each undeclared identifier is reported only once for each function it appears in
    2700 |                  test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
         |                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/bitops.h:49:32: note: in definition of macro 'bitop'
      49 |         ((__builtin_constant_p(nr) &&                                   \
         |                                ^~
   fs/btrfs/free-space-cache.c:2700:18: note: in expansion of macro 'test_bit'
    2700 |                  test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
         |                  ^~~~~~~~
--
   In file included from fs/btrfs/zoned.c:3:
   fs/btrfs/zoned.c: In function 'btrfs_calc_zone_unusable':
>> fs/btrfs/zoned.c:1586:22: error: 'BTRFS_FS_ACTIVE_ZONE_TRACKING' undeclared (first use in this function)
    1586 |         if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
         |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/bitops.h:49:32: note: in definition of macro 'bitop'
      49 |         ((__builtin_constant_p(nr) &&                                   \
         |                                ^~
   fs/btrfs/zoned.c:1586:13: note: in expansion of macro 'test_bit'
    1586 |         if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
         |             ^~~~~~~~
   fs/btrfs/zoned.c:1586:22: note: each undeclared identifier is reported only once for each function it appears in
    1586 |         if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
         |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/bitops.h:49:32: note: in definition of macro 'bitop'
      49 |         ((__builtin_constant_p(nr) &&                                   \
         |                                ^~
   fs/btrfs/zoned.c:1586:13: note: in expansion of macro 'test_bit'
    1586 |         if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
         |             ^~~~~~~~


vim +/BTRFS_FS_ACTIVE_ZONE_TRACKING +2700 fs/btrfs/free-space-cache.c

  2678	
  2679	static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
  2680						u64 bytenr, u64 size, bool used)
  2681	{
  2682		struct btrfs_space_info *sinfo = block_group->space_info;
  2683		struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl;
  2684		u64 offset = bytenr - block_group->start;
  2685		u64 to_free, to_unusable;
  2686		int bg_reclaim_threshold = 0;
  2687		bool initial = (size == block_group->length);
  2688		u64 reclaimable_unusable;
  2689	
  2690		WARN_ON(!initial && offset + size > block_group->zone_capacity);
  2691	
  2692		if (!initial)
  2693			bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
  2694	
  2695		spin_lock(&ctl->tree_lock);
  2696		/* Count initial region as zone_unusable until it gets activated. */
  2697		if (!used)
  2698			to_free = size;
  2699		else if (initial &&
> 2700			 test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
  2701			 block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM))
  2702			to_free = 0;
  2703		else if (initial)
  2704			to_free = block_group->zone_capacity;
  2705		else if (offset >= block_group->alloc_offset)
  2706			to_free = size;
  2707		else if (offset + size <= block_group->alloc_offset)
  2708			to_free = 0;
  2709		else
  2710			to_free = offset + size - block_group->alloc_offset;
  2711		to_unusable = size - to_free;
  2712	
  2713		ctl->free_space += to_free;
  2714		/*
  2715		 * If the block group is read-only, we should account freed space into
  2716		 * bytes_readonly.
  2717		 */
  2718		if (!block_group->ro)
  2719			block_group->zone_unusable += to_unusable;
  2720		spin_unlock(&ctl->tree_lock);
  2721		if (!used) {
  2722			spin_lock(&block_group->lock);
  2723			block_group->alloc_offset -= size;
  2724			spin_unlock(&block_group->lock);
  2725		}
  2726	
  2727		reclaimable_unusable = block_group->zone_unusable -
  2728				       (block_group->length - block_group->zone_capacity);
  2729		/* All the region is now unusable. Mark it as unused and reclaim */
  2730		if (block_group->zone_unusable == block_group->length &&
  2731		    block_group->alloc_offset) {
  2732			btrfs_mark_bg_unused(block_group);
  2733		} else if (bg_reclaim_threshold &&
  2734			   reclaimable_unusable >=
  2735			   mult_perc(block_group->zone_capacity, bg_reclaim_threshold)) {
  2736			btrfs_mark_bg_to_reclaim(block_group);
  2737		}
  2738	
  2739		return 0;
  2740	}
  2741
kernel test robot March 13, 2023, 12:08 p.m. UTC | #2
Hi Naohiro,

I love your patch! Yet something to improve:

[auto build test ERROR on kdave/for-next]
[also build test ERROR on linus/master v6.3-rc2 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Naohiro-Aota/btrfs-zoned-count-fresh-BG-region-as-zone-unusable/20230313-150709
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link:    https://lore.kernel.org/r/bdcc434abd271dfd2b75c1b018ed6dbe425f562e.1678689012.git.naohiro.aota%40wdc.com
patch subject: [PATCH 1/2] btrfs: zoned: count fresh BG region as zone unusable
config: x86_64-randconfig-r025-20230313 (https://download.01.org/0day-ci/archive/20230313/202303131915.APpbxwaX-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/c266ae6ba937488d8339e586ebcc8c93d3389eb0
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Naohiro-Aota/btrfs-zoned-count-fresh-BG-region-as-zone-unusable/20230313-150709
        git checkout c266ae6ba937488d8339e586ebcc8c93d3389eb0
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash fs/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303131915.APpbxwaX-lkp@intel.com/

All errors (new ones prefixed by >>):

>> fs/btrfs/free-space-cache.c:2700:13: error: use of undeclared identifier 'BTRFS_FS_ACTIVE_ZONE_TRACKING'
                    test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
                             ^
>> fs/btrfs/free-space-cache.c:2700:13: error: use of undeclared identifier 'BTRFS_FS_ACTIVE_ZONE_TRACKING'
>> fs/btrfs/free-space-cache.c:2700:13: error: use of undeclared identifier 'BTRFS_FS_ACTIVE_ZONE_TRACKING'
   3 errors generated.


vim +/BTRFS_FS_ACTIVE_ZONE_TRACKING +2700 fs/btrfs/free-space-cache.c

  2678	
  2679	static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
  2680						u64 bytenr, u64 size, bool used)
  2681	{
  2682		struct btrfs_space_info *sinfo = block_group->space_info;
  2683		struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl;
  2684		u64 offset = bytenr - block_group->start;
  2685		u64 to_free, to_unusable;
  2686		int bg_reclaim_threshold = 0;
  2687		bool initial = (size == block_group->length);
  2688		u64 reclaimable_unusable;
  2689	
  2690		WARN_ON(!initial && offset + size > block_group->zone_capacity);
  2691	
  2692		if (!initial)
  2693			bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
  2694	
  2695		spin_lock(&ctl->tree_lock);
  2696		/* Count initial region as zone_unusable until it gets activated. */
  2697		if (!used)
  2698			to_free = size;
  2699		else if (initial &&
> 2700			 test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
  2701			 block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM))
  2702			to_free = 0;
  2703		else if (initial)
  2704			to_free = block_group->zone_capacity;
  2705		else if (offset >= block_group->alloc_offset)
  2706			to_free = size;
  2707		else if (offset + size <= block_group->alloc_offset)
  2708			to_free = 0;
  2709		else
  2710			to_free = offset + size - block_group->alloc_offset;
  2711		to_unusable = size - to_free;
  2712	
  2713		ctl->free_space += to_free;
  2714		/*
  2715		 * If the block group is read-only, we should account freed space into
  2716		 * bytes_readonly.
  2717		 */
  2718		if (!block_group->ro)
  2719			block_group->zone_unusable += to_unusable;
  2720		spin_unlock(&ctl->tree_lock);
  2721		if (!used) {
  2722			spin_lock(&block_group->lock);
  2723			block_group->alloc_offset -= size;
  2724			spin_unlock(&block_group->lock);
  2725		}
  2726	
  2727		reclaimable_unusable = block_group->zone_unusable -
  2728				       (block_group->length - block_group->zone_capacity);
  2729		/* All the region is now unusable. Mark it as unused and reclaim */
  2730		if (block_group->zone_unusable == block_group->length &&
  2731		    block_group->alloc_offset) {
  2732			btrfs_mark_bg_unused(block_group);
  2733		} else if (bg_reclaim_threshold &&
  2734			   reclaimable_unusable >=
  2735			   mult_perc(block_group->zone_capacity, bg_reclaim_threshold)) {
  2736			btrfs_mark_bg_to_reclaim(block_group);
  2737		}
  2738	
  2739		return 0;
  2740	}
  2741
diff mbox series

Patch

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 0d250d052487..4962d7bf1e3a 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2693,8 +2693,13 @@  static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
 		bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
 
 	spin_lock(&ctl->tree_lock);
+	/* Count initial region as zone_unusable until it gets activated. */
 	if (!used)
 		to_free = size;
+	else if (initial &&
+		 test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
+		 block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM))
+		to_free = 0;
 	else if (initial)
 		to_free = block_group->zone_capacity;
 	else if (offset >= block_group->alloc_offset)
@@ -2722,7 +2727,8 @@  static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
 	reclaimable_unusable = block_group->zone_unusable -
 			       (block_group->length - block_group->zone_capacity);
 	/* All the region is now unusable. Mark it as unused and reclaim */
-	if (block_group->zone_unusable == block_group->length) {
+	if (block_group->zone_unusable == block_group->length &&
+	    block_group->alloc_offset) {
 		btrfs_mark_bg_unused(block_group);
 	} else if (bg_reclaim_threshold &&
 		   reclaimable_unusable >=
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 808cfa3091c5..c733383bbaeb 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1580,9 +1580,19 @@  void btrfs_calc_zone_unusable(struct btrfs_block_group *cache)
 		return;
 
 	WARN_ON(cache->bytes_super != 0);
-	unusable = (cache->alloc_offset - cache->used) +
-		   (cache->length - cache->zone_capacity);
-	free = cache->zone_capacity - cache->alloc_offset;
+
+	/* Check for block groups never get activated */
+	if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
+	    cache->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM) &&
+	    !test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags) &&
+	    cache->alloc_offset == 0) {
+		unusable = cache->length;
+		free = 0;
+	} else {
+		unusable = (cache->alloc_offset - cache->used) +
+			   (cache->length - cache->zone_capacity);
+		free = cache->zone_capacity - cache->alloc_offset;
+	}
 
 	/* We only need ->free_space in ALLOC_SEQ block groups */
 	cache->cached = BTRFS_CACHE_FINISHED;
@@ -1901,7 +1911,11 @@  bool btrfs_zone_activate(struct btrfs_block_group *block_group)
 
 	/* Successfully activated all the zones */
 	set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags);
-	space_info->active_total_bytes += block_group->length;
+	WARN_ON(block_group->alloc_offset != 0);
+	if (block_group->zone_unusable == block_group->length) {
+		block_group->zone_unusable = block_group->length - block_group->zone_capacity;
+		space_info->bytes_zone_unusable -= block_group->zone_capacity;
+	}
 	spin_unlock(&block_group->lock);
 	btrfs_try_granting_tickets(fs_info, space_info);
 	spin_unlock(&space_info->lock);
@@ -2256,6 +2270,7 @@  int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
 
 		spin_lock(&block_group->lock);
 		if (block_group->reserved ||
+		    block_group->alloc_offset == 0 ||
 		    (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
 			spin_unlock(&block_group->lock);
 			continue;