diff mbox series

[8/9] xfs: remove xfs_defer_agfl_block

Message ID 171892418834.3183906.376857417040987772.stgit@frogsfrogsfrogs (mailing list archive)
State Accepted, archived
Headers show
Series [1/9] xfs: clean up extent free log intent item tracepoint callsites | expand

Commit Message

Darrick J. Wong June 20, 2024, 11:06 p.m. UTC
From: Christoph Hellwig <hch@lst.de>

xfs_free_extent_later can handle the extra AGFL special casing with
very little extra logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_alloc.c |   67 +++++++++++++++------------------------------
 1 file changed, 22 insertions(+), 45 deletions(-)

Comments

kernel test robot July 3, 2024, 8:07 a.m. UTC | #1
Hello,

kernel test robot noticed "Assertion_failed" on:

commit: f53305b8c490815f244c0d44b096abd4f2a63aeb ("[PATCH 8/9] xfs: remove xfs_defer_agfl_block")
url: https://github.com/intel-lab-lkp/linux/commits/Darrick-J-Wong/xfs-convert-skip_discard-to-a-proper-flags-bitset/20240625-204930
base: https://git.kernel.org/cgit/fs/xfs/xfs-linux.git for-next
patch link: https://lore.kernel.org/all/171892418834.3183906.376857417040987772.stgit@frogsfrogsfrogs/
patch subject: [PATCH 8/9] xfs: remove xfs_defer_agfl_block

in testcase: stress-ng
version: stress-ng-x86_64-ecd3fe291-1_20240612
with following parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 60s
	fs: xfs
	test: copy-file
	cpufreq_governor: performance



compiler: gcc-13
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202407031556.d271bd4c-oliver.sang@intel.com



user  :err   : [   88.899876] [ perf record: Woken up 5 times to write data ]

user  :err   : [   88.979592] [ perf record: Captured and wrote 9.304 MB /tmp/lkp/perf_c2c.data (5470 samples) ]

kern  :warn  : [  101.832173] XFS: Assertion failed: type != XFS_AG_RESV_AGFL, file: fs/xfs/libxfs/xfs_alloc.c, line: 2558
kern  :warn  : [  101.842834] ------------[ cut here ]------------
kern :warn : [  101.848538] WARNING: CPU: 22 PID: 536 at fs/xfs/xfs_message.c:89 asswarn (kbuild/src/consumer/fs/xfs/xfs_message.c:89 (discriminator 1)) xfs
kern  :warn  : [  101.857842] Modules linked in: xfs intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal coretemp btrfs blake2b_generic kvm_intel ipmi_ssif xor raid6_pq libcrc32c kvm crct10dif_pclmul crc32_pclmul crc32c_intel sd_mod ghash_clmulni_intel sg sha512_ssse3 nvme ahci rapl libahci ast nvme_core binfmt_misc t10_pi intel_cstate mei_me drm_shmem_helper acpi_power_meter intel_th_gth crc64_rocksoft_generic i2c_i801 crc64_rocksoft ioatdma intel_th_pci libata intel_uncore drm_kms_helper megaraid_sas i2c_smbus ipmi_si mei intel_pch_thermal acpi_ipmi dax_hmem crc64 intel_th dca wmi ipmi_devintf ipmi_msghandler joydev drm fuse loop dm_mod ip_tables
user  :notice: [  101.860115] stress-ng: metrc: [2914] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
kern  :warn  : [  101.914497] CPU: 22 PID: 536 Comm: kworker/22:1 Not tainted 6.10.0-rc4-00009-gf53305b8c490 #1

kern  :warn  : [  101.929361] Hardware name: Inspur NF5180M6/NF5180M6, BIOS 06.00.04 04/12/2022
user  :notice: [  101.940509] stress-ng: metrc: [2914]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
kern  :warn  : [  101.940764] Workqueue: xfs-inodegc/sdb1 xfs_inodegc_worker [xfs]


kern :warn : [  101.962311] RIP: 0010:asswarn (kbuild/src/consumer/fs/xfs/xfs_message.c:89 (discriminator 1)) xfs
user  :notice: [  101.970762] stress-ng: metrc: [2914] copy-file         10938     60.17      0.14      4.61       181.79        2300.90         0.12          3244
kern :warn : [ 101.971200] Code: 90 90 66 0f 1f 00 0f 1f 44 00 00 49 89 d0 41 89 c9 48 c7 c2 90 ed 01 c1 48 89 f1 48 89 fe 48 c7 c7 20 07 01 c1 e8 18 fd ff ff <0f> 0b c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
All code
========
   0:	90                   	nop
   1:	90                   	nop
   2:	66 0f 1f 00          	nopw   (%rax)
   6:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   b:	49 89 d0             	mov    %rdx,%r8
   e:	41 89 c9             	mov    %ecx,%r9d
  11:	48 c7 c2 90 ed 01 c1 	mov    $0xffffffffc101ed90,%rdx
  18:	48 89 f1             	mov    %rsi,%rcx
  1b:	48 89 fe             	mov    %rdi,%rsi
  1e:	48 c7 c7 20 07 01 c1 	mov    $0xffffffffc1010720,%rdi
  25:	e8 18 fd ff ff       	callq  0xfffffffffffffd42
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	c3                   	retq   
  2d:	cc                   	int3   
  2e:	cc                   	int3   
  2f:	cc                   	int3   
  30:	cc                   	int3   
  31:	90                   	nop
  32:	90                   	nop
  33:	90                   	nop
  34:	90                   	nop
  35:	90                   	nop
  36:	90                   	nop
  37:	90                   	nop
  38:	90                   	nop
  39:	90                   	nop
  3a:	90                   	nop
  3b:	90                   	nop
  3c:	90                   	nop
  3d:	90                   	nop
  3e:	90                   	nop
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	c3                   	retq   
   3:	cc                   	int3   
   4:	cc                   	int3   
   5:	cc                   	int3   
   6:	cc                   	int3   
   7:	90                   	nop
   8:	90                   	nop
   9:	90                   	nop
   a:	90                   	nop
   b:	90                   	nop
   c:	90                   	nop
   d:	90                   	nop
   e:	90                   	nop
   f:	90                   	nop
  10:	90                   	nop
  11:	90                   	nop
  12:	90                   	nop
  13:	90                   	nop
  14:	90                   	nop
  15:	90                   	nop

user  :notice: [  101.974008] stress-ng: metrc: [2914] miscellaneous metrics:
kern  :warn  : [  101.978448] RSP: 0018:ffa000000db6f9b8 EFLAGS: 00010246

user  :notice: [  101.993576] stress-ng: metrc: [2914] copy-file           2629.63 MB per sec copy rate (harmonic mean of 64 instances)


kern  :warn  : [  102.020066] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000007fffffff
kern  :warn  : [  102.020067] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffc1010720
user  :notice: [  102.026590] stress-ng: info:  [2914] for a 60.26s run time:
kern  :warn  : [  102.028177] RBP: ffa000000db6f9f8 R08: 0000000000000000 R09: 000000000000000a
kern  :warn  : [  102.028178] R10: 000000000000000a R11: 0fffffffffffffff R12: ffa000000db6faa0
kern  :warn  : [  102.028179] R13: ff11004060da7790 R14: 0000000000000000 R15: 0000000000000001

user  :notice: [  102.040178] stress-ng: info:  [2914]    3856.62s available CPU time
kern  :warn  : [  102.041662] FS:  0000000000000000(0000) GS:ff11003fc0900000(0000) knlGS:0000000000000000

user  :notice: [  102.044574] stress-ng: info:  [2914]       0.14s user time   (  0.00%)
kern  :warn  : [  102.051679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern  :warn  : [  102.051681] CR2: 000056061f02f700 CR3: 000000407de1c002 CR4: 0000000000771ef0

user  :notice: [  102.060248] stress-ng: info:  [2914]       4.63s system time (  0.12%)
kern  :warn  : [  102.065773] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

kern  :warn  : [  102.081424] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kern  :warn  : [  102.081426] PKRU: 55555554
user  :notice: [  102.089981] stress-ng: info:  [2914]       4.77s total time  (  0.12%)
kern  :warn  : [  102.091444] Call Trace:
kern  :warn  : [  102.091446]  <TASK>

user  :notice: [  102.099108] stress-ng: info:  [2914] load average: 42.09 12.22 4.21
kern :warn : [  102.107184] ? __warn (kbuild/src/consumer/kernel/panic.c:693) 



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240703/202407031556.d271bd4c-oliver.sang@intel.com
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 03a0a4289d943..1da3b1f741300 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2535,48 +2535,6 @@  xfs_agfl_reset(
 	clear_bit(XFS_AGSTATE_AGFL_NEEDS_RESET, &pag->pag_opstate);
 }
 
-/*
- * Defer an AGFL block free. This is effectively equivalent to
- * xfs_free_extent_later() with some special handling particular to AGFL blocks.
- *
- * Deferring AGFL frees helps prevent log reservation overruns due to too many
- * allocation operations in a transaction. AGFL frees are prone to this problem
- * because for one they are always freed one at a time. Further, an immediate
- * AGFL block free can cause a btree join and require another block free before
- * the real allocation can proceed. Deferring the free disconnects freeing up
- * the AGFL slot from freeing the block.
- */
-static int
-xfs_defer_agfl_block(
-	struct xfs_trans		*tp,
-	xfs_agnumber_t			agno,
-	xfs_agblock_t			agbno,
-	struct xfs_owner_info		*oinfo)
-{
-	struct xfs_mount		*mp = tp->t_mountp;
-	struct xfs_extent_free_item	*xefi;
-	xfs_fsblock_t			fsbno = XFS_AGB_TO_FSB(mp, agno, agbno);
-
-	ASSERT(xfs_extfree_item_cache != NULL);
-	ASSERT(oinfo != NULL);
-
-	if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbno(mp, fsbno)))
-		return -EFSCORRUPTED;
-
-	xefi = kmem_cache_zalloc(xfs_extfree_item_cache,
-			       GFP_KERNEL | __GFP_NOFAIL);
-	xefi->xefi_startblock = fsbno;
-	xefi->xefi_blockcount = 1;
-	xefi->xefi_owner = oinfo->oi_owner;
-	xefi->xefi_agresv = XFS_AG_RESV_AGFL;
-
-	trace_xfs_agfl_free_defer(mp, xefi);
-
-	xfs_extent_free_get_group(mp, xefi);
-	xfs_defer_add(tp, &xefi->xefi_list, &xfs_agfl_free_defer_type);
-	return 0;
-}
-
 /*
  * Add the extent to the list of extents to be free at transaction end.
  * The list is maintained sorted (by block number).
@@ -2624,7 +2582,13 @@  xfs_defer_extent_free(
 	trace_xfs_extent_free_defer(mp, xefi);
 
 	xfs_extent_free_get_group(mp, xefi);
-	*dfpp = xfs_defer_add(tp, &xefi->xefi_list, &xfs_extent_free_defer_type);
+
+	if (xefi->xefi_agresv == XFS_AG_RESV_AGFL)
+		*dfpp = xfs_defer_add(tp, &xefi->xefi_list,
+				&xfs_agfl_free_defer_type);
+	else
+		*dfpp = xfs_defer_add(tp, &xefi->xefi_list,
+				&xfs_extent_free_defer_type);
 	return 0;
 }
 
@@ -2882,8 +2846,21 @@  xfs_alloc_fix_freelist(
 		if (error)
 			goto out_agbp_relse;
 
-		/* defer agfl frees */
-		error = xfs_defer_agfl_block(tp, args->agno, bno, &targs.oinfo);
+		/*
+		 * Defer the AGFL block free.
+		 *
+		 * This helps to prevent log reservation overruns due to too
+		 * many allocation operations in a transaction. AGFL frees are
+		 * prone to this problem because for one they are always freed
+		 * one at a time.  Further, an immediate AGFL block free can
+		 * cause a btree join and require another block free before the
+		 * real allocation can proceed.
+		 * Deferring the free disconnects freeing up the AGFL slot from
+		 * freeing the block.
+		 */
+		error = xfs_free_extent_later(tp,
+				XFS_AGB_TO_FSB(mp, args->agno, bno), 1,
+				&targs.oinfo, XFS_AG_RESV_AGFL, 0);
 		if (error)
 			goto out_agbp_relse;
 	}