[RFC,v5,00/21] DEPT(Dependency Tracker)

Message ID	1647397593-16747-1-git-send-email-byungchul.park@lge.com (mailing list archive)
Headers	show Return-Path: <linux-block-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EA07C433F5 for <linux-block@archiver.kernel.org>; Wed, 16 Mar 2022 02:27:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350946AbiCPC22 (ORCPT <rfc822;linux-block@archiver.kernel.org>); Tue, 15 Mar 2022 22:28:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351313AbiCPC21 (ORCPT <rfc822;linux-block@vger.kernel.org>); Tue, 15 Mar 2022 22:28:27 -0400 Received: from lgeamrelo11.lge.com (lgeamrelo12.lge.com [156.147.23.52]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 77EB55DA6D for <linux-block@vger.kernel.org>; Tue, 15 Mar 2022 19:27:10 -0700 (PDT) Received: from unknown (HELO lgemrelse7q.lge.com) (156.147.1.151) by 156.147.23.52 with ESMTP; 16 Mar 2022 11:27:09 +0900 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: byungchul.park@lge.com Received: from unknown (HELO localhost.localdomain) (10.177.244.38) by 156.147.1.151 with ESMTP; 16 Mar 2022 11:27:09 +0900 X-Original-SENDERIP: 10.177.244.38 X-Original-MAILFROM: byungchul.park@lge.com From: Byungchul Park <byungchul.park@lge.com> To: torvalds@linux-foundation.org Cc: damien.lemoal@opensource.wdc.com, linux-ide@vger.kernel.org, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org, peterz@infradead.org, will@kernel.org, tglx@linutronix.de, rostedt@goodmis.org, joel@joelfernandes.org, sashal@kernel.org, daniel.vetter@ffwll.ch, chris@chris-wilson.co.uk, duyuyang@gmail.com, johannes.berg@intel.com, tj@kernel.org, tytso@mit.edu, willy@infradead.org, david@fromorbit.com, amir73il@gmail.com, bfields@fieldses.org, gregkh@linuxfoundation.org, kernel-team@lge.com, linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org, minchan@kernel.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com, sj@kernel.org, jglisse@redhat.com, dennis@kernel.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, vbabka@suse.cz, ngupta@vflare.org, linux-block@vger.kernel.org, paolo.valente@linaro.org, josef@toxicpanda.com, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, jack@suse.com, jlayton@kernel.org, dan.j.williams@intel.com, hch@infradead.org, djwong@kernel.org, dri-devel@lists.freedesktop.org, airlied@linux.ie, rodrigosiqueiramelo@gmail.com, melissa.srw@gmail.com, hamohammed.sa@gmail.com Subject: [PATCH RFC v5 00/21] DEPT(Dependency Tracker) Date: Wed, 16 Mar 2022 11:26:12 +0900 Message-Id: <1647397593-16747-1-git-send-email-byungchul.park@lge.com> X-Mailer: git-send-email 1.9.1 Precedence: bulk List-ID: <linux-block.vger.kernel.org> X-Mailing-List: linux-block@vger.kernel.org
Series	DEPT(Dependency Tracker) \| expand [RFC,v5,00/21] DEPT(Dependency Tracker) [RFC,v5,01/21] llist: Move llist_{head,node} definition to types.h [RFC,v5,02/21] dept: Implement Dept(Dependency Tracker) [RFC,v5,03/21] dept: Embed Dept data in Lockdep [RFC,v5,04/21] dept: Apply Dept to spinlock [RFC,v5,05/21] dept: Apply Dept to mutex families [RFC,v5,06/21] dept: Apply Dept to rwlock [RFC,v5,07/21] dept: Apply Dept to wait_for_completion()/complete() [RFC,v5,08/21] dept: Apply Dept to seqlock [RFC,v5,09/21] dept: Apply Dept to rwsem [RFC,v5,10/21] dept: Add proc knobs to show stats and dependency graph [RFC,v5,11/21] dept: Introduce split map concept and new APIs for them [RFC,v5,12/21] dept: Apply Dept to wait/event of PG_{locked,writeback} [RFC,v5,13/21] dept: Apply SDT to swait [RFC,v5,14/21] dept: Apply SDT to wait(waitqueue) [RFC,v5,15/21] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread [RFC,v5,16/21] dept: Distinguish each syscall context from another [RFC,v5,17/21] dept: Distinguish each work from another [RFC,v5,18/21] dept: Disable Dept within the wait_bit layer by default [RFC,v5,19/21] dept: Add nocheck version of init_completion() [RFC,v5,20/21] dept: Disable Dept on struct crypto_larval's completion for now [RFC,v5,21/21] dept: Don't create dependencies between different depths in any case

Message ID

1647397593-16747-1-git-send-email-byungchul.park@lge.com (mailing list archive)

Headers

From: Byungchul Park <byungchul.park@lge.com>
To: torvalds@linux-foundation.org
Cc: damien.lemoal@opensource.wdc.com, linux-ide@vger.kernel.org,
        adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
        mingo@redhat.com, linux-kernel@vger.kernel.org,
        peterz@infradead.org, will@kernel.org, tglx@linutronix.de,
        rostedt@goodmis.org, joel@joelfernandes.org, sashal@kernel.org,
        daniel.vetter@ffwll.ch, chris@chris-wilson.co.uk,
        duyuyang@gmail.com, johannes.berg@intel.com, tj@kernel.org,
        tytso@mit.edu, willy@infradead.org, david@fromorbit.com,
        amir73il@gmail.com, bfields@fieldses.org,
        gregkh@linuxfoundation.org, kernel-team@lge.com,
        linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org,
        minchan@kernel.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com,
        sj@kernel.org, jglisse@redhat.com, dennis@kernel.org, cl@linux.com,
        penberg@kernel.org, rientjes@google.com, vbabka@suse.cz,
        ngupta@vflare.org, linux-block@vger.kernel.org,
        paolo.valente@linaro.org, josef@toxicpanda.com,
        linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
        jack@suse.cz, jack@suse.com, jlayton@kernel.org,
        dan.j.williams@intel.com, hch@infradead.org, djwong@kernel.org,
        dri-devel@lists.freedesktop.org, airlied@linux.ie,
        rodrigosiqueiramelo@gmail.com, melissa.srw@gmail.com,
        hamohammed.sa@gmail.com
Subject: [PATCH RFC v5 00/21] DEPT(Dependency Tracker)
Date: Wed, 16 Mar 2022 11:26:12 +0900
Message-Id: <1647397593-16747-1-git-send-email-byungchul.park@lge.com>
Precedence: bulk

Series

DEPT(Dependency Tracker) | expand

Message

Byungchul Park March 16, 2022, 2:26 a.m. UTC

I'm gonna re-add RFC for a while at Ted's request. But hard testing is
needed to find false alarms for now that there's no false alarm with my
system. I'm gonna look for other systems that might produce false
alarms. And it'd be appreciated if you share it when you see any alarms
with yours.

---

Hi Linus and folks,

I've been developing a tool for detecting deadlock possibilities by
tracking wait/event rather than lock(?) acquisition order to try to
cover all synchonization machanisms. It's done on v5.17-rc7 tag.

https://github.com/lgebyungchulpark/linux-dept/commits/dept1.18_on_v5.17-rc7

Benifit:

	0. Works with all lock primitives.
	1. Works with wait_for_completion()/complete().
	2. Works with 'wait' on PG_locked.
	3. Works with 'wait' on PG_writeback.
	4. Works with swait/wakeup.
	5. Works with waitqueue.
	6. Multiple reports are allowed.
	7. Deduplication control on multiple reports.
	8. Withstand false positives thanks to 6.
	9. Easy to tag any wait/event.

Future work:

	0. To make it more stable.
	1. To separates Dept from Lockdep.
	2. To improves performance in terms of time and space.
	3. To use Dept as a dependency engine for Lockdep.
	4. To add any missing tags of wait/event in the kernel.
	5. To deduplicate stack trace.

How to interpret reports:

	1. E(event) in each context cannot be triggered because of the
	   W(wait) that cannot be woken.
	2. The stack trace helping find the problematic code is located
	   in each conext's detail.

Thanks,
Byungchul

---

Changes from v4:

	1. Fix some bugs that produce false alarms.
	2. Distinguish each syscall context from another *for arm64*.
	3. Make it not warn it but just print it in case Dept ring
	   buffer gets exhausted. (feedback from Hyeonggon)
	4. Explicitely describe "EXPERIMENTAL" and "Dept might produce
	   false positive reports" in Kconfig. (feedback from Ted)

Changes from v3:

	1. Dept shouldn't create dependencies between different depths
	   of a class that were indicated by *_lock_nested(). Dept
	   normally doesn't but it does once another lock class comes
	   in. So fixed it. (feedback from Hyeonggon)
	2. Dept considered a wait as a real wait once getting to
	   __schedule() even if it has been set to TASK_RUNNING by wake
	   up sources in advance. Fixed it so that Dept doesn't consider
	   the case as a real wait. (feedback from Jan Kara)
	3. Stop tracking dependencies with a map once the event
	   associated with the map has been handled. Dept will start to
	   work with the map again, on the next sleep.

Changes from v2:

	1. Disable Dept on bit_wait_table[] in sched/wait_bit.c
	   reporting a lot of false positives, which is my fault.
	   Wait/event for bit_wait_table[] should've been tagged in a
	   higher layer for better work, which is a future work.
	   (feedback from Jan Kara)
	2. Disable Dept on crypto_larval's completion to prevent a false
	   positive.

Changes from v1:

	1. Fix coding style and typo. (feedback from Steven)
	2. Distinguish each work context from another in workqueue.
	3. Skip checking lock acquisition with nest_lock, which is about
	   correct lock usage that should be checked by Lockdep.

Changes from RFC:

	1. Prevent adding a wait tag at prepare_to_wait() but __schedule().
	   (feedback from Linus and Matthew)
	2. Use try version at lockdep_acquire_cpus_lock() annotation.
	3. Distinguish each syscall context from another.

Byungchul Park (21):
  llist: Move llist_{head,node} definition to types.h
  dept: Implement Dept(Dependency Tracker)
  dept: Embed Dept data in Lockdep
  dept: Apply Dept to spinlock
  dept: Apply Dept to mutex families
  dept: Apply Dept to rwlock
  dept: Apply Dept to wait_for_completion()/complete()
  dept: Apply Dept to seqlock
  dept: Apply Dept to rwsem
  dept: Add proc knobs to show stats and dependency graph
  dept: Introduce split map concept and new APIs for them
  dept: Apply Dept to wait/event of PG_{locked,writeback}
  dept: Apply SDT to swait
  dept: Apply SDT to wait(waitqueue)
  locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread
  dept: Distinguish each syscall context from another
  dept: Distinguish each work from another
  dept: Disable Dept within the wait_bit layer by default
  dept: Add nocheck version of init_completion()
  dept: Disable Dept on struct crypto_larval's completion for now
  dept: Don't create dependencies between different depths in any case

 arch/arm64/kernel/syscall.c        |    2 +
 arch/x86/entry/common.c            |    4 +
 crypto/api.c                       |    7 +-
 include/linux/completion.h         |   50 +-
 include/linux/dept.h               |  544 +++++++
 include/linux/dept_page.h          |   78 +
 include/linux/dept_sdt.h           |   62 +
 include/linux/hardirq.h            |    3 +
 include/linux/irqflags.h           |   33 +-
 include/linux/llist.h              |    8 -
 include/linux/lockdep.h            |  157 ++-
 include/linux/lockdep_types.h      |    3 +
 include/linux/mutex.h              |   32 +
 include/linux/page-flags.h         |   45 +-
 include/linux/pagemap.h            |    7 +-
 include/linux/percpu-rwsem.h       |   10 +-
 include/linux/rtmutex.h            |    7 +
 include/linux/rwlock.h             |   50 +
 include/linux/rwlock_api_smp.h     |    8 +-
 include/linux/rwlock_types.h       |    7 +
 include/linux/rwsem.h              |   32 +
 include/linux/sched.h              |    7 +
 include/linux/seqlock.h            |   68 +-
 include/linux/spinlock.h           |   25 +
 include/linux/spinlock_types_raw.h |   13 +
 include/linux/swait.h              |    4 +
 include/linux/types.h              |    8 +
 include/linux/wait.h               |    6 +-
 init/init_task.c                   |    2 +
 init/main.c                        |    4 +
 kernel/Makefile                    |    1 +
 kernel/cpu.c                       |    2 +-
 kernel/dependency/Makefile         |    4 +
 kernel/dependency/dept.c           | 2743 ++++++++++++++++++++++++++++++++++++
 kernel/dependency/dept_hash.h      |   10 +
 kernel/dependency/dept_internal.h  |   26 +
 kernel/dependency/dept_object.h    |   13 +
 kernel/dependency/dept_proc.c      |   92 ++
 kernel/exit.c                      |    1 +
 kernel/fork.c                      |    2 +
 kernel/locking/lockdep.c           |   12 +-
 kernel/module.c                    |    2 +
 kernel/sched/completion.c          |   12 +-
 kernel/sched/core.c                |    8 +
 kernel/sched/swait.c               |   10 +
 kernel/sched/wait.c                |   16 +
 kernel/sched/wait_bit.c            |    5 +-
 kernel/softirq.c                   |    6 +-
 kernel/trace/trace_preemptirq.c    |   19 +-
 kernel/workqueue.c                 |    3 +
 lib/Kconfig.debug                  |   27 +
 mm/filemap.c                       |   68 +
 mm/page_ext.c                      |    5 +
 53 files changed, 4313 insertions(+), 60 deletions(-)
 create mode 100644 include/linux/dept.h
 create mode 100644 include/linux/dept_page.h
 create mode 100644 include/linux/dept_sdt.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_internal.h
 create mode 100644 kernel/dependency/dept_object.h
 create mode 100644 kernel/dependency/dept_proc.c

Comments

Theodore Ts'o March 17, 2022, 3:39 a.m. UTC | #1

On Wed, Mar 16, 2022 at 11:26:12AM +0900, Byungchul Park wrote:
> I'm gonna re-add RFC for a while at Ted's request. But hard testing is
> needed to find false alarms for now that there's no false alarm with my
> system. I'm gonna look for other systems that might produce false
> alarms. And it'd be appreciated if you share it when you see any alarms
> with yours.

Is dept1.18_on_v5.17-rc7 roughly equivalent to the v5 version sent to
the list.  The commit date is March 16th, so I assume it was.  I tried
merging it with the ext4 dev branch, and tried enabling CONFIG_DEPT
and running xfstests.  The result was nearly test failing, because a 
DEPT warning.

I assume that this is due to some misconfiguration of DEPT on my part?
And I'm curious why DEPT_WARN_ONCE is apparently getting many, many
times?

[  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
[  770.319656] DEPT_WARN_ONCE: Pool(ecxt) is empty.
[  772.460360] DEPT_WARN_ONCE: Pool(ecxt) is empty.
[  784.039676] DEPT_WARN_ONCE: Pool(ecxt) is empty.

(and this goes on over and over...)

Here's the full output of the DEPT warning from trying to run
generic/001.  There is a similar warning for generic/002, generic/003,
etc., for a total of 468 failures out of 495 tests run.

[  760.945068] run fstests generic/001 at 2022-03-16 08:16:53
[  760.985440] ------------[ cut here ]------------
[  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
[  760.995166] WARNING: CPU: 1 PID: 73369 at kernel/dependency/dept.c:297 from_pool+0xc2/0x110
[  761.003915] CPU: 1 PID: 73369 Comm: bash Tainted: G        W         5.17.0-rc7-xfstests-00649-g5456f2312272 #520
[  761.014389] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[  761.024363] RIP: 0010:from_pool+0xc2/0x110
[  761.028598] Code: 3d 32 62 96 01 00 75 c2 48 6b db 38 48 c7 c7 00 94 f1 ad 48 89 04 24 c6 05 1a 62 96 01 01 48 8b b3 20 9a 2f ae e8 2f dd bf 00 <0f> 0b 48 8b 04 24 eb 98 48 63 c2 48 0f af 86 28 9a 2f ae 48 03 86
[  761.048189] RSP: 0018:ffffa7ce4425fd48 EFLAGS: 00010086
[  761.053617] RAX: 0000000000000000 RBX: 00000000000000a8 RCX: 0000000000000000
[  761.060965] RDX: 0000000000000001 RSI: ffffffffadfb95e0 RDI: 00000000ffffffff
[  761.068322] RBP: 00000000001dc598 R08: 0000000000000000 R09: ffffa7ce4425fb90
[  761.075789] R10: fffffffffffe0aa0 R11: fffffffffffe0ae8 R12: ffff9768e07f0600
[  761.083063] R13: 0000000000000000 R14: 0000000000000246 R15: 0000000000000000
[  761.090312] FS:  00007fd4ecc4c740(0000) GS:ffff976999400000(0000) knlGS:0000000000000000
[  761.098623] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  761.104580] CR2: 0000563c61657eb0 CR3: 00000001328fa001 CR4: 00000000003706e0
[  761.111921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  761.119171] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  761.126617] Call Trace:
[  761.129175]  <TASK>
[  761.131385]  add_ecxt+0x54/0x1c0
[  761.134736]  ? simple_attr_write+0x87/0x100
[  761.139063]  dept_event+0xaa/0x1d0
[  761.142687]  ? simple_attr_write+0x87/0x100
[  761.147089]  __mutex_unlock_slowpath+0x60/0x2d0
[  761.151866]  simple_attr_write+0x87/0x100
[  761.155997]  debugfs_attr_write+0x40/0x60
[  761.160124]  vfs_write+0xec/0x390
[  761.163557]  ksys_write+0x68/0xe0
[  761.167004]  do_syscall_64+0x43/0x90
[  761.170782]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  761.176204] RIP: 0033:0x7fd4ecd3df33
[  761.180010] Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
[  761.199551] RSP: 002b:00007ffe772d4808 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  761.207240] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4ecd3df33
[  761.214583] RDX: 0000000000000002 RSI: 0000563c61657eb0 RDI: 0000000000000001
[  761.221835] RBP: 0000563c61657eb0 R08: 000000000000000a R09: 0000000000000001
[  761.229537] R10: 0000563c61902240 R11: 0000000000000246 R12: 0000000000000002
[  761.237239] R13: 00007fd4ece0e6a0 R14: 0000000000000002 R15: 00007fd4ece0e8a0
[  761.245283]  </TASK>
[  761.247586] ---[ end trace 0000000000000000 ]---
[  761.761829] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
[  769.903489] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.

Let me know what I should do in order to fix this DEPT_WARN_ONCE?

Thanks,

						- Ted

Byungchul Park March 18, 2022, 7:49 a.m. UTC | #2

On Wed, Mar 16, 2022 at 11:39:19PM -0400, Theodore Ts'o wrote:
> On Wed, Mar 16, 2022 at 11:26:12AM +0900, Byungchul Park wrote:
> > I'm gonna re-add RFC for a while at Ted's request. But hard testing is
> > needed to find false alarms for now that there's no false alarm with my
> > system. I'm gonna look for other systems that might produce false
> > alarms. And it'd be appreciated if you share it when you see any alarms
> > with yours.
> 
> Is dept1.18_on_v5.17-rc7 roughly equivalent to the v5 version sent to

Yes.

> the list.  The commit date is March 16th, so I assume it was.  I tried
> merging it with the ext4 dev branch, and tried enabling CONFIG_DEPT
> and running xfstests.  The result was nearly test failing, because a 
> DEPT warning.
> 
> I assume that this is due to some misconfiguration of DEPT on my part?

I guess it was becasue of the commit b1fca27d384e8("kernel debug:
support resetting WARN*_ONCE"). Your script seems to reset WARN*_ONCE
repeatedly.

But, yeah. It's *too much* that Dept warns it on the lack of pools. I
will switch it to just pr_warn_once().

Plus, I will implement a new functionality to expand pools to prevent
facing the situation in advance.

> And I'm curious why DEPT_WARN_ONCE is apparently getting many, many
> times?
> 
> [  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> [  770.319656] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> [  772.460360] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> [  784.039676] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> 
> (and this goes on over and over...)
> 
> Here's the full output of the DEPT warning from trying to run
> generic/001.  There is a similar warning for generic/002, generic/003,
> etc., for a total of 468 failures out of 495 tests run.

Sorry for the noise. I will prevent this as described above.

> [  760.945068] run fstests generic/001 at 2022-03-16 08:16:53
> [  760.985440] ------------[ cut here ]------------
> [  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> [  760.995166] WARNING: CPU: 1 PID: 73369 at kernel/dependency/dept.c:297 from_pool+0xc2/0x110
> [  761.003915] CPU: 1 PID: 73369 Comm: bash Tainted: G        W         5.17.0-rc7-xfstests-00649-g5456f2312272 #520
> [  761.014389] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [  761.024363] RIP: 0010:from_pool+0xc2/0x110
> [  761.028598] Code: 3d 32 62 96 01 00 75 c2 48 6b db 38 48 c7 c7 00 94 f1 ad 48 89 04 24 c6 05 1a 62 96 01 01 48 8b b3 20 9a 2f ae e8 2f dd bf 00 <0f> 0b 48 8b 04 24 eb 98 48 63 c2 48 0f af 86 28 9a 2f ae 48 03 86
> [  761.048189] RSP: 0018:ffffa7ce4425fd48 EFLAGS: 00010086
> [  761.053617] RAX: 0000000000000000 RBX: 00000000000000a8 RCX: 0000000000000000
> [  761.060965] RDX: 0000000000000001 RSI: ffffffffadfb95e0 RDI: 00000000ffffffff
> [  761.068322] RBP: 00000000001dc598 R08: 0000000000000000 R09: ffffa7ce4425fb90
> [  761.075789] R10: fffffffffffe0aa0 R11: fffffffffffe0ae8 R12: ffff9768e07f0600
> [  761.083063] R13: 0000000000000000 R14: 0000000000000246 R15: 0000000000000000
> [  761.090312] FS:  00007fd4ecc4c740(0000) GS:ffff976999400000(0000) knlGS:0000000000000000
> [  761.098623] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  761.104580] CR2: 0000563c61657eb0 CR3: 00000001328fa001 CR4: 00000000003706e0
> [  761.111921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  761.119171] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  761.126617] Call Trace:
> [  761.129175]  <TASK>
> [  761.131385]  add_ecxt+0x54/0x1c0
> [  761.134736]  ? simple_attr_write+0x87/0x100
> [  761.139063]  dept_event+0xaa/0x1d0
> [  761.142687]  ? simple_attr_write+0x87/0x100
> [  761.147089]  __mutex_unlock_slowpath+0x60/0x2d0
> [  761.151866]  simple_attr_write+0x87/0x100
> [  761.155997]  debugfs_attr_write+0x40/0x60
> [  761.160124]  vfs_write+0xec/0x390
> [  761.163557]  ksys_write+0x68/0xe0
> [  761.167004]  do_syscall_64+0x43/0x90
> [  761.170782]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  761.176204] RIP: 0033:0x7fd4ecd3df33
> [  761.180010] Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> [  761.199551] RSP: 002b:00007ffe772d4808 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [  761.207240] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4ecd3df33
> [  761.214583] RDX: 0000000000000002 RSI: 0000563c61657eb0 RDI: 0000000000000001
> [  761.221835] RBP: 0000563c61657eb0 R08: 000000000000000a R09: 0000000000000001
> [  761.229537] R10: 0000563c61902240 R11: 0000000000000246 R12: 0000000000000002
> [  761.237239] R13: 00007fd4ece0e6a0 R14: 0000000000000002 R15: 00007fd4ece0e8a0
> [  761.245283]  </TASK>
> [  761.247586] ---[ end trace 0000000000000000 ]---
> [  761.761829] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> [  769.903489] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 
> Let me know what I should do in order to fix this DEPT_WARN_ONCE?

I will let you know on all works done.

Thank you very much for all your feedback,
Byungchul

Theodore Ts'o March 19, 2022, 10:49 p.m. UTC | #3

On Fri, Mar 18, 2022 at 04:49:45PM +0900, Byungchul Park wrote:
> 
> I guess it was becasue of the commit b1fca27d384e8("kernel debug:
> support resetting WARN*_ONCE"). Your script seems to reset WARN*_ONCE
> repeatedly.

I wasn't aware this was being done, but your guess was correct.  The
WARN_ONCE state is getting cleared between each test in xfstests, with
the rationale (which IMO is quite reasonable) why this done in the
xfstests commit descrition:

commit c67ea2347454aebbe8eb6e825e9314d099b683da
Author: Lukas Czerner <lczerner@redhat.com>
Date:   Wed Jul 15 13:42:19 2020 +0200

    check: clear WARN_ONCE state before each test
    
    clear WARN_ONCE state before each test to allow a potential problem
    to be reported for each test
    
    [Eryu: replace "/sys/kernel/debug" with $DEBUGFS_MNT ]
    
    Signed-off-by: Lukas Czerner <lczerner@redhat.com>
    Reviewed-by: Zorro Lang <zlang@redhat.com>
    Signed-off-by: Eryu Guan <guaneryu@gmail.com>

Cheers,

						- Ted

Byungchul Park March 20, 2022, 10:55 a.m. UTC | #4

On Fri, Mar 18, 2022 at 04:49:45PM +0900, Byungchul Park wrote:
> On Wed, Mar 16, 2022 at 11:39:19PM -0400, Theodore Ts'o wrote:
> > On Wed, Mar 16, 2022 at 11:26:12AM +0900, Byungchul Park wrote:
> > > I'm gonna re-add RFC for a while at Ted's request. But hard testing is
> > > needed to find false alarms for now that there's no false alarm with my
> > > system. I'm gonna look for other systems that might produce false
> > > alarms. And it'd be appreciated if you share it when you see any alarms
> > > with yours.
> > 
> > Is dept1.18_on_v5.17-rc7 roughly equivalent to the v5 version sent to
> 
> Yes.
> 
> > the list.  The commit date is March 16th, so I assume it was.  I tried
> > merging it with the ext4 dev branch, and tried enabling CONFIG_DEPT
> > and running xfstests.  The result was nearly test failing, because a 
> > DEPT warning.
> > 
> > I assume that this is due to some misconfiguration of DEPT on my part?
> 
> I guess it was becasue of the commit b1fca27d384e8("kernel debug:
> support resetting WARN*_ONCE"). Your script seems to reset WARN*_ONCE
> repeatedly.
> 
> But, yeah. It's *too much* that Dept warns it on the lack of pools. I
> will switch it to just pr_warn_once().
> 
> Plus, I will implement a new functionality to expand pools to prevent
> facing the situation in advance.
> 
> > And I'm curious why DEPT_WARN_ONCE is apparently getting many, many
> > times?
> > 
> > [  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  770.319656] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  772.460360] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  784.039676] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > 
> > (and this goes on over and over...)
> > 
> > Here's the full output of the DEPT warning from trying to run
> > generic/001.  There is a similar warning for generic/002, generic/003,
> > etc., for a total of 468 failures out of 495 tests run.
> 
> Sorry for the noise. I will prevent this as described above.
> 
> > [  760.945068] run fstests generic/001 at 2022-03-16 08:16:53
> > [  760.985440] ------------[ cut here ]------------
> > [  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  760.995166] WARNING: CPU: 1 PID: 73369 at kernel/dependency/dept.c:297 from_pool+0xc2/0x110
> > [  761.003915] CPU: 1 PID: 73369 Comm: bash Tainted: G        W         5.17.0-rc7-xfstests-00649-g5456f2312272 #520
> > [  761.014389] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > [  761.024363] RIP: 0010:from_pool+0xc2/0x110
> > [  761.028598] Code: 3d 32 62 96 01 00 75 c2 48 6b db 38 48 c7 c7 00 94 f1 ad 48 89 04 24 c6 05 1a 62 96 01 01 48 8b b3 20 9a 2f ae e8 2f dd bf 00 <0f> 0b 48 8b 04 24 eb 98 48 63 c2 48 0f af 86 28 9a 2f ae 48 03 86
> > [  761.048189] RSP: 0018:ffffa7ce4425fd48 EFLAGS: 00010086
> > [  761.053617] RAX: 0000000000000000 RBX: 00000000000000a8 RCX: 0000000000000000
> > [  761.060965] RDX: 0000000000000001 RSI: ffffffffadfb95e0 RDI: 00000000ffffffff
> > [  761.068322] RBP: 00000000001dc598 R08: 0000000000000000 R09: ffffa7ce4425fb90
> > [  761.075789] R10: fffffffffffe0aa0 R11: fffffffffffe0ae8 R12: ffff9768e07f0600
> > [  761.083063] R13: 0000000000000000 R14: 0000000000000246 R15: 0000000000000000
> > [  761.090312] FS:  00007fd4ecc4c740(0000) GS:ffff976999400000(0000) knlGS:0000000000000000
> > [  761.098623] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  761.104580] CR2: 0000563c61657eb0 CR3: 00000001328fa001 CR4: 00000000003706e0
> > [  761.111921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  761.119171] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  761.126617] Call Trace:
> > [  761.129175]  <TASK>
> > [  761.131385]  add_ecxt+0x54/0x1c0
> > [  761.134736]  ? simple_attr_write+0x87/0x100
> > [  761.139063]  dept_event+0xaa/0x1d0
> > [  761.142687]  ? simple_attr_write+0x87/0x100
> > [  761.147089]  __mutex_unlock_slowpath+0x60/0x2d0
> > [  761.151866]  simple_attr_write+0x87/0x100
> > [  761.155997]  debugfs_attr_write+0x40/0x60
> > [  761.160124]  vfs_write+0xec/0x390
> > [  761.163557]  ksys_write+0x68/0xe0
> > [  761.167004]  do_syscall_64+0x43/0x90
> > [  761.170782]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  761.176204] RIP: 0033:0x7fd4ecd3df33
> > [  761.180010] Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> > [  761.199551] RSP: 002b:00007ffe772d4808 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [  761.207240] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4ecd3df33
> > [  761.214583] RDX: 0000000000000002 RSI: 0000563c61657eb0 RDI: 0000000000000001
> > [  761.221835] RBP: 0000563c61657eb0 R08: 000000000000000a R09: 0000000000000001
> > [  761.229537] R10: 0000563c61902240 R11: 0000000000000246 R12: 0000000000000002
> > [  761.237239] R13: 00007fd4ece0e6a0 R14: 0000000000000002 R15: 00007fd4ece0e8a0
> > [  761.245283]  </TASK>
> > [  761.247586] ---[ end trace 0000000000000000 ]---
> > [  761.761829] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > [  769.903489] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 
> > Let me know what I should do in order to fix this DEPT_WARN_ONCE?
> 
> I will let you know on all works done.

I have yet to decide the design for expanding pool on demand. I should
be careful in it because Dept is working in a very low layer. I will
have it done later.

However, I temporarily sized up the pools for heavy loaded system.
Besides that, all works have been done. I've just updated the same
branch.

https://github.com/lgebyungchulpark/linux-dept/commits/dept1.18_on_v5.17-rc7

This is just for your information.

Thanks,
Byungchul