PM: sleep: Avoid calling put_device() under dpm_list_mtx

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It is generally unsafe to call put_device() with dpm_list_mtx held,
because the given device's release routine may carry out an action
depending on that lock which then may deadlock, so modify the
system-wide suspend and resume of devices to always drop dpm_list_mtx
before calling put_device() (and adjust white space somewhat while
at it).

For instance, this prevents the following splat from showing up in
the kernel log after a system resume in certain configurations:

[ 3290.969514] ======================================================
[ 3290.969517] WARNING: possible circular locking dependency detected
[ 3290.969519] 5.15.0+ #2420 Tainted: G S               
[ 3290.969523] ------------------------------------------------------
[ 3290.969525] systemd-sleep/4553 is trying to acquire lock:
[ 3290.969529] ffff888117ab1138 ((wq_completion)hci0#2){+.+.}-{0:0}, at: flush_workqueue+0x87/0x4a0
[ 3290.969554] 
               but task is already holding lock:
[ 3290.969556] ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0
[ 3290.969571] 
               which lock already depends on the new lock.

[ 3290.969573] 
               the existing dependency chain (in reverse order) is:
[ 3290.969575] 
               -> #3 (dpm_list_mtx){+.+.}-{3:3}:
[ 3290.969583]        __mutex_lock+0x9d/0xa30
[ 3290.969591]        device_pm_add+0x2e/0xe0
[ 3290.969597]        device_add+0x4d5/0x8f0
[ 3290.969605]        hci_conn_add_sysfs+0x43/0xb0 [bluetooth]
[ 3290.969689]        hci_conn_complete_evt.isra.71+0x124/0x750 [bluetooth]
[ 3290.969747]        hci_event_packet+0xd6c/0x28a0 [bluetooth]
[ 3290.969798]        hci_rx_work+0x213/0x640 [bluetooth]
[ 3290.969842]        process_one_work+0x2aa/0x650
[ 3290.969851]        worker_thread+0x39/0x400
[ 3290.969859]        kthread+0x142/0x170
[ 3290.969865]        ret_from_fork+0x22/0x30
[ 3290.969872] 
               -> #2 (&hdev->lock){+.+.}-{3:3}:
[ 3290.969881]        __mutex_lock+0x9d/0xa30
[ 3290.969887]        hci_event_packet+0xba/0x28a0 [bluetooth]
[ 3290.969935]        hci_rx_work+0x213/0x640 [bluetooth]
[ 3290.969978]        process_one_work+0x2aa/0x650
[ 3290.969985]        worker_thread+0x39/0x400
[ 3290.969993]        kthread+0x142/0x170
[ 3290.969999]        ret_from_fork+0x22/0x30
[ 3290.970004] 
               -> #1 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}:
[ 3290.970013]        process_one_work+0x27d/0x650
[ 3290.970020]        worker_thread+0x39/0x400
[ 3290.970028]        kthread+0x142/0x170
[ 3290.970033]        ret_from_fork+0x22/0x30
[ 3290.970038] 
               -> #0 ((wq_completion)hci0#2){+.+.}-{0:0}:
[ 3290.970047]        __lock_acquire+0x15cb/0x1b50
[ 3290.970054]        lock_acquire+0x26c/0x300
[ 3290.970059]        flush_workqueue+0xae/0x4a0
[ 3290.970066]        drain_workqueue+0xa1/0x130
[ 3290.970073]        destroy_workqueue+0x34/0x1f0
[ 3290.970081]        hci_release_dev+0x49/0x180 [bluetooth]
[ 3290.970130]        bt_host_release+0x1d/0x30 [bluetooth]
[ 3290.970195]        device_release+0x33/0x90
[ 3290.970201]        kobject_release+0x63/0x160
[ 3290.970211]        dpm_resume+0x164/0x3e0
[ 3290.970215]        dpm_resume_end+0xd/0x20
[ 3290.970220]        suspend_devices_and_enter+0x1a4/0xba0
[ 3290.970229]        pm_suspend+0x26b/0x310
[ 3290.970236]        state_store+0x42/0x90
[ 3290.970243]        kernfs_fop_write_iter+0x135/0x1b0
[ 3290.970251]        new_sync_write+0x125/0x1c0
[ 3290.970257]        vfs_write+0x360/0x3c0
[ 3290.970263]        ksys_write+0xa7/0xe0
[ 3290.970269]        do_syscall_64+0x3a/0x80
[ 3290.970276]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3290.970284] 
               other info that might help us debug this:

[ 3290.970285] Chain exists of:
                 (wq_completion)hci0#2 --> &hdev->lock --> dpm_list_mtx

[ 3290.970297]  Possible unsafe locking scenario:

[ 3290.970299]        CPU0                    CPU1
[ 3290.970300]        ----                    ----
[ 3290.970302]   lock(dpm_list_mtx);
[ 3290.970306]                                lock(&hdev->lock);
[ 3290.970310]                                lock(dpm_list_mtx);
[ 3290.970314]   lock((wq_completion)hci0#2);
[ 3290.970319] 
                *** DEADLOCK ***

[ 3290.970321] 7 locks held by systemd-sleep/4553:
[ 3290.970325]  #0: ffff888103bcd448 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xa7/0xe0
[ 3290.970341]  #1: ffff888115a14488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x103/0x1b0
[ 3290.970355]  #2: ffff888100f719e0 (kn->active#233){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1b0
[ 3290.970369]  #3: ffffffff82661048 (autosleep_lock){+.+.}-{3:3}, at: state_store+0x12/0x90
[ 3290.970384]  #4: ffffffff82658ac8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x9f/0x310
[ 3290.970399]  #5: ffffffff827f2a48 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x4c/0x80
[ 3290.970416]  #6: ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0
[ 3290.970428] 
               stack backtrace:
[ 3290.970431] CPU: 3 PID: 4553 Comm: systemd-sleep Tainted: G S                5.15.0+ #2420
[ 3290.970438] Hardware name: Dell Inc. XPS 13 9380/0RYJWW, BIOS 1.5.0 06/03/2019
[ 3290.970441] Call Trace:
[ 3290.970446]  dump_stack_lvl+0x44/0x57
[ 3290.970454]  check_noncircular+0x105/0x120
[ 3290.970468]  ? __lock_acquire+0x15cb/0x1b50
[ 3290.970474]  __lock_acquire+0x15cb/0x1b50
[ 3290.970487]  lock_acquire+0x26c/0x300
[ 3290.970493]  ? flush_workqueue+0x87/0x4a0
[ 3290.970503]  ? __raw_spin_lock_init+0x3b/0x60
[ 3290.970510]  ? lockdep_init_map_type+0x58/0x240
[ 3290.970519]  flush_workqueue+0xae/0x4a0
[ 3290.970526]  ? flush_workqueue+0x87/0x4a0
[ 3290.970544]  ? drain_workqueue+0xa1/0x130
[ 3290.970552]  drain_workqueue+0xa1/0x130
[ 3290.970561]  destroy_workqueue+0x34/0x1f0
[ 3290.970572]  hci_release_dev+0x49/0x180 [bluetooth]
[ 3290.970624]  bt_host_release+0x1d/0x30 [bluetooth]
[ 3290.970687]  device_release+0x33/0x90
[ 3290.970695]  kobject_release+0x63/0x160
[ 3290.970705]  dpm_resume+0x164/0x3e0
[ 3290.970710]  ? dpm_resume_early+0x251/0x3b0
[ 3290.970718]  dpm_resume_end+0xd/0x20
[ 3290.970723]  suspend_devices_and_enter+0x1a4/0xba0
[ 3290.970737]  pm_suspend+0x26b/0x310
[ 3290.970746]  state_store+0x42/0x90
[ 3290.970755]  kernfs_fop_write_iter+0x135/0x1b0
[ 3290.970764]  new_sync_write+0x125/0x1c0
[ 3290.970777]  vfs_write+0x360/0x3c0
[ 3290.970785]  ksys_write+0xa7/0xe0
[ 3290.970794]  do_syscall_64+0x3a/0x80
[ 3290.970803]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3290.970811] RIP: 0033:0x7f41b1328164
[ 3290.970819] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 4a d2 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
[ 3290.970824] RSP: 002b:00007ffe6ae21b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3290.970831] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f41b1328164
[ 3290.970836] RDX: 0000000000000004 RSI: 000055965e651070 RDI: 0000000000000004
[ 3290.970839] RBP: 000055965e651070 R08: 000055965e64f390 R09: 00007f41b1e3d1c0
[ 3290.970843] R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000004
[ 3290.970846] R13: 0000000000000001 R14: 000055965e64f2b0 R15: 0000000000000004

Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   84 +++++++++++++++++++++++++++++++---------------
 1 file changed, 57 insertions(+), 27 deletions(-)

Message ID	4693240.31r3eYUQgx@kreacher (mailing list archive)
State	Not Applicable
Headers	show Return-Path: <linux-bluetooth-owner@kernel.org> From: "Rafael J. Wysocki" <rjw@rjwysocki.net> To: Linux PM <linux-pm@vger.kernel.org> Cc: LKML <linux-kernel@vger.kernel.org>, Ulf Hansson <ulf.hansson@linaro.org>, Alan Stern <stern@rowland.harvard.edu>, linux-bluetooth@vger.kernel.org, Marcel Holtmann <marcel@holtmann.org>, Johan Hedberg <johan.hedberg@gmail.com>, Luiz Augusto von Dentz <luiz.dentz@gmail.com> Subject: [PATCH] PM: sleep: Avoid calling put_device() under dpm_list_mtx Date: Thu, 04 Nov 2021 18:26:26 +0100 Message-ID: <4693240.31r3eYUQgx@kreacher> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	PM: sleep: Avoid calling put_device() under dpm_list_mtx \| expand PM: sleep: Avoid calling put_device() under dpm_list_mtx

Context	Check	Description
tedd_an/checkpatch	fail	PM: sleep: Avoid calling put_device() under dpm_list_mtx\WARNING:BAD_STABLE_ADDRESS_STYLE: Invalid email format for stable: 'All applicable <stable@vger.kernel.org>', prefer 'stable@vger.kernel.org' #206: Cc: All applicable <stable@vger.kernel.org> total: 0 errors, 1 warnings, 194 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. /github/workspace/src/12603575.patch has style problems, please review. NOTE: Ignored message types: UNKNOWN_COMMIT_ID NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS.
tedd_an/gitlint	fail	PM: sleep: Avoid calling put_device() under dpm_list_mtx 17: B2 Line has trailing whitespace: "[ 3290.969519] 5.15.0+ #2420 Tainted: G S " 20: B1 Line exceeds max length (99>80): "[ 3290.969529] ffff888117ab1138 ((wq_completion)hci0#2){+.+.}-{0:0}, at: flush_workqueue+0x87/0x4a0" 21: B2 Line has trailing whitespace: "[ 3290.969554] " 23: B1 Line exceeds max length (86>80): "[ 3290.969556] ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0" 24: B2 Line has trailing whitespace: "[ 3290.969571] " 27: B2 Line has trailing whitespace: "[ 3290.969573] " 29: B2 Line has trailing whitespace: "[ 3290.969575] " 42: B2 Line has trailing whitespace: "[ 3290.969872] " 51: B2 Line has trailing whitespace: "[ 3290.970004] " 57: B2 Line has trailing whitespace: "[ 3290.970038] " 79: B2 Line has trailing whitespace: "[ 3290.970284] " 93: B2 Line has trailing whitespace: "[ 3290.970319] " 97: B1 Line exceeds max length (89>80): "[ 3290.970325] #0: ffff888103bcd448 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xa7/0xe0" 98: B1 Line exceeds max length (100>80): "[ 3290.970341] #1: ffff888115a14488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x103/0x1b0" 99: B1 Line exceeds max length (104>80): "[ 3290.970355] #2: ffff888100f719e0 (kn->active#233){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1b0" 100: B1 Line exceeds max length (92>80): "[ 3290.970369] #3: ffffffff82661048 (autosleep_lock){+.+.}-{3:3}, at: state_store+0x12/0x90" 101: B1 Line exceeds max length (101>80): "[ 3290.970384] #4: ffffffff82658ac8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x9f/0x310" 102: B1 Line exceeds max length (99>80): "[ 3290.970399] #5: ffffffff827f2a48 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x4c/0x80" 103: B1 Line exceeds max length (91>80): "[ 3290.970416] #6: ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0" 104: B2 Line has trailing whitespace: "[ 3290.970428] " 106: B1 Line exceeds max length (93>80): "[ 3290.970431] CPU: 3 PID: 4553 Comm: systemd-sleep Tainted: G S 5.15.0+ #2420" 107: B1 Line exceeds max length (81>80): "[ 3290.970438] Hardware name: Dell Inc. XPS 13 9380/0RYJWW, BIOS 1.5.0 06/03/2019" 139: B1 Line exceeds max length (214>80): "[ 3290.970819] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 4a d2 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83" 140: B1 Line exceeds max length (85>80): "[ 3290.970824] RSP: 002b:00007ffe6ae21b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001"
tedd_an/buildkernel	success	Build Kernel PASS
tedd_an/testrunnersetup	success	Test Runner Setup PASS
tedd_an/testrunnerl2cap-tester	success	Total: 40, Passed: 40 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnerbnep-tester	success	Total: 1, Passed: 1 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnermgmt-tester	success	Total: 492, Passed: 492 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnerrfcomm-tester	success	Total: 9, Passed: 9 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnersco-tester	success	Total: 12, Passed: 12 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnersmp-tester	success	Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunneruserchan-tester	success	Total: 4, Passed: 4 (100.0%), Failed: 0, Not Run: 0

PM: sleep: Avoid calling put_device() under dpm_list_mtx

Checks

Commit Message

Comments

Patch