diff mbox series

[v1] Revert "Bluetooth: core: Fix missing power_on work cancel on HCI close"

Message ID 20220614181706.26513-1-max.oss.09@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v1] Revert "Bluetooth: core: Fix missing power_on work cancel on HCI close" | expand

Checks

Context Check Description
tedd_an/pre-ci_am fail error: patch failed: net/bluetooth/hci_core.c:2675 error: net/bluetooth/hci_core.c: patch does not apply hint: Use 'git am --show-current-patch' to see the failed patch

Commit Message

Max Krummenacher June 14, 2022, 6:17 p.m. UTC
From: Max Krummenacher <max.krummenacher@toradex.com>

This reverts commit ff7f2926114d3a50f5ffe461a9bce8d761748da5.

The commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work
cancel on HCI close") introduced between v5.18 and v5.19-rc1 makes
going to suspend freeze. v5.19-rc2 is equally affected.

This has been seen on a Colibri iMX6ULL WB which has a Marvell 8997
based WiFi / Bluetooth module connected over SDIO.

With 'v5.18' or 'v5.19-rc1 with said commit reverted' a suspend/resume
cycle looks as follows and the device is functional after the resume:

root@imx6ull:~# rfkill
ID TYPE      DEVICE    SOFT      HARD
 0 bluetooth hci0   blocked unblocked
 1 wlan      phy0   blocked unblocked
root@imx6ull:~# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
root@imx6ull:~# date;echo mem > /sys/power/state;date
Tue Jun 14 14:43:03 UTC 2022
[ 6393.464497] PM: suspend entry (deep)
[ 6393.529398] Filesystems sync: 0.064 seconds
[ 6393.594006] Freezing user space processes ... (elapsed 0.015 seconds) done.
[ 6393.610266] OOM killer disabled.
[ 6393.610285] Freezing remaining freezable tasks ... (elapsed 0.013 seconds) done.
[ 6393.623727] printk: Suspending console(s) (use no_console_suspend to debug)

~~ suspended until console initiates the resume

[ 6394.023552] fec 20b4000.ethernet eth0: Link is Down
[ 6394.049902] PM: suspend devices took 0.300 seconds
[ 6394.091654] Disabling non-boot CPUs ...
[ 6394.565896] PM: resume devices took 0.440 seconds
[ 6394.681350] OOM killer enabled.
[ 6394.681369] Restarting tasks ... done.
[ 6394.741157] random: crng reseeded on system resumption
[ 6394.813135] PM: suspend exit
Tue Jun 14 14:43:11 UTC 2022
[ 6396.403873] fec 20b4000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 6396.404347] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
root@imx6ull:~#

With 'v5.19-rc1' suspend freezes in the suspend phase, i.e. power
consumption is not lowered and no wakeup source initiates a wakup.

root@imx6ull:~# rfkill
ID TYPE      DEVICE    SOFT      HARD
 0 bluetooth hci0   blocked unblocked
 1 wlan      phy0   blocked unblocked
root@imx6ull:~# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
root@imx6ull:~# date;echo mem > /sys/power/state;date
Tue Jun 14 12:40:38 UTC 2022
[  122.476333] PM: suspend entry (deep)
[  122.556012] Filesystems sync: 0.079 seconds

~~ no further kernel output

If one first unbinds the bluetooth device driver, suspend / resume works
as expected also with 'v5.19-rc1':

root@imx6ull:~# echo mmc1:0001:2 > /sys/bus/sdio/drivers/btmrvl_sdio/unbind
root@imx6ull:~# rfkill
ID TYPE DEVICE    SOFT      HARD
 1 wlan phy0   blocked unblocked
root@imx6ull:~# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
root@imx6ull:~# date;echo mem > /sys/power/state;date
Tue Jun 14 14:59:26 UTC 2022
[  123.530310] PM: suspend entry (deep)
[  123.595432] Filesystems sync: 0.064 seconds
[  123.672478] Freezing user space processes ... (elapsed 0.028 seconds) done.
[  123.701848] OOM killer disabled.
[  123.701869] Freezing remaining freezable tasks ... (elapsed 0.007 seconds) done.
[  123.709993] printk: Suspending console(s) (use no_console_suspend to debug)
[  124.097772] fec 20b4000.ethernet eth0: Link is Down
[  124.124795] PM: suspend devices took 0.280 seconds
[  124.165893] Disabling non-boot CPUs ...
[  124.632959] PM: resume devices took 0.430 seconds
[  124.750164] OOM killer enabled.
[  124.750187] Restarting tasks ... done.
[  124.827899] random: crng reseeded on system resumption
[  124.923183] PM: suspend exit
Tue Jun 14 14:59:31 UTC 2022
[  127.520321] fec 20b4000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[  127.520514] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
root@imx6ull:~#

Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>

---

 net/bluetooth/hci_core.c | 2 ++
 net/bluetooth/hci_sync.c | 1 -
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

bluez.test.bot@gmail.com June 14, 2022, 6:53 p.m. UTC | #1
This is an automated email and please do not reply to this email.

Dear Submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.

----- Output -----
error: patch failed: net/bluetooth/hci_core.c:2675
error: net/bluetooth/hci_core.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch


Please resolve the issue and submit the patches again.


---
Regards,
Linux Bluetooth
Francesco Dolcini June 21, 2022, 11:55 a.m. UTC | #2
Marcel, Vasyl,
any comment on this?

Francesco

On Tue, Jun 14, 2022 at 08:17:06PM +0200, Max Krummenacher wrote:
> From: Max Krummenacher <max.krummenacher@toradex.com>
> 
> This reverts commit ff7f2926114d3a50f5ffe461a9bce8d761748da5.
> 
> The commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work
> cancel on HCI close") introduced between v5.18 and v5.19-rc1 makes
> going to suspend freeze. v5.19-rc2 is equally affected.
> 
> This has been seen on a Colibri iMX6ULL WB which has a Marvell 8997
> based WiFi / Bluetooth module connected over SDIO.
> 
> With 'v5.18' or 'v5.19-rc1 with said commit reverted' a suspend/resume
> cycle looks as follows and the device is functional after the resume:
> 
> root@imx6ull:~# rfkill
> ID TYPE      DEVICE    SOFT      HARD
>  0 bluetooth hci0   blocked unblocked
>  1 wlan      phy0   blocked unblocked
> root@imx6ull:~# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
> root@imx6ull:~# date;echo mem > /sys/power/state;date
> Tue Jun 14 14:43:03 UTC 2022
> [ 6393.464497] PM: suspend entry (deep)
> [ 6393.529398] Filesystems sync: 0.064 seconds
> [ 6393.594006] Freezing user space processes ... (elapsed 0.015 seconds) done.
> [ 6393.610266] OOM killer disabled.
> [ 6393.610285] Freezing remaining freezable tasks ... (elapsed 0.013 seconds) done.
> [ 6393.623727] printk: Suspending console(s) (use no_console_suspend to debug)
> 
> ~~ suspended until console initiates the resume
> 
> [ 6394.023552] fec 20b4000.ethernet eth0: Link is Down
> [ 6394.049902] PM: suspend devices took 0.300 seconds
> [ 6394.091654] Disabling non-boot CPUs ...
> [ 6394.565896] PM: resume devices took 0.440 seconds
> [ 6394.681350] OOM killer enabled.
> [ 6394.681369] Restarting tasks ... done.
> [ 6394.741157] random: crng reseeded on system resumption
> [ 6394.813135] PM: suspend exit
> Tue Jun 14 14:43:11 UTC 2022
> [ 6396.403873] fec 20b4000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> [ 6396.404347] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> root@imx6ull:~#
> 
> With 'v5.19-rc1' suspend freezes in the suspend phase, i.e. power
> consumption is not lowered and no wakeup source initiates a wakup.
> 
> root@imx6ull:~# rfkill
> ID TYPE      DEVICE    SOFT      HARD
>  0 bluetooth hci0   blocked unblocked
>  1 wlan      phy0   blocked unblocked
> root@imx6ull:~# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
> root@imx6ull:~# date;echo mem > /sys/power/state;date
> Tue Jun 14 12:40:38 UTC 2022
> [  122.476333] PM: suspend entry (deep)
> [  122.556012] Filesystems sync: 0.079 seconds
> 
> ~~ no further kernel output
> 
> If one first unbinds the bluetooth device driver, suspend / resume works
> as expected also with 'v5.19-rc1':
> 
> root@imx6ull:~# echo mmc1:0001:2 > /sys/bus/sdio/drivers/btmrvl_sdio/unbind
> root@imx6ull:~# rfkill
> ID TYPE DEVICE    SOFT      HARD
>  1 wlan phy0   blocked unblocked
> root@imx6ull:~# echo enabled > /sys/class/tty/ttymxc0/power/wakeup
> root@imx6ull:~# date;echo mem > /sys/power/state;date
> Tue Jun 14 14:59:26 UTC 2022
> [  123.530310] PM: suspend entry (deep)
> [  123.595432] Filesystems sync: 0.064 seconds
> [  123.672478] Freezing user space processes ... (elapsed 0.028 seconds) done.
> [  123.701848] OOM killer disabled.
> [  123.701869] Freezing remaining freezable tasks ... (elapsed 0.007 seconds) done.
> [  123.709993] printk: Suspending console(s) (use no_console_suspend to debug)
> [  124.097772] fec 20b4000.ethernet eth0: Link is Down
> [  124.124795] PM: suspend devices took 0.280 seconds
> [  124.165893] Disabling non-boot CPUs ...
> [  124.632959] PM: resume devices took 0.430 seconds
> [  124.750164] OOM killer enabled.
> [  124.750187] Restarting tasks ... done.
> [  124.827899] random: crng reseeded on system resumption
> [  124.923183] PM: suspend exit
> Tue Jun 14 14:59:31 UTC 2022
> [  127.520321] fec 20b4000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> [  127.520514] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> root@imx6ull:~#
> 
> Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
> 
> ---
> 
>  net/bluetooth/hci_core.c | 2 ++
>  net/bluetooth/hci_sync.c | 1 -
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 59a5c1341c26..19df3905c5f8 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -2675,6 +2675,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
>  	list_del(&hdev->list);
>  	write_unlock(&hci_dev_list_lock);
>  
> +	cancel_work_sync(&hdev->power_on);
> +
>  	hci_cmd_sync_clear(hdev);
>  
>  	if (!test_bit(HCI_QUIRK_NO_SUSPEND_NOTIFIER, &hdev->quirks))
> diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
> index 286d6767f017..1739e8cb3291 100644
> --- a/net/bluetooth/hci_sync.c
> +++ b/net/bluetooth/hci_sync.c
> @@ -4088,7 +4088,6 @@ int hci_dev_close_sync(struct hci_dev *hdev)
>  
>  	bt_dev_dbg(hdev, "");
>  
> -	cancel_work_sync(&hdev->power_on);
>  	cancel_delayed_work(&hdev->power_off);
>  	cancel_delayed_work(&hdev->ncmd_timer);
>  
> -- 
> 2.20.1
>
Jakub Kicinski June 23, 2022, 7:51 p.m. UTC | #3
On Tue, 21 Jun 2022 13:55:14 +0200 Francesco Dolcini wrote:
> Marcel, Vasyl,
> any comment on this?

+1 is there an ETA on the fix getting into net?

> On Tue, Jun 14, 2022 at 08:17:06PM +0200, Max Krummenacher wrote:
> > From: Max Krummenacher <max.krummenacher@toradex.com>
> > 
> > This reverts commit ff7f2926114d3a50f5ffe461a9bce8d761748da5.
> > 
> > The commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work
> > cancel on HCI close") introduced between v5.18 and v5.19-rc1 makes
> > going to suspend freeze. v5.19-rc2 is equally affected.
> > 
> > This has been seen on a Colibri iMX6ULL WB which has a Marvell 8997
> > based WiFi / Bluetooth module connected over SDIO.
> > 
> > With 'v5.18' or 'v5.19-rc1 with said commit reverted' a suspend/resume
> > cycle looks as follows and the device is functional after the resume:
Mateusz Jończyk July 4, 2022, 7:56 p.m. UTC | #4
W dniu 14.06.2022 o 20:17, Max Krummenacher pisze:
> From: Max Krummenacher <max.krummenacher@toradex.com>
>
> This reverts commit ff7f2926114d3a50f5ffe461a9bce8d761748da5.
>
> The commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work
> cancel on HCI close") introduced between v5.18 and v5.19-rc1 makes
> going to suspend freeze. v5.19-rc2 is equally affected.
>
> This has been seen on a Colibri iMX6ULL WB which has a Marvell 8997
> based WiFi / Bluetooth module connected over SDIO.
> [...]

Hello,

commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work cancel on HCI close")

causes problems also on my laptop (HP 17-by0001nw with a Realtek Bluetooth adapter).

I have Bluetooth disabled by default on startup (via systemd-rfkill.service ) and
vanilla kernel 5.19.0-rc5 fails to suspend (the screen turns black, but I am then able to
touch a trackpad and log in). Reverting that commit on top of 5.19.0-rc5 fixes the issue.

On bare 5.19.0-rc5, after startup, the kworker/u9:0+hci0 process hangs indefinitely
with this stacktrace (obtained through "cat /proc/163/stack" )

        [<0>] __flush_work+0x143/0x220
        [<0>] __cancel_work_timer+0x122/0x1a0
        [<0>] cancel_work_sync+0x10/0x20
        [<0>] hci_dev_close_sync+0x2a/0x550 [bluetooth]
        [<0>] hci_dev_do_close+0x2a/0x60 [bluetooth]
        [<0>] hci_power_on+0x91/0x200 [bluetooth]
        [<0>] process_one_work+0x21c/0x3c0
        [<0>] worker_thread+0x4a/0x3a0
        [<0>] kthread+0xcf/0xf0
        [<0>] ret_from_fork+0x22/0x30

It appears that the hci_power_on() function calls hci_dev_do_close() in this block:

    if (hci_dev_test_flag(hdev, HCI_RFKILLED) ||
         /* [...] */ ) {
        hci_dev_clear_flag(hdev, HCI_AUTO_OFF);
        hci_dev_do_close(hdev);
    } else /* [...] */

which then calls hci_dev_close_sync(). With the problematic commit, that function
calls

       cancel_work_sync(&hdev->power_on)

which tries to cancel the execution of the hci_power_on() function's itself, which leads to a deadlock.

When trying to suspend, the "/lib/systemd/systemd-sleep suspend" process hangs on

        [<0>] hci_suspend_dev+0x87/0xf0 [bluetooth]
        [<0>] hci_suspend_notifier+0x38/0x80 [bluetooth]
        [<0>] notifier_call_chain_robust+0x5e/0xc0
        [<0>] blocking_notifier_call_chain_robust+0x42/0x60
        [<0>] pm_notifier_call_chain_robust+0x1d/0x40
        [<0>] pm_suspend+0x116/0x5a0
        [<0>] state_store+0x82/0xe0
        [<0>] kobj_attr_store+0x12/0x20
        [<0>] sysfs_kf_write+0x3e/0x50
        [<0>] kernfs_fop_write_iter+0x138/0x1c0
        [<0>] new_sync_write+0x104/0x180
        [<0>] vfs_write+0x1d7/0x260
        [<0>] ksys_write+0x67/0xe0
        [<0>] __x64_sys_write+0x1a/0x20
        [<0>] do_syscall_64+0x3b/0x90
        [<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

My device is:

Bus 001 Device 005: ID 0bda:b00b Realtek Semiconductor Corp. Bluetooth Radio

So probably the
commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work cancel on HCI close")
should be reverted or corrected.

Greetings,

Mateusz Jończyk

> Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
> ---
>
>  net/bluetooth/hci_core.c | 2 ++
>  net/bluetooth/hci_sync.c | 1 -
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 59a5c1341c26..19df3905c5f8 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -2675,6 +2675,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
>  	list_del(&hdev->list);
>  	write_unlock(&hci_dev_list_lock);
>  
> +	cancel_work_sync(&hdev->power_on);
> +
>  	hci_cmd_sync_clear(hdev);
>  
>  	if (!test_bit(HCI_QUIRK_NO_SUSPEND_NOTIFIER, &hdev->quirks))
> diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
> index 286d6767f017..1739e8cb3291 100644
> --- a/net/bluetooth/hci_sync.c
> +++ b/net/bluetooth/hci_sync.c
> @@ -4088,7 +4088,6 @@ int hci_dev_close_sync(struct hci_dev *hdev)
>  
>  	bt_dev_dbg(hdev, "");
>  
> -	cancel_work_sync(&hdev->power_on);
>  	cancel_delayed_work(&hdev->power_off);
>  	cancel_delayed_work(&hdev->ncmd_timer);
>
Max Krummenacher July 5, 2022, 2:09 p.m. UTC | #5
On Mon, Jul 4, 2022 at 9:57 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>
> W dniu 14.06.2022 o 20:17, Max Krummenacher pisze:
> > From: Max Krummenacher <max.krummenacher@toradex.com>
> >
> > This reverts commit ff7f2926114d3a50f5ffe461a9bce8d761748da5.
> >
> > The commit ff7f2926114d ("Bluetooth: core: Fix missing power_on work
> > cancel on HCI close") introduced between v5.18 and v5.19-rc1 makes
> > going to suspend freeze. v5.19-rc2 is equally affected.

The following follow up commit fixes the issue for me without the revert.
https://lore.kernel.org/all/20220705125931.3601-1-vasyl.vavrychuk@opensynergy.com/

Thanks
Max
diff mbox series

Patch

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 59a5c1341c26..19df3905c5f8 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2675,6 +2675,8 @@  void hci_unregister_dev(struct hci_dev *hdev)
 	list_del(&hdev->list);
 	write_unlock(&hci_dev_list_lock);
 
+	cancel_work_sync(&hdev->power_on);
+
 	hci_cmd_sync_clear(hdev);
 
 	if (!test_bit(HCI_QUIRK_NO_SUSPEND_NOTIFIER, &hdev->quirks))
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index 286d6767f017..1739e8cb3291 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -4088,7 +4088,6 @@  int hci_dev_close_sync(struct hci_dev *hdev)
 
 	bt_dev_dbg(hdev, "");
 
-	cancel_work_sync(&hdev->power_on);
 	cancel_delayed_work(&hdev->power_off);
 	cancel_delayed_work(&hdev->ncmd_timer);