diff mbox series

[v4] USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up

Message ID 20241024024038.26157-1-duanchenghao@kylinos.cn (mailing list archive)
State New
Headers show
Series [v4] USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up | expand

Commit Message

duanchenghao Oct. 24, 2024, 2:40 a.m. UTC
When a device is inserted into the USB port and an S4 wakeup is initiated,
after the USB-hub initialization is completed, it will automatically enter
suspend mode. Upon detecting a device on the USB port, it will proceed with
resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During the S4
wakeup process, peripherals are put into suspend mode, followed by task
recovery. However, upon detecting that the hcd is in the
HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status, causing the
S4 suspend to fail and subsequent task recovery to not proceed.
-
[   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28 returns -16
[   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100 returns -16
[   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100 returned 0 after 3 usecs
[   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100 returned -16 after 17223 usecs
[   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce async: error -16
[   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282 msecs
[   27.823302][ 1]  PM: start quiesce of devices aborted after 1839.975 msecs
......
[   31.303172][ 1]  PM: recover of devices complete after 3473.039 msecs
[   31.309818][ 1]  PM: Failed to load hibernation image, recovering.
[   31.348188][ 1]  PM: Basic memory bitmaps freed
[   31.352686][ 1]  OOM killer enabled.
[   31.356232][ 1]  Restarting tasks ... done.
[   31.360609][ 1]  PM: resume from hibernation failed (0)
[   31.365800][ 1]  PM: Hibernation image not present or could not be loaded.

The "do_wakeup" is determined based on whether the controller's
power/wakeup attribute is set. The current issue necessitates considering
the type of suspend that is occurring. If the suspend type is either
PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set to
false.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
---
 drivers/usb/core/hcd-pci.c | 15 +++++++++++++--
 include/linux/pm.h         |  3 ++-
 2 files changed, 15 insertions(+), 3 deletions(-)

Comments

duanchenghao Oct. 24, 2024, 2:53 a.m. UTC | #1
Hi Alan,

The current patch has been tested on 6 machines and has passed the
tests, with each machine undergoing 500 cycles of testing.

Saranya also replied in the email that she conducted 1,500 tests and
all were successful.


Thanks
Duan Chenghao 

在 2024-10-24星期四的 10:40 +0800,Duan Chenghao写道:
> When a device is inserted into the USB port and an S4 wakeup is
> initiated,
> after the USB-hub initialization is completed, it will automatically
> enter
> suspend mode. Upon detecting a device on the USB port, it will
> proceed with
> resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During
> the S4
> wakeup process, peripherals are put into suspend mode, followed by
> task
> recovery. However, upon detecting that the hcd is in the
> HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> causing the
> S4 suspend to fail and subsequent task recovery to not proceed.
> -
> [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28
> returns -16
> [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100
> returns -16
> [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100
> returned 0 after 3 usecs
> [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100
> returned -16 after 17223 usecs
> [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce async:
> error -16
> [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282
> msecs
> [   27.823302][ 1]  PM: start quiesce of devices aborted after
> 1839.975 msecs
> ......
> [   31.303172][ 1]  PM: recover of devices complete after 3473.039
> msecs
> [   31.309818][ 1]  PM: Failed to load hibernation image, recovering.
> [   31.348188][ 1]  PM: Basic memory bitmaps freed
> [   31.352686][ 1]  OOM killer enabled.
> [   31.356232][ 1]  Restarting tasks ... done.
> [   31.360609][ 1]  PM: resume from hibernation failed (0)
> [   31.365800][ 1]  PM: Hibernation image not present or could not be
> loaded.
> 
> The "do_wakeup" is determined based on whether the controller's
> power/wakeup attribute is set. The current issue necessitates
> considering
> the type of suspend that is occurring. If the suspend type is either
> PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set
> to
> false.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Closes:
> https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> ---
>  drivers/usb/core/hcd-pci.c | 15 +++++++++++++--
>  include/linux/pm.h         |  3 ++-
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
> index a08f3f228e6d..390b1225f3cc 100644
> --- a/drivers/usb/core/hcd-pci.c
> +++ b/drivers/usb/core/hcd-pci.c
> @@ -422,7 +422,12 @@ static int suspend_common(struct device *dev,
> pm_message_t msg)
>         bool                    do_wakeup;
>         int                     retval;
>  
> -       do_wakeup = PMSG_IS_AUTO(msg) ? true :
> device_may_wakeup(dev);
> +       if (PMSG_IS_AUTO(msg))
> +               do_wakeup = true;
> +       else if (PMSG_NO_WAKEUP(msg))
> +               do_wakeup = false;
> +       else
> +               do_wakeup = device_may_wakeup(dev);
>  
>         /* Root hub suspend should have stopped all downstream
> traffic,
>          * and all bus master traffic.  And done so for both the
> interface
> @@ -521,6 +526,11 @@ static int hcd_pci_suspend(struct device *dev)
>         return suspend_common(dev, PMSG_SUSPEND);
>  }
>  
> +static int hcd_pci_freeze(struct device *dev)
> +{
> +       return suspend_common(dev, PMSG_FREEZE);
> +}
> +
>  static int hcd_pci_suspend_noirq(struct device *dev)
>  {
>         struct pci_dev          *pci_dev = to_pci_dev(dev);
> @@ -590,6 +600,7 @@ static int hcd_pci_restore(struct device *dev)
>  #else
>  
>  #define hcd_pci_suspend                NULL
> +#define hcd_pci_freeze                 NULL
>  #define hcd_pci_suspend_noirq  NULL
>  #define hcd_pci_poweroff_late  NULL
>  #define hcd_pci_resume_noirq   NULL
> @@ -624,7 +635,7 @@ const struct dev_pm_ops usb_hcd_pci_pm_ops = {
>         .suspend_noirq  = hcd_pci_suspend_noirq,
>         .resume_noirq   = hcd_pci_resume_noirq,
>         .resume         = hcd_pci_resume,
> -       .freeze         = hcd_pci_suspend,
> +       .freeze         = hcd_pci_freeze,
>         .freeze_noirq   = check_root_hub_suspended,
>         .thaw_noirq     = NULL,
>         .thaw           = hcd_pci_resume,
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index 97b0e23363c8..2a676b2cb699 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -570,7 +570,8 @@ const struct dev_pm_ops name = { \
>                                         { .event =
> PM_EVENT_AUTO_RESUME, })
>  
>  #define PMSG_IS_AUTO(msg)      (((msg).event & PM_EVENT_AUTO) != 0)
> -
> +#define PMSG_NO_WAKEUP(msg)    (((msg).event & \
> +                               (PM_EVENT_FREEZE | PM_EVENT_QUIESCE))
> != 0)
>  /*
>   * Device run-time power management status.
>   *
Greg Kroah-Hartman Oct. 24, 2024, 7:05 a.m. UTC | #2
On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> When a device is inserted into the USB port and an S4 wakeup is initiated,
> after the USB-hub initialization is completed, it will automatically enter
> suspend mode. Upon detecting a device on the USB port, it will proceed with
> resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During the S4
> wakeup process, peripherals are put into suspend mode, followed by task
> recovery. However, upon detecting that the hcd is in the
> HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status, causing the
> S4 suspend to fail and subsequent task recovery to not proceed.
> -
> [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28 returns -16
> [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100 returns -16
> [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100 returned 0 after 3 usecs
> [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100 returned -16 after 17223 usecs
> [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce async: error -16
> [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282 msecs
> [   27.823302][ 1]  PM: start quiesce of devices aborted after 1839.975 msecs
> ......
> [   31.303172][ 1]  PM: recover of devices complete after 3473.039 msecs
> [   31.309818][ 1]  PM: Failed to load hibernation image, recovering.
> [   31.348188][ 1]  PM: Basic memory bitmaps freed
> [   31.352686][ 1]  OOM killer enabled.
> [   31.356232][ 1]  Restarting tasks ... done.
> [   31.360609][ 1]  PM: resume from hibernation failed (0)
> [   31.365800][ 1]  PM: Hibernation image not present or could not be loaded.
> 
> The "do_wakeup" is determined based on whether the controller's
> power/wakeup attribute is set. The current issue necessitates considering
> the type of suspend that is occurring. If the suspend type is either
> PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set to
> false.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>

What commit id does this fix?

And I missed where Alan provided a signed-off-by, where was that?

thanks,

greg k-h
duanchenghao Oct. 24, 2024, 8:46 a.m. UTC | #3
hi greg k-h,

在 2024-10-24星期四的 09:05 +0200,Greg KH写道:
> On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> > When a device is inserted into the USB port and an S4 wakeup is
> > initiated,
> > after the USB-hub initialization is completed, it will
> > automatically enter
> > suspend mode. Upon detecting a device on the USB port, it will
> > proceed with
> > resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During
> > the S4
> > wakeup process, peripherals are put into suspend mode, followed by
> > task
> > recovery. However, upon detecting that the hcd is in the
> > HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> > causing the
> > S4 suspend to fail and subsequent task recovery to not proceed.
> > -
> > [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28
> > returns -16
> > [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100
> > returns -16
> > [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100
> > returned 0 after 3 usecs
> > [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100
> > returned -16 after 17223 usecs
> > [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce
> > async: error -16
> > [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282
> > msecs
> > [   27.823302][ 1]  PM: start quiesce of devices aborted after
> > 1839.975 msecs
> > ......
> > [   31.303172][ 1]  PM: recover of devices complete after 3473.039
> > msecs
> > [   31.309818][ 1]  PM: Failed to load hibernation image,
> > recovering.
> > [   31.348188][ 1]  PM: Basic memory bitmaps freed
> > [   31.352686][ 1]  OOM killer enabled.
> > [   31.356232][ 1]  Restarting tasks ... done.
> > [   31.360609][ 1]  PM: resume from hibernation failed (0)
> > [   31.365800][ 1]  PM: Hibernation image not present or could not
> > be loaded.
> > 
> > The "do_wakeup" is determined based on whether the controller's
> > power/wakeup attribute is set. The current issue necessitates
> > considering
> > the type of suspend that is occurring. If the suspend type is
> > either
> > PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set
> > to
> > false.
> > 
> > Reported-by: kernel test robot <lkp@intel.com>
> > Closes:
> > https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> 
> What commit id does this fix?

The current patch is not intended to fix an issue with a specific
commit, but rather to address a long-standing problem in the USB core.

> 
> And I missed where Alan provided a signed-off-by, where was that?

In the following email, Alan proposed using "Signed-off-by" for
signing.
https://lore.kernel.org/all/489805e7-c19c-4b57-9cd7-713e075261cd@rowland.harvard.edu/

Thanks
Duan Chenghao

> 
> thanks,
> 
> greg k-h
diff mbox series

Patch

diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index a08f3f228e6d..390b1225f3cc 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -422,7 +422,12 @@  static int suspend_common(struct device *dev, pm_message_t msg)
 	bool			do_wakeup;
 	int			retval;
 
-	do_wakeup = PMSG_IS_AUTO(msg) ? true : device_may_wakeup(dev);
+	if (PMSG_IS_AUTO(msg))
+		do_wakeup = true;
+	else if (PMSG_NO_WAKEUP(msg))
+		do_wakeup = false;
+	else
+		do_wakeup = device_may_wakeup(dev);
 
 	/* Root hub suspend should have stopped all downstream traffic,
 	 * and all bus master traffic.  And done so for both the interface
@@ -521,6 +526,11 @@  static int hcd_pci_suspend(struct device *dev)
 	return suspend_common(dev, PMSG_SUSPEND);
 }
 
+static int hcd_pci_freeze(struct device *dev)
+{
+	return suspend_common(dev, PMSG_FREEZE);
+}
+
 static int hcd_pci_suspend_noirq(struct device *dev)
 {
 	struct pci_dev		*pci_dev = to_pci_dev(dev);
@@ -590,6 +600,7 @@  static int hcd_pci_restore(struct device *dev)
 #else
 
 #define hcd_pci_suspend		NULL
+#define hcd_pci_freeze			NULL
 #define hcd_pci_suspend_noirq	NULL
 #define hcd_pci_poweroff_late	NULL
 #define hcd_pci_resume_noirq	NULL
@@ -624,7 +635,7 @@  const struct dev_pm_ops usb_hcd_pci_pm_ops = {
 	.suspend_noirq	= hcd_pci_suspend_noirq,
 	.resume_noirq	= hcd_pci_resume_noirq,
 	.resume		= hcd_pci_resume,
-	.freeze		= hcd_pci_suspend,
+	.freeze		= hcd_pci_freeze,
 	.freeze_noirq	= check_root_hub_suspended,
 	.thaw_noirq	= NULL,
 	.thaw		= hcd_pci_resume,
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 97b0e23363c8..2a676b2cb699 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -570,7 +570,8 @@  const struct dev_pm_ops name = { \
 					{ .event = PM_EVENT_AUTO_RESUME, })
 
 #define PMSG_IS_AUTO(msg)	(((msg).event & PM_EVENT_AUTO) != 0)
-
+#define PMSG_NO_WAKEUP(msg)	(((msg).event & \
+				(PM_EVENT_FREEZE | PM_EVENT_QUIESCE)) != 0)
 /*
  * Device run-time power management status.
  *