Message ID | 21111815.riZgbS9lI9@aspire.rjw.lan (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Mon, Jun 05, 2017 at 03:08:57PM +0200, Rafael J. Wysocki wrote: > On Monday, June 05, 2017 11:05:21 AM Dominik Brodowski wrote: > > unfortunately, commit eed4d47efe95 (ACPI / sleep: Ignore spurious SCI > > wakeups from suspend-to-idle) breaks suspend-to-mem on my Dell XPS 13 > > (9343) up to v4.12-rc4: Issuing > > > > $ echo "mem" > /sys/power/state > > > > in the initramfs returns, after a while, with "write error: Resource > > busy", and the system *not* having entered the sleep state in between. > > Why initramfs? Easiest and quickest to test, with no userspace in between to interfere. During normal operations, I see suspend-to-mem fail, then a fall-back to s2idle... > > In contrast thereto, 8a537ece3d94 (PM / wakeup: Integrate mechanism to abort > > transitions in progress) still works fine, and allows to enter > > suspend-to-mem. No real difference is to be seen in dmesg, with the notable > > exception of > > > > ACPI: Low-level resume complete > > ACPI : EC: EC started > > PM: Restoring platform NVS memory > > Suspended for N.NNNN seconds > > > > only showing up on working kernels. Reverting eed4d47efe95 on top of > > v4.12-rc4 restores suspend-to-mem to work as expected. > > I'm sure it is not necessary to revert all of it. Thought so, just wanted to confirm that eed4d47efe95 is indeed the culprit. > I guess what happens is that you get a wakeup event during suspend which is > aborted as a result. > > Please apply the partial revert below and see if it makes the issue go away. Yes it does -- with this patch applied on top of v4.12-rc4, everything works as expected. Both in initramfs and during normal runtime operations. Only an (probably unrelated) issue still appears during resume, namely "cache" complaining that the non-boot CPUs should not be sleeping: [ 140.773931] Suspended for 2.244 seconds [ 140.776090] Enabling non-boot CPUs ... [ 140.780214] x86: Booting SMP configuration: [ 140.780216] smpboot: Booting Node 0 Processor 1 APIC 0x2 [ 140.781116] cache: parent cpu1 should not be sleeping [ 140.781325] microcode: sig=0x306d4, pf=0x40, revision=0x1f [ 140.782400] CPU1 is up [ 140.788172] smpboot: Booting Node 0 Processor 2 APIC 0x1 [ 140.788865] cache: parent cpu2 should not be sleeping [ 140.788987] microcode: sig=0x306d4, pf=0x40, revision=0x24 [ 140.789095] CPU2 is up [ 140.797493] smpboot: Booting Node 0 Processor 3 APIC 0x3 [ 140.798208] cache: parent cpu3 should not be sleeping [ 140.798551] CPU3 is up [ 140.802735] ACPI: Waking up from system sleep state S3 Thanks! Dominik > drivers/acpi/battery.c | 2 +- > drivers/acpi/button.c | 4 ++-- > drivers/acpi/device_pm.c | 2 +- > 3 files changed, 4 insertions(+), 4 deletions(-) > > Index: linux-pm/drivers/acpi/battery.c > =================================================================== > --- linux-pm.orig/drivers/acpi/battery.c > +++ linux-pm/drivers/acpi/battery.c > @@ -782,7 +782,7 @@ static int acpi_battery_update(struct ac > if ((battery->state & ACPI_BATTERY_STATE_CRITICAL) || > (test_bit(ACPI_BATTERY_ALARM_PRESENT, &battery->flags) && > (battery->capacity_now <= battery->alarm))) > - pm_wakeup_hard_event(&battery->device->dev); > + pm_wakeup_event(&battery->device->dev, 0); > > return result; > } > Index: linux-pm/drivers/acpi/device_pm.c > =================================================================== > --- linux-pm.orig/drivers/acpi/device_pm.c > +++ linux-pm/drivers/acpi/device_pm.c > @@ -400,7 +400,7 @@ static void acpi_pm_notify_handler(acpi_ > mutex_lock(&acpi_pm_notifier_lock); > > if (adev->wakeup.flags.notifier_present) { > - pm_wakeup_ws_event(adev->wakeup.ws, 0, true); > + __pm_wakeup_event(adev->wakeup.ws, 0); > if (adev->wakeup.context.work.func) > queue_pm_work(&adev->wakeup.context.work); > } > Index: linux-pm/drivers/acpi/button.c > =================================================================== > --- linux-pm.orig/drivers/acpi/button.c > +++ linux-pm/drivers/acpi/button.c > @@ -217,7 +217,7 @@ static int acpi_lid_notify_state(struct > } > > if (state) > - pm_wakeup_hard_event(&device->dev); > + pm_wakeup_event(&device->dev, 0); > > ret = blocking_notifier_call_chain(&acpi_lid_notifier, state, device); > if (ret == NOTIFY_DONE) > @@ -402,7 +402,7 @@ static void acpi_button_notify(struct ac > } else { > int keycode; > > - pm_wakeup_hard_event(&device->dev); > + pm_wakeup_event(&device->dev, 0); > if (button->suspended) > break; >
On Mon, Jun 5, 2017 at 5:09 PM, Dominik Brodowski <linux@dominikbrodowski.net> wrote: > On Mon, Jun 05, 2017 at 03:08:57PM +0200, Rafael J. Wysocki wrote: >> On Monday, June 05, 2017 11:05:21 AM Dominik Brodowski wrote: >> > unfortunately, commit eed4d47efe95 (ACPI / sleep: Ignore spurious SCI >> > wakeups from suspend-to-idle) breaks suspend-to-mem on my Dell XPS 13 >> > (9343) up to v4.12-rc4: Issuing >> > >> > $ echo "mem" > /sys/power/state >> > >> > in the initramfs returns, after a while, with "write error: Resource >> > busy", and the system *not* having entered the sleep state in between. >> >> Why initramfs? > > Easiest and quickest to test, with no userspace in between to interfere. > During normal operations, I see suspend-to-mem fail, then a fall-back to > s2idle... I see, OK. >> > In contrast thereto, 8a537ece3d94 (PM / wakeup: Integrate mechanism to abort >> > transitions in progress) still works fine, and allows to enter >> > suspend-to-mem. No real difference is to be seen in dmesg, with the notable >> > exception of >> > >> > ACPI: Low-level resume complete >> > ACPI : EC: EC started >> > PM: Restoring platform NVS memory >> > Suspended for N.NNNN seconds >> > >> > only showing up on working kernels. Reverting eed4d47efe95 on top of >> > v4.12-rc4 restores suspend-to-mem to work as expected. >> >> I'm sure it is not necessary to revert all of it. > > Thought so, just wanted to confirm that eed4d47efe95 is indeed the culprit. > >> I guess what happens is that you get a wakeup event during suspend which is >> aborted as a result. >> >> Please apply the partial revert below and see if it makes the issue go away. > > Yes it does -- with this patch applied on top of v4.12-rc4, everything works > as expected. Both in initramfs and during normal runtime operations. Only an > (probably unrelated) issue still appears during resume, namely "cache" > complaining that the non-boot CPUs should not be sleeping: That's unrelated. OK, this still leaves us with three possibilities. Can you please try to apply the device_pm.c part of the partial revert alone and see if that helps (and if not, same for the button.c and battery.c changes)? Thanks, Rafael
On Mon, Jun 05, 2017 at 10:32:15PM +0200, Rafael J. Wysocki wrote: > On Mon, Jun 5, 2017 at 5:09 PM, Dominik Brodowski > <linux@dominikbrodowski.net> wrote: > > On Mon, Jun 05, 2017 at 03:08:57PM +0200, Rafael J. Wysocki wrote: > >> On Monday, June 05, 2017 11:05:21 AM Dominik Brodowski wrote: > >> > unfortunately, commit eed4d47efe95 (ACPI / sleep: Ignore spurious SCI > >> > wakeups from suspend-to-idle) breaks suspend-to-mem on my Dell XPS 13 > >> > (9343) up to v4.12-rc4: Issuing > >> > > >> > $ echo "mem" > /sys/power/state > >> > > >> > in the initramfs returns, after a while, with "write error: Resource > >> > busy", and the system *not* having entered the sleep state in between. > >> > >> Why initramfs? > > > > Easiest and quickest to test, with no userspace in between to interfere. > > During normal operations, I see suspend-to-mem fail, then a fall-back to > > s2idle... > > I see, OK. > > >> > In contrast thereto, 8a537ece3d94 (PM / wakeup: Integrate mechanism to abort > >> > transitions in progress) still works fine, and allows to enter > >> > suspend-to-mem. No real difference is to be seen in dmesg, with the notable > >> > exception of > >> > > >> > ACPI: Low-level resume complete > >> > ACPI : EC: EC started > >> > PM: Restoring platform NVS memory > >> > Suspended for N.NNNN seconds > >> > > >> > only showing up on working kernels. Reverting eed4d47efe95 on top of > >> > v4.12-rc4 restores suspend-to-mem to work as expected. > >> > >> I'm sure it is not necessary to revert all of it. > > > > Thought so, just wanted to confirm that eed4d47efe95 is indeed the culprit. > > > >> I guess what happens is that you get a wakeup event during suspend which is > >> aborted as a result. > >> > >> Please apply the partial revert below and see if it makes the issue go away. > > > > Yes it does -- with this patch applied on top of v4.12-rc4, everything works > > as expected. Both in initramfs and during normal runtime operations. Only an > > (probably unrelated) issue still appears during resume, namely "cache" > > complaining that the non-boot CPUs should not be sleeping: > > That's unrelated. > > OK, this still leaves us with three possibilities. > > Can you please try to apply the device_pm.c part of the partial revert > alone and see if that helps (and if not, same for the button.c and > battery.c changes)? Indeed, the device_pm.c part of the partial revert helps on its own. For completeness, applying only the chunks for button.c or battery.c does not help; whether the following stack trace (with battery.c applied) is of any help, seems doubtful, but here it is: [ 11.312194] PM: Suspending system (mem) [ 11.312301] Suspending console(s) (use no_console_suspend to debug) [ 15.879041] psmouse serio1: Failed to deactivate mouse on isa0060/serio1 [ 16.351046] psmouse serio1: Failed to enable mouse on isa0060/serio1 [ 20.695329] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input7 [ 20.903110] psmouse serio1: Failed to enable mouse on isa0060/serio1 [ 20.923324] sd 3:0:0:0: [sda] Synchronizing SCSI cache [ 20.925306] sd 3:0:0:0: [sda] Stopping disk [ 21.126993] psmouse serio1: Failed to disable mouse on isa0060/serio1 [ 21.572782] ACPI : EC: event blocked [ 22.745084] PM: suspend of devices complete after 1821.805 msecs [ 22.745085] PM: suspend devices took 11.433 seconds [ 22.745086] Component: suspend devices, time: 11433 [ 22.745092] ------------[ cut here ]------------ [ 22.745096] WARNING: CPU: 1 PID: 515 at /home/brodo/local/kernel/git/linux/kernel/power/suspend_test.c:55 suspend_test_finish+0x76/0x80 [ 22.745096] Modules linked in: [ 22.745099] CPU: 1 PID: 515 Comm: sh Not tainted 4.12.0-rc4+ #5 [ 22.745100] Hardware name: Dell Inc. XPS 13 9343/0TM99H, BIOS A11 12/08/2016 <snip> [ 22.745111] Call Trace: [ 22.745114] suspend_devices_and_enter+0xdb/0x860 [ 22.745116] ? console_unlock.part.8+0x261/0x4a0 [ 22.745118] pm_suspend+0x3c3/0x590 [ 22.745119] state_store+0x85/0x100 [ 22.745122] kobj_attr_store+0xf/0x20 [ 22.745124] sysfs_kf_write+0x37/0x40 [ 22.745125] kernfs_fop_write+0x11c/0x1a0 [ 22.745128] __vfs_write+0x37/0x140 [ 22.745130] ? preempt_count_sub+0x9b/0x100 [ 22.745132] vfs_write+0xb1/0x170 [ 22.745134] SyS_write+0x55/0xc0 [ 22.745137] entry_SYSCALL_64_fastpath+0x18/0xa8 <snip> [ 22.745173] ---[ end trace beda9cb50fa83d31 ]--- [ 22.758152] PM: late suspend of devices complete after 12.961 msecs A second suspend then does not take as long, so I presume that there is much noise in the messages above: [ 36.538507] PM: Preparing system for sleep (mem) [ 36.538846] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 36.540248] OOM killer disabled. [ 36.540250] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 36.541671] PM: Suspending system (mem) [ 36.541759] Suspending console(s) (use no_console_suspend to debug) [ 36.561114] sd 3:0:0:0: [sda] Synchronizing SCSI cache [ 36.563755] sd 3:0:0:0: [sda] Stopping disk [ 36.593838] ACPI : EC: event blocked [ 37.785272] PM: suspend of devices complete after 1224.215 msecs [ 37.785274] PM: suspend devices took 1.244 seconds [ 37.798224] PM: late suspend of devices complete after 12.946 msecs > Thanks, Thank you! Dominik
On Mon, Jun 5, 2017 at 11:14 PM, Dominik Brodowski <linux@dominikbrodowski.net> wrote: > On Mon, Jun 05, 2017 at 10:32:15PM +0200, Rafael J. Wysocki wrote: >> On Mon, Jun 5, 2017 at 5:09 PM, Dominik Brodowski >> <linux@dominikbrodowski.net> wrote: >> > On Mon, Jun 05, 2017 at 03:08:57PM +0200, Rafael J. Wysocki wrote: >> >> On Monday, June 05, 2017 11:05:21 AM Dominik Brodowski wrote: >> >> > unfortunately, commit eed4d47efe95 (ACPI / sleep: Ignore spurious SCI >> >> > wakeups from suspend-to-idle) breaks suspend-to-mem on my Dell XPS 13 >> >> > (9343) up to v4.12-rc4: Issuing >> >> > >> >> > $ echo "mem" > /sys/power/state >> >> > >> >> > in the initramfs returns, after a while, with "write error: Resource >> >> > busy", and the system *not* having entered the sleep state in between. >> >> >> >> Why initramfs? >> > >> > Easiest and quickest to test, with no userspace in between to interfere. >> > During normal operations, I see suspend-to-mem fail, then a fall-back to >> > s2idle... >> >> I see, OK. >> >> >> > In contrast thereto, 8a537ece3d94 (PM / wakeup: Integrate mechanism to abort >> >> > transitions in progress) still works fine, and allows to enter >> >> > suspend-to-mem. No real difference is to be seen in dmesg, with the notable >> >> > exception of >> >> > >> >> > ACPI: Low-level resume complete >> >> > ACPI : EC: EC started >> >> > PM: Restoring platform NVS memory >> >> > Suspended for N.NNNN seconds >> >> > >> >> > only showing up on working kernels. Reverting eed4d47efe95 on top of >> >> > v4.12-rc4 restores suspend-to-mem to work as expected. >> >> >> >> I'm sure it is not necessary to revert all of it. >> > >> > Thought so, just wanted to confirm that eed4d47efe95 is indeed the culprit. >> > >> >> I guess what happens is that you get a wakeup event during suspend which is >> >> aborted as a result. >> >> >> >> Please apply the partial revert below and see if it makes the issue go away. >> > >> > Yes it does -- with this patch applied on top of v4.12-rc4, everything works >> > as expected. Both in initramfs and during normal runtime operations. Only an >> > (probably unrelated) issue still appears during resume, namely "cache" >> > complaining that the non-boot CPUs should not be sleeping: >> >> That's unrelated. >> >> OK, this still leaves us with three possibilities. >> >> Can you please try to apply the device_pm.c part of the partial revert >> alone and see if that helps (and if not, same for the button.c and >> battery.c changes)? > > Indeed, the device_pm.c part of the partial revert helps on its own. > For completeness, applying only the chunks for button.c or battery.c > does not help; That's kind of expected. :-) Anyway, I decided to revert commit eed4d47efe95 entirely as it turns out to be premature and causes too many problems to happen all over. Thanks, Rafael
Index: linux-pm/drivers/acpi/battery.c =================================================================== --- linux-pm.orig/drivers/acpi/battery.c +++ linux-pm/drivers/acpi/battery.c @@ -782,7 +782,7 @@ static int acpi_battery_update(struct ac if ((battery->state & ACPI_BATTERY_STATE_CRITICAL) || (test_bit(ACPI_BATTERY_ALARM_PRESENT, &battery->flags) && (battery->capacity_now <= battery->alarm))) - pm_wakeup_hard_event(&battery->device->dev); + pm_wakeup_event(&battery->device->dev, 0); return result; } Index: linux-pm/drivers/acpi/device_pm.c =================================================================== --- linux-pm.orig/drivers/acpi/device_pm.c +++ linux-pm/drivers/acpi/device_pm.c @@ -400,7 +400,7 @@ static void acpi_pm_notify_handler(acpi_ mutex_lock(&acpi_pm_notifier_lock); if (adev->wakeup.flags.notifier_present) { - pm_wakeup_ws_event(adev->wakeup.ws, 0, true); + __pm_wakeup_event(adev->wakeup.ws, 0); if (adev->wakeup.context.work.func) queue_pm_work(&adev->wakeup.context.work); } Index: linux-pm/drivers/acpi/button.c =================================================================== --- linux-pm.orig/drivers/acpi/button.c +++ linux-pm/drivers/acpi/button.c @@ -217,7 +217,7 @@ static int acpi_lid_notify_state(struct } if (state) - pm_wakeup_hard_event(&device->dev); + pm_wakeup_event(&device->dev, 0); ret = blocking_notifier_call_chain(&acpi_lid_notifier, state, device); if (ret == NOTIFY_DONE) @@ -402,7 +402,7 @@ static void acpi_button_notify(struct ac } else { int keycode; - pm_wakeup_hard_event(&device->dev); + pm_wakeup_event(&device->dev, 0); if (button->suspended) break;