Message ID | 20211001161854.19802-1-mario.limonciello@amd.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
Series | [RFC] acpi/x86: s2idle: Use constraints to enforce behavior | expand |
On Fri, Oct 1, 2021 at 6:19 PM Mario Limonciello <mario.limonciello@amd.com> wrote: > > Currently s2idle constraints are only verified when pm_debug_messages > is set. Very much intentionally. > Although useful for debugging, it does have a tendency to paper > over some underlying issues. What do you mean "paper over"? It is not expected to do anything except for providing information. > Of particular note, I found a case that a system that has two physical > SATA controllers that share the same ACPI Power Resource. When a drive is > connected to one of the controllers then it will bind with PCI devices with > the ahci driver and form a relationship with the firmware node and physical > node. During s2idle I see that the constraints are met for this device as > it's transitioned into the appropriate state. However the second ACPI node > doesn't have any relationship with a physical node and so it stays in "D0": So the debug information is actually useful. > > ``` > ACPI: \_SB_.PCI0.GP18.SATA: ACPI: PM: Power state change: D0 -> D3cold > ACPI: PM: Power resource [P0SA] still in use > acpi device:2a: Power state changed to D3cold > ``` > > Due to the refcounting used on the shared power resource putting the > device with a physical node into D3 doesn't result in the _OFF method > being called. > > To help this problem, make the following changes to the constraint > checking: > 1) Make constraint checking compulsory That would break things AFAICS due to false-positives resulting from it. The rule of thumb is to check the constraints only if you don't get the expected state. > but gate the output on pm_debug_messages > 2) As part of constraint checking verify if the ACPI device has a physical > node or not. > 3) If a device doesn't have a physical node, but does have a requirement > set the power state for the device to allow shared resources to transition. This fixes your particular issue, but is it generally safe? Also, I think that this needs to be taken care of during system initialization rather than here. Device objects without physical nodes that prevent power resources from being turned off are generally not useful at all and they prevent the SoC from getting into lower-power states through active-state power management as well. > After making this change, here is what the flow looks like: > ``` > ACPI: \_SB_.PCI0.GP18.SATA: ACPI: PM: Power state change: D0 -> D3cold > ACPI: PM: Power resource [P0SA] still in use > acpi device:2a: Power state changed to Dcold > <snip> > ACPI: \_SB_.PCI0.GP18.SAT1: LPI: Device is not physically present - forcing transition from D0 to D3cold > ACPI: \_SB_.PCI0.GP18.SAT1: ACPI: PM: Power state change: D0 -> D3cold > ACPI: PM: Power resource [P0SA] turned off > acpi device:2c: Power state changed to D3cold > ``` > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091 > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > drivers/acpi/x86/s2idle.c | 22 ++++++++++++++++------ > 1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c > index bd92b549fd5a..fbfac40733eb 100644 > --- a/drivers/acpi/x86/s2idle.c > +++ b/drivers/acpi/x86/s2idle.c > @@ -304,14 +304,25 @@ static void lpi_check_constraints(void) > acpi_power_state_string(adev->power.state)); > > if (!adev->flags.power_manageable) { > - acpi_handle_info(handle, "LPI: Device not power manageable\n"); > + if (pm_debug_messages_on) > + acpi_handle_info(handle, "LPI: Device not power manageable\n"); > lpi_constraints_table[i].handle = NULL; > continue; > } > > - if (adev->power.state < lpi_constraints_table[i].min_dstate) > - acpi_handle_info(handle, > - "LPI: Constraint not met; min power state:%s current power state:%s\n", > + if (adev->power.state >= lpi_constraints_table[i].min_dstate) > + continue; > + /* make sure that anything with shared resources isn't blocked */ > + if (!acpi_get_first_physical_node(adev)) { > + if (pm_debug_messages_on) > + acpi_handle_info(handle, "LPI: Device is not physically present - forcing transition from %s to %s\n", > + acpi_power_state_string(adev->power.state), > + acpi_power_state_string(ACPI_STATE_D3_COLD)); > + acpi_device_set_power(adev, ACPI_STATE_D3_COLD); > + continue; > + } > + if (pm_debug_messages_on) > + acpi_handle_info(handle, "LPI: Constraint not met; min power state:%s current power state:%s\n", > acpi_power_state_string(lpi_constraints_table[i].min_dstate), > acpi_power_state_string(adev->power.state)); > } > @@ -446,8 +457,7 @@ int acpi_s2idle_prepare_late(void) > if (!lps0_device_handle || sleep_no_lps0) > return 0; > > - if (pm_debug_messages_on) > - lpi_check_constraints(); > + lpi_check_constraints(); > > /* Screen off */ > if (lps0_dsm_func_mask > 0) > -- > 2.25.1 >
On 10/1/2021 11:41, Rafael J. Wysocki wrote: > On Fri, Oct 1, 2021 at 6:19 PM Mario Limonciello > <mario.limonciello@amd.com> wrote: >> >> Currently s2idle constraints are only verified when pm_debug_messages >> is set. > > Very much intentionally. > >> Although useful for debugging, it does have a tendency to paper >> over some underlying issues. > > What do you mean "paper over"? It is not expected to do anything > except for providing information. There is a difference here in Windows and Linux for how this works - In Windows the failure of constraints checking actually gates the final LPS0 notification and any actions that have arisen from that (ACPI_LPS0_ENTRY, ACPI_LPS0_MS_ENTRY, ACPI_LPS0_ENTRY_AMD). > >> Of particular note, I found a case that a system that has two physical >> SATA controllers that share the same ACPI Power Resource. When a drive is >> connected to one of the controllers then it will bind with PCI devices with >> the ahci driver and form a relationship with the firmware node and physical >> node. During s2idle I see that the constraints are met for this device as >> it's transitioned into the appropriate state. However the second ACPI node >> doesn't have any relationship with a physical node and so it stays in "D0": > > So the debug information is actually useful. Very much so. > >> >> ``` >> ACPI: \_SB_.PCI0.GP18.SATA: ACPI: PM: Power state change: D0 -> D3cold >> ACPI: PM: Power resource [P0SA] still in use >> acpi device:2a: Power state changed to D3cold >> ``` >> >> Due to the refcounting used on the shared power resource putting the >> device with a physical node into D3 doesn't result in the _OFF method >> being called. >> >> To help this problem, make the following changes to the constraint >> checking: >> 1) Make constraint checking compulsory > > That would break things AFAICS due to false-positives resulting from it. For the RFC patch I kept things the same functionally - only emit messages if pm_debug_messages is set. I would like know your thoughts on more closely aligning to the Windows behavior though. > > The rule of thumb is to check the constraints only if you don't get > the expected state > >> but gate the output on pm_debug_messages >> 2) As part of constraint checking verify if the ACPI device has a physical >> node or not. >> 3) If a device doesn't have a physical node, but does have a requirement >> set the power state for the device to allow shared resources to transition. > > This fixes your particular issue, but is it generally safe? It doesn't actually fix my issue, but I found it as part of investigating the issue. I don't know if it's generally safe or not - that's part of why the patch is RFC :) > > Also, I think that this needs to be taken care of during system > initialization rather than here. Device objects without physical > nodes that prevent power resources from being turned off are generally > not useful at all and they prevent the SoC from getting into > lower-power states through active-state power management as well. OK, I'll try to find a better time during initialization to do this. > >> After making this change, here is what the flow looks like: >> ``` >> ACPI: \_SB_.PCI0.GP18.SATA: ACPI: PM: Power state change: D0 -> D3cold >> ACPI: PM: Power resource [P0SA] still in use >> acpi device:2a: Power state changed to Dcold >> <snip> >> ACPI: \_SB_.PCI0.GP18.SAT1: LPI: Device is not physically present - forcing transition from D0 to D3cold >> ACPI: \_SB_.PCI0.GP18.SAT1: ACPI: PM: Power state change: D0 -> D3cold >> ACPI: PM: Power resource [P0SA] turned off >> acpi device:2c: Power state changed to D3cold >> ``` >> >> BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214091&data=04%7C01%7Cmario.limonciello%40amd.com%7C763dfde3eb0a4f9fcb2408d984fa6102%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637687033171350513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DayEDSMdysYB%2FdpHpzjZMVsSYr9IZ1U55YEjUpw%2Fj88%3D&reserved=0 >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> drivers/acpi/x86/s2idle.c | 22 ++++++++++++++++------ >> 1 file changed, 16 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c >> index bd92b549fd5a..fbfac40733eb 100644 >> --- a/drivers/acpi/x86/s2idle.c >> +++ b/drivers/acpi/x86/s2idle.c >> @@ -304,14 +304,25 @@ static void lpi_check_constraints(void) >> acpi_power_state_string(adev->power.state)); >> >> if (!adev->flags.power_manageable) { >> - acpi_handle_info(handle, "LPI: Device not power manageable\n"); >> + if (pm_debug_messages_on) >> + acpi_handle_info(handle, "LPI: Device not power manageable\n"); >> lpi_constraints_table[i].handle = NULL; >> continue; >> } >> >> - if (adev->power.state < lpi_constraints_table[i].min_dstate) >> - acpi_handle_info(handle, >> - "LPI: Constraint not met; min power state:%s current power state:%s\n", >> + if (adev->power.state >= lpi_constraints_table[i].min_dstate) >> + continue; >> + /* make sure that anything with shared resources isn't blocked */ >> + if (!acpi_get_first_physical_node(adev)) { >> + if (pm_debug_messages_on) >> + acpi_handle_info(handle, "LPI: Device is not physically present - forcing transition from %s to %s\n", >> + acpi_power_state_string(adev->power.state), >> + acpi_power_state_string(ACPI_STATE_D3_COLD)); >> + acpi_device_set_power(adev, ACPI_STATE_D3_COLD); >> + continue; >> + } >> + if (pm_debug_messages_on) >> + acpi_handle_info(handle, "LPI: Constraint not met; min power state:%s current power state:%s\n", >> acpi_power_state_string(lpi_constraints_table[i].min_dstate), >> acpi_power_state_string(adev->power.state)); >> } >> @@ -446,8 +457,7 @@ int acpi_s2idle_prepare_late(void) >> if (!lps0_device_handle || sleep_no_lps0) >> return 0; >> >> - if (pm_debug_messages_on) >> - lpi_check_constraints(); >> + lpi_check_constraints(); >> >> /* Screen off */ >> if (lps0_dsm_func_mask > 0) >> -- >> 2.25.1 >>
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index bd92b549fd5a..fbfac40733eb 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -304,14 +304,25 @@ static void lpi_check_constraints(void) acpi_power_state_string(adev->power.state)); if (!adev->flags.power_manageable) { - acpi_handle_info(handle, "LPI: Device not power manageable\n"); + if (pm_debug_messages_on) + acpi_handle_info(handle, "LPI: Device not power manageable\n"); lpi_constraints_table[i].handle = NULL; continue; } - if (adev->power.state < lpi_constraints_table[i].min_dstate) - acpi_handle_info(handle, - "LPI: Constraint not met; min power state:%s current power state:%s\n", + if (adev->power.state >= lpi_constraints_table[i].min_dstate) + continue; + /* make sure that anything with shared resources isn't blocked */ + if (!acpi_get_first_physical_node(adev)) { + if (pm_debug_messages_on) + acpi_handle_info(handle, "LPI: Device is not physically present - forcing transition from %s to %s\n", + acpi_power_state_string(adev->power.state), + acpi_power_state_string(ACPI_STATE_D3_COLD)); + acpi_device_set_power(adev, ACPI_STATE_D3_COLD); + continue; + } + if (pm_debug_messages_on) + acpi_handle_info(handle, "LPI: Constraint not met; min power state:%s current power state:%s\n", acpi_power_state_string(lpi_constraints_table[i].min_dstate), acpi_power_state_string(adev->power.state)); } @@ -446,8 +457,7 @@ int acpi_s2idle_prepare_late(void) if (!lps0_device_handle || sleep_no_lps0) return 0; - if (pm_debug_messages_on) - lpi_check_constraints(); + lpi_check_constraints(); /* Screen off */ if (lps0_dsm_func_mask > 0)
Currently s2idle constraints are only verified when pm_debug_messages is set. Although useful for debugging, it does have a tendency to paper over some underlying issues. Of particular note, I found a case that a system that has two physical SATA controllers that share the same ACPI Power Resource. When a drive is connected to one of the controllers then it will bind with PCI devices with the ahci driver and form a relationship with the firmware node and physical node. During s2idle I see that the constraints are met for this device as it's transitioned into the appropriate state. However the second ACPI node doesn't have any relationship with a physical node and so it stays in "D0": ``` ACPI: \_SB_.PCI0.GP18.SATA: ACPI: PM: Power state change: D0 -> D3cold ACPI: PM: Power resource [P0SA] still in use acpi device:2a: Power state changed to D3cold ``` Due to the refcounting used on the shared power resource putting the device with a physical node into D3 doesn't result in the _OFF method being called. To help this problem, make the following changes to the constraint checking: 1) Make constraint checking compulsory but gate the output on pm_debug_messages 2) As part of constraint checking verify if the ACPI device has a physical node or not. 3) If a device doesn't have a physical node, but does have a requirement set the power state for the device to allow shared resources to transition. After making this change, here is what the flow looks like: ``` ACPI: \_SB_.PCI0.GP18.SATA: ACPI: PM: Power state change: D0 -> D3cold ACPI: PM: Power resource [P0SA] still in use acpi device:2a: Power state changed to Dcold <snip> ACPI: \_SB_.PCI0.GP18.SAT1: LPI: Device is not physically present - forcing transition from D0 to D3cold ACPI: \_SB_.PCI0.GP18.SAT1: ACPI: PM: Power state change: D0 -> D3cold ACPI: PM: Power resource [P0SA] turned off acpi device:2c: Power state changed to D3cold ``` BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> --- drivers/acpi/x86/s2idle.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)