Message ID | 20230802201013.910-3-mario.limonciello@amd.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix wakeup problems on some AMD platforms | expand |
Hi, On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: > @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) > if (dmi_check_system(bridge_d3_blacklist)) > return false; > > - /* > - * It should be safe to put PCIe ports from 2015 or newer > - * to D3. > - */ > - if (dmi_get_bios_year() >= 2015) > + /* the platform indicates in a device constraint that D3 is needed */ > + if (platform_constraint_d3(bridge)) This for sure causes some sort of power regression on the Intel platforms made after 2015. Why not check for the constraint and: - If present and enabled, use the desired D-state - If present and disabled, leave the device in D0 - If not present use the existing cutoff date ?
Hi Mario, kernel test robot noticed the following build errors: [auto build test ERROR on rafael-pm/linux-next] [also build test ERROR on pci/next pci/for-linus westeri-thunderbolt/next linus/master v6.5-rc4 next-20230802] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello/ACPI-Add-comments-to-clarify-some-ifdef-statements/20230803-041214 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next patch link: https://lore.kernel.org/r/20230802201013.910-3-mario.limonciello%40amd.com patch subject: [PATCH v8 2/2] PCI/ACPI: Use device constraints instead of dates to opt devices into D3 config: i386-randconfig-i005-20230731 (https://download.01.org/0day-ci/archive/20230803/202308031243.Mmwm1IHI-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce: (https://download.01.org/0day-ci/archive/20230803/202308031243.Mmwm1IHI-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308031243.Mmwm1IHI-lkp@intel.com/ All errors (new ones prefixed by >>): ld: drivers/pci/pci-acpi.o: in function `acpi_pci_device_constraint_d3': >> drivers/pci/pci-acpi.c:1055: undefined reference to `acpi_get_lps0_constraint' vim +1055 drivers/pci/pci-acpi.c 1045 1046 /* 1047 * acpi_pci_device_constraint_d3 - determine if device constraints require D3 1048 * @dev: PCI device to check 1049 * 1050 * Returns true if the PEP constraints for the device is enabled and 1051 * requires D3. 1052 */ 1053 bool acpi_pci_device_constraint_d3(struct pci_dev *dev) 1054 { > 1055 int constraint = acpi_get_lps0_constraint(&dev->dev); 1056 1057 if (constraint < 0) { 1058 pci_dbg(dev, "ACPI device constraint not present\n"); 1059 return false; 1060 } 1061 1062 return constraint >= 3; 1063 } 1064
Hi Mario, kernel test robot noticed the following build errors: [auto build test ERROR on rafael-pm/linux-next] [also build test ERROR on pci/next pci/for-linus westeri-thunderbolt/next linus/master v6.5-rc4 next-20230803] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello/ACPI-Add-comments-to-clarify-some-ifdef-statements/20230803-041214 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next patch link: https://lore.kernel.org/r/20230802201013.910-3-mario.limonciello%40amd.com patch subject: [PATCH v8 2/2] PCI/ACPI: Use device constraints instead of dates to opt devices into D3 config: x86_64-randconfig-x003-20230731 (https://download.01.org/0day-ci/archive/20230803/202308031702.maQwVjYc-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce: (https://download.01.org/0day-ci/archive/20230803/202308031702.maQwVjYc-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308031702.maQwVjYc-lkp@intel.com/ All errors (new ones prefixed by >>): >> ld.lld: error: undefined symbol: acpi_get_lps0_constraint >>> referenced by pci-acpi.c:1055 (drivers/pci/pci-acpi.c:1055) >>> vmlinux.o:(pci_bridge_d3_possible) >>> referenced by pci-acpi.c:1055 (drivers/pci/pci-acpi.c:1055) >>> vmlinux.o:(acpi_pci_device_constraint_d3)
On 8/3/23 00:01, Mika Westerberg wrote: > Hi, > > On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: >> @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) >> if (dmi_check_system(bridge_d3_blacklist)) >> return false; >> >> - /* >> - * It should be safe to put PCIe ports from 2015 or newer >> - * to D3. >> - */ >> - if (dmi_get_bios_year() >= 2015) >> + /* the platform indicates in a device constraint that D3 is needed */ >> + if (platform_constraint_d3(bridge)) > > This for sure causes some sort of power regression on the Intel > platforms made after 2015. Why not check for the constraint and: > Are you sure? I saw it as an explanation of how Windows could put the systems into D3 when there is no other PM related ACPI objects. > - If present and enabled, use the desired D-state > - If present and disabled, leave the device in D0 > - If not present use the existing cutoff date > > ? Thanks! That sounds very reasonable to me. I'll double check it in my case.
On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > PCIe ports from modern machines (>=2015) are allowed to be put into D3 by > storing a value to the `bridge_d3` variable in the `struct pci_dev` > structure. > > pci_power_manageable() uses this variable to indicate a PCIe port can > enter D3. > pci_pm_suspend_noirq() uses the return from pci_power_manageable() to > decide whether to try to put a device into its target state for a sleep > cycle via pci_prepare_to_sleep(). > > For devices that support D3, the target state is selected by this policy: > 1. If platform_pci_power_manageable(): > Use platform_pci_choose_state() > 2. If the device is armed for wakeup: > Select the deepest D-state that supports a PME. > 3. Else: > Use D3hot. > > Devices are considered power manageable by the platform when they have > one or more objects described in the table in section 7.3 of the ACPI 6.5 > specification. > > When devices are not considered power manageable; specs are ambiguous as > to what should happen. In this situation Windows 11 seems to leave PCIe > root ports in D0 while Linux puts them into D3 due to the above mentioned > commit. > > In Windows systems that support Modern Standby specify hardware > pre-conditions for the SoC to achieve the lowest power state by device > constraints in a SOC specific "Power Engine Plugin" (PEP) [2] [3]. > They can be marked as disabled or enabled and when enabled can specify > the minimum power state required for an ACPI device. > > Instead of using a time based heuristic to decide if a port should go > into D3 use device constraints to decide. > * If the constraint is not present or disabled then choose D0. > * If the constraint is enabled, then enable D3 if the constraint is set > to 3 or greater. ... > +/* > + * acpi_get_lps0_constraint - get any LPS0 constraint for an acpi device > + * @handle: ACPI handle of the device > + * > + * If a constraint has been specified in the _DSM method for the device, > + * return it. Otherwise, return -ENODEV. > + */ > +int acpi_get_lps0_constraint(struct device *dev) > +{ > + acpi_handle handle = ACPI_HANDLE(dev); (see below) > + int i; > + > + if (!handle) > + return -ENODEV; > + > + for (i = 0; i < lpi_constraints_table_size; ++i) { > + if (lpi_constraints_table[i].handle != handle) Maybe device_match_acpi_handle() ? > + continue; > + return lpi_constraints_table[i].min_dstate; > + } > + > + return -ENODEV; > +} ... > +/* Is it deliberately non-kernel-doc while mimicking it? > + * acpi_pci_device_constraint_d3 - determine if device constraints require D3 > + * @dev: PCI device to check > + * > + * Returns true if the PEP constraints for the device is enabled and > + * requires D3. > + */ > +bool acpi_pci_device_constraint_d3(struct pci_dev *dev) > +{ > + int constraint = acpi_get_lps0_constraint(&dev->dev); > + > + if (constraint < 0) { I slightly prefer int constraint; constraint = acpi_get_lps0_constraint(&dev->dev); if (constraint < 0) { > + pci_dbg(dev, "ACPI device constraint not present\n"); > + return false; > + } > + > + return constraint >= 3; > +}
On Thu, Aug 03, 2023 at 06:38:45AM -0500, Mario Limonciello wrote: > On 8/3/23 00:01, Mika Westerberg wrote: > > Hi, > > > > On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: > > > @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) > > > if (dmi_check_system(bridge_d3_blacklist)) > > > return false; > > > - /* > > > - * It should be safe to put PCIe ports from 2015 or newer > > > - * to D3. > > > - */ > > > - if (dmi_get_bios_year() >= 2015) > > > + /* the platform indicates in a device constraint that D3 is needed */ > > > + if (platform_constraint_d3(bridge)) > > > > This for sure causes some sort of power regression on the Intel > > platforms made after 2015. Why not check for the constraint and: > > > Are you sure? I saw it as an explanation of how Windows could put the > systems into D3 when there is no other PM related ACPI objects. I'm concerned if there are no PEP constraints on some of the affected systems this now leaves root ports into D0 then, no?
On 8/3/2023 10:14, Mika Westerberg wrote: > On Thu, Aug 03, 2023 at 06:38:45AM -0500, Mario Limonciello wrote: >> On 8/3/23 00:01, Mika Westerberg wrote: >>> Hi, >>> >>> On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: >>>> @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) >>>> if (dmi_check_system(bridge_d3_blacklist)) >>>> return false; >>>> - /* >>>> - * It should be safe to put PCIe ports from 2015 or newer >>>> - * to D3. >>>> - */ >>>> - if (dmi_get_bios_year() >= 2015) >>>> + /* the platform indicates in a device constraint that D3 is needed */ >>>> + if (platform_constraint_d3(bridge)) >>> >>> This for sure causes some sort of power regression on the Intel >>> platforms made after 2015. Why not check for the constraint and: >>> >> Are you sure? I saw it as an explanation of how Windows could put the >> systems into D3 when there is no other PM related ACPI objects. > > I'm concerned if there are no PEP constraints on some of the affected > systems this now leaves root ports into D0 then, no? Do you have any idea if any of the affected systems were something that didn't ship with Windows? Like an Apple system or a Chromebook? If so; I'd think it's better to treat those as "quirks" rather than make a blanket policy from the timing.
On 8/3/23 06:38, Mario Limonciello wrote: > On 8/3/23 00:01, Mika Westerberg wrote: >> Hi, >> >> On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: >>> @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev >>> *bridge) >>> if (dmi_check_system(bridge_d3_blacklist)) >>> return false; >>> - /* >>> - * It should be safe to put PCIe ports from 2015 or newer >>> - * to D3. >>> - */ >>> - if (dmi_get_bios_year() >= 2015) >>> + /* the platform indicates in a device constraint that D3 is >>> needed */ >>> + if (platform_constraint_d3(bridge)) >> >> This for sure causes some sort of power regression on the Intel >> platforms made after 2015. Why not check for the constraint and: >> > Are you sure? I saw it as an explanation of how Windows could put the > systems into D3 when there is no other PM related ACPI objects. > >> - If present and enabled, use the desired D-state >> - If present and disabled, leave the device in D0 >> - If not present use the existing cutoff date >> >> ? > > Thanks! That sounds very reasonable to me. I'll double check it in my > case. I've played with this a bit, and I found that I can make it work by moving the constraints check into pci_target_state() in the non-ACPI power manageable case. To me this works pretty well to reflect spec policy ambiguity but should avoid regressions dropping the 2015 check. I'll send out a v9 with this approach.
On Thu, Aug 03, 2023 at 10:18:07AM -0500, Mario Limonciello wrote: > On 8/3/2023 10:14, Mika Westerberg wrote: > > On Thu, Aug 03, 2023 at 06:38:45AM -0500, Mario Limonciello wrote: > > > On 8/3/23 00:01, Mika Westerberg wrote: > > > > Hi, > > > > > > > > On Wed, Aug 02, 2023 at 03:10:13PM -0500, Mario Limonciello wrote: > > > > > @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) > > > > > if (dmi_check_system(bridge_d3_blacklist)) > > > > > return false; > > > > > - /* > > > > > - * It should be safe to put PCIe ports from 2015 or newer > > > > > - * to D3. > > > > > - */ > > > > > - if (dmi_get_bios_year() >= 2015) > > > > > + /* the platform indicates in a device constraint that D3 is needed */ > > > > > + if (platform_constraint_d3(bridge)) > > > > > > > > This for sure causes some sort of power regression on the Intel > > > > platforms made after 2015. Why not check for the constraint and: > > > > > > > Are you sure? I saw it as an explanation of how Windows could put the > > > systems into D3 when there is no other PM related ACPI objects. > > > > I'm concerned if there are no PEP constraints on some of the affected > > systems this now leaves root ports into D0 then, no? > > Do you have any idea if any of the affected systems were something that > didn't ship with Windows? Like an Apple system or a Chromebook? Some of them, at least the Apollo Lake ones were used in IVI systems that did not run Windows IIRC. > If so; I'd think it's better to treat those as "quirks" rather than make a > blanket policy from the timing. It is possible that the quirk list ends up being rather big (or not) so it may be considered something that is painful to maintain.
On Fri, Aug 04, 2023 at 09:07:43AM +0300, Mika Westerberg wrote: > On Thu, Aug 03, 2023 at 10:18:07AM -0500, Mario Limonciello wrote: ... > Some of them, at least the Apollo Lake ones were used in IVI systems > that did not run Windows IIRC. And if it matters, they even don't have EFI complaint BIOS.
On 8/4/23 01:12, Andy Shevchenko wrote: > On Fri, Aug 04, 2023 at 09:07:43AM +0300, Mika Westerberg wrote: >> On Thu, Aug 03, 2023 at 10:18:07AM -0500, Mario Limonciello wrote: > > ... > >> Some of them, at least the Apollo Lake ones were used in IVI systems >> that did not run Windows IIRC. > > And if it matters, they even don't have EFI complaint BIOS. > Thanks. I agree then; to avoid causing regressions or building a monster list we should at least try to work in the confines of improving the situation with an extra optional check that doesn't fail if not present. Hopefully the approach from my v9 series works for this.
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index ce62e61a9605e..60f681fa05ed3 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -133,7 +133,7 @@ static void lpi_device_get_constraints_amd(void) union acpi_object *obj = &info_obj->package.elements[k]; list = &lpi_constraints_table[lpi_constraints_table_size]; - list->min_dstate = -1; + list->min_dstate = -EINVAL; switch (k) { case 0: @@ -244,7 +244,7 @@ static void lpi_device_get_constraints(void) acpi_handle_debug(lps0_device_handle, "index:%d Name:%s\n", i, info.name); - constraint->min_dstate = -1; + constraint->min_dstate = -EINVAL; for (j = 0; j < package_count; ++j) { union acpi_object *info_obj = &info.package[j]; @@ -291,6 +291,30 @@ static void lpi_device_get_constraints(void) ACPI_FREE(out_obj); } +/* + * acpi_get_lps0_constraint - get any LPS0 constraint for an acpi device + * @handle: ACPI handle of the device + * + * If a constraint has been specified in the _DSM method for the device, + * return it. Otherwise, return -ENODEV. + */ +int acpi_get_lps0_constraint(struct device *dev) +{ + acpi_handle handle = ACPI_HANDLE(dev); + int i; + + if (!handle) + return -ENODEV; + + for (i = 0; i < lpi_constraints_table_size; ++i) { + if (lpi_constraints_table[i].handle != handle) + continue; + return lpi_constraints_table[i].min_dstate; + } + + return -ENODEV; +} + static void lpi_check_constraints(void) { int i; diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index a05350a4e49cb..40900754404ff 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -1043,6 +1043,25 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev) return false; } +/* + * acpi_pci_device_constraint_d3 - determine if device constraints require D3 + * @dev: PCI device to check + * + * Returns true if the PEP constraints for the device is enabled and + * requires D3. + */ +bool acpi_pci_device_constraint_d3(struct pci_dev *dev) +{ + int constraint = acpi_get_lps0_constraint(&dev->dev); + + if (constraint < 0) { + pci_dbg(dev, "ACPI device constraint not present\n"); + return false; + } + + return constraint >= 3; +} + static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable) { int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0c..dce313cd3b8f0 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1082,6 +1082,14 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) return acpi_pci_bridge_d3(dev); } +static inline bool platform_constraint_d3(struct pci_dev *dev) +{ + if (pci_use_mid_pm()) + return false; + + return acpi_pci_device_constraint_d3(dev); +} + /** * pci_update_current_state - Read power state of given device and cache it * @dev: PCI device to handle. @@ -3036,11 +3044,8 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) if (dmi_check_system(bridge_d3_blacklist)) return false; - /* - * It should be safe to put PCIe ports from 2015 or newer - * to D3. - */ - if (dmi_get_bios_year() >= 2015) + /* the platform indicates in a device constraint that D3 is needed */ + if (platform_constraint_d3(bridge)) return true; break; } diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index a4c3974340576..2162f243bcc25 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -707,6 +707,7 @@ void pci_set_acpi_fwnode(struct pci_dev *dev); int pci_dev_acpi_reset(struct pci_dev *dev, bool probe); bool acpi_pci_power_manageable(struct pci_dev *dev); bool acpi_pci_bridge_d3(struct pci_dev *dev); +bool acpi_pci_device_constraint_d3(struct pci_dev *dev); int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state); pci_power_t acpi_pci_get_power_state(struct pci_dev *dev); void acpi_pci_refresh_power_state(struct pci_dev *dev); @@ -731,6 +732,10 @@ static inline bool acpi_pci_bridge_d3(struct pci_dev *dev) { return false; } +static inline bool acpi_pci_device_constraint_d3(struct pci_dev *dev) +{ + return false; +} static inline int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state) { return -ENODEV; diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 0d5277b7c6323..353c0b910c2cd 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -1109,6 +1109,12 @@ struct acpi_s2idle_dev_ops { }; int acpi_register_lps0_dev(struct acpi_s2idle_dev_ops *arg); void acpi_unregister_lps0_dev(struct acpi_s2idle_dev_ops *arg); +int acpi_get_lps0_constraint(struct device *dev); +#else +static inline int acpi_get_lps0_constraint(struct device *dev) +{ + return false; +} #endif /* CONFIG_X86 */ #ifndef CONFIG_IA64 void arch_reserve_mem_area(acpi_physical_address addr, size_t size);
Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") PCIe ports from modern machines (>=2015) are allowed to be put into D3 by storing a value to the `bridge_d3` variable in the `struct pci_dev` structure. pci_power_manageable() uses this variable to indicate a PCIe port can enter D3. pci_pm_suspend_noirq() uses the return from pci_power_manageable() to decide whether to try to put a device into its target state for a sleep cycle via pci_prepare_to_sleep(). For devices that support D3, the target state is selected by this policy: 1. If platform_pci_power_manageable(): Use platform_pci_choose_state() 2. If the device is armed for wakeup: Select the deepest D-state that supports a PME. 3. Else: Use D3hot. Devices are considered power manageable by the platform when they have one or more objects described in the table in section 7.3 of the ACPI 6.5 specification. When devices are not considered power manageable; specs are ambiguous as to what should happen. In this situation Windows 11 seems to leave PCIe root ports in D0 while Linux puts them into D3 due to the above mentioned commit. In Windows systems that support Modern Standby specify hardware pre-conditions for the SoC to achieve the lowest power state by device constraints in a SOC specific "Power Engine Plugin" (PEP) [2] [3]. They can be marked as disabled or enabled and when enabled can specify the minimum power state required for an ACPI device. Instead of using a time based heuristic to decide if a port should go into D3 use device constraints to decide. * If the constraint is not present or disabled then choose D0. * If the constraint is enabled, then enable D3 if the constraint is set to 3 or greater. Link: https://uefi.org/specs/ACPI/6.5/07_Power_and_Performance_Mgmt.html#device-power-management-objects [1] Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/platform-design-for-modern-standby#low-power-core-silicon-cpu-soc-dram [2] Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf [3] Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane <iain@orangesquash.org.uk> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> --- v7->v8: * Use device constraints instead * Update commit message and links --- drivers/acpi/x86/s2idle.c | 28 ++++++++++++++++++++++++++-- drivers/pci/pci-acpi.c | 19 +++++++++++++++++++ drivers/pci/pci.c | 15 ++++++++++----- drivers/pci/pci.h | 5 +++++ include/linux/acpi.h | 6 ++++++ 5 files changed, 66 insertions(+), 7 deletions(-)