Message ID | 20220902233543.390890-4-helgaas@kernel.org (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | PCI/PM: Always disable PTM for all devices during suspend | expand |
On 9/2/22 4:35 PM, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > We want to disable PTM on Root Ports because that allows some chips, e.g., > Intel mobile chips since Coffee Lake, to enter a lower-power PM state. > > That means we also have to disable PTM on downstream devices. PCIe r6.0, > sec 2.2.8, recommends that functions support generation of messages in > non-D0 states, so we have to assume Switch Upstream Ports or Endpoints may > send PTM Requests while in D1, D2, and D3hot. A PTM message received by a > Downstream Port (including a Root Port) with PTM disabled must be treated > as an Unsupported Request (sec 6.21.3). > > PTM was previously disabled only for Root Ports, and it was disabled in > pci_prepare_to_sleep(), which is not called at all if a driver supports > legacy PM or does its own state saving. > > Instead, disable PTM early in pci_pm_suspend() and pci_pm_runtime_suspend() > so we do it in all cases. > > Previously PTM was disabled *after* saving device state, so the state > restore on resume automatically re-enabled it. Since we now disable PTM > *before* saving state, we must explicitly re-enable it. > > Here's a sample of errors that occur when PTM is disabled only on the Root > Port. With this topology: > > 0000:00:1d.0 Root Port to [bus 08-71] > 0000:08:00.0 Switch Upstream Port to [bus 09-71] > > Kai-Heng reported errors like this: > > pcieport 0000:00:1d.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:1d.0 > pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) > pcieport 0000:00:1d.0: device [8086:7ab0] error status/mask=00100000/00004000 > pcieport 0000:00:1d.0: [20] UnsupReq (First) > pcieport 0000:00:1d.0: AER: TLP Header: 34000000 08000052 00000000 00000000 > > Decoding TLP header 0x34...... (0011 0100b) and 0x08000052: > > Fmt 001b 4 DW header, no data > Type 1 0100b Msg (Local - Terminate at Receiver) > Requester ID 0x0800 Bus 08 Devfn 00.0 > Message Code 0x52 0101 0010b PTM Request > > The 00:1d.0 Root Port logged an Unsupported Request error when it received > a PTM Request with Requester ID 08:00.0. > > Fixes: a697f072f5da ("PCI: Disable PTM during suspend to save power") > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453 > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216210 > Based-on: https://lore.kernel.org/r/20220706123244.18056-1-kai.heng.feng@canonical.com > Based-on-patch-by: Kai-Heng Feng <kai.heng.feng@canonical.com> > Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/pci-driver.c | 14 ++++++++++++++ > drivers/pci/pci.c | 20 -------------------- > 2 files changed, 14 insertions(+), 20 deletions(-) > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index 2815922ac525..115febaa7e0b 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -772,6 +772,12 @@ static int pci_pm_suspend(struct device *dev) > struct pci_dev *pci_dev = to_pci_dev(dev); > const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; > > + /* > + * Disabling PTM allows some systems, e.g., Intel mobile chips > + * since Coffee Lake, to enter a lower-power PM state. > + */ > + pci_disable_ptm(pci_dev); I think you can use "if (pci_dev->ptm_enabled)" check for pci_disable_ptm() as well. This will avoid unnecessary checks in pci_disable_ptm(). > + > pci_dev->skip_bus_pm = false; > > if (pci_has_legacy_pm_support(pci_dev)) > @@ -982,6 +988,9 @@ static int pci_pm_resume(struct device *dev) > if (pci_dev->state_saved) > pci_restore_standard_config(pci_dev); > > + if (pci_dev->ptm_enabled) > + pci_enable_ptm(pci_dev, NULL); > + > if (pci_has_legacy_pm_support(pci_dev)) > return pci_legacy_resume(dev); > > @@ -1269,6 +1278,8 @@ static int pci_pm_runtime_suspend(struct device *dev) > pci_power_t prev = pci_dev->current_state; > int error; > > + pci_disable_ptm(pci_dev); > + > /* > * If pci_dev->driver is not set (unbound), we leave the device in D0, > * but it may go to D3cold when the bridge above it runtime suspends. > @@ -1331,6 +1342,9 @@ static int pci_pm_runtime_resume(struct device *dev) > */ > pci_pm_default_resume_early(pci_dev); > > + if (pci_dev->ptm_enabled) > + pci_enable_ptm(pci_dev, NULL); > + > if (!pci_dev->driver) > return 0; > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 95bc329e74c0..b0e2968c8cca 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -2706,16 +2706,6 @@ int pci_prepare_to_sleep(struct pci_dev *dev) > if (target_state == PCI_POWER_ERROR) > return -EIO; > > - /* > - * There are systems (for example, Intel mobile chips since Coffee > - * Lake) where the power drawn while suspended can be significantly > - * reduced by disabling PTM on PCIe root ports as this allows the > - * port to enter a lower-power PM state and the SoC to reach a > - * lower-power idle state as a whole. > - */ > - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) > - pci_disable_ptm(dev); > - > pci_enable_wake(dev, target_state, wakeup); > > error = pci_set_power_state(dev, target_state); > @@ -2764,16 +2754,6 @@ int pci_finish_runtime_suspend(struct pci_dev *dev) > if (target_state == PCI_POWER_ERROR) > return -EIO; > > - /* > - * There are systems (for example, Intel mobile chips since Coffee > - * Lake) where the power drawn while suspended can be significantly > - * reduced by disabling PTM on PCIe root ports as this allows the > - * port to enter a lower-power PM state and the SoC to reach a > - * lower-power idle state as a whole. > - */ > - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) > - pci_disable_ptm(dev); > - > __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev)); > > error = pci_set_power_state(dev, target_state);
Hi Bjorn, I love your patch! Yet something to improve: [auto build test ERROR on helgaas-pci/next] [also build test ERROR on linus/master v6.0-rc3 next-20220901] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Bjorn-Helgaas/PCI-PM-Always-disable-PTM-for-all-devices-during-suspend/20220903-073808 base: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next config: riscv-randconfig-r042-20220901 (https://download.01.org/0day-ci/archive/20220903/202209031120.PkNnRSt8-lkp@intel.com/config) compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project c55b41d5199d2394dd6cdb8f52180d8b81d809d4) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install riscv cross compiling tool for clang build # apt-get install binutils-riscv64-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/809e1c954b459ee37193c4ab9fa843243fbd7fa9 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Bjorn-Helgaas/PCI-PM-Always-disable-PTM-for-all-devices-during-suspend/20220903-073808 git checkout 809e1c954b459ee37193c4ab9fa843243fbd7fa9 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash drivers/pci/ If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> drivers/pci/pci-driver.c:1350:15: error: no member named 'ptm_enabled' in 'struct pci_dev' if (pci_dev->ptm_enabled) ~~~~~~~ ^ 1 error generated. vim +1350 drivers/pci/pci-driver.c 1335 1336 static int pci_pm_runtime_resume(struct device *dev) 1337 { 1338 struct pci_dev *pci_dev = to_pci_dev(dev); 1339 const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; 1340 pci_power_t prev_state = pci_dev->current_state; 1341 int error = 0; 1342 1343 /* 1344 * Restoring config space is necessary even if the device is not bound 1345 * to a driver because although we left it in D0, it may have gone to 1346 * D3cold when the bridge above it runtime suspended. 1347 */ 1348 pci_pm_default_resume_early(pci_dev); 1349 > 1350 if (pci_dev->ptm_enabled) 1351 pci_enable_ptm(pci_dev, NULL); 1352 1353 if (!pci_dev->driver) 1354 return 0; 1355 1356 pci_fixup_device(pci_fixup_resume_early, pci_dev); 1357 pci_pm_default_resume(pci_dev); 1358 1359 if (prev_state == PCI_D3cold) 1360 pci_pm_bridge_power_up_actions(pci_dev); 1361 1362 if (pm && pm->runtime_resume) 1363 error = pm->runtime_resume(dev); 1364 1365 return error; 1366 } 1367
Hi Bjorn, I love your patch! Yet something to improve: [auto build test ERROR on helgaas-pci/next] [also build test ERROR on linus/master v6.0-rc3 next-20220901] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Bjorn-Helgaas/PCI-PM-Always-disable-PTM-for-all-devices-during-suspend/20220903-073808 base: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next config: x86_64-randconfig-a015 (https://download.01.org/0day-ci/archive/20220903/202209031256.McjuB8mz-lkp@intel.com/config) compiler: gcc-11 (Debian 11.3.0-5) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/809e1c954b459ee37193c4ab9fa843243fbd7fa9 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Bjorn-Helgaas/PCI-PM-Always-disable-PTM-for-all-devices-during-suspend/20220903-073808 git checkout 809e1c954b459ee37193c4ab9fa843243fbd7fa9 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/pci/pci-driver.c: In function 'pci_pm_runtime_resume': >> drivers/pci/pci-driver.c:1350:22: error: 'struct pci_dev' has no member named 'ptm_enabled'; did you mean 'ats_enabled'? 1350 | if (pci_dev->ptm_enabled) | ^~~~~~~~~~~ | ats_enabled vim +1350 drivers/pci/pci-driver.c 1335 1336 static int pci_pm_runtime_resume(struct device *dev) 1337 { 1338 struct pci_dev *pci_dev = to_pci_dev(dev); 1339 const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; 1340 pci_power_t prev_state = pci_dev->current_state; 1341 int error = 0; 1342 1343 /* 1344 * Restoring config space is necessary even if the device is not bound 1345 * to a driver because although we left it in D0, it may have gone to 1346 * D3cold when the bridge above it runtime suspended. 1347 */ 1348 pci_pm_default_resume_early(pci_dev); 1349 > 1350 if (pci_dev->ptm_enabled) 1351 pci_enable_ptm(pci_dev, NULL); 1352 1353 if (!pci_dev->driver) 1354 return 0; 1355 1356 pci_fixup_device(pci_fixup_resume_early, pci_dev); 1357 pci_pm_default_resume(pci_dev); 1358 1359 if (prev_state == PCI_D3cold) 1360 pci_pm_bridge_power_up_actions(pci_dev); 1361 1362 if (pm && pm->runtime_resume) 1363 error = pm->runtime_resume(dev); 1364 1365 return error; 1366 } 1367
On Sat, Sep 3, 2022 at 1:59 AM Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com> wrote: > > > > On 9/2/22 4:35 PM, Bjorn Helgaas wrote: > > From: Bjorn Helgaas <bhelgaas@google.com> > > > > We want to disable PTM on Root Ports because that allows some chips, e.g., > > Intel mobile chips since Coffee Lake, to enter a lower-power PM state. > > > > That means we also have to disable PTM on downstream devices. PCIe r6.0, > > sec 2.2.8, recommends that functions support generation of messages in > > non-D0 states, so we have to assume Switch Upstream Ports or Endpoints may > > send PTM Requests while in D1, D2, and D3hot. A PTM message received by a > > Downstream Port (including a Root Port) with PTM disabled must be treated > > as an Unsupported Request (sec 6.21.3). > > > > PTM was previously disabled only for Root Ports, and it was disabled in > > pci_prepare_to_sleep(), which is not called at all if a driver supports > > legacy PM or does its own state saving. > > > > Instead, disable PTM early in pci_pm_suspend() and pci_pm_runtime_suspend() > > so we do it in all cases. > > > > Previously PTM was disabled *after* saving device state, so the state > > restore on resume automatically re-enabled it. Since we now disable PTM > > *before* saving state, we must explicitly re-enable it. > > > > Here's a sample of errors that occur when PTM is disabled only on the Root > > Port. With this topology: > > > > 0000:00:1d.0 Root Port to [bus 08-71] > > 0000:08:00.0 Switch Upstream Port to [bus 09-71] > > > > Kai-Heng reported errors like this: > > > > pcieport 0000:00:1d.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:1d.0 > > pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) > > pcieport 0000:00:1d.0: device [8086:7ab0] error status/mask=00100000/00004000 > > pcieport 0000:00:1d.0: [20] UnsupReq (First) > > pcieport 0000:00:1d.0: AER: TLP Header: 34000000 08000052 00000000 00000000 > > > > Decoding TLP header 0x34...... (0011 0100b) and 0x08000052: > > > > Fmt 001b 4 DW header, no data > > Type 1 0100b Msg (Local - Terminate at Receiver) > > Requester ID 0x0800 Bus 08 Devfn 00.0 > > Message Code 0x52 0101 0010b PTM Request > > > > The 00:1d.0 Root Port logged an Unsupported Request error when it received > > a PTM Request with Requester ID 08:00.0. > > > > Fixes: a697f072f5da ("PCI: Disable PTM during suspend to save power") > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453 > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216210 > > Based-on: https://lore.kernel.org/r/20220706123244.18056-1-kai.heng.feng@canonical.com > > Based-on-patch-by: Kai-Heng Feng <kai.heng.feng@canonical.com> > > Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > > --- > > drivers/pci/pci-driver.c | 14 ++++++++++++++ > > drivers/pci/pci.c | 20 -------------------- > > 2 files changed, 14 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > > index 2815922ac525..115febaa7e0b 100644 > > --- a/drivers/pci/pci-driver.c > > +++ b/drivers/pci/pci-driver.c > > @@ -772,6 +772,12 @@ static int pci_pm_suspend(struct device *dev) > > struct pci_dev *pci_dev = to_pci_dev(dev); > > const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; > > > > + /* > > + * Disabling PTM allows some systems, e.g., Intel mobile chips > > + * since Coffee Lake, to enter a lower-power PM state. > > + */ > > + pci_disable_ptm(pci_dev); > > I think you can use "if (pci_dev->ptm_enabled)" check for pci_disable_ptm() > as well. This will avoid unnecessary checks in pci_disable_ptm(). Or use that check in pci_disable_ptm() instead of the pci_is_pcie() one. Also, I would remae pci_disable_ptm() to pci_suspend_ptm() (because its role is to temporarily disable PTM for system-wide suspend) and introduc pci_resume_ptm() that will do if (pci_dev->ptm_enabled) pci_enable_ptm(pci_dev, NULL); > > + > > pci_dev->skip_bus_pm = false; > > > > if (pci_has_legacy_pm_support(pci_dev)) > > @@ -982,6 +988,9 @@ static int pci_pm_resume(struct device *dev) > > if (pci_dev->state_saved) > > pci_restore_standard_config(pci_dev); > > > > + if (pci_dev->ptm_enabled) > > + pci_enable_ptm(pci_dev, NULL); > > + > > if (pci_has_legacy_pm_support(pci_dev)) > > return pci_legacy_resume(dev); > > > > @@ -1269,6 +1278,8 @@ static int pci_pm_runtime_suspend(struct device *dev) > > pci_power_t prev = pci_dev->current_state; > > int error; > > > > + pci_disable_ptm(pci_dev); > > + > > /* > > * If pci_dev->driver is not set (unbound), we leave the device in D0, > > * but it may go to D3cold when the bridge above it runtime suspends. > > @@ -1331,6 +1342,9 @@ static int pci_pm_runtime_resume(struct device *dev) > > */ > > pci_pm_default_resume_early(pci_dev); > > > > + if (pci_dev->ptm_enabled) > > + pci_enable_ptm(pci_dev, NULL); > > + > > if (!pci_dev->driver) > > return 0; > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index 95bc329e74c0..b0e2968c8cca 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -2706,16 +2706,6 @@ int pci_prepare_to_sleep(struct pci_dev *dev) > > if (target_state == PCI_POWER_ERROR) > > return -EIO; > > > > - /* > > - * There are systems (for example, Intel mobile chips since Coffee > > - * Lake) where the power drawn while suspended can be significantly > > - * reduced by disabling PTM on PCIe root ports as this allows the > > - * port to enter a lower-power PM state and the SoC to reach a > > - * lower-power idle state as a whole. > > - */ > > - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) > > - pci_disable_ptm(dev); > > - > > pci_enable_wake(dev, target_state, wakeup); > > > > error = pci_set_power_state(dev, target_state); > > @@ -2764,16 +2754,6 @@ int pci_finish_runtime_suspend(struct pci_dev *dev) > > if (target_state == PCI_POWER_ERROR) > > return -EIO; > > > > - /* > > - * There are systems (for example, Intel mobile chips since Coffee > > - * Lake) where the power drawn while suspended can be significantly > > - * reduced by disabling PTM on PCIe root ports as this allows the > > - * port to enter a lower-power PM state and the SoC to reach a > > - * lower-power idle state as a whole. > > - */ > > - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) > > - pci_disable_ptm(dev); > > - > > __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev)); > > > > error = pci_set_power_state(dev, target_state); > > -- > Sathyanarayanan Kuppuswamy > Linux Kernel Developer
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 2815922ac525..115febaa7e0b 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -772,6 +772,12 @@ static int pci_pm_suspend(struct device *dev) struct pci_dev *pci_dev = to_pci_dev(dev); const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; + /* + * Disabling PTM allows some systems, e.g., Intel mobile chips + * since Coffee Lake, to enter a lower-power PM state. + */ + pci_disable_ptm(pci_dev); + pci_dev->skip_bus_pm = false; if (pci_has_legacy_pm_support(pci_dev)) @@ -982,6 +988,9 @@ static int pci_pm_resume(struct device *dev) if (pci_dev->state_saved) pci_restore_standard_config(pci_dev); + if (pci_dev->ptm_enabled) + pci_enable_ptm(pci_dev, NULL); + if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume(dev); @@ -1269,6 +1278,8 @@ static int pci_pm_runtime_suspend(struct device *dev) pci_power_t prev = pci_dev->current_state; int error; + pci_disable_ptm(pci_dev); + /* * If pci_dev->driver is not set (unbound), we leave the device in D0, * but it may go to D3cold when the bridge above it runtime suspends. @@ -1331,6 +1342,9 @@ static int pci_pm_runtime_resume(struct device *dev) */ pci_pm_default_resume_early(pci_dev); + if (pci_dev->ptm_enabled) + pci_enable_ptm(pci_dev, NULL); + if (!pci_dev->driver) return 0; diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 95bc329e74c0..b0e2968c8cca 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2706,16 +2706,6 @@ int pci_prepare_to_sleep(struct pci_dev *dev) if (target_state == PCI_POWER_ERROR) return -EIO; - /* - * There are systems (for example, Intel mobile chips since Coffee - * Lake) where the power drawn while suspended can be significantly - * reduced by disabling PTM on PCIe root ports as this allows the - * port to enter a lower-power PM state and the SoC to reach a - * lower-power idle state as a whole. - */ - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) - pci_disable_ptm(dev); - pci_enable_wake(dev, target_state, wakeup); error = pci_set_power_state(dev, target_state); @@ -2764,16 +2754,6 @@ int pci_finish_runtime_suspend(struct pci_dev *dev) if (target_state == PCI_POWER_ERROR) return -EIO; - /* - * There are systems (for example, Intel mobile chips since Coffee - * Lake) where the power drawn while suspended can be significantly - * reduced by disabling PTM on PCIe root ports as this allows the - * port to enter a lower-power PM state and the SoC to reach a - * lower-power idle state as a whole. - */ - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) - pci_disable_ptm(dev); - __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev)); error = pci_set_power_state(dev, target_state);