diff mbox series

nvme-pci: Shutdown the device if D3Cold is allowed by the user

Message ID 20241118082344.8146-1-manivannan.sadhasivam@linaro.org (mailing list archive)
State New
Headers show
Series nvme-pci: Shutdown the device if D3Cold is allowed by the user | expand

Commit Message

Manivannan Sadhasivam Nov. 18, 2024, 8:23 a.m. UTC
PCI core allows users to configure the D3Cold state for each PCI device
through the sysfs attribute '/sys/bus/pci/devices/.../d3cold_allowed'. This
attribute sets the 'pci_dev:d3cold_allowed' flag and could be used by users
to allow/disallow the PCI devices to enter D3Cold during system suspend.

So make use of this flag in the NVMe driver to shutdown the NVMe device
during system suspend if the user has allowed D3Cold for the device.
Existing checks in the NVMe driver decide whether to shut down the device
(based on platform/device limitations), so use this flag as the last resort
to keep the existing behavior.

The default behavior of the 'pci_dev:d3cold_allowed' flag is to allow
D3Cold and the users can disallow it through sysfs if they want.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
 drivers/nvme/host/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Nov. 18, 2024, 12:58 p.m. UTC | #1
On Mon, Nov 18, 2024 at 01:53:44PM +0530, Manivannan Sadhasivam wrote:
> PCI core allows users to configure the D3Cold state for each PCI device
> through the sysfs attribute '/sys/bus/pci/devices/.../d3cold_allowed'. This
> attribute sets the 'pci_dev:d3cold_allowed' flag and could be used by users
> to allow/disallow the PCI devices to enter D3Cold during system suspend.
>
> So make use of this flag in the NVMe driver to shutdown the NVMe device
> during system suspend if the user has allowed D3Cold for the device.
> Existing checks in the NVMe driver decide whether to shut down the device
> (based on platform/device limitations), so use this flag as the last resort
> to keep the existing behavior.

Umm, what?  The documentation of this attribute says:

"d3cold_allowed is bit to control whether the corresponding PCI
 device can be put into D3Cold state.  If it is cleared, the
 device will never be put into D3Cold state.  If it is set, the
 device may be put into D3Cold state if other requirements are
 satisfied too.  Reading this attribute will show the current
 value of d3cold_allowed bit. Writing this attribute will
 the value of d3cold_allowed bit."

Which honestly already sounds rather non-specific, but everything but
a mandate for drivers to act on it.

The only place currently checking it is pci_dev_check_d3cold in the
PCI core, which is used to set the bridge_d3 attibute.

So blindly using it in a driver to force a different PM strategy feels
completely wrong.  Even if the attrite should have that effect it
needs to happen through a well documented PCI or PM layer helper and
open coded like this.
Manivannan Sadhasivam Nov. 18, 2024, 2:58 p.m. UTC | #2
On Mon, Nov 18, 2024 at 01:58:17PM +0100, Christoph Hellwig wrote:
> On Mon, Nov 18, 2024 at 01:53:44PM +0530, Manivannan Sadhasivam wrote:
> > PCI core allows users to configure the D3Cold state for each PCI device
> > through the sysfs attribute '/sys/bus/pci/devices/.../d3cold_allowed'. This
> > attribute sets the 'pci_dev:d3cold_allowed' flag and could be used by users
> > to allow/disallow the PCI devices to enter D3Cold during system suspend.
> >
> > So make use of this flag in the NVMe driver to shutdown the NVMe device
> > during system suspend if the user has allowed D3Cold for the device.
> > Existing checks in the NVMe driver decide whether to shut down the device
> > (based on platform/device limitations), so use this flag as the last resort
> > to keep the existing behavior.
> 
> Umm, what?  The documentation of this attribute says:
> 
> "d3cold_allowed is bit to control whether the corresponding PCI
>  device can be put into D3Cold state.  If it is cleared, the
>  device will never be put into D3Cold state.  If it is set, the
>  device may be put into D3Cold state if other requirements are
>  satisfied too.  Reading this attribute will show the current
>  value of d3cold_allowed bit. Writing this attribute will
>  the value of d3cold_allowed bit."
> 
> Which honestly already sounds rather non-specific, but everything but
> a mandate for drivers to act on it.
> 
> The only place currently checking it is pci_dev_check_d3cold in the
> PCI core, which is used to set the bridge_d3 attibute.
> 

Yeah, it is pretty much used internally up until now. But the attribute looks
like a close match of what I could find for this usecase and that's why I used
it.

> So blindly using it in a driver to force a different PM strategy feels
> completely wrong.  Even if the attrite should have that effect it
> needs to happen through a well documented PCI or PM layer helper and
> open coded like this.
> 

Ok. I'd like to get some feedback from Bjorn H (PCI maintainer) about using this
attribute before moving forward with a helper.

Thanks!

- Mani
diff mbox series

Patch

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4b9fda0b1d9a..a4d4687854bf 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3287,7 +3287,8 @@  static int nvme_suspend(struct device *dev)
 	 */
 	if (pm_suspend_via_firmware() || !ctrl->npss ||
 	    !pcie_aspm_enabled(pdev) ||
-	    (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND))
+	    (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND) ||
+	    pdev->d3cold_allowed)
 		return nvme_disable_prepare_reset(ndev, true);
 
 	nvme_start_freeze(ctrl);