Message ID | 20241118082344.8146-1-manivannan.sadhasivam@linaro.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | nvme-pci: Shutdown the device if D3Cold is allowed by the user | expand |
On Mon, Nov 18, 2024 at 01:53:44PM +0530, Manivannan Sadhasivam wrote: > PCI core allows users to configure the D3Cold state for each PCI device > through the sysfs attribute '/sys/bus/pci/devices/.../d3cold_allowed'. This > attribute sets the 'pci_dev:d3cold_allowed' flag and could be used by users > to allow/disallow the PCI devices to enter D3Cold during system suspend. > > So make use of this flag in the NVMe driver to shutdown the NVMe device > during system suspend if the user has allowed D3Cold for the device. > Existing checks in the NVMe driver decide whether to shut down the device > (based on platform/device limitations), so use this flag as the last resort > to keep the existing behavior. Umm, what? The documentation of this attribute says: "d3cold_allowed is bit to control whether the corresponding PCI device can be put into D3Cold state. If it is cleared, the device will never be put into D3Cold state. If it is set, the device may be put into D3Cold state if other requirements are satisfied too. Reading this attribute will show the current value of d3cold_allowed bit. Writing this attribute will the value of d3cold_allowed bit." Which honestly already sounds rather non-specific, but everything but a mandate for drivers to act on it. The only place currently checking it is pci_dev_check_d3cold in the PCI core, which is used to set the bridge_d3 attibute. So blindly using it in a driver to force a different PM strategy feels completely wrong. Even if the attrite should have that effect it needs to happen through a well documented PCI or PM layer helper and open coded like this.
On Mon, Nov 18, 2024 at 01:58:17PM +0100, Christoph Hellwig wrote: > On Mon, Nov 18, 2024 at 01:53:44PM +0530, Manivannan Sadhasivam wrote: > > PCI core allows users to configure the D3Cold state for each PCI device > > through the sysfs attribute '/sys/bus/pci/devices/.../d3cold_allowed'. This > > attribute sets the 'pci_dev:d3cold_allowed' flag and could be used by users > > to allow/disallow the PCI devices to enter D3Cold during system suspend. > > > > So make use of this flag in the NVMe driver to shutdown the NVMe device > > during system suspend if the user has allowed D3Cold for the device. > > Existing checks in the NVMe driver decide whether to shut down the device > > (based on platform/device limitations), so use this flag as the last resort > > to keep the existing behavior. > > Umm, what? The documentation of this attribute says: > > "d3cold_allowed is bit to control whether the corresponding PCI > device can be put into D3Cold state. If it is cleared, the > device will never be put into D3Cold state. If it is set, the > device may be put into D3Cold state if other requirements are > satisfied too. Reading this attribute will show the current > value of d3cold_allowed bit. Writing this attribute will > the value of d3cold_allowed bit." > > Which honestly already sounds rather non-specific, but everything but > a mandate for drivers to act on it. > > The only place currently checking it is pci_dev_check_d3cold in the > PCI core, which is used to set the bridge_d3 attibute. > Yeah, it is pretty much used internally up until now. But the attribute looks like a close match of what I could find for this usecase and that's why I used it. > So blindly using it in a driver to force a different PM strategy feels > completely wrong. Even if the attrite should have that effect it > needs to happen through a well documented PCI or PM layer helper and > open coded like this. > Ok. I'd like to get some feedback from Bjorn H (PCI maintainer) about using this attribute before moving forward with a helper. Thanks! - Mani
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 4b9fda0b1d9a..a4d4687854bf 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3287,7 +3287,8 @@ static int nvme_suspend(struct device *dev) */ if (pm_suspend_via_firmware() || !ctrl->npss || !pcie_aspm_enabled(pdev) || - (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND)) + (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND) || + pdev->d3cold_allowed) return nvme_disable_prepare_reset(ndev, true); nvme_start_freeze(ctrl);
PCI core allows users to configure the D3Cold state for each PCI device through the sysfs attribute '/sys/bus/pci/devices/.../d3cold_allowed'. This attribute sets the 'pci_dev:d3cold_allowed' flag and could be used by users to allow/disallow the PCI devices to enter D3Cold during system suspend. So make use of this flag in the NVMe driver to shutdown the NVMe device during system suspend if the user has allowed D3Cold for the device. Existing checks in the NVMe driver decide whether to shut down the device (based on platform/device limitations), so use this flag as the last resort to keep the existing behavior. The default behavior of the 'pci_dev:d3cold_allowed' flag is to allow D3Cold and the users can disallow it through sysfs if they want. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> --- drivers/nvme/host/pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)