Message ID | 20240802-pci-bridge-d3-v5-4-2426dd9e8e27@linaro.org (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | PCI: Allow D3Hot for PCI bridges in Devicetree based platforms | expand |
On Fri, Aug 02, 2024 at 11:25:03AM +0530, Manivannan Sadhasivam via B4 Relay wrote: > Unlike ACPI based platforms, there are no known issues with D3Hot for the > PCI bridges in the Devicetree based platforms. So let's allow the PCI > bridges to go to D3Hot during runtime. It should be noted that the bridges > need to be defined in Devicetree for this to work. [...] > + if (state == PCI_D3hot && dev_of_node(&bridge->dev)) > + return true; For such a simple change which several parties are interested in, I think it would be better to move it to the front of the series. Patch [1/4] looks like an optimization (using a cached value) which this patch doesn't depend on. Patch [2/4] looks like a change of bikeshed color which isn't strictly necessary for this fourth patch either. If you want to propose those changes, fine, but by making this fourth patch depend on them, you risk delaying its acceptance. As an upstreaming strategy it's usually smarter to move potentially controversial or unnecessary material to the back of the series, or submit it separately if it can be applied standalone. > Currently, D3Cold is not allowed since Vcc supply which is required for > transitioning the device to D3Cold is not exposed on all Devicetree based > platforms. The PCI core cannot put devices into D3cold without help from the platform. Checking whether D3cold is possible (or allowed or whatever) thus requires asking platform support code via platform_pci_power_manageable(), platform_pci_choose_state() etc. I think patch [3/4] is a little confusing because it creates infrastructure to decide whether D3cold is supported (allowed?) but we already have that in the platform_pci_*() functions. So I'm not sure if patch [3/4] adds value. I think generally speaking if D3hot isn't possible (allowed?), D3cold is assumed to not be possible either. Thanks, Lukas
On Fri, Aug 02, 2024 at 12:13:31PM +0200, Lukas Wunner wrote: > On Fri, Aug 02, 2024 at 11:25:03AM +0530, Manivannan Sadhasivam via B4 Relay wrote: > > Unlike ACPI based platforms, there are no known issues with D3Hot for the > > PCI bridges in the Devicetree based platforms. So let's allow the PCI > > bridges to go to D3Hot during runtime. It should be noted that the bridges > > need to be defined in Devicetree for this to work. > [...] > > + if (state == PCI_D3hot && dev_of_node(&bridge->dev)) > > + return true; > > For such a simple change which several parties are interested in, > I think it would be better to move it to the front of the series. > > Patch [1/4] looks like an optimization (using a cached value) > which this patch doesn't depend on. Patch [2/4] looks like a > change of bikeshed color which isn't strictly necessary for > this fourth patch either. If you want to propose those changes, > fine, but by making this fourth patch depend on them, you risk > delaying its acceptance. As an upstreaming strategy it's usually > smarter to move potentially controversial or unnecessary material > to the back of the series, or submit it separately if it can be > applied standalone. > Agree with you! Even after doing upstreaming for this much time, I tend to ignore this... > > > Currently, D3Cold is not allowed since Vcc supply which is required for > > transitioning the device to D3Cold is not exposed on all Devicetree based > > platforms. > > The PCI core cannot put devices into D3cold without help from the > platform. Checking whether D3cold is possible (or allowed or > whatever) thus requires asking platform support code via > platform_pci_power_manageable(), platform_pci_choose_state() etc. > > I think patch [3/4] is a little confusing because it creates > infrastructure to decide whether D3cold is supported (allowed?) > but we already have that in the platform_pci_*() functions. > So I'm not sure if patch [3/4] adds value. I think generally > speaking if D3hot isn't possible (allowed?), D3cold is assumed > to not be possible either. > Why? D3Hot is useful for runtime PM and if the platform doesn't want to do runtime PM, it can always skip D3Hot (not ideal though). But D3Cold is a power off state, and the platform may choose to use it for the case of system suspend. So I still feel that decoupling D3Hot and D3Cold is necessary. - Mani
On Mon, Aug 05, 2024 at 07:05:55PM +0530, Manivannan Sadhasivam wrote: > On Fri, Aug 02, 2024 at 12:13:31PM +0200, Lukas Wunner wrote: > > The PCI core cannot put devices into D3cold without help from the > > platform. Checking whether D3cold is possible (or allowed or > > whatever) thus requires asking platform support code via > > platform_pci_power_manageable(), platform_pci_choose_state() etc. > > > > I think patch [3/4] is a little confusing because it creates > > infrastructure to decide whether D3cold is supported (allowed?) > > but we already have that in the platform_pci_*() functions. > > So I'm not sure if patch [3/4] adds value. I think generally > > speaking if D3hot isn't possible (allowed?), D3cold is assumed > > to not be possible either. > > Why? D3Hot is useful for runtime PM and if the platform doesn't want to do > runtime PM, it can always skip D3Hot (not ideal though). AFAICS we always program the device to go to D3hot and the platform then cuts power, thereby putting it into D3cold. So D3hot is never skipped. See __pci_set_power_state(): if (state == PCI_D3cold) { /* * To put the device in D3cold, put it into D3hot in the native * way, then put it into D3cold using platform ops. */ error = pci_set_low_power_state(dev, PCI_D3hot, locked); if (pci_platform_power_transition(dev, PCI_D3cold)) return error; Thanks, Lukas
On Tue, Aug 06, 2024 at 08:53:01AM +0200, Lukas Wunner wrote: > On Mon, Aug 05, 2024 at 07:05:55PM +0530, Manivannan Sadhasivam wrote: > > On Fri, Aug 02, 2024 at 12:13:31PM +0200, Lukas Wunner wrote: > > > The PCI core cannot put devices into D3cold without help from the > > > platform. Checking whether D3cold is possible (or allowed or > > > whatever) thus requires asking platform support code via > > > platform_pci_power_manageable(), platform_pci_choose_state() etc. > > > > > > I think patch [3/4] is a little confusing because it creates > > > infrastructure to decide whether D3cold is supported (allowed?) > > > but we already have that in the platform_pci_*() functions. > > > So I'm not sure if patch [3/4] adds value. I think generally > > > speaking if D3hot isn't possible (allowed?), D3cold is assumed > > > to not be possible either. > > > > Why? D3Hot is useful for runtime PM and if the platform doesn't want to do > > runtime PM, it can always skip D3Hot (not ideal though). > > AFAICS we always program the device to go to D3hot and the platform > then cuts power, thereby putting it into D3cold. So D3hot is never > skipped. See __pci_set_power_state(): > > if (state == PCI_D3cold) { > /* > * To put the device in D3cold, put it into D3hot in the native > * way, then put it into D3cold using platform ops. > */ > error = pci_set_low_power_state(dev, PCI_D3hot, locked); > > if (pci_platform_power_transition(dev, PCI_D3cold)) > return error; > This is applicable only to pci_set_power_state(), but AFAIK PCIe spec doesn't mandate switching to D3Hot for entering D3Cold. So the PCIe host controller drivers (especically non-ACPI platforms) may just send PME_Turn_Off followed by removing the slot power (which again is not controlled by pci_set_power_state()) as there are no non-ACPI related hooks as of now. - Mani
On Tue, Aug 06, 2024 at 06:11:07PM +0530, Manivannan Sadhasivam wrote: > On Tue, Aug 06, 2024 at 08:53:01AM +0200, Lukas Wunner wrote: > > AFAICS we always program the device to go to D3hot and the platform > > then cuts power, thereby putting it into D3cold. So D3hot is never > > skipped. See __pci_set_power_state(): > > > > if (state == PCI_D3cold) { > > /* > > * To put the device in D3cold, put it into D3hot in the native > > * way, then put it into D3cold using platform ops. > > */ > > error = pci_set_low_power_state(dev, PCI_D3hot, locked); > > > > if (pci_platform_power_transition(dev, PCI_D3cold)) > > return error; > > > > This is applicable only to pci_set_power_state(), but AFAIK PCIe spec > doesn't mandate switching to D3Hot for entering D3Cold. Per PCI Bus Power Management Interface Specification r1.2 sec 5.5 fig 5-1, the only supported state transition to D3cold is from D3hot. Per PCIe r6.2 sec 5.2, "PM is compatible with the PCI Bus Power Management Interface Specification". Granted, PCI-PM is an ancient spec, so I think anyone can be forgiven for not knowing its intricacies off-the-cuff. :) > So the PCIe host controller drivers (especically non-ACPI platforms) > may just send PME_Turn_Off followed by removing the slot power > (which again is not controlled by pci_set_power_state()) > as there are no non-ACPI related hooks as of now. Ideally, devicetree-based platforms should be brought into the platform_pci_*() fold to align them with ACPI and get common behavior across all platforms. Thanks, Lukas
On Tue, Aug 06, 2024 at 03:02:39PM +0200, Lukas Wunner wrote: > On Tue, Aug 06, 2024 at 06:11:07PM +0530, Manivannan Sadhasivam wrote: > > On Tue, Aug 06, 2024 at 08:53:01AM +0200, Lukas Wunner wrote: > > > AFAICS we always program the device to go to D3hot and the platform > > > then cuts power, thereby putting it into D3cold. So D3hot is never > > > skipped. See __pci_set_power_state(): > > > > > > if (state == PCI_D3cold) { > > > /* > > > * To put the device in D3cold, put it into D3hot in the native > > > * way, then put it into D3cold using platform ops. > > > */ > > > error = pci_set_low_power_state(dev, PCI_D3hot, locked); > > > > > > if (pci_platform_power_transition(dev, PCI_D3cold)) > > > return error; > > > > > > > This is applicable only to pci_set_power_state(), but AFAIK PCIe spec > > doesn't mandate switching to D3Hot for entering D3Cold. > > Per PCI Bus Power Management Interface Specification r1.2 sec 5.5 fig 5-1, > the only supported state transition to D3cold is from D3hot. > > Per PCIe r6.2 sec 5.2, "PM is compatible with the PCI Bus Power Management > Interface Specification". > > Granted, PCI-PM is an ancient spec, so I think anyone can be forgiven > for not knowing its intricacies off-the-cuff. :) > Ah, the grand old PCI-PM... I don't remember the last time I looked into it :) > > > So the PCIe host controller drivers (especically non-ACPI platforms) > > may just send PME_Turn_Off followed by removing the slot power > > (which again is not controlled by pci_set_power_state()) > > as there are no non-ACPI related hooks as of now. > > Ideally, devicetree-based platforms should be brought into the > platform_pci_*() fold to align them with ACPI and get common > behavior across all platforms. > Yeah, that would be the ideal case. Unfortunately, there is no ideal ground for DT :/ We do not even have the supplies populated properly. But with the advent of power sequencing framework, I think this can be fixed. Regarding your comment on patch 3/4, we already have the sysfs attribute to control whether the device can be put into D3Cold or not and that is directly coming from userspace. So there is no guarantee to assume that D3Hot support is considered. - Mani
On Tue, Aug 06, 2024 at 08:09:18PM +0530, Manivannan Sadhasivam wrote: > Regarding your comment on patch 3/4, we already have the sysfs attribute > to control whether the device can be put into D3Cold or not and that is > directly coming from userspace. So there is no guarantee to assume that > D3Hot support is considered. If a device is not allowed to be suspended to D3cold, it will only be suspended to D3hot. If a port is put into anything deeper than D0, its secondary bus is no longer in B0 (PCI-PM r1.2 table 6-1) and children are inaccessible, so they're "effectively" in D3cold. Hence if a device cannot be suspended to D3cold, it will block all parent bridges from being suspended. That's what the logic in pci_bridge_d3_update() and pci_dev_check_d3cold() is doing. The d3cold_allowed attribute in sysfs is just one of several reasons why a device may not go to D3cold (see pci_dev_check_d3cold() for details). The d3cold_allowed attribute was originally intended to disable D3cold on devices where it was known to not work. Nowadays this should all be handled automatically, which is why we've discussed moving the attribute to debugfs: https://lore.kernel.org/all/20230918132424.GA11357@wunner.de/ https://lore.kernel.org/all/20231002181025.82746-1-mario.limonciello@amd.com/ Thanks, Lukas
On Tue, Aug 6, 2024 at 7:39 AM Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote: > > On Tue, Aug 06, 2024 at 03:02:39PM +0200, Lukas Wunner wrote: > > On Tue, Aug 06, 2024 at 06:11:07PM +0530, Manivannan Sadhasivam wrote: > > > On Tue, Aug 06, 2024 at 08:53:01AM +0200, Lukas Wunner wrote: > > > > AFAICS we always program the device to go to D3hot and the platform > > > > then cuts power, thereby putting it into D3cold. So D3hot is never > > > > skipped. See __pci_set_power_state(): > > > > > > > > if (state == PCI_D3cold) { > > > > /* > > > > * To put the device in D3cold, put it into D3hot in the native > > > > * way, then put it into D3cold using platform ops. > > > > */ > > > > error = pci_set_low_power_state(dev, PCI_D3hot, locked); > > > > > > > > if (pci_platform_power_transition(dev, PCI_D3cold)) > > > > return error; > > > > > > > > > > This is applicable only to pci_set_power_state(), but AFAIK PCIe spec > > > doesn't mandate switching to D3Hot for entering D3Cold. > > > > Per PCI Bus Power Management Interface Specification r1.2 sec 5.5 fig 5-1, > > the only supported state transition to D3cold is from D3hot. > > > > Per PCIe r6.2 sec 5.2, "PM is compatible with the PCI Bus Power Management > > Interface Specification". > > > > Granted, PCI-PM is an ancient spec, so I think anyone can be forgiven > > for not knowing its intricacies off-the-cuff. :) > > > > Ah, the grand old PCI-PM... I don't remember the last time I looked into it :) > > > > > > So the PCIe host controller drivers (especically non-ACPI platforms) > > > may just send PME_Turn_Off followed by removing the slot power > > > (which again is not controlled by pci_set_power_state()) > > > as there are no non-ACPI related hooks as of now. > > > > Ideally, devicetree-based platforms should be brought into the > > platform_pci_*() fold to align them with ACPI and get common > > behavior across all platforms. > > > > Yeah, that would be the ideal case. Unfortunately, there is no ideal ground for > DT :/ We do not even have the supplies populated properly. But with the advent > of power sequencing framework, I think this can be fixed. > Looking in acpi_pci_bridge_d3(), it has several checkings about whether d3 is supported, including reading power_manageable flag (acpi_device_power_manageable) and reading the root port property. For DT, does it make sense to have a chosen property about this? > Regarding your comment on patch 3/4, we already have the sysfs attribute to > control whether the device can be put into D3Cold or not and that is directly > coming from userspace. So there is no guarantee to assume that D3Hot support is > considered. > > - Mani > > -- > மணிவண்ணன் சதாசிவம்
On Tue, Aug 06, 2024 at 10:20:39PM +0200, Lukas Wunner wrote: > On Tue, Aug 06, 2024 at 08:09:18PM +0530, Manivannan Sadhasivam wrote: > > Regarding your comment on patch 3/4, we already have the sysfs attribute > > to control whether the device can be put into D3Cold or not and that is > > directly coming from userspace. So there is no guarantee to assume that > > D3Hot support is considered. > > If a device is not allowed to be suspended to D3cold, it will only be > suspended to D3hot. > > If a port is put into anything deeper than D0, its secondary bus is > no longer in B0 (PCI-PM r1.2 table 6-1) and children are inaccessible, > so they're "effectively" in D3cold. Hence if a device cannot be > suspended to D3cold, it will block all parent bridges from being > suspended. That's what the logic in pci_bridge_d3_update() and > pci_dev_check_d3cold() is doing. > Agree. But patch 3/4 is mostly based on the suggestion from Bjorn [1] for earlier revision. He specifically mentioned that the platform_pci_bridge_d3() function doesn't differentiate between D3Hot and D3Cold and that's why I splitted them: "These are two vastly different scenarios, and I would really like to untangle them so they aren't conflated. I see that you're extending platform_pci_bridge_d3(), which apparently has that conflation baked into it already, but my personal experience is that this is really hard to maintain." I agree with your point that if D3Hot is not possible, then D3Cold is also not possible as per the PCI PM reference you quoted. But here, D3Hot is possible, but D3Cold is not. And platform_pci_power_manageable(), platform_pci_choose_state() are already returning false for DT platforms. So if 'true' is returned from platform_pci_bridge_d3(), then it implies that D3Cold is also supported, but it doesn't on DT platforms. So a split seems to be necessary IMO. - Mani [1] https://lore.kernel.org/linux-pci/20240221182000.GA1533634@bhelgaas/ > The d3cold_allowed attribute in sysfs is just one of several reasons > why a device may not go to D3cold (see pci_dev_check_d3cold() for > details). > > The d3cold_allowed attribute was originally intended to disable D3cold > on devices where it was known to not work. Nowadays this should all > be handled automatically, which is why we've discussed moving the > attribute to debugfs: > > https://lore.kernel.org/all/20230918132424.GA11357@wunner.de/ > https://lore.kernel.org/all/20231002181025.82746-1-mario.limonciello@amd.com/ > > Thanks, > > Lukas
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index c7a4f961ec28..bc1e1ca673f1 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2992,6 +2992,18 @@ static bool pci_bridge_d3_allowed(struct pci_dev *bridge, pci_power_t state) if (pci_bridge_d3_force) return true; + /* + * Allow D3Hot for all Devicetree based platforms having a + * separate node for the bridge. We don't allow D3Cold for now + * since not all platforms are exposing the Vcc supply in + * Devicetree which is required for transitioning the bridge to + * D3Cold. + * + * NOTE: The bridge is expected to be defined in Devicetree. + */ + if (state == PCI_D3hot && dev_of_node(&bridge->dev)) + return true; + /* Even the oldest 2010 Thunderbolt controller supports D3. */ if (bridge->is_thunderbolt) return true; @@ -3042,7 +3054,7 @@ bool pci_bridge_d3cold_allowed(struct pci_dev *bridge) * * This function checks if the bridge is allowed to move to D3Hot. * Currently we only allow D3Hot for recent enough PCIe ports on ACPI based - * platforms and Thunderbolt. + * platforms, Thunderbolt and Devicetree based platforms. */ bool pci_bridge_d3hot_allowed(struct pci_dev *bridge) {