diff mbox series

[v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if device can wake from D3

Message ID 20220315153252.4880-1-mario.limonciello@amd.com (mailing list archive)
State Superseded
Headers show
Series [v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if device can wake from D3 | expand

Commit Message

Mario Limonciello March 15, 2022, 3:32 p.m. UTC
According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
indicates the ability for a bridge to be able to wakeup from D3:

  This ACPI object [HotPlugSupportInD3] enables the operating system
  to identify and power manage PCIe Root Ports that are capable of
  handling hot plug events while in D3 state.

This however is static information in the ACPI table at BIOS compilation
time and on some platforms it's possible to configure the firmware at boot
up such that _S0W returns "0" indicating the inability to wake up the
device from D3 as explained in the ACPI specification:

  7.3.20 _S0W (S0 Device Wake State)

  This object evaluates to an integer that conveys to OSPM the deepest
  D-state supported by this device in the S0 system sleeping state
  where the device can wake itself.

This mismatch may lead to being unable to enumerate devices behind the
hotplug bridge when a device is plugged in. To remedy these situations
that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
check that the ACPI companion has returned _S0W greater than or equal
to 3 and the device has a GPE allowing the device to generate wakeup
signals handled by the platform in `acpi_pci_bridge_d3`.

Windows 10 and Windows 11 both will prevent the bridge from going in D3
when the firmware is configured this way and this changes aligns the
handling of the situation to be the same.

Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/device-power-management-objects.html?highlight=s0w#s0w-s0-device-wake-state
Link: https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3
Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
changes from v3->v4:
 * rework comment
 * only evaluate _S0W if necessary
 * drop static function with only one caller

 drivers/pci/pci-acpi.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki March 15, 2022, 4:35 p.m. UTC | #1
On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> indicates the ability for a bridge to be able to wakeup from D3:
>
>   This ACPI object [HotPlugSupportInD3] enables the operating system
>   to identify and power manage PCIe Root Ports that are capable of
>   handling hot plug events while in D3 state.
>
> This however is static information in the ACPI table at BIOS compilation
> time and on some platforms it's possible to configure the firmware at boot
> up such that _S0W returns "0" indicating the inability to wake up the
> device from D3 as explained in the ACPI specification:
>
>   7.3.20 _S0W (S0 Device Wake State)
>
>   This object evaluates to an integer that conveys to OSPM the deepest
>   D-state supported by this device in the S0 system sleeping state
>   where the device can wake itself.
>
> This mismatch may lead to being unable to enumerate devices behind the
> hotplug bridge when a device is plugged in. To remedy these situations
> that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> check that the ACPI companion has returned _S0W greater than or equal
> to 3 and the device has a GPE allowing the device to generate wakeup
> signals handled by the platform in `acpi_pci_bridge_d3`.
>
> Windows 10 and Windows 11 both will prevent the bridge from going in D3
> when the firmware is configured this way and this changes aligns the
> handling of the situation to be the same.
>
> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/device-power-management-objects.html?highlight=s0w#s0w-s0-device-wake-state
> Link: https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3
> Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

No more comments from me:

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> changes from v3->v4:
>  * rework comment
>  * only evaluate _S0W if necessary
>  * drop static function with only one caller
>
>  drivers/pci/pci-acpi.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index a42dbf448860..e535dab2c888 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -977,6 +977,7 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>         const union acpi_object *obj;
>         struct acpi_device *adev;
>         struct pci_dev *rpdev;
> +       unsigned long long ret;
>
>         if (acpi_pci_disabled || !dev->is_hotplug_bridge)
>                 return false;
> @@ -1003,7 +1004,21 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>                                    ACPI_TYPE_INTEGER, &obj) < 0)
>                 return false;
>
> -       return obj->integer.value == 1;
> +       if (!obj->integer.value)
> +               return false;
> +
> +       /*
> +        * If 'HotPlugSupportInD3' is set, but wakeup is not actually supported,
> +        * the former cannot be trusted anyway, so validate it by verifying the
> +        * latter.
> +        */
> +       if (!adev->wakeup.flags.valid)
> +               return false;
> +
> +       if (ACPI_FAILURE(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
> +               return false;
> +
> +       return ret >= ACPI_STATE_D3_HOT;
>  }
>
>  int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
> --
> 2.34.1
>
Rafael J. Wysocki March 15, 2022, 4:36 p.m. UTC | #2
On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
> <mario.limonciello@amd.com> wrote:
> >
> > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> > indicates the ability for a bridge to be able to wakeup from D3:
> >
> >   This ACPI object [HotPlugSupportInD3] enables the operating system
> >   to identify and power manage PCIe Root Ports that are capable of
> >   handling hot plug events while in D3 state.
> >
> > This however is static information in the ACPI table at BIOS compilation
> > time and on some platforms it's possible to configure the firmware at boot
> > up such that _S0W returns "0" indicating the inability to wake up the
> > device from D3 as explained in the ACPI specification:
> >
> >   7.3.20 _S0W (S0 Device Wake State)
> >
> >   This object evaluates to an integer that conveys to OSPM the deepest
> >   D-state supported by this device in the S0 system sleeping state
> >   where the device can wake itself.
> >
> > This mismatch may lead to being unable to enumerate devices behind the
> > hotplug bridge when a device is plugged in. To remedy these situations
> > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> > check that the ACPI companion has returned _S0W greater than or equal
> > to 3 and the device has a GPE allowing the device to generate wakeup
> > signals handled by the platform in `acpi_pci_bridge_d3`.
> >
> > Windows 10 and Windows 11 both will prevent the bridge from going in D3
> > when the firmware is configured this way and this changes aligns the
> > handling of the situation to be the same.
> >
> > Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/device-power-management-objects.html?highlight=s0w#s0w-s0-device-wake-state
> > Link: https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3
> > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>
> No more comments from me:
>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Or please let me know if I should pick it up.

> > ---
> > changes from v3->v4:
> >  * rework comment
> >  * only evaluate _S0W if necessary
> >  * drop static function with only one caller
> >
> >  drivers/pci/pci-acpi.c | 17 ++++++++++++++++-
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> > index a42dbf448860..e535dab2c888 100644
> > --- a/drivers/pci/pci-acpi.c
> > +++ b/drivers/pci/pci-acpi.c
> > @@ -977,6 +977,7 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
> >         const union acpi_object *obj;
> >         struct acpi_device *adev;
> >         struct pci_dev *rpdev;
> > +       unsigned long long ret;
> >
> >         if (acpi_pci_disabled || !dev->is_hotplug_bridge)
> >                 return false;
> > @@ -1003,7 +1004,21 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
> >                                    ACPI_TYPE_INTEGER, &obj) < 0)
> >                 return false;
> >
> > -       return obj->integer.value == 1;
> > +       if (!obj->integer.value)
> > +               return false;
> > +
> > +       /*
> > +        * If 'HotPlugSupportInD3' is set, but wakeup is not actually supported,
> > +        * the former cannot be trusted anyway, so validate it by verifying the
> > +        * latter.
> > +        */
> > +       if (!adev->wakeup.flags.valid)
> > +               return false;
> > +
> > +       if (ACPI_FAILURE(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
> > +               return false;
> > +
> > +       return ret >= ACPI_STATE_D3_HOT;
> >  }
> >
> >  int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
> > --
> > 2.34.1
> >
Mario Limonciello March 23, 2022, 4:01 p.m. UTC | #3
On 3/15/22 11:36, Rafael J. Wysocki wrote:
> On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
>> <mario.limonciello@amd.com> wrote:
>>>
>>> According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
>>> indicates the ability for a bridge to be able to wakeup from D3:
>>>
>>>    This ACPI object [HotPlugSupportInD3] enables the operating system
>>>    to identify and power manage PCIe Root Ports that are capable of
>>>    handling hot plug events while in D3 state.
>>>
>>> This however is static information in the ACPI table at BIOS compilation
>>> time and on some platforms it's possible to configure the firmware at boot
>>> up such that _S0W returns "0" indicating the inability to wake up the
>>> device from D3 as explained in the ACPI specification:
>>>
>>>    7.3.20 _S0W (S0 Device Wake State)
>>>
>>>    This object evaluates to an integer that conveys to OSPM the deepest
>>>    D-state supported by this device in the S0 system sleeping state
>>>    where the device can wake itself.
>>>
>>> This mismatch may lead to being unable to enumerate devices behind the
>>> hotplug bridge when a device is plugged in. To remedy these situations
>>> that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
>>> check that the ACPI companion has returned _S0W greater than or equal
>>> to 3 and the device has a GPE allowing the device to generate wakeup
>>> signals handled by the platform in `acpi_pci_bridge_d3`.
>>>
>>> Windows 10 and Windows 11 both will prevent the bridge from going in D3
>>> when the firmware is configured this way and this changes aligns the
>>> handling of the situation to be the same.
>>>
>>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org%2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt%2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w-s0-device-wake-state&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7Cae5d5d8f4ac1452dac0c08da06a200b2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829590116901616%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=3cTSyBgSEOBo5mPeUcNDrc9qhdMJeZ4cVtLqroDRZqY%3D&amp;reserved=0
>>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-pcie-root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-d3&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7Cae5d5d8f4ac1452dac0c08da06a200b2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829590116901616%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=rZHk%2FgII0qOvr9cmmB8N1auNc1nLSFViVTG9f%2FwEREY%3D&amp;reserved=0
>>> Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>
>> No more comments from me:
>>
>> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Or please let me know if I should pick it up.

Bjorn,

Friendly reminder on this one.

> 
>>> ---
>>> changes from v3->v4:
>>>   * rework comment
>>>   * only evaluate _S0W if necessary
>>>   * drop static function with only one caller
>>>
>>>   drivers/pci/pci-acpi.c | 17 ++++++++++++++++-
>>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>>> index a42dbf448860..e535dab2c888 100644
>>> --- a/drivers/pci/pci-acpi.c
>>> +++ b/drivers/pci/pci-acpi.c
>>> @@ -977,6 +977,7 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>>          const union acpi_object *obj;
>>>          struct acpi_device *adev;
>>>          struct pci_dev *rpdev;
>>> +       unsigned long long ret;
>>>
>>>          if (acpi_pci_disabled || !dev->is_hotplug_bridge)
>>>                  return false;
>>> @@ -1003,7 +1004,21 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>>                                     ACPI_TYPE_INTEGER, &obj) < 0)
>>>                  return false;
>>>
>>> -       return obj->integer.value == 1;
>>> +       if (!obj->integer.value)
>>> +               return false;
>>> +
>>> +       /*
>>> +        * If 'HotPlugSupportInD3' is set, but wakeup is not actually supported,
>>> +        * the former cannot be trusted anyway, so validate it by verifying the
>>> +        * latter.
>>> +        */
>>> +       if (!adev->wakeup.flags.valid)
>>> +               return false;
>>> +
>>> +       if (ACPI_FAILURE(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
>>> +               return false;
>>> +
>>> +       return ret >= ACPI_STATE_D3_HOT;
>>>   }
>>>
>>>   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>>> --
>>> 2.34.1
>>>
Bjorn Helgaas March 23, 2022, 8:52 p.m. UTC | #4
On Wed, Mar 23, 2022 at 11:01:05AM -0500, Mario Limonciello wrote:
> On 3/15/22 11:36, Rafael J. Wysocki wrote:
> > On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > 
> > > On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
> > > <mario.limonciello@amd.com> wrote:
> > > > 
> > > > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> > > > indicates the ability for a bridge to be able to wakeup from D3:
> > > > 
> > > >    This ACPI object [HotPlugSupportInD3] enables the operating system
> > > >    to identify and power manage PCIe Root Ports that are capable of
> > > >    handling hot plug events while in D3 state.
> > > > 
> > > > This however is static information in the ACPI table at BIOS compilation
> > > > time and on some platforms it's possible to configure the firmware at boot
> > > > up such that _S0W returns "0" indicating the inability to wake up the
> > > > device from D3 as explained in the ACPI specification:
> > > > 
> > > >    7.3.20 _S0W (S0 Device Wake State)
> > > > 
> > > >    This object evaluates to an integer that conveys to OSPM the deepest
> > > >    D-state supported by this device in the S0 system sleeping state
> > > >    where the device can wake itself.
> > > > 
> > > > This mismatch may lead to being unable to enumerate devices behind the
> > > > hotplug bridge when a device is plugged in. To remedy these situations
> > > > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> > > > check that the ACPI companion has returned _S0W greater than or equal
> > > > to 3 and the device has a GPE allowing the device to generate wakeup
> > > > signals handled by the platform in `acpi_pci_bridge_d3`.
> > > > 
> > > > Windows 10 and Windows 11 both will prevent the bridge from going in D3
> > > > when the firmware is configured this way and this changes aligns the
> > > > handling of the situation to be the same.
> > > > 
> > > > Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org%2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt%2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w-s0-device-wake-state&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7Cae5d5d8f4ac1452dac0c08da06a200b2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829590116901616%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=3cTSyBgSEOBo5mPeUcNDrc9qhdMJeZ4cVtLqroDRZqY%3D&amp;reserved=0
> > > > Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-pcie-root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-d3&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7Cae5d5d8f4ac1452dac0c08da06a200b2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829590116901616%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=rZHk%2FgII0qOvr9cmmB8N1auNc1nLSFViVTG9f%2FwEREY%3D&amp;reserved=0
> > > > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
> > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > 
> > > No more comments from me:
> > > 
> > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Or please let me know if I should pick it up.
> 
> Bjorn,
> 
> Friendly reminder on this one.

Thanks; we're in the middle of the merge window now, so I'll wait till
that's over unless this is an emergency.

IIUC the bug this fixes is that "when a bridge is in D3cold, we don't
notice when a device is hot-added below it," right?  So we need to
avoid putting the bridge in D3cold?

Is there a typical scenario where this bites users?  I don't think we
ever saw an actual problem report?

> > > > ---
> > > > changes from v3->v4:
> > > >   * rework comment
> > > >   * only evaluate _S0W if necessary
> > > >   * drop static function with only one caller
> > > > 
> > > >   drivers/pci/pci-acpi.c | 17 ++++++++++++++++-
> > > >   1 file changed, 16 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> > > > index a42dbf448860..e535dab2c888 100644
> > > > --- a/drivers/pci/pci-acpi.c
> > > > +++ b/drivers/pci/pci-acpi.c
> > > > @@ -977,6 +977,7 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
> > > >          const union acpi_object *obj;
> > > >          struct acpi_device *adev;
> > > >          struct pci_dev *rpdev;
> > > > +       unsigned long long ret;
> > > > 
> > > >          if (acpi_pci_disabled || !dev->is_hotplug_bridge)
> > > >                  return false;
> > > > @@ -1003,7 +1004,21 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
> > > >                                     ACPI_TYPE_INTEGER, &obj) < 0)
> > > >                  return false;
> > > > 
> > > > -       return obj->integer.value == 1;
> > > > +       if (!obj->integer.value)
> > > > +               return false;
> > > > +
> > > > +       /*
> > > > +        * If 'HotPlugSupportInD3' is set, but wakeup is not actually supported,
> > > > +        * the former cannot be trusted anyway, so validate it by verifying the
> > > > +        * latter.
> > > > +        */
> > > > +       if (!adev->wakeup.flags.valid)
> > > > +               return false;
> > > > +
> > > > +       if (ACPI_FAILURE(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
> > > > +               return false;
> > > > +
> > > > +       return ret >= ACPI_STATE_D3_HOT;
> > > >   }
> > > > 
> > > >   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
> > > > --
> > > > 2.34.1
> > > > 
>
Mario Limonciello March 23, 2022, 9:26 p.m. UTC | #5
[Public]

> 
> On Wed, Mar 23, 2022 at 11:01:05AM -0500, Mario Limonciello wrote:
> > On 3/15/22 11:36, Rafael J. Wysocki wrote:
> > > On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org>
> wrote:
> > > >
> > > > On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
> > > > <mario.limonciello@amd.com> wrote:
> > > > >
> > > > > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> > > > > indicates the ability for a bridge to be able to wakeup from D3:
> > > > >
> > > > >    This ACPI object [HotPlugSupportInD3] enables the operating system
> > > > >    to identify and power manage PCIe Root Ports that are capable of
> > > > >    handling hot plug events while in D3 state.
> > > > >
> > > > > This however is static information in the ACPI table at BIOS compilation
> > > > > time and on some platforms it's possible to configure the firmware at
> boot
> > > > > up such that _S0W returns "0" indicating the inability to wake up the
> > > > > device from D3 as explained in the ACPI specification:
> > > > >
> > > > >    7.3.20 _S0W (S0 Device Wake State)
> > > > >
> > > > >    This object evaluates to an integer that conveys to OSPM the deepest
> > > > >    D-state supported by this device in the S0 system sleeping state
> > > > >    where the device can wake itself.
> > > > >
> > > > > This mismatch may lead to being unable to enumerate devices behind the
> > > > > hotplug bridge when a device is plugged in. To remedy these situations
> > > > > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> > > > > check that the ACPI companion has returned _S0W greater than or equal
> > > > > to 3 and the device has a GPE allowing the device to generate wakeup
> > > > > signals handled by the platform in `acpi_pci_bridge_d3`.
> > > > >
> > > > > Windows 10 and Windows 11 both will prevent the bridge from going in
> D3
> > > > > when the firmware is configured this way and this changes aligns the
> > > > > handling of the situation to be the same.
> > > > >
> > > > > Link:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org
> %2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt
> %2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w-s0-
> device-wake-
> state&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37a
> c47640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> =Cw1OTYiX9BD3gh8eN3Zyz6%2FK8YFgqbn6bgi9%2FjNsnrM%3D&amp;reserved
> =0
> > > > > Link:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.mi
> crosoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-pcie-
> root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-
> d3&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37ac4
> 7640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> 7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=
> UkB5lmz1QBUzWVM6%2BuNzJsleP%2Fi%2BDCJJuSgINiNbymo%3D&amp;reserv
> ed=0
> > > > > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug
> ports")
> > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > >
> > > > No more comments from me:
> > > >
> > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > Or please let me know if I should pick it up.
> >
> > Bjorn,
> >
> > Friendly reminder on this one.
> 
> Thanks; we're in the middle of the merge window now, so I'll wait till
> that's over unless this is an emergency.

Actually it's a pretty important problem.  We waffled on the nuts and
bolts of this commit during the 5.17 RC's, but I didn't think we would
make it out to the merge window before it got picked.  I guess I should
have spoke up on the urgency earlier.

> 
> IIUC the bug this fixes is that "when a bridge is in D3cold, we don't
> notice when a device is hot-added below it," right?  So we need to
> avoid putting the bridge in D3cold?

When the ASL has been configured this way (to return 0 for _S0W) the
lower level hardware implementation will lead to hotplugged devices
not being detected.
Without this commit the hardware will expect to be in D0, but the kernel
will select D3hot.  So yes; the outcome is that hotplugged PCIe devices
don't work.

> 
> Is there a typical scenario where this bites users?  I don't think we
> ever saw an actual problem report?

This is the common way that these systems are being shipped.  I have
plenty of private reports related to this, but nothing public to link to.

FYI: the earlier version of this was:
https://patchwork.kernel.org/project/linux-usb/patch/1646658319-59532-1-git-send-email-Sanju.Mehta@amd.com/
This is basically to intentionally pull the device out of D3hot when a
tunnel is created.

> 
> > > > > ---
> > > > > changes from v3->v4:
> > > > >   * rework comment
> > > > >   * only evaluate _S0W if necessary
> > > > >   * drop static function with only one caller
> > > > >
> > > > >   drivers/pci/pci-acpi.c | 17 ++++++++++++++++-
> > > > >   1 file changed, 16 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> > > > > index a42dbf448860..e535dab2c888 100644
> > > > > --- a/drivers/pci/pci-acpi.c
> > > > > +++ b/drivers/pci/pci-acpi.c
> > > > > @@ -977,6 +977,7 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
> > > > >          const union acpi_object *obj;
> > > > >          struct acpi_device *adev;
> > > > >          struct pci_dev *rpdev;
> > > > > +       unsigned long long ret;
> > > > >
> > > > >          if (acpi_pci_disabled || !dev->is_hotplug_bridge)
> > > > >                  return false;
> > > > > @@ -1003,7 +1004,21 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
> > > > >                                     ACPI_TYPE_INTEGER, &obj) < 0)
> > > > >                  return false;
> > > > >
> > > > > -       return obj->integer.value == 1;
> > > > > +       if (!obj->integer.value)
> > > > > +               return false;
> > > > > +
> > > > > +       /*
> > > > > +        * If 'HotPlugSupportInD3' is set, but wakeup is not actually
> supported,
> > > > > +        * the former cannot be trusted anyway, so validate it by verifying
> the
> > > > > +        * latter.
> > > > > +        */
> > > > > +       if (!adev->wakeup.flags.valid)
> > > > > +               return false;
> > > > > +
> > > > > +       if (ACPI_FAILURE(acpi_evaluate_integer(adev->handle, "_S0W",
> NULL, &ret)))
> > > > > +               return false;
> > > > > +
> > > > > +       return ret >= ACPI_STATE_D3_HOT;
> > > > >   }
> > > > >
> > > > >   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
> > > > > --
> > > > > 2.34.1
> > > > >
> >
Bjorn Helgaas March 23, 2022, 9:42 p.m. UTC | #6
On Wed, Mar 23, 2022 at 09:26:15PM +0000, Limonciello, Mario wrote:
> [Public]
> 
> > 
> > On Wed, Mar 23, 2022 at 11:01:05AM -0500, Mario Limonciello wrote:
> > > On 3/15/22 11:36, Rafael J. Wysocki wrote:
> > > > On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org>
> > wrote:
> > > > >
> > > > > On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
> > > > > <mario.limonciello@amd.com> wrote:
> > > > > >
> > > > > > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> > > > > > indicates the ability for a bridge to be able to wakeup from D3:
> > > > > >
> > > > > >    This ACPI object [HotPlugSupportInD3] enables the operating system
> > > > > >    to identify and power manage PCIe Root Ports that are capable of
> > > > > >    handling hot plug events while in D3 state.
> > > > > >
> > > > > > This however is static information in the ACPI table at BIOS compilation
> > > > > > time and on some platforms it's possible to configure the firmware at
> > boot
> > > > > > up such that _S0W returns "0" indicating the inability to wake up the
> > > > > > device from D3 as explained in the ACPI specification:
> > > > > >
> > > > > >    7.3.20 _S0W (S0 Device Wake State)
> > > > > >
> > > > > >    This object evaluates to an integer that conveys to OSPM the deepest
> > > > > >    D-state supported by this device in the S0 system sleeping state
> > > > > >    where the device can wake itself.
> > > > > >
> > > > > > This mismatch may lead to being unable to enumerate devices behind the
> > > > > > hotplug bridge when a device is plugged in. To remedy these situations
> > > > > > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> > > > > > check that the ACPI companion has returned _S0W greater than or equal
> > > > > > to 3 and the device has a GPE allowing the device to generate wakeup
> > > > > > signals handled by the platform in `acpi_pci_bridge_d3`.
> > > > > >
> > > > > > Windows 10 and Windows 11 both will prevent the bridge from going in
> > D3
> > > > > > when the firmware is configured this way and this changes aligns the
> > > > > > handling of the situation to be the same.
> > > > > >
> > > > > > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org
> > %2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt
> > %2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w-s0-
> > device-wake-
> > state&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37a
> > c47640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> > %7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> > =Cw1OTYiX9BD3gh8eN3Zyz6%2FK8YFgqbn6bgi9%2FjNsnrM%3D&amp;reserved
> > =0
> > > > > > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.mi
> > crosoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-pcie-
> > root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-
> > d3&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37ac4
> > 7640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> > 7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=
> > UkB5lmz1QBUzWVM6%2BuNzJsleP%2Fi%2BDCJJuSgINiNbymo%3D&amp;reserv
> > ed=0
> > > > > > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug
> > ports")
> > > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > > >
> > > > > No more comments from me:
> > > > >
> > > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > >
> > > > Or please let me know if I should pick it up.
> > >
> > > Bjorn,
> > >
> > > Friendly reminder on this one.
> > 
> > Thanks; we're in the middle of the merge window now, so I'll wait till
> > that's over unless this is an emergency.
> 
> Actually it's a pretty important problem.  We waffled on the nuts and
> bolts of this commit during the 5.17 RC's, but I didn't think we would
> make it out to the merge window before it got picked.  I guess I should
> have spoke up on the urgency earlier.
> 
> > IIUC the bug this fixes is that "when a bridge is in D3cold, we don't
> > notice when a device is hot-added below it," right?  So we need to
> > avoid putting the bridge in D3cold?
> 
> When the ASL has been configured this way (to return 0 for _S0W) the
> lower level hardware implementation will lead to hotplugged devices
> not being detected.
> Without this commit the hardware will expect to be in D0, but the kernel
> will select D3hot.  So yes; the outcome is that hotplugged PCIe devices
> don't work.
> 
> > Is there a typical scenario where this bites users?  I don't think we
> > ever saw an actual problem report?
> 
> This is the common way that these systems are being shipped.  I have
> plenty of private reports related to this, but nothing public to link to.

I still have no clue how many people this affects, what kernel
versions, etc.  If there are no public reports, it suggests that it
doesn't affect distro kernels or production machines yet.

Can we at least list some common scenarios?  E.g., it affects kernels
after commit X, or it affects machines with CPUs newer than Y, or it
affects a certain kind of tunneling, etc?  Distros need this
information so they can figure whether and how far to backport things
like this.

> FYI: the earlier version of this was:
> https://patchwork.kernel.org/project/linux-usb/patch/1646658319-59532-1-git-send-email-Sanju.Mehta@amd.com/
> This is basically to intentionally pull the device out of D3hot when a
> tunnel is created.
Mario Limonciello March 23, 2022, 9:51 p.m. UTC | #7
[Public]



> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Wednesday, March 23, 2022 16:43
> To: Limonciello, Mario <Mario.Limonciello@amd.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-
> pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>; Mehta, Sanju
> <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>
> Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if device
> can wake from D3
> 
> On Wed, Mar 23, 2022 at 09:26:15PM +0000, Limonciello, Mario wrote:
> > [Public]
> >
> > >
> > > On Wed, Mar 23, 2022 at 11:01:05AM -0500, Mario Limonciello wrote:
> > > > On 3/15/22 11:36, Rafael J. Wysocki wrote:
> > > > > On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org>
> > > wrote:
> > > > > >
> > > > > > On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
> > > > > > <mario.limonciello@amd.com> wrote:
> > > > > > >
> > > > > > > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> > > > > > > indicates the ability for a bridge to be able to wakeup from D3:
> > > > > > >
> > > > > > >    This ACPI object [HotPlugSupportInD3] enables the operating
> system
> > > > > > >    to identify and power manage PCIe Root Ports that are capable of
> > > > > > >    handling hot plug events while in D3 state.
> > > > > > >
> > > > > > > This however is static information in the ACPI table at BIOS
> compilation
> > > > > > > time and on some platforms it's possible to configure the firmware at
> > > boot
> > > > > > > up such that _S0W returns "0" indicating the inability to wake up the
> > > > > > > device from D3 as explained in the ACPI specification:
> > > > > > >
> > > > > > >    7.3.20 _S0W (S0 Device Wake State)
> > > > > > >
> > > > > > >    This object evaluates to an integer that conveys to OSPM the
> deepest
> > > > > > >    D-state supported by this device in the S0 system sleeping state
> > > > > > >    where the device can wake itself.
> > > > > > >
> > > > > > > This mismatch may lead to being unable to enumerate devices behind
> the
> > > > > > > hotplug bridge when a device is plugged in. To remedy these
> situations
> > > > > > > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> > > > > > > check that the ACPI companion has returned _S0W greater than or
> equal
> > > > > > > to 3 and the device has a GPE allowing the device to generate wakeup
> > > > > > > signals handled by the platform in `acpi_pci_bridge_d3`.
> > > > > > >
> > > > > > > Windows 10 and Windows 11 both will prevent the bridge from going
> in
> > > D3
> > > > > > > when the firmware is configured this way and this changes aligns the
> > > > > > > handling of the situation to be the same.
> > > > > > >
> > > > > > > Link:
> > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org
> %2F&amp;data=04%7C01%7CMario.Limonciello%40amd.com%7C095e9cb1cb3
> 34458a08d08da0d1610bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C637836685684066439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> =aAj91EajV77J7GjWEGGhAYLAb9Al2TMsH1baasdVdyI%3D&amp;reserved=0
> > >
> %2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt
> > > %2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w-
> s0-
> > > device-wake-
> > >
> state&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37a
> > >
> c47640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> > >
> %7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > >
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> > >
> =Cw1OTYiX9BD3gh8eN3Zyz6%2FK8YFgqbn6bgi9%2FjNsnrM%3D&amp;reserved
> > > =0
> > > > > > > Link:
> > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.mi
> %2F&amp;data=04%7C01%7CMario.Limonciello%40amd.com%7C095e9cb1cb3
> 34458a08d08da0d1610bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C637836685684066439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> =v8YYoxTCp1drb7ZxgbDkrgA2Nc3TxbENLBfPxbo1CTI%3D&amp;reserved=0
> > > crosoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-
> pcie-
> > > root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-
> > >
> d3&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37ac4
> > >
> 7640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> > >
> 7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > >
> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=
> > >
> UkB5lmz1QBUzWVM6%2BuNzJsleP%2Fi%2BDCJJuSgINiNbymo%3D&amp;reserv
> > > ed=0
> > > > > > > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe
> hotplug
> > > ports")
> > > > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > > > >
> > > > > > No more comments from me:
> > > > > >
> > > > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > >
> > > > > Or please let me know if I should pick it up.
> > > >
> > > > Bjorn,
> > > >
> > > > Friendly reminder on this one.
> > >
> > > Thanks; we're in the middle of the merge window now, so I'll wait till
> > > that's over unless this is an emergency.
> >
> > Actually it's a pretty important problem.  We waffled on the nuts and
> > bolts of this commit during the 5.17 RC's, but I didn't think we would
> > make it out to the merge window before it got picked.  I guess I should
> > have spoke up on the urgency earlier.
> >
> > > IIUC the bug this fixes is that "when a bridge is in D3cold, we don't
> > > notice when a device is hot-added below it," right?  So we need to
> > > avoid putting the bridge in D3cold?
> >
> > When the ASL has been configured this way (to return 0 for _S0W) the
> > lower level hardware implementation will lead to hotplugged devices
> > not being detected.
> > Without this commit the hardware will expect to be in D0, but the kernel
> > will select D3hot.  So yes; the outcome is that hotplugged PCIe devices
> > don't work.
> >
> > > Is there a typical scenario where this bites users?  I don't think we
> > > ever saw an actual problem report?
> >
> > This is the common way that these systems are being shipped.  I have
> > plenty of private reports related to this, but nothing public to link to.
> 
> I still have no clue how many people this affects, what kernel
> versions, etc.  If there are no public reports, it suggests that it
> doesn't affect distro kernels or production machines yet.
> 
> Can we at least list some common scenarios?  E.g., it affects kernels
> after commit X, or it affects machines with CPUs newer than Y, or it
> affects a certain kind of tunneling, etc?  Distros need this
> information so they can figure whether and how far to backport things
> like this.

It's going to affect any retail machine with the SOC we refer to in the kernel
as "Yellow Carp".  This is one of the first non-Intel USB4 hosts and will be
using the USB4 SW CM in the kernel.

Without this change, effectively PCIe tunneling will not work when any downstream
PCIe device is hotplugged.
In the right circumstances it might work if it's coldbooted (if the paths selected
by the pre-boot firmware connection manager are identical to that selected
by SW CM).

Yellow carp isn't supported in the kernel until 5.14, so should this be backported
it shouldn't need to go any further than 5.14.  (Realistically though no distro should be on
5.14, they should have gone to the 5.15 LTS).  

Should it behave well baking in mainline, I would also later suggest to bring it back to
5.15.y and later as well for distros to pickup.

> 
> > FYI: the earlier version of this was:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchw
> ork.kernel.org%2Fproject%2Flinux-usb%2Fpatch%2F1646658319-59532-1-git-
> send-email-
> Sanju.Mehta%40amd.com%2F&amp;data=04%7C01%7CMario.Limonciello%40a
> md.com%7C095e9cb1cb334458a08d08da0d1610bf%7C3dd8961fe4884e608e11
> a82d994e183d%7C0%7C0%7C637836685684066439%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C3000&amp;sdata=tMSmfEl2%2FcvcFwNr7tg8QBi3s6HA8ZBgg5rCa7DZD
> Hg%3D&amp;reserved=0
> > This is basically to intentionally pull the device out of D3hot when a
> > tunnel is created.
Bjorn Helgaas March 24, 2022, 4:34 p.m. UTC | #8
[+cc Mika, "HotPlugSupportInD3" scope question below]

On Wed, Mar 23, 2022 at 09:51:18PM +0000, Limonciello, Mario wrote:
> > -----Original Message-----
> > From: Bjorn Helgaas <helgaas@kernel.org>
> > Sent: Wednesday, March 23, 2022 16:43
> > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-
> > pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>; Mehta, Sanju
> > <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>
> > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if device
> > can wake from D3
> > 
> > On Wed, Mar 23, 2022 at 09:26:15PM +0000, Limonciello, Mario wrote:
> > > [Public]
> > >
> > > >
> > > > On Wed, Mar 23, 2022 at 11:01:05AM -0500, Mario Limonciello wrote:
> > > > > On 3/15/22 11:36, Rafael J. Wysocki wrote:
> > > > > > On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@kernel.org>
> > > > wrote:
> > > > > > >
> > > > > > > On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello
> > > > > > > <mario.limonciello@amd.com> wrote:
> > > > > > > >
> > > > > > > > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is
> > > > > > > > indicates the ability for a bridge to be able to wakeup from D3:
> > > > > > > >
> > > > > > > >    This ACPI object [HotPlugSupportInD3] enables the operating
> > system
> > > > > > > >    to identify and power manage PCIe Root Ports that are capable of
> > > > > > > >    handling hot plug events while in D3 state.
> > > > > > > >
> > > > > > > > This however is static information in the ACPI table at BIOS
> > compilation
> > > > > > > > time and on some platforms it's possible to configure the firmware at
> > > > boot
> > > > > > > > up such that _S0W returns "0" indicating the inability to wake up the
> > > > > > > > device from D3 as explained in the ACPI specification:
> > > > > > > >
> > > > > > > >    7.3.20 _S0W (S0 Device Wake State)
> > > > > > > >
> > > > > > > >    This object evaluates to an integer that conveys to OSPM the
> > deepest
> > > > > > > >    D-state supported by this device in the S0 system sleeping state
> > > > > > > >    where the device can wake itself.
> > > > > > > >
> > > > > > > > This mismatch may lead to being unable to enumerate devices behind
> > the
> > > > > > > > hotplug bridge when a device is plugged in. To remedy these
> > situations
> > > > > > > > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly
> > > > > > > > check that the ACPI companion has returned _S0W greater than or
> > equal
> > > > > > > > to 3 and the device has a GPE allowing the device to generate wakeup
> > > > > > > > signals handled by the platform in `acpi_pci_bridge_d3`.
> > > > > > > >
> > > > > > > > Windows 10 and Windows 11 both will prevent the bridge from going
> > in
> > > > D3
> > > > > > > > when the firmware is configured this way and this changes aligns the
> > > > > > > > handling of the situation to be the same.
> > > > > > > >
> > > > > > > > Link:
> > > >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org
> > %2F&amp;data=04%7C01%7CMario.Limonciello%40amd.com%7C095e9cb1cb3
> > 34458a08d08da0d1610bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> > %7C637836685684066439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> > =aAj91EajV77J7GjWEGGhAYLAb9Al2TMsH1baasdVdyI%3D&amp;reserved=0
> > > >
> > %2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt
> > > > %2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w-
> > s0-
> > > > device-wake-
> > > >
> > state&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37a
> > > >
> > c47640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> > > >
> > %7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > > >
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> > > >
> > =Cw1OTYiX9BD3gh8eN3Zyz6%2FK8YFgqbn6bgi9%2FjNsnrM%3D&amp;reserved
> > > > =0
> > > > > > > > Link:
> > > >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.mi
> > %2F&amp;data=04%7C01%7CMario.Limonciello%40amd.com%7C095e9cb1cb3
> > 34458a08d08da0d1610bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> > %7C637836685684066439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata
> > =v8YYoxTCp1drb7ZxgbDkrgA2Nc3TxbENLBfPxbo1CTI%3D&amp;reserved=0
> > > > crosoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-
> > pcie-
> > > > root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-
> > > >
> > d3&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37ac4
> > > >
> > 7640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> > > >
> > 7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > > >
> > DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=
> > > >
> > UkB5lmz1QBUzWVM6%2BuNzJsleP%2Fi%2BDCJJuSgINiNbymo%3D&amp;reserv
> > > > ed=0
> > > > > > > > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe
> > hotplug
> > > > ports")
> > > > > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > > > > >
> > > > > > > No more comments from me:
> > > > > > >
> > > > > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > > >
> > > > > > Or please let me know if I should pick it up.
> > > > >
> > > > > Bjorn,
> > > > >
> > > > > Friendly reminder on this one.
> > > >
> > > > Thanks; we're in the middle of the merge window now, so I'll
> > > > wait till that's over unless this is an emergency.
> > >
> > > Actually it's a pretty important problem.  We waffled on the
> > > nuts and bolts of this commit during the 5.17 RC's, but I didn't
> > > think we would make it out to the merge window before it got
> > > picked.  I guess I should have spoke up on the urgency earlier.
> > >
> > > > IIUC the bug this fixes is that "when a bridge is in D3cold,
> > > > we don't notice when a device is hot-added below it," right?
> > > > So we need to avoid putting the bridge in D3cold?
> > >
> > > When the ASL has been configured this way (to return 0 for _S0W)
> > > the lower level hardware implementation will lead to hotplugged
> > > devices not being detected.  Without this commit the hardware
> > > will expect to be in D0, but the kernel will select D3hot.  So
> > > yes; the outcome is that hotplugged PCIe devices don't work.
> > >
> > > > Is there a typical scenario where this bites users?  I don't
> > > > think we ever saw an actual problem report?
> > >
> > > This is the common way that these systems are being shipped.  I
> > > have plenty of private reports related to this, but nothing
> > > public to link to.
> > 
> > I still have no clue how many people this affects, what kernel
> > versions, etc.  If there are no public reports, it suggests that
> > it doesn't affect distro kernels or production machines yet.
> > 
> > Can we at least list some common scenarios?  E.g., it affects
> > kernels after commit X, or it affects machines with CPUs newer
> > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > need this information so they can figure whether and how far to
> > backport things like this.
> 
> It's going to affect any retail machine with the SOC we refer to in
> the kernel as "Yellow Carp".  This is one of the first non-Intel
> USB4 hosts and will be using the USB4 SW CM in the kernel.
> 
> Without this change, effectively PCIe tunneling will not work when
> any downstream PCIe device is hotplugged.  In the right
> circumstances it might work if it's coldbooted (if the paths
> selected by the pre-boot firmware connection manager are identical
> to that selected by SW CM).

Thanks a lot for this context!  As far as I can tell from grubbing
through the git history, there are no PCI, USB4, or Thunderbolt
changes related to Yellow Carp, so I assume this has to do with Yellow
Carp firmware doing things differently than previous platforms.

Previously, if a Root Port implemented the HotPlugSupportInD3
property, we assumed that the Root Port and any downstream bridges
could handle hot-plug events while in D3hot.

I guess the difference here is that on Yellow Carp firmware, even if
there is a HotPlugSupportInD3 property on the Root Port, the Root Port
cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
method AND that _S0W says the Root Port can wakeup from D3hot or
D3cold, right?

I have some heartburn about this that's only partly related to this
patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
implemented on Root Ports and its presence tells us that the *Root
Port* can handle hot-plug events while in D3.

But acpi_pci_bridge_d3() looks up the Root Port at the top of the
hierarchy and assume that its "HotPlugSupportInD3" applies to any
switch ports that may be below that Root Port (added by 26ad34d510a8
("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).

That seems wrong to me.  How can the platform firmware know how
downstream switches behave?

I have the same question about this patch because it looks for _S0W on
the Root Port, even if acpi_pci_bridge_d3() was called for some switch
port below the root.

> Yellow carp isn't supported in the kernel until 5.14, so should this
> be backported it shouldn't need to go any further than 5.14.
> (Realistically though no distro should be on 5.14, they should have
> gone to the 5.15 LTS).  

The question of moving from v5.14 to v5.15 is outside the scope of
this patch.  I think we just have to mention Yellow Carp and the way
its firmware differs from previous practice, and distros can make
their own decisions.

[1] https://git.kernel.org/linus/26ad34d510a8
Mario Limonciello March 24, 2022, 5:14 p.m. UTC | #9
[Public]

> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Thursday, March 24, 2022 11:35
> To: Limonciello, Mario <Mario.Limonciello@amd.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-
> pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>; Mehta, Sanju
> <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>; Mika
> Westerberg <mika.westerberg@linux.intel.com>
> Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if
> device can wake from D3
> 
> [+cc Mika, "HotPlugSupportInD3" scope question below]

FYI - Mika had approved some earlier versions of this, so I expect conceptual
Alignment at least with the latest one.

<snip>

> > > Can we at least list some common scenarios?  E.g., it affects
> > > kernels after commit X, or it affects machines with CPUs newer
> > > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > > need this information so they can figure whether and how far to
> > > backport things like this.
> >
> > It's going to affect any retail machine with the SOC we refer to in
> > the kernel as "Yellow Carp".  This is one of the first non-Intel
> > USB4 hosts and will be using the USB4 SW CM in the kernel.
> >
> > Without this change, effectively PCIe tunneling will not work when
> > any downstream PCIe device is hotplugged.  In the right
> > circumstances it might work if it's coldbooted (if the paths
> > selected by the pre-boot firmware connection manager are identical
> > to that selected by SW CM).
> 
> Thanks a lot for this context!  As far as I can tell from grubbing
> through the git history, there are no PCI, USB4, or Thunderbolt
> changes related to Yellow Carp, so I assume this has to do with Yellow
> Carp firmware doing things differently than previous platforms.

There have been a variety of Thunderbolt/USB4 changes as a result of
Yellow Carp development and findings, but they have not been quirks;
they have been done as generic changes that still make sense for all
USB4 devices.

Sanju (on CC) has submitted a majority of these, so if you want to see
a sense of what these are you can look for his commits in drivers/thunderbolt.

> 
> Previously, if a Root Port implemented the HotPlugSupportInD3
> property, we assumed that the Root Port and any downstream bridges
> could handle hot-plug events while in D3hot.
> 
> I guess the difference here is that on Yellow Carp firmware, even if
> there is a HotPlugSupportInD3 property on the Root Port, the Root Port
> cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
> method AND that _S0W says the Root Port can wakeup from D3hot or
> D3cold, right?

Yes, correct!

> 
> I have some heartburn about this that's only partly related to this
> patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
> implemented on Root Ports and its presence tells us that the *Root
> Port* can handle hot-plug events while in D3.
> 
> But acpi_pci_bridge_d3() looks up the Root Port at the top of the
> hierarchy and assume that its "HotPlugSupportInD3" applies to any
> switch ports that may be below that Root Port (added by 26ad34d510a8
> ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).
> 
> That seems wrong to me.  How can the platform firmware know how
> downstream switches behave?

Mika might be able to better answer this than me.

> 
> I have the same question about this patch because it looks for _S0W on
> the Root Port, even if acpi_pci_bridge_d3() was called for some switch
> port below the root.
> 
> > Yellow carp isn't supported in the kernel until 5.14, so should this
> > be backported it shouldn't need to go any further than 5.14.
> > (Realistically though no distro should be on 5.14, they should have
> > gone to the 5.15 LTS).
> 
> The question of moving from v5.14 to v5.15 is outside the scope of
> this patch.  I think we just have to mention Yellow Carp and the way
> its firmware differs from previous practice, and distros can make
> their own decisions.
> 

OK.
Rafael J. Wysocki March 24, 2022, 6:31 p.m. UTC | #10
On Thu, Mar 24, 2022 at 6:15 PM Limonciello, Mario
<Mario.Limonciello@amd.com> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Bjorn Helgaas <helgaas@kernel.org>
> > Sent: Thursday, March 24, 2022 11:35
> > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-
> > pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>; Mehta, Sanju
> > <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>; Mika
> > Westerberg <mika.westerberg@linux.intel.com>
> > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if
> > device can wake from D3
> >
> > [+cc Mika, "HotPlugSupportInD3" scope question below]
>
> FYI - Mika had approved some earlier versions of this, so I expect conceptual
> Alignment at least with the latest one.
>
> <snip>
>
> > > > Can we at least list some common scenarios?  E.g., it affects
> > > > kernels after commit X, or it affects machines with CPUs newer
> > > > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > > > need this information so they can figure whether and how far to
> > > > backport things like this.
> > >
> > > It's going to affect any retail machine with the SOC we refer to in
> > > the kernel as "Yellow Carp".  This is one of the first non-Intel
> > > USB4 hosts and will be using the USB4 SW CM in the kernel.
> > >
> > > Without this change, effectively PCIe tunneling will not work when
> > > any downstream PCIe device is hotplugged.  In the right
> > > circumstances it might work if it's coldbooted (if the paths
> > > selected by the pre-boot firmware connection manager are identical
> > > to that selected by SW CM).
> >
> > Thanks a lot for this context!  As far as I can tell from grubbing
> > through the git history, there are no PCI, USB4, or Thunderbolt
> > changes related to Yellow Carp, so I assume this has to do with Yellow
> > Carp firmware doing things differently than previous platforms.
>
> There have been a variety of Thunderbolt/USB4 changes as a result of
> Yellow Carp development and findings, but they have not been quirks;
> they have been done as generic changes that still make sense for all
> USB4 devices.
>
> Sanju (on CC) has submitted a majority of these, so if you want to see
> a sense of what these are you can look for his commits in drivers/thunderbolt.
>
> >
> > Previously, if a Root Port implemented the HotPlugSupportInD3
> > property, we assumed that the Root Port and any downstream bridges
> > could handle hot-plug events while in D3hot.
> >
> > I guess the difference here is that on Yellow Carp firmware, even if
> > there is a HotPlugSupportInD3 property on the Root Port, the Root Port
> > cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
> > method AND that _S0W says the Root Port can wakeup from D3hot or
> > D3cold, right?
>
> Yes, correct!
>
> >
> > I have some heartburn about this that's only partly related to this
> > patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
> > implemented on Root Ports and its presence tells us that the *Root
> > Port* can handle hot-plug events while in D3.
> >
> > But acpi_pci_bridge_d3() looks up the Root Port at the top of the
> > hierarchy and assume that its "HotPlugSupportInD3" applies to any
> > switch ports that may be below that Root Port (added by 26ad34d510a8
> > ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).

Not really.

"HotPlugSupportInD3" applies to the root port only and the platform
firmware may not know about any ports below it.

However, the presence of "HotPlugSupportInD3" is used as an indicator
that the entire hierarchy is "D3cold-aware", so to speak.
Essentially, this boils down to the "Is the hardware modern enough?"
consideration and the answer is assumed to be "yes" if the property in
question is present for the root port.

But if "HotPlugSupportInD3" is not consistent with the other pieces of
information regarding the root port available from the firmware (_PRW
and _S0W in this particular case), the presence of it is questionable
in the first place, so IMO the approach here makes sense.
Bjorn Helgaas March 24, 2022, 6:52 p.m. UTC | #11
On Thu, Mar 24, 2022 at 07:31:56PM +0100, Rafael J. Wysocki wrote:
> On Thu, Mar 24, 2022 at 6:15 PM Limonciello, Mario
> <Mario.Limonciello@amd.com> wrote:
> > > -----Original Message-----
> > > From: Bjorn Helgaas <helgaas@kernel.org>
> > > Sent: Thursday, March 24, 2022 11:35
> > > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > > Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-
> > > pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>; Mehta, Sanju
> > > <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>; Mika
> > > Westerberg <mika.westerberg@linux.intel.com>
> > > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if
> > > device can wake from D3
> > >
> > > [+cc Mika, "HotPlugSupportInD3" scope question below]
> >
> > FYI - Mika had approved some earlier versions of this, so I expect conceptual
> > Alignment at least with the latest one.
> >
> > <snip>
> >
> > > > > Can we at least list some common scenarios?  E.g., it affects
> > > > > kernels after commit X, or it affects machines with CPUs newer
> > > > > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > > > > need this information so they can figure whether and how far to
> > > > > backport things like this.
> > > >
> > > > It's going to affect any retail machine with the SOC we refer to in
> > > > the kernel as "Yellow Carp".  This is one of the first non-Intel
> > > > USB4 hosts and will be using the USB4 SW CM in the kernel.
> > > >
> > > > Without this change, effectively PCIe tunneling will not work when
> > > > any downstream PCIe device is hotplugged.  In the right
> > > > circumstances it might work if it's coldbooted (if the paths
> > > > selected by the pre-boot firmware connection manager are identical
> > > > to that selected by SW CM).
> > >
> > > Thanks a lot for this context!  As far as I can tell from grubbing
> > > through the git history, there are no PCI, USB4, or Thunderbolt
> > > changes related to Yellow Carp, so I assume this has to do with Yellow
> > > Carp firmware doing things differently than previous platforms.
> >
> > There have been a variety of Thunderbolt/USB4 changes as a result of
> > Yellow Carp development and findings, but they have not been quirks;
> > they have been done as generic changes that still make sense for all
> > USB4 devices.
> >
> > Sanju (on CC) has submitted a majority of these, so if you want to see
> > a sense of what these are you can look for his commits in drivers/thunderbolt.
> >
> > > Previously, if a Root Port implemented the HotPlugSupportInD3
> > > property, we assumed that the Root Port and any downstream bridges
> > > could handle hot-plug events while in D3hot.
> > >
> > > I guess the difference here is that on Yellow Carp firmware, even if
> > > there is a HotPlugSupportInD3 property on the Root Port, the Root Port
> > > cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
> > > method AND that _S0W says the Root Port can wakeup from D3hot or
> > > D3cold, right?
> >
> > Yes, correct!
> >
> > > I have some heartburn about this that's only partly related to this
> > > patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
> > > implemented on Root Ports and its presence tells us that the *Root
> > > Port* can handle hot-plug events while in D3.
> > >
> > > But acpi_pci_bridge_d3() looks up the Root Port at the top of the
> > > hierarchy and assume that its "HotPlugSupportInD3" applies to any
> > > switch ports that may be below that Root Port (added by 26ad34d510a8
> > > ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).
> 
> Not really.
> 
> "HotPlugSupportInD3" applies to the root port only and the platform
> firmware may not know about any ports below it.
> 
> However, the presence of "HotPlugSupportInD3" is used as an indicator
> that the entire hierarchy is "D3cold-aware", so to speak.
> Essentially, this boils down to the "Is the hardware modern enough?"
> consideration and the answer is assumed to be "yes" if the property in
> question is present for the root port.

Seems weird to me since we're talking about a hot-plug Root Port and
anything at all could be plugged into it.  We're basically saying that
we can assume a property of an arbitrary downstream device based on
something we know about the upstream device.  I'm still not
comfortable with that.

At a minimum we should add a comment about this assumption.  The
existing "... we know the hierarchy behind it supports D3 just fine"
seems a little too strong.

> But if "HotPlugSupportInD3" is not consistent with the other pieces of
> information regarding the root port available from the firmware (_PRW
> and _S0W in this particular case), the presence of it is questionable
> in the first place, so IMO the approach here makes sense.

This part seems reasonable to me, as long as we have good confidence
that requiring "HotPlugSupportInD3" + _PRW + _S0W where we used to
require only "HotPlugSupportInD3" is unlikely to break anything.

I can't judge that, but I assume you know that we don't use the
acpi_pci_bridge_d3() result unless _PRW and _S0W exist, so I'll take
your word for it :)

Bjorn
Rafael J. Wysocki March 24, 2022, 7:10 p.m. UTC | #12
On Thu, Mar 24, 2022 at 7:52 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Mar 24, 2022 at 07:31:56PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Mar 24, 2022 at 6:15 PM Limonciello, Mario
> > <Mario.Limonciello@amd.com> wrote:
> > > > -----Original Message-----
> > > > From: Bjorn Helgaas <helgaas@kernel.org>
> > > > Sent: Thursday, March 24, 2022 11:35
> > > > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > > > Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-
> > > > pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>; Mehta, Sanju
> > > > <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>; Mika
> > > > Westerberg <mika.westerberg@linux.intel.com>
> > > > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if
> > > > device can wake from D3
> > > >
> > > > [+cc Mika, "HotPlugSupportInD3" scope question below]
> > >
> > > FYI - Mika had approved some earlier versions of this, so I expect conceptual
> > > Alignment at least with the latest one.
> > >
> > > <snip>
> > >
> > > > > > Can we at least list some common scenarios?  E.g., it affects
> > > > > > kernels after commit X, or it affects machines with CPUs newer
> > > > > > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > > > > > need this information so they can figure whether and how far to
> > > > > > backport things like this.
> > > > >
> > > > > It's going to affect any retail machine with the SOC we refer to in
> > > > > the kernel as "Yellow Carp".  This is one of the first non-Intel
> > > > > USB4 hosts and will be using the USB4 SW CM in the kernel.
> > > > >
> > > > > Without this change, effectively PCIe tunneling will not work when
> > > > > any downstream PCIe device is hotplugged.  In the right
> > > > > circumstances it might work if it's coldbooted (if the paths
> > > > > selected by the pre-boot firmware connection manager are identical
> > > > > to that selected by SW CM).
> > > >
> > > > Thanks a lot for this context!  As far as I can tell from grubbing
> > > > through the git history, there are no PCI, USB4, or Thunderbolt
> > > > changes related to Yellow Carp, so I assume this has to do with Yellow
> > > > Carp firmware doing things differently than previous platforms.
> > >
> > > There have been a variety of Thunderbolt/USB4 changes as a result of
> > > Yellow Carp development and findings, but they have not been quirks;
> > > they have been done as generic changes that still make sense for all
> > > USB4 devices.
> > >
> > > Sanju (on CC) has submitted a majority of these, so if you want to see
> > > a sense of what these are you can look for his commits in drivers/thunderbolt.
> > >
> > > > Previously, if a Root Port implemented the HotPlugSupportInD3
> > > > property, we assumed that the Root Port and any downstream bridges
> > > > could handle hot-plug events while in D3hot.
> > > >
> > > > I guess the difference here is that on Yellow Carp firmware, even if
> > > > there is a HotPlugSupportInD3 property on the Root Port, the Root Port
> > > > cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
> > > > method AND that _S0W says the Root Port can wakeup from D3hot or
> > > > D3cold, right?
> > >
> > > Yes, correct!
> > >
> > > > I have some heartburn about this that's only partly related to this
> > > > patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
> > > > implemented on Root Ports and its presence tells us that the *Root
> > > > Port* can handle hot-plug events while in D3.
> > > >
> > > > But acpi_pci_bridge_d3() looks up the Root Port at the top of the
> > > > hierarchy and assume that its "HotPlugSupportInD3" applies to any
> > > > switch ports that may be below that Root Port (added by 26ad34d510a8
> > > > ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).
> >
> > Not really.
> >
> > "HotPlugSupportInD3" applies to the root port only and the platform
> > firmware may not know about any ports below it.
> >
> > However, the presence of "HotPlugSupportInD3" is used as an indicator
> > that the entire hierarchy is "D3cold-aware", so to speak.
> > Essentially, this boils down to the "Is the hardware modern enough?"
> > consideration and the answer is assumed to be "yes" if the property in
> > question is present for the root port.
>
> Seems weird to me since we're talking about a hot-plug Root Port and
> anything at all could be plugged into it.  We're basically saying that
> we can assume a property of an arbitrary downstream device based on
> something we know about the upstream device.  I'm still not
> comfortable with that.
>
> At a minimum we should add a comment about this assumption.  The
> existing "... we know the hierarchy behind it supports D3 just fine"
> seems a little too strong.
>
> > But if "HotPlugSupportInD3" is not consistent with the other pieces of
> > information regarding the root port available from the firmware (_PRW
> > and _S0W in this particular case), the presence of it is questionable
> > in the first place, so IMO the approach here makes sense.
>
> This part seems reasonable to me, as long as we have good confidence
> that requiring "HotPlugSupportInD3" + _PRW + _S0W where we used to
> require only "HotPlugSupportInD3" is unlikely to break anything.
>
> I can't judge that, but I assume you know that we don't use the
> acpi_pci_bridge_d3() result unless _PRW and _S0W exist, so I'll take
> your word for it :)

Actually, that is an extremely good point I didn't think about.

Thinking about it now, one thing is missing: a check if _S0W is
present, because the lack of it means "any power state should be
fine".

With this check in place we would only avoid taking
"HotPlugSupportInD3" into account if it were not consistent with the
other settings and if that didn't work, we would end up in the quirks
territory this way or another.
Mario Limonciello March 24, 2022, 11:13 p.m. UTC | #13
[Public]



> -----Original Message-----
> From: Rafael J. Wysocki <rafael@kernel.org>
> Sent: Thursday, March 24, 2022 14:10
> To: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Rafael J. Wysocki <rafael@kernel.org>; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Bjorn Helgaas <bhelgaas@google.com>;
> open list:PCI SUBSYSTEM <linux-pci@vger.kernel.org>; Linux PM <linux-
> pm@vger.kernel.org>; Mehta, Sanju <Sanju.Mehta@amd.com>; Mika
> Westerberg <mika.westerberg@linux.intel.com>
> Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if
> device can wake from D3
> 
> On Thu, Mar 24, 2022 at 7:52 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Thu, Mar 24, 2022 at 07:31:56PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Mar 24, 2022 at 6:15 PM Limonciello, Mario
> > > <Mario.Limonciello@amd.com> wrote:
> > > > > -----Original Message-----
> > > > > From: Bjorn Helgaas <helgaas@kernel.org>
> > > > > Sent: Thursday, March 24, 2022 11:35
> > > > > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > > > > Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM
> <linux-
> > > > > pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>;
> Mehta, Sanju
> > > > > <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>;
> Mika
> > > > > Westerberg <mika.westerberg@linux.intel.com>
> > > > > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3`
> only if
> > > > > device can wake from D3
> > > > >
> > > > > [+cc Mika, "HotPlugSupportInD3" scope question below]
> > > >
> > > > FYI - Mika had approved some earlier versions of this, so I expect
> conceptual
> > > > Alignment at least with the latest one.
> > > >
> > > > <snip>
> > > >
> > > > > > > Can we at least list some common scenarios?  E.g., it affects
> > > > > > > kernels after commit X, or it affects machines with CPUs newer
> > > > > > > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > > > > > > need this information so they can figure whether and how far to
> > > > > > > backport things like this.
> > > > > >
> > > > > > It's going to affect any retail machine with the SOC we refer to in
> > > > > > the kernel as "Yellow Carp".  This is one of the first non-Intel
> > > > > > USB4 hosts and will be using the USB4 SW CM in the kernel.
> > > > > >
> > > > > > Without this change, effectively PCIe tunneling will not work when
> > > > > > any downstream PCIe device is hotplugged.  In the right
> > > > > > circumstances it might work if it's coldbooted (if the paths
> > > > > > selected by the pre-boot firmware connection manager are
> identical
> > > > > > to that selected by SW CM).
> > > > >
> > > > > Thanks a lot for this context!  As far as I can tell from grubbing
> > > > > through the git history, there are no PCI, USB4, or Thunderbolt
> > > > > changes related to Yellow Carp, so I assume this has to do with Yellow
> > > > > Carp firmware doing things differently than previous platforms.
> > > >
> > > > There have been a variety of Thunderbolt/USB4 changes as a result of
> > > > Yellow Carp development and findings, but they have not been quirks;
> > > > they have been done as generic changes that still make sense for all
> > > > USB4 devices.
> > > >
> > > > Sanju (on CC) has submitted a majority of these, so if you want to see
> > > > a sense of what these are you can look for his commits in
> drivers/thunderbolt.
> > > >
> > > > > Previously, if a Root Port implemented the HotPlugSupportInD3
> > > > > property, we assumed that the Root Port and any downstream
> bridges
> > > > > could handle hot-plug events while in D3hot.
> > > > >
> > > > > I guess the difference here is that on Yellow Carp firmware, even if
> > > > > there is a HotPlugSupportInD3 property on the Root Port, the Root
> Port
> > > > > cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
> > > > > method AND that _S0W says the Root Port can wakeup from D3hot or
> > > > > D3cold, right?
> > > >
> > > > Yes, correct!
> > > >
> > > > > I have some heartburn about this that's only partly related to this
> > > > > patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
> > > > > implemented on Root Ports and its presence tells us that the *Root
> > > > > Port* can handle hot-plug events while in D3.
> > > > >
> > > > > But acpi_pci_bridge_d3() looks up the Root Port at the top of the
> > > > > hierarchy and assume that its "HotPlugSupportInD3" applies to any
> > > > > switch ports that may be below that Root Port (added by
> 26ad34d510a8
> > > > > ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).
> > >
> > > Not really.
> > >
> > > "HotPlugSupportInD3" applies to the root port only and the platform
> > > firmware may not know about any ports below it.
> > >
> > > However, the presence of "HotPlugSupportInD3" is used as an indicator
> > > that the entire hierarchy is "D3cold-aware", so to speak.
> > > Essentially, this boils down to the "Is the hardware modern enough?"
> > > consideration and the answer is assumed to be "yes" if the property in
> > > question is present for the root port.
> >
> > Seems weird to me since we're talking about a hot-plug Root Port and
> > anything at all could be plugged into it.  We're basically saying that
> > we can assume a property of an arbitrary downstream device based on
> > something we know about the upstream device.  I'm still not
> > comfortable with that.
> >
> > At a minimum we should add a comment about this assumption.  The
> > existing "... we know the hierarchy behind it supports D3 just fine"
> > seems a little too strong.
> >
> > > But if "HotPlugSupportInD3" is not consistent with the other pieces of
> > > information regarding the root port available from the firmware (_PRW
> > > and _S0W in this particular case), the presence of it is questionable
> > > in the first place, so IMO the approach here makes sense.
> >
> > This part seems reasonable to me, as long as we have good confidence
> > that requiring "HotPlugSupportInD3" + _PRW + _S0W where we used to
> > require only "HotPlugSupportInD3" is unlikely to break anything.
> >
> > I can't judge that, but I assume you know that we don't use the
> > acpi_pci_bridge_d3() result unless _PRW and _S0W exist, so I'll take
> > your word for it :)
> 
> Actually, that is an extremely good point I didn't think about.
> 
> Thinking about it now, one thing is missing: a check if _S0W is
> present, because the lack of it means "any power state should be
> fine".
> 
> With this check in place we would only avoid taking
> "HotPlugSupportInD3" into account if it were not consistent with the
> other settings and if that didn't work, we would end up in the quirks
> territory this way or another.

In that case something like this instead?

+	if (!adev->wakeup.flags.valid)
+		return false;
+
+	if (ACPI_SUCCESS(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
+		return ret >= ACPI_STATE_D3_HOT;
+
+	return true;
Rafael J. Wysocki March 25, 2022, 1:55 p.m. UTC | #14
On Fri, Mar 25, 2022 at 12:13 AM Limonciello, Mario
<Mario.Limonciello@amd.com> wrote:
>
> [Public]
>
>
>
> > -----Original Message-----
> > From: Rafael J. Wysocki <rafael@kernel.org>
> > Sent: Thursday, March 24, 2022 14:10
> > To: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Rafael J. Wysocki <rafael@kernel.org>; Limonciello, Mario
> > <Mario.Limonciello@amd.com>; Bjorn Helgaas <bhelgaas@google.com>;
> > open list:PCI SUBSYSTEM <linux-pci@vger.kernel.org>; Linux PM <linux-
> > pm@vger.kernel.org>; Mehta, Sanju <Sanju.Mehta@amd.com>; Mika
> > Westerberg <mika.westerberg@linux.intel.com>
> > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if
> > device can wake from D3
> >
> > On Thu, Mar 24, 2022 at 7:52 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >
> > > On Thu, Mar 24, 2022 at 07:31:56PM +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Mar 24, 2022 at 6:15 PM Limonciello, Mario
> > > > <Mario.Limonciello@amd.com> wrote:
> > > > > > -----Original Message-----
> > > > > > From: Bjorn Helgaas <helgaas@kernel.org>
> > > > > > Sent: Thursday, March 24, 2022 11:35
> > > > > > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > > > > > Cc: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM
> > <linux-
> > > > > > pci@vger.kernel.org>; Linux PM <linux-pm@vger.kernel.org>;
> > Mehta, Sanju
> > > > > > <Sanju.Mehta@amd.com>; Rafael J. Wysocki <rafael@kernel.org>;
> > Mika
> > > > > > Westerberg <mika.westerberg@linux.intel.com>
> > > > > > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3`
> > only if
> > > > > > device can wake from D3
> > > > > >
> > > > > > [+cc Mika, "HotPlugSupportInD3" scope question below]
> > > > >
> > > > > FYI - Mika had approved some earlier versions of this, so I expect
> > conceptual
> > > > > Alignment at least with the latest one.
> > > > >
> > > > > <snip>
> > > > >
> > > > > > > > Can we at least list some common scenarios?  E.g., it affects
> > > > > > > > kernels after commit X, or it affects machines with CPUs newer
> > > > > > > > than Y, or it affects a certain kind of tunneling, etc?  Distros
> > > > > > > > need this information so they can figure whether and how far to
> > > > > > > > backport things like this.
> > > > > > >
> > > > > > > It's going to affect any retail machine with the SOC we refer to in
> > > > > > > the kernel as "Yellow Carp".  This is one of the first non-Intel
> > > > > > > USB4 hosts and will be using the USB4 SW CM in the kernel.
> > > > > > >
> > > > > > > Without this change, effectively PCIe tunneling will not work when
> > > > > > > any downstream PCIe device is hotplugged.  In the right
> > > > > > > circumstances it might work if it's coldbooted (if the paths
> > > > > > > selected by the pre-boot firmware connection manager are
> > identical
> > > > > > > to that selected by SW CM).
> > > > > >
> > > > > > Thanks a lot for this context!  As far as I can tell from grubbing
> > > > > > through the git history, there are no PCI, USB4, or Thunderbolt
> > > > > > changes related to Yellow Carp, so I assume this has to do with Yellow
> > > > > > Carp firmware doing things differently than previous platforms.
> > > > >
> > > > > There have been a variety of Thunderbolt/USB4 changes as a result of
> > > > > Yellow Carp development and findings, but they have not been quirks;
> > > > > they have been done as generic changes that still make sense for all
> > > > > USB4 devices.
> > > > >
> > > > > Sanju (on CC) has submitted a majority of these, so if you want to see
> > > > > a sense of what these are you can look for his commits in
> > drivers/thunderbolt.
> > > > >
> > > > > > Previously, if a Root Port implemented the HotPlugSupportInD3
> > > > > > property, we assumed that the Root Port and any downstream
> > bridges
> > > > > > could handle hot-plug events while in D3hot.
> > > > > >
> > > > > > I guess the difference here is that on Yellow Carp firmware, even if
> > > > > > there is a HotPlugSupportInD3 property on the Root Port, the Root
> > Port
> > > > > > cannot handle hot-plug events in D3hot UNLESS there is also an _S0W
> > > > > > method AND that _S0W says the Root Port can wakeup from D3hot or
> > > > > > D3cold, right?
> > > > >
> > > > > Yes, correct!
> > > > >
> > > > > > I have some heartburn about this that's only partly related to this
> > > > > > patch.  The Microsoft doc clearly says "HotPlugSupportInD3" must be
> > > > > > implemented on Root Ports and its presence tells us that the *Root
> > > > > > Port* can handle hot-plug events while in D3.
> > > > > >
> > > > > > But acpi_pci_bridge_d3() looks up the Root Port at the top of the
> > > > > > hierarchy and assume that its "HotPlugSupportInD3" applies to any
> > > > > > switch ports that may be below that Root Port (added by
> > 26ad34d510a8
> > > > > > ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") [1]).
> > > >
> > > > Not really.
> > > >
> > > > "HotPlugSupportInD3" applies to the root port only and the platform
> > > > firmware may not know about any ports below it.
> > > >
> > > > However, the presence of "HotPlugSupportInD3" is used as an indicator
> > > > that the entire hierarchy is "D3cold-aware", so to speak.
> > > > Essentially, this boils down to the "Is the hardware modern enough?"
> > > > consideration and the answer is assumed to be "yes" if the property in
> > > > question is present for the root port.
> > >
> > > Seems weird to me since we're talking about a hot-plug Root Port and
> > > anything at all could be plugged into it.  We're basically saying that
> > > we can assume a property of an arbitrary downstream device based on
> > > something we know about the upstream device.  I'm still not
> > > comfortable with that.
> > >
> > > At a minimum we should add a comment about this assumption.  The
> > > existing "... we know the hierarchy behind it supports D3 just fine"
> > > seems a little too strong.
> > >
> > > > But if "HotPlugSupportInD3" is not consistent with the other pieces of
> > > > information regarding the root port available from the firmware (_PRW
> > > > and _S0W in this particular case), the presence of it is questionable
> > > > in the first place, so IMO the approach here makes sense.
> > >
> > > This part seems reasonable to me, as long as we have good confidence
> > > that requiring "HotPlugSupportInD3" + _PRW + _S0W where we used to
> > > require only "HotPlugSupportInD3" is unlikely to break anything.
> > >
> > > I can't judge that, but I assume you know that we don't use the
> > > acpi_pci_bridge_d3() result unless _PRW and _S0W exist, so I'll take
> > > your word for it :)
> >
> > Actually, that is an extremely good point I didn't think about.
> >
> > Thinking about it now, one thing is missing: a check if _S0W is
> > present, because the lack of it means "any power state should be
> > fine".
> >
> > With this check in place we would only avoid taking
> > "HotPlugSupportInD3" into account if it were not consistent with the
> > other settings and if that didn't work, we would end up in the quirks
> > territory this way or another.
>
> In that case something like this instead?

Works for me.

> +       if (!adev->wakeup.flags.valid)
> +               return false;
> +
> +       if (ACPI_SUCCESS(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
> +               return ret >= ACPI_STATE_D3_HOT;
> +
> +       return true;
diff mbox series

Patch

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index a42dbf448860..e535dab2c888 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -977,6 +977,7 @@  bool acpi_pci_bridge_d3(struct pci_dev *dev)
 	const union acpi_object *obj;
 	struct acpi_device *adev;
 	struct pci_dev *rpdev;
+	unsigned long long ret;
 
 	if (acpi_pci_disabled || !dev->is_hotplug_bridge)
 		return false;
@@ -1003,7 +1004,21 @@  bool acpi_pci_bridge_d3(struct pci_dev *dev)
 				   ACPI_TYPE_INTEGER, &obj) < 0)
 		return false;
 
-	return obj->integer.value == 1;
+	if (!obj->integer.value)
+		return false;
+
+	/*
+	 * If 'HotPlugSupportInD3' is set, but wakeup is not actually supported,
+	 * the former cannot be trusted anyway, so validate it by verifying the
+	 * latter.
+	 */
+	if (!adev->wakeup.flags.valid)
+		return false;
+
+	if (ACPI_FAILURE(acpi_evaluate_integer(adev->handle, "_S0W", NULL, &ret)))
+		return false;
+
+	return ret >= ACPI_STATE_D3_HOT;
 }
 
 int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)