diff mbox series

[v4] PCI: Call _REG when transitioning D-states

Message ID 20230620140451.21007-1-mario.limonciello@amd.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series [v4] PCI: Call _REG when transitioning D-states | expand

Commit Message

Limonciello, Mario June 20, 2023, 2:04 p.m. UTC
Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
an OperationRegion unless `_REG` has been called.

"The OS runs _REG control methods to inform AML code of a change in the
availability of an operation region. When an operation region handler
is unavailable, AML cannot access data fields in that region.
(Operation region writes will be ignored and reads will return
indeterminate data.)"

The PCI core does not call `_REG` at anytime, leading to the undefined
behavior mentioned in the spec.

The spec explains that _REG should be executed to indicate whether a
given region can be accessed.

"Once _REG has been executed for a particular operation region, indicating
that the operation region handler is ready, a control method can
access fields in the operation region. Conversely, control methods
must not access fields in operation regions when _REG method execution
has not indicated that the operation region handler is ready."

An example included in the spec demonstrates calling _REG when devices are
turned off: "when the host controller or bridge controller is turned off
or disabled, PCI Config Space Operation Regions for child devices are
no longer available. As such, ETH0’s _REG method will be run when it
is turned off and will again be run when PCI1 is turned off.".

It is reported that ASMedia PCIe GPIO controllers fail functional tests
after the system has returning from suspend (S3 or s2idle). This is
because the BIOS checks whether the OSPM has called the `_REG` method
to determine whether it can interact with the OperationRegion assigned
to the device as part of the other AML called for the device.

To fix this issue, call acpi_evaluate_reg() when devices are
transitioning to D3cold or D0.

Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
v3->v4:
 * Only enable config space access for D0 state instead of D1/D2 states.
 * Only disallow config space access for D3cold state.
v2->v3:
 * Move call site
 * Rename static function to better describe behavior
 * call _REG with disconnect /before/ going into D3
 * call _REG with connect /after/ going into D0/D1/D2
 * Update commit message
v1->v2:
 * Handle case of no CONFIG_ACPI
 * Rename function
 * Update commit message
 * Move ACPI calling code into pci-acpi.c instead
 * Cite the ACPI spec
---
 drivers/pci/pci-acpi.c | 44 ++++++++++++++++++++++++++++--------------
 1 file changed, 29 insertions(+), 15 deletions(-)

Comments

Rafael J. Wysocki June 21, 2023, 11:25 a.m. UTC | #1
On Tue, Jun 20, 2023 at 4:23 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
> an OperationRegion unless `_REG` has been called.
>
> "The OS runs _REG control methods to inform AML code of a change in the
> availability of an operation region. When an operation region handler
> is unavailable, AML cannot access data fields in that region.
> (Operation region writes will be ignored and reads will return
> indeterminate data.)"
>
> The PCI core does not call `_REG` at anytime, leading to the undefined
> behavior mentioned in the spec.
>
> The spec explains that _REG should be executed to indicate whether a
> given region can be accessed.
>
> "Once _REG has been executed for a particular operation region, indicating
> that the operation region handler is ready, a control method can
> access fields in the operation region. Conversely, control methods
> must not access fields in operation regions when _REG method execution
> has not indicated that the operation region handler is ready."
>
> An example included in the spec demonstrates calling _REG when devices are
> turned off: "when the host controller or bridge controller is turned off
> or disabled, PCI Config Space Operation Regions for child devices are
> no longer available. As such, ETH0’s _REG method will be run when it
> is turned off and will again be run when PCI1 is turned off.".
>
> It is reported that ASMedia PCIe GPIO controllers fail functional tests
> after the system has returning from suspend (S3 or s2idle). This is
> because the BIOS checks whether the OSPM has called the `_REG` method
> to determine whether it can interact with the OperationRegion assigned
> to the device as part of the other AML called for the device.
>
> To fix this issue, call acpi_evaluate_reg() when devices are
> transitioning to D3cold or D0.
>
> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> v3->v4:
>  * Only enable config space access for D0 state instead of D1/D2 states.
>  * Only disallow config space access for D3cold state.

I would ACPI to the patch subject, like "PCI: ACPI: Call ...", but
that can be done when applying the patch, and it is the only comment I
have to this version, so

Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>

> v2->v3:
>  * Move call site
>  * Rename static function to better describe behavior
>  * call _REG with disconnect /before/ going into D3
>  * call _REG with connect /after/ going into D0/D1/D2
>  * Update commit message
> v1->v2:
>  * Handle case of no CONFIG_ACPI
>  * Rename function
>  * Update commit message
>  * Move ACPI calling code into pci-acpi.c instead
>  * Cite the ACPI spec
> ---
>  drivers/pci/pci-acpi.c | 44 ++++++++++++++++++++++++++++--------------
>  1 file changed, 29 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 052a611081ec..182cac535250 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>         return false;
>  }
>
> +static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
> +{
> +       int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
> +       int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
> +                                   ACPI_ADR_SPACE_PCI_CONFIG, val);
> +       if (ret)
> +               pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
> +                       enable ? "connect" : "disconnect", ret);
> +}
> +
>  int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>  {
>         struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
> @@ -1053,32 +1063,36 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>                 [PCI_D3hot] = ACPI_STATE_D3_HOT,
>                 [PCI_D3cold] = ACPI_STATE_D3_COLD,
>         };
> -       int error = -EINVAL;
> +       int ret;
>
>         /* If the ACPI device has _EJ0, ignore the device */
>         if (!adev || acpi_has_method(adev->handle, "_EJ0"))
>                 return -ENODEV;
>
>         switch (state) {
> +       case PCI_POWER_ERROR:
> +       case PCI_UNKNOWN:
> +               return -EINVAL;
>         case PCI_D3cold:
>                 if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
> -                               PM_QOS_FLAGS_ALL) {
> -                       error = -EBUSY;
> -                       break;
> -               }
> -               fallthrough;
> -       case PCI_D0:
> -       case PCI_D1:
> -       case PCI_D2:
> -       case PCI_D3hot:
> -               error = acpi_device_set_power(adev, state_conv[state]);
> +                                    PM_QOS_FLAGS_ALL)
> +                       return -EBUSY;
> +               /* Notify AML lack of PCI config space availability */
> +               acpi_pci_config_space_access(dev, false);
> +               break;
>         }
>
> -       if (!error)
> -               pci_dbg(dev, "power state changed by ACPI to %s\n",
> -                       acpi_power_state_string(adev->power.state));
> +       ret = acpi_device_set_power(adev, state_conv[state]);
> +       if (ret)
> +               return ret;
> +       pci_dbg(dev, "power state changed by ACPI to %s\n",
> +               acpi_power_state_string(adev->power.state));
>
> -       return error;
> +       /* Notify AML of PCI config space availability */
> +       if (state == PCI_D0)
> +               acpi_pci_config_space_access(dev, true);
> +
> +       return 0;
>  }
>
>  pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
> --
> 2.34.1
>
Bjorn Helgaas June 21, 2023, 10:28 p.m. UTC | #2
On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
> Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
> an OperationRegion unless `_REG` has been called.
> 
> "The OS runs _REG control methods to inform AML code of a change in the
> availability of an operation region. When an operation region handler
> is unavailable, AML cannot access data fields in that region.
> (Operation region writes will be ignored and reads will return
> indeterminate data.)"
> 
> The PCI core does not call `_REG` at anytime, leading to the undefined
> behavior mentioned in the spec.

I got lost in the maze of users of ACPI_ADR_SPACE_PCI_CONFIG, but is
it really true that we never call _REG for PCI config space at all
today?

If so, I guess AML that uses ACPI_ADR_SPACE_PCI_CONFIG won't work
until after we set the relevant device to D0?

Do we explicitly set devices to D0 during enumeration, e.g., somewhere
in the pci_scan_device() path?  If not, should we?

If we don't set things to D0 during enumeration, it seems like this
AML won't work until we suspend and resume the device.

Separately, I propose a minor restructuring to avoid the need for
mentioning PCI_POWER_ERROR and PCI_UNKNOWN.  Checking for those means
we need to look at the definitions to be sure we cover all cases, and
it also doesn't solve the problem that a caller can pass undefined
pci_power_t values that would index outside the state_conv[].

Possible rework attached below.  I also like the fact that it makes
the _REG patch very simple and specific to _REG.

> The spec explains that _REG should be executed to indicate whether a
> given region can be accessed.
> 
> "Once _REG has been executed for a particular operation region, indicating
> that the operation region handler is ready, a control method can
> access fields in the operation region. Conversely, control methods
> must not access fields in operation regions when _REG method execution
> has not indicated that the operation region handler is ready."
> 
> An example included in the spec demonstrates calling _REG when devices are
> turned off: "when the host controller or bridge controller is turned off
> or disabled, PCI Config Space Operation Regions for child devices are
> no longer available. As such, ETH0’s _REG method will be run when it
> is turned off and will again be run when PCI1 is turned off.".
> 
> It is reported that ASMedia PCIe GPIO controllers fail functional tests
> after the system has returning from suspend (S3 or s2idle). This is
> because the BIOS checks whether the OSPM has called the `_REG` method
> to determine whether it can interact with the OperationRegion assigned
> to the device as part of the other AML called for the device.
> 
> To fix this issue, call acpi_evaluate_reg() when devices are
> transitioning to D3cold or D0.
> 
> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 052a611081ec..182cac535250 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>  	return false;
>  }
>  
> +static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
> +{
> +	int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
> +	int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
> +				    ACPI_ADR_SPACE_PCI_CONFIG, val);
> +	if (ret)
> +		pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
> +			enable ? "connect" : "disconnect", ret);
> +}
> +
>  int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>  {
>  	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
> @@ -1053,32 +1063,36 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>  		[PCI_D3hot] = ACPI_STATE_D3_HOT,
>  		[PCI_D3cold] = ACPI_STATE_D3_COLD,
>  	};
> -	int error = -EINVAL;
> +	int ret;
>  
>  	/* If the ACPI device has _EJ0, ignore the device */
>  	if (!adev || acpi_has_method(adev->handle, "_EJ0"))
>  		return -ENODEV;
>  
>  	switch (state) {
> +	case PCI_POWER_ERROR:
> +	case PCI_UNKNOWN:
> +		return -EINVAL;
>  	case PCI_D3cold:
>  		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
> -				PM_QOS_FLAGS_ALL) {
> -			error = -EBUSY;
> -			break;
> -		}
> -		fallthrough;
> -	case PCI_D0:
> -	case PCI_D1:
> -	case PCI_D2:
> -	case PCI_D3hot:
> -		error = acpi_device_set_power(adev, state_conv[state]);
> +				     PM_QOS_FLAGS_ALL)
> +			return -EBUSY;
> +		/* Notify AML lack of PCI config space availability */
> +		acpi_pci_config_space_access(dev, false);
> +		break;
>  	}
>  
> -	if (!error)
> -		pci_dbg(dev, "power state changed by ACPI to %s\n",
> -		        acpi_power_state_string(adev->power.state));
> +	ret = acpi_device_set_power(adev, state_conv[state]);
> +	if (ret)
> +		return ret;
> +	pci_dbg(dev, "power state changed by ACPI to %s\n",
> +		acpi_power_state_string(adev->power.state));
>  
> -	return error;
> +	/* Notify AML of PCI config space availability */
> +	if (state == PCI_D0)
> +		acpi_pci_config_space_access(dev, true);
> +
> +	return 0;
>  }
>  
>  pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)


commit 79d4fdf58711 ("PCI/PM: Validate acpi_pci_set_power_state() parameter")
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 21 16:36:12 2023 -0500

    PCI/PM: Validate acpi_pci_set_power_state() parameter
    
    Previously acpi_pci_set_power_state() assumed the requested power state was
    valid (PCI_D0 ... PCI_D3cold).  If a caller supplied something else, we
    could index outside the state_conv[] array and pass junk to
    acpi_device_set_power().
    
    Validate the pci_power_t parameter and return -EINVAL if it's invalid.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 052a611081ec..bf545f719182 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1053,32 +1053,37 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 		[PCI_D3hot] = ACPI_STATE_D3_HOT,
 		[PCI_D3cold] = ACPI_STATE_D3_COLD,
 	};
-	int error = -EINVAL;
+	int error;
 
 	/* If the ACPI device has _EJ0, ignore the device */
 	if (!adev || acpi_has_method(adev->handle, "_EJ0"))
 		return -ENODEV;
 
 	switch (state) {
-	case PCI_D3cold:
-		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
-				PM_QOS_FLAGS_ALL) {
-			error = -EBUSY;
-			break;
-		}
-		fallthrough;
 	case PCI_D0:
 	case PCI_D1:
 	case PCI_D2:
 	case PCI_D3hot:
-		error = acpi_device_set_power(adev, state_conv[state]);
+	case PCI_D3cold:
+		break;
+	default:
+		return -EINVAL;
 	}
 
-	if (!error)
-		pci_dbg(dev, "power state changed by ACPI to %s\n",
-		        acpi_power_state_string(adev->power.state));
+	if (state == PCI_D3cold) {
+		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
+				PM_QOS_FLAGS_ALL)
+			return -EBUSY;
+	}
 
-	return error;
+	error = acpi_device_set_power(adev, state_conv[state]);
+	if (error)
+		return error;
+
+	pci_dbg(dev, "power state changed by ACPI to %s\n",
+	        acpi_power_state_string(adev->power.state));
+
+	return 0;
 }
 
 pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
commit 746652bd0376 ("PCI/PM: Call _REG when transitioning D-states")
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Tue Jun 20 09:04:51 2023 -0500

    PCI/PM: Call _REG when transitioning D-states
    
    ACPI r6.5, sec 6.5.4, describes how AML is unable to access an
    OperationRegion unless _REG has been called to connect a handler:
    
      The OS runs _REG control methods to inform AML code of a change in the
      availability of an operation region. When an operation region handler is
      unavailable, AML cannot access data fields in that region.  (Operation
      region writes will be ignored and reads will return indeterminate data.)
    
    The PCI core does not call _REG at any time, leading to the undefined
    behavior mentioned in the spec.
    
    The spec explains that _REG should be executed to indicate whether a
    given region can be accessed:
    
      Once _REG has been executed for a particular operation region, indicating
      that the operation region handler is ready, a control method can access
      fields in the operation region. Conversely, control methods must not
      access fields in operation regions when _REG method execution has not
      indicated that the operation region handler is ready.
    
    An example included in the spec demonstrates calling _REG when devices are
    turned off: "when the host controller or bridge controller is turned off
    or disabled, PCI Config Space Operation Regions for child devices are
    no longer available. As such, ETH0’s _REG method will be run when it
    is turned off and will again be run when PCI1 is turned off."
    
    It is reported that ASMedia PCIe GPIO controllers fail functional tests
    after the system has returning from suspend (S3 or s2idle). This is because
    the BIOS checks whether the OSPM has called the _REG method to determine
    whether it can interact with the OperationRegion assigned to the device as
    part of the other AML called for the device.
    
    To fix this issue, call acpi_evaluate_reg() when devices are transitioning
    to D3cold or D0.
    
    Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
    Link: https://lore.kernel.org/r/20230620140451.21007-1-mario.limonciello@amd.com
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>


diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index bf545f719182..a05350a4e49c 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
 	return false;
 }
 
+static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
+{
+	int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
+	int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
+				    ACPI_ADR_SPACE_PCI_CONFIG, val);
+	if (ret)
+		pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
+			enable ? "connect" : "disconnect", ret);
+}
+
 int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 {
 	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
@@ -1074,6 +1084,9 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
 				PM_QOS_FLAGS_ALL)
 			return -EBUSY;
+
+		/* Notify AML lack of PCI config space availability */
+		acpi_pci_config_space_access(dev, false);
 	}
 
 	error = acpi_device_set_power(adev, state_conv[state]);
@@ -1083,6 +1096,15 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 	pci_dbg(dev, "power state changed by ACPI to %s\n",
 	        acpi_power_state_string(adev->power.state));
 
+	/*
+	 * Notify AML of PCI config space availability.  Config space is
+	 * accessible in all states except D3cold; the only transitions
+	 * that change availability are transitions to D3cold and from
+	 * D3cold to D0.
+	 */
+	if (state == PCI_D0)
+		acpi_pci_config_space_access(dev, true);
+
 	return 0;
 }
Limonciello, Mario June 21, 2023, 10:52 p.m. UTC | #3
On 6/21/2023 5:28 PM, Bjorn Helgaas wrote:
> On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
>> Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
>> an OperationRegion unless `_REG` has been called.
>>
>> "The OS runs _REG control methods to inform AML code of a change in the
>> availability of an operation region. When an operation region handler
>> is unavailable, AML cannot access data fields in that region.
>> (Operation region writes will be ignored and reads will return
>> indeterminate data.)"
>>
>> The PCI core does not call `_REG` at anytime, leading to the undefined
>> behavior mentioned in the spec.
> I got lost in the maze of users of ACPI_ADR_SPACE_PCI_CONFIG, but is
> it really true that we never call _REG for PCI config space at all
> today?
I double checked a BIOS debug log which shows ACPI calls
to confirm and didn't see a single _REG call for any device
before this patch across a boot/suspend/resume cycle.
>
> If so, I guess AML that uses ACPI_ADR_SPACE_PCI_CONFIG won't work
> until after we set the relevant device to D0?
The particular problem that that exposed this issue doesn't
happen until suspend/resume time, but yes I think this should
be called when setting the device to D0.
>
> Do we explicitly set devices to D0 during enumeration, e.g., somewhere
> in the pci_scan_device() path?  If not, should we?
AFAICT it's happening for PCIe ports as part of:
pcie_portdrv_probe
->pcie_port_device_register
->->pci_enable_device
->->->pci_enable_device_flags
->->->->do_pci_enable_device
->->->->->pci_set_power_state(pci_dev, PCI_D0)
>
> If we don't set things to D0 during enumeration, it seems like this
> AML won't work until we suspend and resume the device.
>
> Separately, I propose a minor restructuring to avoid the need for
> mentioning PCI_POWER_ERROR and PCI_UNKNOWN.  Checking for those means
> we need to look at the definitions to be sure we cover all cases, and
> it also doesn't solve the problem that a caller can pass undefined
> pci_power_t values that would index outside the state_conv[].
>
> Possible rework attached below.  I also like the fact that it makes
> the _REG patch very simple and specific to _REG.

I like your rework as well.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
For the new patch.

>
>> The spec explains that _REG should be executed to indicate whether a
>> given region can be accessed.
>>
>> "Once _REG has been executed for a particular operation region, indicating
>> that the operation region handler is ready, a control method can
>> access fields in the operation region. Conversely, control methods
>> must not access fields in operation regions when _REG method execution
>> has not indicated that the operation region handler is ready."
>>
>> An example included in the spec demonstrates calling _REG when devices are
>> turned off: "when the host controller or bridge controller is turned off
>> or disabled, PCI Config Space Operation Regions for child devices are
>> no longer available. As such, ETH0’s _REG method will be run when it
>> is turned off and will again be run when PCI1 is turned off.".
>>
>> It is reported that ASMedia PCIe GPIO controllers fail functional tests
>> after the system has returning from suspend (S3 or s2idle). This is
>> because the BIOS checks whether the OSPM has called the `_REG` method
>> to determine whether it can interact with the OperationRegion assigned
>> to the device as part of the other AML called for the device.
>>
>> To fix this issue, call acpi_evaluate_reg() when devices are
>> transitioning to D3cold or D0.
>>
>> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>> index 052a611081ec..182cac535250 100644
>> --- a/drivers/pci/pci-acpi.c
>> +++ b/drivers/pci/pci-acpi.c
>> @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>   	return false;
>>   }
>>   
>> +static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
>> +{
>> +	int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
>> +	int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
>> +				    ACPI_ADR_SPACE_PCI_CONFIG, val);
>> +	if (ret)
>> +		pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
>> +			enable ? "connect" : "disconnect", ret);
>> +}
>> +
>>   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>>   {
>>   	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
>> @@ -1053,32 +1063,36 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>>   		[PCI_D3hot] = ACPI_STATE_D3_HOT,
>>   		[PCI_D3cold] = ACPI_STATE_D3_COLD,
>>   	};
>> -	int error = -EINVAL;
>> +	int ret;
>>   
>>   	/* If the ACPI device has _EJ0, ignore the device */
>>   	if (!adev || acpi_has_method(adev->handle, "_EJ0"))
>>   		return -ENODEV;
>>   
>>   	switch (state) {
>> +	case PCI_POWER_ERROR:
>> +	case PCI_UNKNOWN:
>> +		return -EINVAL;
>>   	case PCI_D3cold:
>>   		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
>> -				PM_QOS_FLAGS_ALL) {
>> -			error = -EBUSY;
>> -			break;
>> -		}
>> -		fallthrough;
>> -	case PCI_D0:
>> -	case PCI_D1:
>> -	case PCI_D2:
>> -	case PCI_D3hot:
>> -		error = acpi_device_set_power(adev, state_conv[state]);
>> +				     PM_QOS_FLAGS_ALL)
>> +			return -EBUSY;
>> +		/* Notify AML lack of PCI config space availability */
>> +		acpi_pci_config_space_access(dev, false);
>> +		break;
>>   	}
>>   
>> -	if (!error)
>> -		pci_dbg(dev, "power state changed by ACPI to %s\n",
>> -		        acpi_power_state_string(adev->power.state));
>> +	ret = acpi_device_set_power(adev, state_conv[state]);
>> +	if (ret)
>> +		return ret;
>> +	pci_dbg(dev, "power state changed by ACPI to %s\n",
>> +		acpi_power_state_string(adev->power.state));
>>   
>> -	return error;
>> +	/* Notify AML of PCI config space availability */
>> +	if (state == PCI_D0)
>> +		acpi_pci_config_space_access(dev, true);
>> +
>> +	return 0;
>>   }
>>   
>>   pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
>
> commit 79d4fdf58711 ("PCI/PM: Validate acpi_pci_set_power_state() parameter")
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 21 16:36:12 2023 -0500
>
>      PCI/PM: Validate acpi_pci_set_power_state() parameter
>      
>      Previously acpi_pci_set_power_state() assumed the requested power state was
>      valid (PCI_D0 ... PCI_D3cold).  If a caller supplied something else, we
>      could index outside the state_conv[] array and pass junk to
>      acpi_device_set_power().
>      
>      Validate the pci_power_t parameter and return -EINVAL if it's invalid.
>      
>      Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
>
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 052a611081ec..bf545f719182 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1053,32 +1053,37 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>   		[PCI_D3hot] = ACPI_STATE_D3_HOT,
>   		[PCI_D3cold] = ACPI_STATE_D3_COLD,
>   	};
> -	int error = -EINVAL;
> +	int error;
>   
>   	/* If the ACPI device has _EJ0, ignore the device */
>   	if (!adev || acpi_has_method(adev->handle, "_EJ0"))
>   		return -ENODEV;
>   
>   	switch (state) {
> -	case PCI_D3cold:
> -		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
> -				PM_QOS_FLAGS_ALL) {
> -			error = -EBUSY;
> -			break;
> -		}
> -		fallthrough;
>   	case PCI_D0:
>   	case PCI_D1:
>   	case PCI_D2:
>   	case PCI_D3hot:
> -		error = acpi_device_set_power(adev, state_conv[state]);
> +	case PCI_D3cold:
> +		break;
> +	default:
> +		return -EINVAL;
>   	}
>   
> -	if (!error)
> -		pci_dbg(dev, "power state changed by ACPI to %s\n",
> -		        acpi_power_state_string(adev->power.state));
> +	if (state == PCI_D3cold) {
> +		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
> +				PM_QOS_FLAGS_ALL)
> +			return -EBUSY;
> +	}
>   
> -	return error;
> +	error = acpi_device_set_power(adev, state_conv[state]);
> +	if (error)
> +		return error;
> +
> +	pci_dbg(dev, "power state changed by ACPI to %s\n",
> +	        acpi_power_state_string(adev->power.state));
> +
> +	return 0;
>   }
>   
>   pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
> commit 746652bd0376 ("PCI/PM: Call _REG when transitioning D-states")
> Author: Mario Limonciello <mario.limonciello@amd.com>
> Date:   Tue Jun 20 09:04:51 2023 -0500
>
>      PCI/PM: Call _REG when transitioning D-states
>      
>      ACPI r6.5, sec 6.5.4, describes how AML is unable to access an
>      OperationRegion unless _REG has been called to connect a handler:
>      
>        The OS runs _REG control methods to inform AML code of a change in the
>        availability of an operation region. When an operation region handler is
>        unavailable, AML cannot access data fields in that region.  (Operation
>        region writes will be ignored and reads will return indeterminate data.)
>      
>      The PCI core does not call _REG at any time, leading to the undefined
>      behavior mentioned in the spec.
>      
>      The spec explains that _REG should be executed to indicate whether a
>      given region can be accessed:
>      
>        Once _REG has been executed for a particular operation region, indicating
>        that the operation region handler is ready, a control method can access
>        fields in the operation region. Conversely, control methods must not
>        access fields in operation regions when _REG method execution has not
>        indicated that the operation region handler is ready.
>      
>      An example included in the spec demonstrates calling _REG when devices are
>      turned off: "when the host controller or bridge controller is turned off
>      or disabled, PCI Config Space Operation Regions for child devices are
>      no longer available. As such, ETH0’s _REG method will be run when it
>      is turned off and will again be run when PCI1 is turned off."
>      
>      It is reported that ASMedia PCIe GPIO controllers fail functional tests
>      after the system has returning from suspend (S3 or s2idle). This is because
>      the BIOS checks whether the OSPM has called the _REG method to determine
>      whether it can interact with the OperationRegion assigned to the device as
>      part of the other AML called for the device.
>      
>      To fix this issue, call acpi_evaluate_reg() when devices are transitioning
>      to D3cold or D0.
>      
>      Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
>      Link: https://lore.kernel.org/r/20230620140451.21007-1-mario.limonciello@amd.com
>      Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>      Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>      Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
>
>
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index bf545f719182..a05350a4e49c 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>   	return false;
>   }
>   
> +static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
> +{
> +	int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
> +	int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
> +				    ACPI_ADR_SPACE_PCI_CONFIG, val);
> +	if (ret)
> +		pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
> +			enable ? "connect" : "disconnect", ret);
> +}
> +
>   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>   {
>   	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
> @@ -1074,6 +1084,9 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>   		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
>   				PM_QOS_FLAGS_ALL)
>   			return -EBUSY;
> +
> +		/* Notify AML lack of PCI config space availability */
> +		acpi_pci_config_space_access(dev, false);
>   	}
>   
>   	error = acpi_device_set_power(adev, state_conv[state]);
> @@ -1083,6 +1096,15 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>   	pci_dbg(dev, "power state changed by ACPI to %s\n",
>   	        acpi_power_state_string(adev->power.state));
>   
> +	/*
> +	 * Notify AML of PCI config space availability.  Config space is
> +	 * accessible in all states except D3cold; the only transitions
> +	 * that change availability are transitions to D3cold and from
> +	 * D3cold to D0.
> +	 */
> +	if (state == PCI_D0)
> +		acpi_pci_config_space_access(dev, true);
> +
>   	return 0;
>   }
>
Limonciello, Mario June 22, 2023, 5:43 p.m. UTC | #4
On 6/21/2023 5:52 PM, Limonciello, Mario wrote:
>
> On 6/21/2023 5:28 PM, Bjorn Helgaas wrote:
>> On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
>>> Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to 
>>> access
>>> an OperationRegion unless `_REG` has been called.
>>>
>>> "The OS runs _REG control methods to inform AML code of a change in the
>>> availability of an operation region. When an operation region handler
>>> is unavailable, AML cannot access data fields in that region.
>>> (Operation region writes will be ignored and reads will return
>>> indeterminate data.)"
>>>
>>> The PCI core does not call `_REG` at anytime, leading to the undefined
>>> behavior mentioned in the spec.
>> I got lost in the maze of users of ACPI_ADR_SPACE_PCI_CONFIG, but is
>> it really true that we never call _REG for PCI config space at all
>> today?
> I double checked a BIOS debug log which shows ACPI calls
> to confirm and didn't see a single _REG call for any device
> before this patch across a boot/suspend/resume cycle.
>>
>> If so, I guess AML that uses ACPI_ADR_SPACE_PCI_CONFIG won't work
>> until after we set the relevant device to D0?
> The particular problem that that exposed this issue doesn't
> happen until suspend/resume time, but yes I think this should
> be called when setting the device to D0.
>>
>> Do we explicitly set devices to D0 during enumeration, e.g., somewhere
>> in the pci_scan_device() path?  If not, should we?
> AFAICT it's happening for PCIe ports as part of:
> pcie_portdrv_probe
> ->pcie_port_device_register
> ->->pci_enable_device
> ->->->pci_enable_device_flags
> ->->->->do_pci_enable_device
> ->->->->->pci_set_power_state(pci_dev, PCI_D0)


Just to add to this; I double checked and one of the devices that doesn't
have a driver on my system has the power_state set to "unknown". I don't
think it would be appropriate to explicitly put "all devices without 
drivers"
into D0 as this could block low power states for the SOC.

On the Intel side, there is some special stuff in intel-pmc-core for example
that explicitly puts specific driverless devices into D3 to ensure low power
states.  If we put everything without a driver into D0 we may break 
stuff like
that.


>>
>> If we don't set things to D0 during enumeration, it seems like this
>> AML won't work until we suspend and resume the device.
>>
>> Separately, I propose a minor restructuring to avoid the need for
>> mentioning PCI_POWER_ERROR and PCI_UNKNOWN.  Checking for those means
>> we need to look at the definitions to be sure we cover all cases, and
>> it also doesn't solve the problem that a caller can pass undefined
>> pci_power_t values that would index outside the state_conv[].
>>
>> Possible rework attached below.  I also like the fact that it makes
>> the _REG patch very simple and specific to _REG.
>
> I like your rework as well.
>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> For the new patch.
>
>>
>>> The spec explains that _REG should be executed to indicate whether a
>>> given region can be accessed.
>>>
>>> "Once _REG has been executed for a particular operation region, 
>>> indicating
>>> that the operation region handler is ready, a control method can
>>> access fields in the operation region. Conversely, control methods
>>> must not access fields in operation regions when _REG method execution
>>> has not indicated that the operation region handler is ready."
>>>
>>> An example included in the spec demonstrates calling _REG when 
>>> devices are
>>> turned off: "when the host controller or bridge controller is turned 
>>> off
>>> or disabled, PCI Config Space Operation Regions for child devices are
>>> no longer available. As such, ETH0’s _REG method will be run when it
>>> is turned off and will again be run when PCI1 is turned off.".
>>>
>>> It is reported that ASMedia PCIe GPIO controllers fail functional tests
>>> after the system has returning from suspend (S3 or s2idle). This is
>>> because the BIOS checks whether the OSPM has called the `_REG` method
>>> to determine whether it can interact with the OperationRegion assigned
>>> to the device as part of the other AML called for the device.
>>>
>>> To fix this issue, call acpi_evaluate_reg() when devices are
>>> transitioning to D3cold or D0.
>>>
>>> Link: 
>>> https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>>> index 052a611081ec..182cac535250 100644
>>> --- a/drivers/pci/pci-acpi.c
>>> +++ b/drivers/pci/pci-acpi.c
>>> @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>>       return false;
>>>   }
>>>   +static void acpi_pci_config_space_access(struct pci_dev *dev, 
>>> bool enable)
>>> +{
>>> +    int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
>>> +    int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
>>> +                    ACPI_ADR_SPACE_PCI_CONFIG, val);
>>> +    if (ret)
>>> +        pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
>>> +            enable ? "connect" : "disconnect", ret);
>>> +}
>>> +
>>>   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>>>   {
>>>       struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
>>> @@ -1053,32 +1063,36 @@ int acpi_pci_set_power_state(struct pci_dev 
>>> *dev, pci_power_t state)
>>>           [PCI_D3hot] = ACPI_STATE_D3_HOT,
>>>           [PCI_D3cold] = ACPI_STATE_D3_COLD,
>>>       };
>>> -    int error = -EINVAL;
>>> +    int ret;
>>>         /* If the ACPI device has _EJ0, ignore the device */
>>>       if (!adev || acpi_has_method(adev->handle, "_EJ0"))
>>>           return -ENODEV;
>>>         switch (state) {
>>> +    case PCI_POWER_ERROR:
>>> +    case PCI_UNKNOWN:
>>> +        return -EINVAL;
>>>       case PCI_D3cold:
>>>           if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
>>> -                PM_QOS_FLAGS_ALL) {
>>> -            error = -EBUSY;
>>> -            break;
>>> -        }
>>> -        fallthrough;
>>> -    case PCI_D0:
>>> -    case PCI_D1:
>>> -    case PCI_D2:
>>> -    case PCI_D3hot:
>>> -        error = acpi_device_set_power(adev, state_conv[state]);
>>> +                     PM_QOS_FLAGS_ALL)
>>> +            return -EBUSY;
>>> +        /* Notify AML lack of PCI config space availability */
>>> +        acpi_pci_config_space_access(dev, false);
>>> +        break;
>>>       }
>>>   -    if (!error)
>>> -        pci_dbg(dev, "power state changed by ACPI to %s\n",
>>> - acpi_power_state_string(adev->power.state));
>>> +    ret = acpi_device_set_power(adev, state_conv[state]);
>>> +    if (ret)
>>> +        return ret;
>>> +    pci_dbg(dev, "power state changed by ACPI to %s\n",
>>> +        acpi_power_state_string(adev->power.state));
>>>   -    return error;
>>> +    /* Notify AML of PCI config space availability */
>>> +    if (state == PCI_D0)
>>> +        acpi_pci_config_space_access(dev, true);
>>> +
>>> +    return 0;
>>>   }
>>>     pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
>>
>> commit 79d4fdf58711 ("PCI/PM: Validate acpi_pci_set_power_state() 
>> parameter")
>> Author: Bjorn Helgaas <bhelgaas@google.com>
>> Date:   Wed Jun 21 16:36:12 2023 -0500
>>
>>      PCI/PM: Validate acpi_pci_set_power_state() parameter
>>           Previously acpi_pci_set_power_state() assumed the requested 
>> power state was
>>      valid (PCI_D0 ... PCI_D3cold).  If a caller supplied something 
>> else, we
>>      could index outside the state_conv[] array and pass junk to
>>      acpi_device_set_power().
>>           Validate the pci_power_t parameter and return -EINVAL if 
>> it's invalid.
>>           Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>
>>
>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>> index 052a611081ec..bf545f719182 100644
>> --- a/drivers/pci/pci-acpi.c
>> +++ b/drivers/pci/pci-acpi.c
>> @@ -1053,32 +1053,37 @@ int acpi_pci_set_power_state(struct pci_dev 
>> *dev, pci_power_t state)
>>           [PCI_D3hot] = ACPI_STATE_D3_HOT,
>>           [PCI_D3cold] = ACPI_STATE_D3_COLD,
>>       };
>> -    int error = -EINVAL;
>> +    int error;
>>         /* If the ACPI device has _EJ0, ignore the device */
>>       if (!adev || acpi_has_method(adev->handle, "_EJ0"))
>>           return -ENODEV;
>>         switch (state) {
>> -    case PCI_D3cold:
>> -        if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
>> -                PM_QOS_FLAGS_ALL) {
>> -            error = -EBUSY;
>> -            break;
>> -        }
>> -        fallthrough;
>>       case PCI_D0:
>>       case PCI_D1:
>>       case PCI_D2:
>>       case PCI_D3hot:
>> -        error = acpi_device_set_power(adev, state_conv[state]);
>> +    case PCI_D3cold:
>> +        break;
>> +    default:
>> +        return -EINVAL;
>>       }
>>   -    if (!error)
>> -        pci_dbg(dev, "power state changed by ACPI to %s\n",
>> -                acpi_power_state_string(adev->power.state));
>> +    if (state == PCI_D3cold) {
>> +        if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
>> +                PM_QOS_FLAGS_ALL)
>> +            return -EBUSY;
>> +    }
>>   -    return error;
>> +    error = acpi_device_set_power(adev, state_conv[state]);
>> +    if (error)
>> +        return error;
>> +
>> +    pci_dbg(dev, "power state changed by ACPI to %s\n",
>> +            acpi_power_state_string(adev->power.state));
>> +
>> +    return 0;
>>   }
>>     pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
>> commit 746652bd0376 ("PCI/PM: Call _REG when transitioning D-states")
>> Author: Mario Limonciello <mario.limonciello@amd.com>
>> Date:   Tue Jun 20 09:04:51 2023 -0500
>>
>>      PCI/PM: Call _REG when transitioning D-states
>>           ACPI r6.5, sec 6.5.4, describes how AML is unable to access an
>>      OperationRegion unless _REG has been called to connect a handler:
>>             The OS runs _REG control methods to inform AML code of a 
>> change in the
>>        availability of an operation region. When an operation region 
>> handler is
>>        unavailable, AML cannot access data fields in that region.  
>> (Operation
>>        region writes will be ignored and reads will return 
>> indeterminate data.)
>>           The PCI core does not call _REG at any time, leading to the 
>> undefined
>>      behavior mentioned in the spec.
>>           The spec explains that _REG should be executed to indicate 
>> whether a
>>      given region can be accessed:
>>             Once _REG has been executed for a particular operation 
>> region, indicating
>>        that the operation region handler is ready, a control method 
>> can access
>>        fields in the operation region. Conversely, control methods 
>> must not
>>        access fields in operation regions when _REG method execution 
>> has not
>>        indicated that the operation region handler is ready.
>>           An example included in the spec demonstrates calling _REG 
>> when devices are
>>      turned off: "when the host controller or bridge controller is 
>> turned off
>>      or disabled, PCI Config Space Operation Regions for child 
>> devices are
>>      no longer available. As such, ETH0’s _REG method will be run 
>> when it
>>      is turned off and will again be run when PCI1 is turned off."
>>           It is reported that ASMedia PCIe GPIO controllers fail 
>> functional tests
>>      after the system has returning from suspend (S3 or s2idle). This 
>> is because
>>      the BIOS checks whether the OSPM has called the _REG method to 
>> determine
>>      whether it can interact with the OperationRegion assigned to the 
>> device as
>>      part of the other AML called for the device.
>>           To fix this issue, call acpi_evaluate_reg() when devices 
>> are transitioning
>>      to D3cold or D0.
>>           Link: 
>> https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region
>>      Link: 
>> https://lore.kernel.org/r/20230620140451.21007-1-mario.limonciello@amd.com
>>      Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>      Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>      Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
>>
>>
>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>> index bf545f719182..a05350a4e49c 100644
>> --- a/drivers/pci/pci-acpi.c
>> +++ b/drivers/pci/pci-acpi.c
>> @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>       return false;
>>   }
>>   +static void acpi_pci_config_space_access(struct pci_dev *dev, bool 
>> enable)
>> +{
>> +    int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
>> +    int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
>> +                    ACPI_ADR_SPACE_PCI_CONFIG, val);
>> +    if (ret)
>> +        pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
>> +            enable ? "connect" : "disconnect", ret);
>> +}
>> +
>>   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>>   {
>>       struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
>> @@ -1074,6 +1084,9 @@ int acpi_pci_set_power_state(struct pci_dev 
>> *dev, pci_power_t state)
>>           if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
>>                   PM_QOS_FLAGS_ALL)
>>               return -EBUSY;
>> +
>> +        /* Notify AML lack of PCI config space availability */
>> +        acpi_pci_config_space_access(dev, false);
>>       }
>>         error = acpi_device_set_power(adev, state_conv[state]);
>> @@ -1083,6 +1096,15 @@ int acpi_pci_set_power_state(struct pci_dev 
>> *dev, pci_power_t state)
>>       pci_dbg(dev, "power state changed by ACPI to %s\n",
>>               acpi_power_state_string(adev->power.state));
>>   +    /*
>> +     * Notify AML of PCI config space availability.  Config space is
>> +     * accessible in all states except D3cold; the only transitions
>> +     * that change availability are transitions to D3cold and from
>> +     * D3cold to D0.
>> +     */
>> +    if (state == PCI_D0)
>> +        acpi_pci_config_space_access(dev, true);
>> +
>>       return 0;
>>   }
Bjorn Helgaas June 22, 2023, 10:08 p.m. UTC | #5
On Thu, Jun 22, 2023 at 12:43:56PM -0500, Limonciello, Mario wrote:
> On 6/21/2023 5:52 PM, Limonciello, Mario wrote:
> > On 6/21/2023 5:28 PM, Bjorn Helgaas wrote:
> > > On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
> > > > Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable
> > > > to access an OperationRegion unless `_REG` has been called.

> > > If so, I guess AML that uses ACPI_ADR_SPACE_PCI_CONFIG won't work
> > > until after we set the relevant device to D0?
> >
> > The particular problem that that exposed this issue doesn't
> > happen until suspend/resume time, but yes I think this should
> > be called when setting the device to D0.

> > > Do we explicitly set devices to D0 during enumeration, e.g., somewhere
> > > in the pci_scan_device() path?  If not, should we?
> >
> > AFAICT it's happening for PCIe ports as part of:
> > pcie_portdrv_probe
> > ->pcie_port_device_register
> > ->->pci_enable_device
> > ->->->pci_enable_device_flags
> > ->->->->do_pci_enable_device
> > ->->->->->pci_set_power_state(pci_dev, PCI_D0)
> 
> Just to add to this; I double checked and one of the devices that
> doesn't have a driver on my system has the power_state set to
> "unknown". I don't think it would be appropriate to explicitly put
> "all devices without drivers" into D0 as this could block low power
> states for the SOC.

I think we probably *should* be evaluating _REG for bridges and
endpoints during enumeration, but I don't think we should depend on
pci_enable_device() to do it.

If a method uses the PCI config OpRegion, I think it should work at
any time, but with this patch it looks like it will only work after a
transition to D0.

> On the Intel side, there is some special stuff in intel-pmc-core for example
> that explicitly puts specific driverless devices into D3 to ensure low power
> states.  If we put everything without a driver into D0 we may break stuff
> like
> that.
Bjorn Helgaas June 22, 2023, 10:10 p.m. UTC | #6
On Wed, Jun 21, 2023 at 05:52:52PM -0500, Limonciello, Mario wrote:
> 
> On 6/21/2023 5:28 PM, Bjorn Helgaas wrote:
> > On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
> > > Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
> > > an OperationRegion unless `_REG` has been called.
> > > 
> > > "The OS runs _REG control methods to inform AML code of a change in the
> > > availability of an operation region. When an operation region handler
> > > is unavailable, AML cannot access data fields in that region.
> > > (Operation region writes will be ignored and reads will return
> > > indeterminate data.)"
> > > 
> > > The PCI core does not call `_REG` at anytime, leading to the undefined
> > > behavior mentioned in the spec.

> I double checked a BIOS debug log which shows ACPI calls
> to confirm and didn't see a single _REG call for any device
> before this patch across a boot/suspend/resume cycle.

OK, thanks, I didn't see one either, which surprised me.

Based on the weird exception in sec 6.5.4:

  Since the [PCI_Config operation region on a PCI root bus containing
  a _BBN object] is permanently available, no _REG methods are
  required, nor will OSPM evaluate any _REG methods that appear in the
  same scope as the operation region declaration(s) of this type.

it seems like when we add a PCI host bridge, we should evaluate any
_REG in the host bridge scope if it does not include _BBN.

The example:

  It should be noted that PCI Config Space Operation Regions are ready
  as soon the host controller or bridge controller has been programmed
  with a bus number. PCI1’s _REG method would not be run until the
  PCI-PCI bridge has been properly configured.

suggests that when we set a PCI-PCI bridge's secondary bus number, we
should evaluate any _REG in its scope.

  At the same time, the OS will also run ETH0’s _REG method since its
  PCI Config Space would be also available. The OS will again run
  ETH0’s _REG method when the ETH0 device is started.

So evidently we should evaluate ETH0._REG once after setting PCI1's
secondary bus number, and again when "ETH0 is started."  I have no
idea what "started" means.

> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> For the new patch.

Thanks for taking a look at it!

I think we're missing some other _REG stuff, but it seems like what
this patch adds does match the per-endpoint power management requires,
so I applied these to pci/pm for v6.5.

Bjorn
Bjorn Helgaas June 23, 2023, 5:35 p.m. UTC | #7
On Wed, Jun 21, 2023 at 05:52:52PM -0500, Limonciello, Mario wrote:
> On 6/21/2023 5:28 PM, Bjorn Helgaas wrote:
> > On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
> > > Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
> > > an OperationRegion unless `_REG` has been called.
> > > ...

> > > It is reported that ASMedia PCIe GPIO controllers fail
> > > functional tests after the system has returning from suspend (S3
> > > or s2idle). This is because the BIOS checks whether the OSPM has
> > > called the `_REG` method to determine whether it can interact
> > > with the OperationRegion assigned to the device as part of the
> > > other AML called for the device.

> I double checked a BIOS debug log which shows ACPI calls
> to confirm and didn't see a single _REG call for any device
> before this patch across a boot/suspend/resume cycle.

Sorry to follow up on this again.

The commit log says these GPIO controllers fail functional tests after
returning from suspend.  Do those functional tests pass *before*
suspend?  If so, why?

Without this patch, we *never* call _REG, so the fact that calling
_REG when we return the device to D0 while resuming fixes something
suggests that it might have been broken even before the suspend.

Bjorn
Limonciello, Mario June 23, 2023, 6:08 p.m. UTC | #8
On 6/23/2023 12:35 PM, Bjorn Helgaas wrote:
> On Wed, Jun 21, 2023 at 05:52:52PM -0500, Limonciello, Mario wrote:
>> On 6/21/2023 5:28 PM, Bjorn Helgaas wrote:
>>> On Tue, Jun 20, 2023 at 09:04:51AM -0500, Mario Limonciello wrote:
>>>> Section 6.5.4 of the ACPI 6.4 spec describes how AML is unable to access
>>>> an OperationRegion unless `_REG` has been called.
>>>> ...
>>>> It is reported that ASMedia PCIe GPIO controllers fail
>>>> functional tests after the system has returning from suspend (S3
>>>> or s2idle). This is because the BIOS checks whether the OSPM has
>>>> called the `_REG` method to determine whether it can interact
>>>> with the OperationRegion assigned to the device as part of the
>>>> other AML called for the device.
>> I double checked a BIOS debug log which shows ACPI calls
>> to confirm and didn't see a single _REG call for any device
>> before this patch across a boot/suspend/resume cycle.
> Sorry to follow up on this again.
>
> The commit log says these GPIO controllers fail functional tests after
> returning from suspend.  Do those functional tests pass *before*
> suspend?  If so, why?
>
> Without this patch, we *never* call _REG, so the fact that calling
> _REG when we return the device to D0 while resuming fixes something
> suggests that it might have been broken even before the suspend.
>
> Bjorn
The reason it works is because the only time the AML attempts
to access the config space is part of suspend/resume.  If it
attempted to access config space as part of startup then I'd
expect the same failure at startup.
diff mbox series

Patch

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 052a611081ec..182cac535250 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1043,6 +1043,16 @@  bool acpi_pci_bridge_d3(struct pci_dev *dev)
 	return false;
 }
 
+static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
+{
+	int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
+	int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
+				    ACPI_ADR_SPACE_PCI_CONFIG, val);
+	if (ret)
+		pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
+			enable ? "connect" : "disconnect", ret);
+}
+
 int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 {
 	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
@@ -1053,32 +1063,36 @@  int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 		[PCI_D3hot] = ACPI_STATE_D3_HOT,
 		[PCI_D3cold] = ACPI_STATE_D3_COLD,
 	};
-	int error = -EINVAL;
+	int ret;
 
 	/* If the ACPI device has _EJ0, ignore the device */
 	if (!adev || acpi_has_method(adev->handle, "_EJ0"))
 		return -ENODEV;
 
 	switch (state) {
+	case PCI_POWER_ERROR:
+	case PCI_UNKNOWN:
+		return -EINVAL;
 	case PCI_D3cold:
 		if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
-				PM_QOS_FLAGS_ALL) {
-			error = -EBUSY;
-			break;
-		}
-		fallthrough;
-	case PCI_D0:
-	case PCI_D1:
-	case PCI_D2:
-	case PCI_D3hot:
-		error = acpi_device_set_power(adev, state_conv[state]);
+				     PM_QOS_FLAGS_ALL)
+			return -EBUSY;
+		/* Notify AML lack of PCI config space availability */
+		acpi_pci_config_space_access(dev, false);
+		break;
 	}
 
-	if (!error)
-		pci_dbg(dev, "power state changed by ACPI to %s\n",
-		        acpi_power_state_string(adev->power.state));
+	ret = acpi_device_set_power(adev, state_conv[state]);
+	if (ret)
+		return ret;
+	pci_dbg(dev, "power state changed by ACPI to %s\n",
+		acpi_power_state_string(adev->power.state));
 
-	return error;
+	/* Notify AML of PCI config space availability */
+	if (state == PCI_D0)
+		acpi_pci_config_space_access(dev, true);
+
+	return 0;
 }
 
 pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)