diff mbox series

[RFC,v3,01/21] ACPI: Only enumerate enabled (or functional) devices

Message ID E1rDOfs-00DvjY-HQ@rmk-PC.armlinux.org.uk (mailing list archive)
State New, archived
Headers show
Series ACPI/arm64: add support for virtual cpu hotplug | expand

Commit Message

Russell King (Oracle) Dec. 13, 2023, 12:49 p.m. UTC
From: James Morse <james.morse@arm.com>

Today the ACPI enumeration code 'visits' all devices that are present.

This is a problem for arm64, where CPUs are always present, but not
always enabled. When a device-check occurs because the firmware-policy
has changed and a CPU is now enabled, the following error occurs:
| acpi ACPI0007:48: Enumeration failure

This is ultimately because acpi_dev_ready_for_enumeration() returns
true for a device that is not enabled. The ACPI Processor driver
will not register such CPUs as they are not 'decoding their resources'.

Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
ACPI allows a device to be functional instead of maintaining the
present and enabled bit. Make this behaviour an explicit check with
a reference to the spec, and then check the present and enabled bits.
This is needed to avoid enumerating present && functional devices that
are not enabled.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
If this change causes problems on deployed hardware, I suggest an
arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
acpi_dev_ready_for_enumeration() to only check the present bit.

Changes since RFC v2:
 * Incorporate comment suggestion by Gavin Shan.
Other review comments from Jonathan Cameron not yet addressed.
---
 drivers/acpi/device_pm.c    |  2 +-
 drivers/acpi/device_sysfs.c |  2 +-
 drivers/acpi/internal.h     |  1 -
 drivers/acpi/property.c     |  2 +-
 drivers/acpi/scan.c         | 24 ++++++++++++++----------
 5 files changed, 17 insertions(+), 14 deletions(-)

Comments

Jonathan Cameron Dec. 14, 2023, 5:32 p.m. UTC | #1
On Wed, 13 Dec 2023 12:49:16 +0000
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:

> From: James Morse <james.morse@arm.com>
> 
> Today the ACPI enumeration code 'visits' all devices that are present.
> 
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
> 
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
> 
> Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit. Make this behaviour an explicit check with
> a reference to the spec, and then check the present and enabled bits.
> This is needed to avoid enumerating present && functional devices that
> are not enabled.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> If this change causes problems on deployed hardware, I suggest an
> arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> acpi_dev_ready_for_enumeration() to only check the present bit.

My gut feeling (having made ACPI 'fixes' in the past that ran into
horribly broken firmware and had to be reverted) is reduce the blast
radius preemptively from the start. I'd love to live in a world were
that wasn't necessary but I don't trust all the generators of ACPI tables.
I'll leave it to Rafael and other ACPI experts suggest how narrow we should
make it though - arch opt in might be narrow enough.

> 
> Changes since RFC v2:
>  * Incorporate comment suggestion by Gavin Shan.
> Other review comments from Jonathan Cameron not yet addressed.

Looking back, I think this was mainly a suggestion for a minor
possible optimization by ignoring the case of !present && enabled
when designing the logic because that's not allowed by the spec.

You made that change in v3.

Otherwise, comments were trivial comment clarifications that I'm not
that worried about.

One comment typo inline.

With assumption others will comment on when this change should be
chicken bit'd out.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> ---
>  drivers/acpi/device_pm.c    |  2 +-
>  drivers/acpi/device_sysfs.c |  2 +-
>  drivers/acpi/internal.h     |  1 -
>  drivers/acpi/property.c     |  2 +-
>  drivers/acpi/scan.c         | 24 ++++++++++++++----------
>  5 files changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>  		return -EINVAL;
>  
>  	device->power.state = ACPI_STATE_UNKNOWN;
> -	if (!acpi_device_is_present(device)) {
> +	if (!acpi_dev_ready_for_enumeration(device)) {
>  		device->flags.initialized = false;
>  		return -ENXIO;
>  	}
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>  	struct acpi_hardware_id *id;
>  
>  	/* Avoid unnecessarily loading modules for non present devices. */
> -	if (!acpi_device_is_present(acpi_dev))
> +	if (!acpi_dev_ready_for_enumeration(acpi_dev))
>  		return 0;
>  
>  	/*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..a1b45e345bcc 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>  void acpi_device_remove_files(struct acpi_device *dev);
>  void acpi_device_add_finalize(struct acpi_device *device);
>  void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>  bool acpi_device_is_battery(struct acpi_device *adev);
>  bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>  					const struct device *dev);
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 6979a3f9f90a..14d6948fd88a 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>  	if (!is_acpi_device_node(fwnode))
>  		return false;
>  
> -	return acpi_device_is_present(to_acpi_device_node(fwnode));
> +	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>  }
>  
>  static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 02bb2cce423f..728649a2a251 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>  	int error;
>  
>  	acpi_bus_get_status(adev);
> -	if (acpi_device_is_present(adev)) {
> +	if (acpi_dev_ready_for_enumeration(adev)) {
>  		/*
>  		 * This function is only called for device objects for which
>  		 * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>  	int error;
>  
>  	acpi_bus_get_status(adev);
> -	if (!acpi_device_is_present(adev)) {
> +	if (!acpi_dev_ready_for_enumeration(adev)) {
>  		acpi_scan_device_not_enumerated(adev);
>  		return 0;
>  	}
> @@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>  	return true;
>  }
>  
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -	return adev->status.present || adev->status.functional;
> -}
> -
>  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>  				       const char *idstr,
>  				       const struct acpi_device_id **matchid)
> @@ -2381,16 +2376,25 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>   * @device: Pointer to the &struct acpi_device to check
>   *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>   *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>   */
>  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>  {
>  	if (device->flags.honor_deps && device->dep_unmet)
>  		return false;
>  
> -	return acpi_device_is_present(device);
> +	/*
> +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +	 * (!present && functional) for certain types of devices that should be
> +	 * enumerated. Note that the enabled bit can't be sert until the present

set until

> +	 * bit is set.
> +	 */
> +	if (device->status.present)
> +		return device->status.enabled;
> +	else
> +		return device->status.functional;
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
Rafael J. Wysocki Dec. 14, 2023, 5:47 p.m. UTC | #2
On Thu, Dec 14, 2023 at 6:32 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 13 Dec 2023 12:49:16 +0000
> Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:
>
> > From: James Morse <james.morse@arm.com>
> >
> > Today the ACPI enumeration code 'visits' all devices that are present.
> >
> > This is a problem for arm64, where CPUs are always present, but not
> > always enabled. When a device-check occurs because the firmware-policy
> > has changed and a CPU is now enabled, the following error occurs:
> > | acpi ACPI0007:48: Enumeration failure
> >
> > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > true for a device that is not enabled. The ACPI Processor driver
> > will not register such CPUs as they are not 'decoding their resources'.
> >
> > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > ACPI allows a device to be functional instead of maintaining the
> > present and enabled bit. Make this behaviour an explicit check with
> > a reference to the spec, and then check the present and enabled bits.
> > This is needed to avoid enumerating present && functional devices that
> > are not enabled.
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > If this change causes problems on deployed hardware, I suggest an
> > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > acpi_dev_ready_for_enumeration() to only check the present bit.
>
> My gut feeling (having made ACPI 'fixes' in the past that ran into
> horribly broken firmware and had to be reverted) is reduce the blast
> radius preemptively from the start. I'd love to live in a world were
> that wasn't necessary but I don't trust all the generators of ACPI tables.
> I'll leave it to Rafael and other ACPI experts suggest how narrow we should
> make it though - arch opt in might be narrow enough.

A chicken bit wouldn't help much IMO, especially in the cases when
working setups get broken.

I would very much prefer to limit the scope of it, say to processors
only, in the first place.
Russell King (Oracle) Dec. 14, 2023, 5:55 p.m. UTC | #3
On Thu, Dec 14, 2023 at 05:32:41PM +0000, Jonathan Cameron wrote:
> On Wed, 13 Dec 2023 12:49:16 +0000
> Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:
> 
> > From: James Morse <james.morse@arm.com>
> > 
> > Today the ACPI enumeration code 'visits' all devices that are present.
> > 
> > This is a problem for arm64, where CPUs are always present, but not
> > always enabled. When a device-check occurs because the firmware-policy
> > has changed and a CPU is now enabled, the following error occurs:
> > | acpi ACPI0007:48: Enumeration failure
> > 
> > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > true for a device that is not enabled. The ACPI Processor driver
> > will not register such CPUs as they are not 'decoding their resources'.
> > 
> > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > ACPI allows a device to be functional instead of maintaining the
> > present and enabled bit. Make this behaviour an explicit check with
> > a reference to the spec, and then check the present and enabled bits.
> > This is needed to avoid enumerating present && functional devices that
> > are not enabled.
> > 
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > If this change causes problems on deployed hardware, I suggest an
> > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > acpi_dev_ready_for_enumeration() to only check the present bit.
> 
> My gut feeling (having made ACPI 'fixes' in the past that ran into
> horribly broken firmware and had to be reverted) is reduce the blast
> radius preemptively from the start. I'd love to live in a world were
> that wasn't necessary but I don't trust all the generators of ACPI tables.
> I'll leave it to Rafael and other ACPI experts suggest how narrow we should
> make it though - arch opt in might be narrow enough.

Yes, I think an arch opt-in would be the most sensible way forward, if
Rafael concurs with that idea. I notice that what I wrote there was
actually an opt-out. I'll fix that.

> > +	/*
> > +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > +	 * (!present && functional) for certain types of devices that should be
> > +	 * enumerated. Note that the enabled bit can't be sert until the present
> 
> set until

Thanks for spotting that, fixed.
Russell King (Oracle) Dec. 14, 2023, 6:10 p.m. UTC | #4
On Thu, Dec 14, 2023 at 06:47:00PM +0100, Rafael J. Wysocki wrote:
> On Thu, Dec 14, 2023 at 6:32 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Wed, 13 Dec 2023 12:49:16 +0000
> > Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:
> >
> > > From: James Morse <james.morse@arm.com>
> > >
> > > Today the ACPI enumeration code 'visits' all devices that are present.
> > >
> > > This is a problem for arm64, where CPUs are always present, but not
> > > always enabled. When a device-check occurs because the firmware-policy
> > > has changed and a CPU is now enabled, the following error occurs:
> > > | acpi ACPI0007:48: Enumeration failure
> > >
> > > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > > true for a device that is not enabled. The ACPI Processor driver
> > > will not register such CPUs as they are not 'decoding their resources'.
> > >
> > > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > > ACPI allows a device to be functional instead of maintaining the
> > > present and enabled bit. Make this behaviour an explicit check with
> > > a reference to the spec, and then check the present and enabled bits.
> > > This is needed to avoid enumerating present && functional devices that
> > > are not enabled.
> > >
> > > Signed-off-by: James Morse <james.morse@arm.com>
> > > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > ---
> > > If this change causes problems on deployed hardware, I suggest an
> > > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > > acpi_dev_ready_for_enumeration() to only check the present bit.
> >
> > My gut feeling (having made ACPI 'fixes' in the past that ran into
> > horribly broken firmware and had to be reverted) is reduce the blast
> > radius preemptively from the start. I'd love to live in a world were
> > that wasn't necessary but I don't trust all the generators of ACPI tables.
> > I'll leave it to Rafael and other ACPI experts suggest how narrow we should
> > make it though - arch opt in might be narrow enough.
> 
> A chicken bit wouldn't help much IMO, especially in the cases when
> working setups get broken.
> 
> I would very much prefer to limit the scope of it, say to processors
> only, in the first place.

Thanks for the feedback and the idea.

I guess we need something like:

	if (device->status.present)
		return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
		       device->status.enabled;
	else
		return device->status.functional;

so we only check device->status.enabled for processor-type devices?
Rafael J. Wysocki Dec. 14, 2023, 6:16 p.m. UTC | #5
On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Thu, Dec 14, 2023 at 06:47:00PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Dec 14, 2023 at 6:32 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Wed, 13 Dec 2023 12:49:16 +0000
> > > Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:
> > >
> > > > From: James Morse <james.morse@arm.com>
> > > >
> > > > Today the ACPI enumeration code 'visits' all devices that are present.
> > > >
> > > > This is a problem for arm64, where CPUs are always present, but not
> > > > always enabled. When a device-check occurs because the firmware-policy
> > > > has changed and a CPU is now enabled, the following error occurs:
> > > > | acpi ACPI0007:48: Enumeration failure
> > > >
> > > > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > > > true for a device that is not enabled. The ACPI Processor driver
> > > > will not register such CPUs as they are not 'decoding their resources'.
> > > >
> > > > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > > > ACPI allows a device to be functional instead of maintaining the
> > > > present and enabled bit. Make this behaviour an explicit check with
> > > > a reference to the spec, and then check the present and enabled bits.
> > > > This is needed to avoid enumerating present && functional devices that
> > > > are not enabled.
> > > >
> > > > Signed-off-by: James Morse <james.morse@arm.com>
> > > > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > > > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > > ---
> > > > If this change causes problems on deployed hardware, I suggest an
> > > > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > > > acpi_dev_ready_for_enumeration() to only check the present bit.
> > >
> > > My gut feeling (having made ACPI 'fixes' in the past that ran into
> > > horribly broken firmware and had to be reverted) is reduce the blast
> > > radius preemptively from the start. I'd love to live in a world were
> > > that wasn't necessary but I don't trust all the generators of ACPI tables.
> > > I'll leave it to Rafael and other ACPI experts suggest how narrow we should
> > > make it though - arch opt in might be narrow enough.
> >
> > A chicken bit wouldn't help much IMO, especially in the cases when
> > working setups get broken.
> >
> > I would very much prefer to limit the scope of it, say to processors
> > only, in the first place.
>
> Thanks for the feedback and the idea.
>
> I guess we need something like:
>
>         if (device->status.present)
>                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
>                        device->status.enabled;
>         else
>                 return device->status.functional;
>
> so we only check device->status.enabled for processor-type devices?

Yes, something like this.
Rafael J. Wysocki Dec. 14, 2023, 6:37 p.m. UTC | #6
On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Thu, Dec 14, 2023 at 06:47:00PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Dec 14, 2023 at 6:32 PM Jonathan Cameron
> > > <Jonathan.Cameron@huawei.com> wrote:
> > > >
> > > > On Wed, 13 Dec 2023 12:49:16 +0000
> > > > Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:
> > > >
> > > > > From: James Morse <james.morse@arm.com>
> > > > >
> > > > > Today the ACPI enumeration code 'visits' all devices that are present.
> > > > >
> > > > > This is a problem for arm64, where CPUs are always present, but not
> > > > > always enabled. When a device-check occurs because the firmware-policy
> > > > > has changed and a CPU is now enabled, the following error occurs:
> > > > > | acpi ACPI0007:48: Enumeration failure
> > > > >
> > > > > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > > > > true for a device that is not enabled. The ACPI Processor driver
> > > > > will not register such CPUs as they are not 'decoding their resources'.
> > > > >
> > > > > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > > > > ACPI allows a device to be functional instead of maintaining the
> > > > > present and enabled bit. Make this behaviour an explicit check with
> > > > > a reference to the spec, and then check the present and enabled bits.
> > > > > This is needed to avoid enumerating present && functional devices that
> > > > > are not enabled.
> > > > >
> > > > > Signed-off-by: James Morse <james.morse@arm.com>
> > > > > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > > > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > > > > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > > > ---
> > > > > If this change causes problems on deployed hardware, I suggest an
> > > > > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > > > > acpi_dev_ready_for_enumeration() to only check the present bit.
> > > >
> > > > My gut feeling (having made ACPI 'fixes' in the past that ran into
> > > > horribly broken firmware and had to be reverted) is reduce the blast
> > > > radius preemptively from the start. I'd love to live in a world were
> > > > that wasn't necessary but I don't trust all the generators of ACPI tables.
> > > > I'll leave it to Rafael and other ACPI experts suggest how narrow we should
> > > > make it though - arch opt in might be narrow enough.
> > >
> > > A chicken bit wouldn't help much IMO, especially in the cases when
> > > working setups get broken.
> > >
> > > I would very much prefer to limit the scope of it, say to processors
> > > only, in the first place.
> >
> > Thanks for the feedback and the idea.
> >
> > I guess we need something like:
> >
> >         if (device->status.present)
> >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> >                        device->status.enabled;
> >         else
> >                 return device->status.functional;
> >
> > so we only check device->status.enabled for processor-type devices?
>
> Yes, something like this.

However, that is not sufficient, because there are
ACPI_BUS_TYPE_DEVICE devices representing processors.

I'm not sure about a clean way to do it ATM.
Russell King (Oracle) Dec. 15, 2023, 3:31 p.m. UTC | #7
On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:
> On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:
> > > I guess we need something like:
> > >
> > >         if (device->status.present)
> > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > >                        device->status.enabled;
> > >         else
> > >                 return device->status.functional;
> > >
> > > so we only check device->status.enabled for processor-type devices?
> >
> > Yes, something like this.
> 
> However, that is not sufficient, because there are
> ACPI_BUS_TYPE_DEVICE devices representing processors.
> 
> I'm not sure about a clean way to do it ATM.

Ok, how about:

static bool acpi_dev_is_processor(const struct acpi_device *device)
{
	struct acpi_hardware_id *hwid;

	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
		return true;

	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
		return false;

	list_for_each_entry(hwid, &device->pnp.ids, list)
		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
			return true;

	return false;
}

and then:

	if (device->status.present)
		return !acpi_dev_is_processor(device) || device->status.enabled;
	else
		return device->status.functional;

?
Jonathan Cameron Dec. 15, 2023, 4:15 p.m. UTC | #8
On Fri, 15 Dec 2023 15:31:55 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:  
> > >
> > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > <linux@armlinux.org.uk> wrote:  
> > > > I guess we need something like:
> > > >
> > > >         if (device->status.present)
> > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > >                        device->status.enabled;
> > > >         else
> > > >                 return device->status.functional;
> > > >
> > > > so we only check device->status.enabled for processor-type devices?  
> > >
> > > Yes, something like this.  
> > 
> > However, that is not sufficient, because there are
> > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > 
> > I'm not sure about a clean way to do it ATM.  
> 
> Ok, how about:
> 
> static bool acpi_dev_is_processor(const struct acpi_device *device)
> {
> 	struct acpi_hardware_id *hwid;
> 
> 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> 		return true;
> 
> 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> 		return false;
> 
> 	list_for_each_entry(hwid, &device->pnp.ids, list)
> 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> 			return true;
> 
> 	return false;
> }
> 
> and then:
> 
> 	if (device->status.present)
> 		return !acpi_dev_is_processor(device) || device->status.enabled;
> 	else
> 		return device->status.functional;
> 
> ?
> 
Changing it to CPU only for now makes sense to me and I think this code snippet should do the
job.  Nice and simple.
Rafael J. Wysocki Dec. 15, 2023, 7:47 p.m. UTC | #9
On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:
> On Fri, 15 Dec 2023 15:31:55 +0000
> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> 
> > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:  
> > > >
> > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > <linux@armlinux.org.uk> wrote:  
> > > > > I guess we need something like:
> > > > >
> > > > >         if (device->status.present)
> > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > >                        device->status.enabled;
> > > > >         else
> > > > >                 return device->status.functional;
> > > > >
> > > > > so we only check device->status.enabled for processor-type devices?  
> > > >
> > > > Yes, something like this.  
> > > 
> > > However, that is not sufficient, because there are
> > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > 
> > > I'm not sure about a clean way to do it ATM.  
> > 
> > Ok, how about:
> > 
> > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > {
> > 	struct acpi_hardware_id *hwid;
> > 
> > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > 		return true;
> > 
> > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > 		return false;
> > 
> > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > 			return true;
> > 
> > 	return false;
> > }
> > 
> > and then:
> > 
> > 	if (device->status.present)
> > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > 	else
> > 		return device->status.functional;
> > 
> > ?
> > 
> Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> job.  Nice and simple.

Well, except that it does checks that are done elsewhere slightly
differently, which from the maintenance POV is not nice.

Maybe something like the appended patch (untested).

---
 drivers/acpi/acpi_processor.c |   11 +++++++++++
 drivers/acpi/internal.h       |    3 +++
 drivers/acpi/scan.c           |   24 +++++++++++++++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/acpi_processor.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_processor.c
+++ linux-pm/drivers/acpi/acpi_processor.c
@@ -644,6 +644,17 @@ static struct acpi_scan_handler processo
 	},
 };
 
+bool acpi_device_is_processor(const struct acpi_device *adev)
+{
+	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
+		return true;
+
+	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
+		return false;
+
+	return acpi_scan_check_handler(adev, &processor_handler);
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
Index: linux-pm/drivers/acpi/internal.h
===================================================================
--- linux-pm.orig/drivers/acpi/internal.h
+++ linux-pm/drivers/acpi/internal.h
@@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(stru
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
 				       const char *hotplug_profile_name);
 void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler);
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *acpi_debugfs_dir;
@@ -133,6 +135,7 @@ int acpi_bus_register_early_device(int t
 const struct acpi_device *acpi_companion_match(const struct device *dev);
 int __acpi_device_uevent_modalias(const struct acpi_device *adev,
 				  struct kobj_uevent_env *env);
+bool acpi_device_is_processor(const struct acpi_device *adev);
 
 /* --------------------------------------------------------------------------
                                   Power Resource
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1938,6 +1938,19 @@ static bool acpi_scan_handler_matching(s
 	return false;
 }
 
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler)
+{
+	struct acpi_hardware_id *hwid;
+
+	list_for_each_entry(hwid, &adev->pnp.ids, list) {
+		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
+			return true;
+	}
+
+	return false;
+}
+
 static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
 					const struct acpi_device_id **matchid)
 {
@@ -2410,7 +2423,16 @@ bool acpi_dev_ready_for_enumeration(cons
 	if (device->flags.honor_deps && device->dep_unmet)
 		return false;
 
-	return acpi_device_is_present(device);
+	if (device->status.functional)
+		return true;
+
+	if (!device->status.present)
+		return false;
+
+	if (device->status.enabled)
+		return true; /* Fast path. */
+
+	return !acpi_device_is_processor(device);
 }
 EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
Jonathan Cameron Jan. 2, 2024, 2:39 p.m. UTC | #10
On Fri, 15 Dec 2023 20:47:31 +0100
"Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:
> > On Fri, 15 Dec 2023 15:31:55 +0000
> > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> >   
> > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:  
> > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:    
> > > > >
> > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > <linux@armlinux.org.uk> wrote:    
> > > > > > I guess we need something like:
> > > > > >
> > > > > >         if (device->status.present)
> > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > >                        device->status.enabled;
> > > > > >         else
> > > > > >                 return device->status.functional;
> > > > > >
> > > > > > so we only check device->status.enabled for processor-type devices?    
> > > > >
> > > > > Yes, something like this.    
> > > > 
> > > > However, that is not sufficient, because there are
> > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > 
> > > > I'm not sure about a clean way to do it ATM.    
> > > 
> > > Ok, how about:
> > > 
> > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > {
> > > 	struct acpi_hardware_id *hwid;
> > > 
> > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > 		return true;
> > > 
> > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > 		return false;
> > > 
> > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > 			return true;
> > > 
> > > 	return false;
> > > }
> > > 
> > > and then:
> > > 
> > > 	if (device->status.present)
> > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > 	else
> > > 		return device->status.functional;
> > > 
> > > ?
> > >   
> > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > job.  Nice and simple.  
> 
> Well, except that it does checks that are done elsewhere slightly
> differently, which from the maintenance POV is not nice.
> 
> Maybe something like the appended patch (untested).

Hi Rafael,

As far as I can see that's functionally equivalent, so looks good to me.
I'm not set up to test this today though, so will defer to Russell on whether
there is anything missing

Thanks for putting this together.

Jonathan

> 
> ---
>  drivers/acpi/acpi_processor.c |   11 +++++++++++
>  drivers/acpi/internal.h       |    3 +++
>  drivers/acpi/scan.c           |   24 +++++++++++++++++++++++-
>  3 files changed, 37 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/acpi/acpi_processor.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_processor.c
> +++ linux-pm/drivers/acpi/acpi_processor.c
> @@ -644,6 +644,17 @@ static struct acpi_scan_handler processo
>  	},
>  };
>  
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +		return true;
> +
> +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +		return false;
> +
> +	return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>  					   const struct acpi_device_id *id)
>  {
> Index: linux-pm/drivers/acpi/internal.h
> ===================================================================
> --- linux-pm.orig/drivers/acpi/internal.h
> +++ linux-pm/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(stru
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>  				       const char *hotplug_profile_name);
>  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler);
>  
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *acpi_debugfs_dir;
> @@ -133,6 +135,7 @@ int acpi_bus_register_early_device(int t
>  const struct acpi_device *acpi_companion_match(const struct device *dev);
>  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>  				  struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>  
>  /* --------------------------------------------------------------------------
>                                    Power Resource
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -1938,6 +1938,19 @@ static bool acpi_scan_handler_matching(s
>  	return false;
>  }
>  
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler)
> +{
> +	struct acpi_hardware_id *hwid;
> +
> +	list_for_each_entry(hwid, &adev->pnp.ids, list) {
> +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
>  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>  					const struct acpi_device_id **matchid)
>  {
> @@ -2410,7 +2423,16 @@ bool acpi_dev_ready_for_enumeration(cons
>  	if (device->flags.honor_deps && device->dep_unmet)
>  		return false;
>  
> -	return acpi_device_is_present(device);
> +	if (device->status.functional)
> +		return true;
> +
> +	if (!device->status.present)
> +		return false;
> +
> +	if (device->status.enabled)
> +		return true; /* Fast path. */
> +
> +	return !acpi_device_is_processor(device);
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>  
> 
> 
>
Jonathan Cameron Jan. 11, 2024, 10:19 a.m. UTC | #11
On Tue, 2 Jan 2024 14:39:25 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Fri, 15 Dec 2023 20:47:31 +0100
> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> 
> > On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:  
> > > On Fri, 15 Dec 2023 15:31:55 +0000
> > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > >     
> > > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:    
> > > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:      
> > > > > >
> > > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > > <linux@armlinux.org.uk> wrote:      
> > > > > > > I guess we need something like:
> > > > > > >
> > > > > > >         if (device->status.present)
> > > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > > >                        device->status.enabled;
> > > > > > >         else
> > > > > > >                 return device->status.functional;
> > > > > > >
> > > > > > > so we only check device->status.enabled for processor-type devices?      
> > > > > >
> > > > > > Yes, something like this.      
> > > > > 
> > > > > However, that is not sufficient, because there are
> > > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > > 
> > > > > I'm not sure about a clean way to do it ATM.      
> > > > 
> > > > Ok, how about:
> > > > 
> > > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > > {
> > > > 	struct acpi_hardware_id *hwid;
> > > > 
> > > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > > 		return true;
> > > > 
> > > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > > 		return false;
> > > > 
> > > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > > 			return true;
> > > > 
> > > > 	return false;
> > > > }
> > > > 
> > > > and then:
> > > > 
> > > > 	if (device->status.present)
> > > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > > 	else
> > > > 		return device->status.functional;
> > > > 
> > > > ?
> > > >     
> > > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > > job.  Nice and simple.    
> > 
> > Well, except that it does checks that are done elsewhere slightly
> > differently, which from the maintenance POV is not nice.
> > 
> > Maybe something like the appended patch (untested).  
> 
> Hi Rafael,
> 
> As far as I can see that's functionally equivalent, so looks good to me.
> I'm not set up to test this today though, so will defer to Russell on whether
> there is anything missing
> 
> Thanks for putting this together.

This is rather embarrassing...

I span this up on a QEMU instance with some prints to find out we need
the !acpi_device_is_processor() restriction.
On my 'random' test setup it fails on one device. ACPI0017 - which I
happen to know rather well. It's the weird pseudo device that lets
a CXL aware OS know there is a CEDT table to probe.

Whilst I really don't like that hack (it is all about making software
distribution of out of tree modules easier rather than something
fundamental), I'm the CXL QEMU maintainer :(

Will fix that, but it shows there is at least one broken firmware out
there.

On plus side, Rafael's code seems to work as expected and lets that
buggy firwmare carry on working :) So lets pretend the bug in qemu
is a deliberate test case!

Jonathan

p.s. My test setup blows up later for an unrelated reason with latest
kernel, so I'll be off debugging that for a while :(


> 
> Jonathan
> 
> > 
> > ---
> >  drivers/acpi/acpi_processor.c |   11 +++++++++++
> >  drivers/acpi/internal.h       |    3 +++
> >  drivers/acpi/scan.c           |   24 +++++++++++++++++++++++-
> >  3 files changed, 37 insertions(+), 1 deletion(-)
> > 
> > Index: linux-pm/drivers/acpi/acpi_processor.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/acpi_processor.c
> > +++ linux-pm/drivers/acpi/acpi_processor.c
> > @@ -644,6 +644,17 @@ static struct acpi_scan_handler processo
> >  	},
> >  };
> >  
> > +bool acpi_device_is_processor(const struct acpi_device *adev)
> > +{
> > +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > +		return true;
> > +
> > +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> > +		return false;
> > +
> > +	return acpi_scan_check_handler(adev, &processor_handler);
> > +}
> > +
> >  static int acpi_processor_container_attach(struct acpi_device *dev,
> >  					   const struct acpi_device_id *id)
> >  {
> > Index: linux-pm/drivers/acpi/internal.h
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/internal.h
> > +++ linux-pm/drivers/acpi/internal.h
> > @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(stru
> >  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
> >  				       const char *hotplug_profile_name);
> >  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> > +bool acpi_scan_check_handler(const struct acpi_device *adev,
> > +			     struct acpi_scan_handler *handler);
> >  
> >  #ifdef CONFIG_DEBUG_FS
> >  extern struct dentry *acpi_debugfs_dir;
> > @@ -133,6 +135,7 @@ int acpi_bus_register_early_device(int t
> >  const struct acpi_device *acpi_companion_match(const struct device *dev);
> >  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
> >  				  struct kobj_uevent_env *env);
> > +bool acpi_device_is_processor(const struct acpi_device *adev);
> >  
> >  /* --------------------------------------------------------------------------
> >                                    Power Resource
> > Index: linux-pm/drivers/acpi/scan.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/scan.c
> > +++ linux-pm/drivers/acpi/scan.c
> > @@ -1938,6 +1938,19 @@ static bool acpi_scan_handler_matching(s
> >  	return false;
> >  }
> >  
> > +bool acpi_scan_check_handler(const struct acpi_device *adev,
> > +			     struct acpi_scan_handler *handler)
> > +{
> > +	struct acpi_hardware_id *hwid;
> > +
> > +	list_for_each_entry(hwid, &adev->pnp.ids, list) {
> > +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> > +			return true;
> > +	}
> > +
> > +	return false;
> > +}
> > +
> >  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
> >  					const struct acpi_device_id **matchid)
> >  {
> > @@ -2410,7 +2423,16 @@ bool acpi_dev_ready_for_enumeration(cons
> >  	if (device->flags.honor_deps && device->dep_unmet)
> >  		return false;
> >  
> > -	return acpi_device_is_present(device);
> > +	if (device->status.functional)
> > +		return true;
> > +
> > +	if (!device->status.present)
> > +		return false;
> > +
> > +	if (device->status.enabled)
> > +		return true; /* Fast path. */
> > +
> > +	return !acpi_device_is_processor(device);
> >  }
> >  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
> >  
> > 
> > 
> >   
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Russell King (Oracle) Jan. 11, 2024, 10:26 a.m. UTC | #12
On Thu, Jan 11, 2024 at 10:19:49AM +0000, Jonathan Cameron wrote:
> On Tue, 2 Jan 2024 14:39:25 +0000
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Fri, 15 Dec 2023 20:47:31 +0100
> > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> > 
> > > On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:  
> > > > On Fri, 15 Dec 2023 15:31:55 +0000
> > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > >     
> > > > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:    
> > > > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:      
> > > > > > >
> > > > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > > > <linux@armlinux.org.uk> wrote:      
> > > > > > > > I guess we need something like:
> > > > > > > >
> > > > > > > >         if (device->status.present)
> > > > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > > > >                        device->status.enabled;
> > > > > > > >         else
> > > > > > > >                 return device->status.functional;
> > > > > > > >
> > > > > > > > so we only check device->status.enabled for processor-type devices?      
> > > > > > >
> > > > > > > Yes, something like this.      
> > > > > > 
> > > > > > However, that is not sufficient, because there are
> > > > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > > > 
> > > > > > I'm not sure about a clean way to do it ATM.      
> > > > > 
> > > > > Ok, how about:
> > > > > 
> > > > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > > > {
> > > > > 	struct acpi_hardware_id *hwid;
> > > > > 
> > > > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > > > 		return true;
> > > > > 
> > > > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > > > 		return false;
> > > > > 
> > > > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > > > 			return true;
> > > > > 
> > > > > 	return false;
> > > > > }
> > > > > 
> > > > > and then:
> > > > > 
> > > > > 	if (device->status.present)
> > > > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > > > 	else
> > > > > 		return device->status.functional;
> > > > > 
> > > > > ?
> > > > >     
> > > > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > > > job.  Nice and simple.    
> > > 
> > > Well, except that it does checks that are done elsewhere slightly
> > > differently, which from the maintenance POV is not nice.
> > > 
> > > Maybe something like the appended patch (untested).  
> > 
> > Hi Rafael,
> > 
> > As far as I can see that's functionally equivalent, so looks good to me.
> > I'm not set up to test this today though, so will defer to Russell on whether
> > there is anything missing
> > 
> > Thanks for putting this together.
> 
> This is rather embarrassing...
> 
> I span this up on a QEMU instance with some prints to find out we need
> the !acpi_device_is_processor() restriction.
> On my 'random' test setup it fails on one device. ACPI0017 - which I
> happen to know rather well. It's the weird pseudo device that lets
> a CXL aware OS know there is a CEDT table to probe.
> 
> Whilst I really don't like that hack (it is all about making software
> distribution of out of tree modules easier rather than something
> fundamental), I'm the CXL QEMU maintainer :(
> 
> Will fix that, but it shows there is at least one broken firmware out
> there.
> 
> On plus side, Rafael's code seems to work as expected and lets that
> buggy firwmare carry on working :) So lets pretend the bug in qemu
> is a deliberate test case!

Lol, thanks for a test case and showing that Rafael's approach is
indeed necessary.

Would your test quality for a tested-by for this? For reference, this
is my current version below with Rafael's update:

8<====
From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Subject: [PATCH] ACPI: Only enumerate enabled (or functional) processor
 devices

From: James Morse <james.morse@arm.com>

Today the ACPI enumeration code 'visits' all devices that are present.

This is a problem for arm64, where CPUs are always present, but not
always enabled. When a device-check occurs because the firmware-policy
has changed and a CPU is now enabled, the following error occurs:
| acpi ACPI0007:48: Enumeration failure

This is ultimately because acpi_dev_ready_for_enumeration() returns
true for a device that is not enabled. The ACPI Processor driver
will not register such CPUs as they are not 'decoding their resources'.

ACPI allows a device to be functional instead of maintaining the
present and enabled bit, but we can't simply check the enabled bit
for all devices since firmware can be buggy.

If ACPI indicates that the device is present and enabled, then all well
and good, we can enumate it. However, if the device is present and not
enabled, then we also check whether the device is a processor device
to limit the impact of this new check to just processor devices.

This avoids enumerating present && functional processor devices that
are not enabled.

Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
Changes since RFC v2:
 * Incorporate comment suggestion by Gavin Shan.
Changes since RFC v3:
 * Fixed "sert" typo.
Changes since RFC v3 (smaller series):
 * Restrict checking the enabled bit to processor devices, update
   commit comments.
 * Use Rafael's suggestion in
   https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
---
 drivers/acpi/acpi_processor.c | 11 ++++++++
 drivers/acpi/device_pm.c      |  2 +-
 drivers/acpi/device_sysfs.c   |  2 +-
 drivers/acpi/internal.h       |  4 ++-
 drivers/acpi/property.c       |  2 +-
 drivers/acpi/scan.c           | 49 ++++++++++++++++++++++++++++-------
 6 files changed, 56 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 4fe2ef54088c..cf7c1cca69dd 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
 	},
 };
 
+bool acpi_device_is_processor(const struct acpi_device *adev)
+{
+	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
+		return true;
+
+	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
+		return false;
+
+	return acpi_scan_check_handler(adev, &processor_handler);
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 3b4d048c4941..e3c80f3b3b57 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
 		return -EINVAL;
 
 	device->power.state = ACPI_STATE_UNKNOWN;
-	if (!acpi_device_is_present(device)) {
+	if (!acpi_dev_ready_for_enumeration(device)) {
 		device->flags.initialized = false;
 		return -ENXIO;
 	}
diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
index 23373faa35ec..a0256d2493a7 100644
--- a/drivers/acpi/device_sysfs.c
+++ b/drivers/acpi/device_sysfs.c
@@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
 	struct acpi_hardware_id *id;
 
 	/* Avoid unnecessarily loading modules for non present devices. */
-	if (!acpi_device_is_present(acpi_dev))
+	if (!acpi_dev_ready_for_enumeration(acpi_dev))
 		return 0;
 
 	/*
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 866c7c4ed233..9388d4c8674a 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
 				       const char *hotplug_profile_name);
 void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler);
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *acpi_debugfs_dir;
@@ -107,7 +109,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
 void acpi_device_remove_files(struct acpi_device *dev);
 void acpi_device_add_finalize(struct acpi_device *device);
 void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
-bool acpi_device_is_present(const struct acpi_device *adev);
 bool acpi_device_is_battery(struct acpi_device *adev);
 bool acpi_device_is_first_physical_node(struct acpi_device *adev,
 					const struct device *dev);
@@ -119,6 +120,7 @@ int acpi_bus_register_early_device(int type);
 const struct acpi_device *acpi_companion_match(const struct device *dev);
 int __acpi_device_uevent_modalias(const struct acpi_device *adev,
 				  struct kobj_uevent_env *env);
+bool acpi_device_is_processor(const struct acpi_device *adev);
 
 /* --------------------------------------------------------------------------
                                   Power Resource
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index 6979a3f9f90a..14d6948fd88a 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
 	if (!is_acpi_device_node(fwnode))
 		return false;
 
-	return acpi_device_is_present(to_acpi_device_node(fwnode));
+	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
 }
 
 static const void *
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 02bb2cce423f..f94d1f744bcc 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (acpi_device_is_present(adev)) {
+	if (acpi_dev_ready_for_enumeration(adev)) {
 		/*
 		 * This function is only called for device objects for which
 		 * matching scan handlers exist.  The only situation in which
@@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (!acpi_device_is_present(adev)) {
+	if (!acpi_dev_ready_for_enumeration(adev)) {
 		acpi_scan_device_not_enumerated(adev);
 		return 0;
 	}
@@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
 	return true;
 }
 
-bool acpi_device_is_present(const struct acpi_device *adev)
-{
-	return adev->status.present || adev->status.functional;
-}
-
 static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 				       const char *idstr,
 				       const struct acpi_device_id **matchid)
@@ -1938,6 +1933,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 	return false;
 }
 
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler)
+{
+	struct acpi_hardware_id *hwid;
+
+	list_for_each_entry(hwid, &adev->pnp.ids, list)
+		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
+			return true;
+
+	return false;
+}
+
 static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
 					const struct acpi_device_id **matchid)
 {
@@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
  * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
  * @device: Pointer to the &struct acpi_device to check
  *
- * Check if the device is present and has no unmet dependencies.
+ * Check if the device is functional or enabled and has no unmet dependencies.
  *
- * Return true if the device is ready for enumeratino. Otherwise, return false.
+ * Return true if the device is ready for enumeration. Otherwise, return false.
  */
 bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
 {
 	if (device->flags.honor_deps && device->dep_unmet)
 		return false;
 
-	return acpi_device_is_present(device);
+	/*
+	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
+	 * (!present && functional) for certain types of devices that should be
+	 * enumerated. Note that the enabled bit should not be set unless the
+	 * present bit is set.
+	 *
+	 * However, limit this only to processor devices to reduce possible
+	 * regressions with firmware.
+	 */
+	if (device->status.functional)
+		return true;
+
+	if (!device->status.present)
+		return false;
+
+	/*
+	 * Fast path - if enabled is set, avoid the more expensive test to
+	 * check whether this device is a processor.
+	 */
+	if (device->status.enabled)
+		return true;
+
+	return !acpi_device_is_processor(device);
 }
 EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
Jonathan Cameron Jan. 12, 2024, 11:52 a.m. UTC | #13
On Thu, 11 Jan 2024 10:26:15 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Thu, Jan 11, 2024 at 10:19:49AM +0000, Jonathan Cameron wrote:
> > On Tue, 2 Jan 2024 14:39:25 +0000
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >   
> > > On Fri, 15 Dec 2023 20:47:31 +0100
> > > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> > >   
> > > > On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:    
> > > > > On Fri, 15 Dec 2023 15:31:55 +0000
> > > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > >       
> > > > > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:      
> > > > > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:        
> > > > > > > >
> > > > > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > > > > <linux@armlinux.org.uk> wrote:        
> > > > > > > > > I guess we need something like:
> > > > > > > > >
> > > > > > > > >         if (device->status.present)
> > > > > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > > > > >                        device->status.enabled;
> > > > > > > > >         else
> > > > > > > > >                 return device->status.functional;
> > > > > > > > >
> > > > > > > > > so we only check device->status.enabled for processor-type devices?        
> > > > > > > >
> > > > > > > > Yes, something like this.        
> > > > > > > 
> > > > > > > However, that is not sufficient, because there are
> > > > > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > > > > 
> > > > > > > I'm not sure about a clean way to do it ATM.        
> > > > > > 
> > > > > > Ok, how about:
> > > > > > 
> > > > > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > > > > {
> > > > > > 	struct acpi_hardware_id *hwid;
> > > > > > 
> > > > > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > > > > 		return true;
> > > > > > 
> > > > > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > > > > 		return false;
> > > > > > 
> > > > > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > > > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > > > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > > > > 			return true;
> > > > > > 
> > > > > > 	return false;
> > > > > > }
> > > > > > 
> > > > > > and then:
> > > > > > 
> > > > > > 	if (device->status.present)
> > > > > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > > > > 	else
> > > > > > 		return device->status.functional;
> > > > > > 
> > > > > > ?
> > > > > >       
> > > > > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > > > > job.  Nice and simple.      
> > > > 
> > > > Well, except that it does checks that are done elsewhere slightly
> > > > differently, which from the maintenance POV is not nice.
> > > > 
> > > > Maybe something like the appended patch (untested).    
> > > 
> > > Hi Rafael,
> > > 
> > > As far as I can see that's functionally equivalent, so looks good to me.
> > > I'm not set up to test this today though, so will defer to Russell on whether
> > > there is anything missing
> > > 
> > > Thanks for putting this together.  
> > 
> > This is rather embarrassing...
> > 
> > I span this up on a QEMU instance with some prints to find out we need
> > the !acpi_device_is_processor() restriction.
> > On my 'random' test setup it fails on one device. ACPI0017 - which I
> > happen to know rather well. It's the weird pseudo device that lets
> > a CXL aware OS know there is a CEDT table to probe.
> > 
> > Whilst I really don't like that hack (it is all about making software
> > distribution of out of tree modules easier rather than something
> > fundamental), I'm the CXL QEMU maintainer :(
> > 
> > Will fix that, but it shows there is at least one broken firmware out
> > there.
> > 
> > On plus side, Rafael's code seems to work as expected and lets that
> > buggy firwmare carry on working :) So lets pretend the bug in qemu
> > is a deliberate test case!  
> 
> Lol, thanks for a test case and showing that Rafael's approach is
> indeed necessary.
> 
> Would your test quality for a tested-by for this? For reference, this
> is my current version below with Rafael's update:

Sure. This matches what I have.

Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> 
> 8<====
> From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Subject: [PATCH] ACPI: Only enumerate enabled (or functional) processor
>  devices
> 
> From: James Morse <james.morse@arm.com>
> 
> Today the ACPI enumeration code 'visits' all devices that are present.
> 
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
> 
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
> 
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit, but we can't simply check the enabled bit
> for all devices since firmware can be buggy.
> 
> If ACPI indicates that the device is present and enabled, then all well
> and good, we can enumate it. However, if the device is present and not
> enabled, then we also check whether the device is a processor device
> to limit the impact of this new check to just processor devices.
> 
> This avoids enumerating present && functional processor devices that
> are not enabled.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>  * Incorporate comment suggestion by Gavin Shan.
> Changes since RFC v3:
>  * Fixed "sert" typo.
> Changes since RFC v3 (smaller series):
>  * Restrict checking the enabled bit to processor devices, update
>    commit comments.
>  * Use Rafael's suggestion in
>    https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
> ---
>  drivers/acpi/acpi_processor.c | 11 ++++++++
>  drivers/acpi/device_pm.c      |  2 +-
>  drivers/acpi/device_sysfs.c   |  2 +-
>  drivers/acpi/internal.h       |  4 ++-
>  drivers/acpi/property.c       |  2 +-
>  drivers/acpi/scan.c           | 49 ++++++++++++++++++++++++++++-------
>  6 files changed, 56 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..cf7c1cca69dd 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
>  	},
>  };
>  
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +		return true;
> +
> +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +		return false;
> +
> +	return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>  					   const struct acpi_device_id *id)
>  {
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>  		return -EINVAL;
>  
>  	device->power.state = ACPI_STATE_UNKNOWN;
> -	if (!acpi_device_is_present(device)) {
> +	if (!acpi_dev_ready_for_enumeration(device)) {
>  		device->flags.initialized = false;
>  		return -ENXIO;
>  	}
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>  	struct acpi_hardware_id *id;
>  
>  	/* Avoid unnecessarily loading modules for non present devices. */
> -	if (!acpi_device_is_present(acpi_dev))
> +	if (!acpi_dev_ready_for_enumeration(acpi_dev))
>  		return 0;
>  
>  	/*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..9388d4c8674a 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>  				       const char *hotplug_profile_name);
>  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler);
>  
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *acpi_debugfs_dir;
> @@ -107,7 +109,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>  void acpi_device_remove_files(struct acpi_device *dev);
>  void acpi_device_add_finalize(struct acpi_device *device);
>  void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>  bool acpi_device_is_battery(struct acpi_device *adev);
>  bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>  					const struct device *dev);
> @@ -119,6 +120,7 @@ int acpi_bus_register_early_device(int type);
>  const struct acpi_device *acpi_companion_match(const struct device *dev);
>  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>  				  struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>  
>  /* --------------------------------------------------------------------------
>                                    Power Resource
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 6979a3f9f90a..14d6948fd88a 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>  	if (!is_acpi_device_node(fwnode))
>  		return false;
>  
> -	return acpi_device_is_present(to_acpi_device_node(fwnode));
> +	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>  }
>  
>  static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 02bb2cce423f..f94d1f744bcc 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>  	int error;
>  
>  	acpi_bus_get_status(adev);
> -	if (acpi_device_is_present(adev)) {
> +	if (acpi_dev_ready_for_enumeration(adev)) {
>  		/*
>  		 * This function is only called for device objects for which
>  		 * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>  	int error;
>  
>  	acpi_bus_get_status(adev);
> -	if (!acpi_device_is_present(adev)) {
> +	if (!acpi_dev_ready_for_enumeration(adev)) {
>  		acpi_scan_device_not_enumerated(adev);
>  		return 0;
>  	}
> @@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>  	return true;
>  }
>  
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -	return adev->status.present || adev->status.functional;
> -}
> -
>  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>  				       const char *idstr,
>  				       const struct acpi_device_id **matchid)
> @@ -1938,6 +1933,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>  	return false;
>  }
>  
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler)
> +{
> +	struct acpi_hardware_id *hwid;
> +
> +	list_for_each_entry(hwid, &adev->pnp.ids, list)
> +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +			return true;
> +
> +	return false;
> +}
> +
>  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>  					const struct acpi_device_id **matchid)
>  {
> @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>   * @device: Pointer to the &struct acpi_device to check
>   *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>   *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>   */
>  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>  {
>  	if (device->flags.honor_deps && device->dep_unmet)
>  		return false;
>  
> -	return acpi_device_is_present(device);
> +	/*
> +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +	 * (!present && functional) for certain types of devices that should be
> +	 * enumerated. Note that the enabled bit should not be set unless the
> +	 * present bit is set.
> +	 *
> +	 * However, limit this only to processor devices to reduce possible
> +	 * regressions with firmware.
> +	 */
> +	if (device->status.functional)
> +		return true;
> +
> +	if (!device->status.present)
> +		return false;
> +
> +	/*
> +	 * Fast path - if enabled is set, avoid the more expensive test to
> +	 * check whether this device is a processor.
> +	 */
> +	if (device->status.enabled)
> +		return true;
> +
> +	return !acpi_device_is_processor(device);
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
Gavin Shan Jan. 22, 2024, 7:31 a.m. UTC | #14
On 1/11/24 20:26, Russell King (Oracle) wrote:
> On Thu, Jan 11, 2024 at 10:19:49AM +0000, Jonathan Cameron wrote:
>> On Tue, 2 Jan 2024 14:39:25 +0000
>> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>>
>>> On Fri, 15 Dec 2023 20:47:31 +0100
>>> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
>>>
>>>> On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:
>>>>> On Fri, 15 Dec 2023 15:31:55 +0000
>>>>> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
>>>>>      
>>>>>> On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:
>>>>>>> On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>>>>>>
>>>>>>>> On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
>>>>>>>> <linux@armlinux.org.uk> wrote:
>>>>>>>>> I guess we need something like:
>>>>>>>>>
>>>>>>>>>          if (device->status.present)
>>>>>>>>>                  return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
>>>>>>>>>                         device->status.enabled;
>>>>>>>>>          else
>>>>>>>>>                  return device->status.functional;
>>>>>>>>>
>>>>>>>>> so we only check device->status.enabled for processor-type devices?
>>>>>>>>
>>>>>>>> Yes, something like this.
>>>>>>>
>>>>>>> However, that is not sufficient, because there are
>>>>>>> ACPI_BUS_TYPE_DEVICE devices representing processors.
>>>>>>>
>>>>>>> I'm not sure about a clean way to do it ATM.
>>>>>>
>>>>>> Ok, how about:
>>>>>>
>>>>>> static bool acpi_dev_is_processor(const struct acpi_device *device)
>>>>>> {
>>>>>> 	struct acpi_hardware_id *hwid;
>>>>>>
>>>>>> 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
>>>>>> 		return true;
>>>>>>
>>>>>> 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
>>>>>> 		return false;
>>>>>>
>>>>>> 	list_for_each_entry(hwid, &device->pnp.ids, list)
>>>>>> 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
>>>>>> 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
>>>>>> 			return true;
>>>>>>
>>>>>> 	return false;
>>>>>> }
>>>>>>
>>>>>> and then:
>>>>>>
>>>>>> 	if (device->status.present)
>>>>>> 		return !acpi_dev_is_processor(device) || device->status.enabled;
>>>>>> 	else
>>>>>> 		return device->status.functional;
>>>>>>
>>>>>> ?
>>>>>>      
>>>>> Changing it to CPU only for now makes sense to me and I think this code snippet should do the
>>>>> job.  Nice and simple.
>>>>
>>>> Well, except that it does checks that are done elsewhere slightly
>>>> differently, which from the maintenance POV is not nice.
>>>>
>>>> Maybe something like the appended patch (untested).
>>>
>>> Hi Rafael,
>>>
>>> As far as I can see that's functionally equivalent, so looks good to me.
>>> I'm not set up to test this today though, so will defer to Russell on whether
>>> there is anything missing
>>>
>>> Thanks for putting this together.
>>
>> This is rather embarrassing...
>>
>> I span this up on a QEMU instance with some prints to find out we need
>> the !acpi_device_is_processor() restriction.
>> On my 'random' test setup it fails on one device. ACPI0017 - which I
>> happen to know rather well. It's the weird pseudo device that lets
>> a CXL aware OS know there is a CEDT table to probe.
>>
>> Whilst I really don't like that hack (it is all about making software
>> distribution of out of tree modules easier rather than something
>> fundamental), I'm the CXL QEMU maintainer :(
>>
>> Will fix that, but it shows there is at least one broken firmware out
>> there.
>>
>> On plus side, Rafael's code seems to work as expected and lets that
>> buggy firwmare carry on working :) So lets pretend the bug in qemu
>> is a deliberate test case!
> 
> Lol, thanks for a test case and showing that Rafael's approach is
> indeed necessary.
> 
> Would your test quality for a tested-by for this? For reference, this
> is my current version below with Rafael's update:
> 
> 8<====
> From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Subject: [PATCH] ACPI: Only enumerate enabled (or functional) processor
>   devices
> 
> From: James Morse <james.morse@arm.com>
> 
> Today the ACPI enumeration code 'visits' all devices that are present.
> 
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
> 
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
> 
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit, but we can't simply check the enabled bit
> for all devices since firmware can be buggy.
> 
> If ACPI indicates that the device is present and enabled, then all well
> and good, we can enumate it. However, if the device is present and not
> enabled, then we also check whether the device is a processor device
> to limit the impact of this new check to just processor devices.
> 
> This avoids enumerating present && functional processor devices that
> are not enabled.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>   * Incorporate comment suggestion by Gavin Shan.
> Changes since RFC v3:
>   * Fixed "sert" typo.
> Changes since RFC v3 (smaller series):
>   * Restrict checking the enabled bit to processor devices, update
>     commit comments.
>   * Use Rafael's suggestion in
>     https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
> ---
>   drivers/acpi/acpi_processor.c | 11 ++++++++
>   drivers/acpi/device_pm.c      |  2 +-
>   drivers/acpi/device_sysfs.c   |  2 +-
>   drivers/acpi/internal.h       |  4 ++-
>   drivers/acpi/property.c       |  2 +-
>   drivers/acpi/scan.c           | 49 ++++++++++++++++++++++++++++-------
>   6 files changed, 56 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..cf7c1cca69dd 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
>   	},
>   };
>   
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +		return true;
> +
> +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +		return false;
> +
> +	return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>   static int acpi_processor_container_attach(struct acpi_device *dev,
>   					   const struct acpi_device_id *id)
>   {
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>   		return -EINVAL;
>   
>   	device->power.state = ACPI_STATE_UNKNOWN;
> -	if (!acpi_device_is_present(device)) {
> +	if (!acpi_dev_ready_for_enumeration(device)) {
>   		device->flags.initialized = false;
>   		return -ENXIO;
>   	}
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>   	struct acpi_hardware_id *id;
>   
>   	/* Avoid unnecessarily loading modules for non present devices. */
> -	if (!acpi_device_is_present(acpi_dev))
> +	if (!acpi_dev_ready_for_enumeration(acpi_dev))
>   		return 0;
>   
>   	/*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..9388d4c8674a 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>   int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>   				       const char *hotplug_profile_name);
>   void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler);
>   
>   #ifdef CONFIG_DEBUG_FS
>   extern struct dentry *acpi_debugfs_dir;
> @@ -107,7 +109,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>   void acpi_device_remove_files(struct acpi_device *dev);
>   void acpi_device_add_finalize(struct acpi_device *device);
>   void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>   bool acpi_device_is_battery(struct acpi_device *adev);
>   bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>   					const struct device *dev);
> @@ -119,6 +120,7 @@ int acpi_bus_register_early_device(int type);
>   const struct acpi_device *acpi_companion_match(const struct device *dev);
>   int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>   				  struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>   
>   /* --------------------------------------------------------------------------
>                                     Power Resource
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 6979a3f9f90a..14d6948fd88a 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>   	if (!is_acpi_device_node(fwnode))
>   		return false;
>   
> -	return acpi_device_is_present(to_acpi_device_node(fwnode));
> +	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>   }
>   
>   static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 02bb2cce423f..f94d1f744bcc 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>   	int error;
>   
>   	acpi_bus_get_status(adev);
> -	if (acpi_device_is_present(adev)) {
> +	if (acpi_dev_ready_for_enumeration(adev)) {
>   		/*
>   		 * This function is only called for device objects for which
>   		 * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>   	int error;
>   
>   	acpi_bus_get_status(adev);
> -	if (!acpi_device_is_present(adev)) {
> +	if (!acpi_dev_ready_for_enumeration(adev)) {
>   		acpi_scan_device_not_enumerated(adev);
>   		return 0;
>   	}
> @@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>   	return true;
>   }
>   
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -	return adev->status.present || adev->status.functional;
> -}
> -
>   static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>   				       const char *idstr,
>   				       const struct acpi_device_id **matchid)
> @@ -1938,6 +1933,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>   	return false;
>   }
>   
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler)
> +{
> +	struct acpi_hardware_id *hwid;
> +
> +	list_for_each_entry(hwid, &adev->pnp.ids, list)
> +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +			return true;
> +
> +	return false;
> +}
> +
>   static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>   					const struct acpi_device_id **matchid)
>   {
> @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>    * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>    * @device: Pointer to the &struct acpi_device to check
>    *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>    *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>    */
>   bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>   {
>   	if (device->flags.honor_deps && device->dep_unmet)
>   		return false;
>   
> -	return acpi_device_is_present(device);
> +	/*
> +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +	 * (!present && functional) for certain types of devices that should be
> +	 * enumerated. Note that the enabled bit should not be set unless the
> +	 * present bit is set.
> +	 *
> +	 * However, limit this only to processor devices to reduce possible
> +	 * regressions with firmware.
> +	 */
> +	if (device->status.functional)
> +		return true;
> +
> +	if (!device->status.present)
> +		return false;
> +
> +	/*
> +	 * Fast path - if enabled is set, avoid the more expensive test to
> +	 * check whether this device is a processor.
> +	 */
> +	if (device->status.enabled)
> +		return true;
> +

It may be worthy to replace 'if enabled is set' with 'if the enabled bit is set',
to be consistent with the terminologies used in the above comments.

Apart from it, the patch itself looks good to me.

> +	return !acpi_device_is_processor(device);
>   }
>   EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>   

Thanks,
Gavin
Russell King (Oracle) Jan. 29, 2024, 2:55 p.m. UTC | #15
Hi Jonathan,

On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> On Thu, 11 Jan 2024 10:26:15 +0000
> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> >   * @device: Pointer to the &struct acpi_device to check
> >   *
> > - * Check if the device is present and has no unmet dependencies.
> > + * Check if the device is functional or enabled and has no unmet dependencies.
> >   *
> > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > + * Return true if the device is ready for enumeration. Otherwise, return false.
> >   */
> >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> >  {
> >  	if (device->flags.honor_deps && device->dep_unmet)
> >  		return false;
> >  
> > -	return acpi_device_is_present(device);
> > +	/*
> > +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > +	 * (!present && functional) for certain types of devices that should be
> > +	 * enumerated. Note that the enabled bit should not be set unless the
> > +	 * present bit is set.
> > +	 *
> > +	 * However, limit this only to processor devices to reduce possible
> > +	 * regressions with firmware.
> > +	 */
> > +	if (device->status.functional)
> > +		return true;

I have a report from within Oracle that this causes testing failures
with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:

	if (!device->status.present)
		return device->status.functional;

	if (device->status.enabled)
		return true;

	return !acpi_device_is_processor(device);

So we can better understand the history here, let's list it as a
truth table. P=present, F=functional, E=enabled, Orig=how the code
is in mainline, James=James' original proposal, Rafael=the proposed
replacement but seems to be buggy, Rmk=the fixed version that passes
tests:

P F E	Orig	James	Rafael		Rmk
0 0 0	0	0	0		0
0 0 1	0	0	0		0
0 1 0	1	1	1		1
0 1 1	1	0	1		1
1 0 0	1	0	!processor	!processor
1 0 1	1	1	1		1
1 1 0	1	0	1		!processor
1 1 1	1	1	1		1

Any objections to this?
Rafael J. Wysocki Jan. 29, 2024, 3:05 p.m. UTC | #16
On Mon, Jan 29, 2024 at 3:55 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> Hi Jonathan,
>
> On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> > On Thu, 11 Jan 2024 10:26:15 +0000
> > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > >   * @device: Pointer to the &struct acpi_device to check
> > >   *
> > > - * Check if the device is present and has no unmet dependencies.
> > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > >   *
> > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > >   */
> > >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > >  {
> > >     if (device->flags.honor_deps && device->dep_unmet)
> > >             return false;
> > >
> > > -   return acpi_device_is_present(device);
> > > +   /*
> > > +    * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > +    * (!present && functional) for certain types of devices that should be
> > > +    * enumerated. Note that the enabled bit should not be set unless the
> > > +    * present bit is set.
> > > +    *
> > > +    * However, limit this only to processor devices to reduce possible
> > > +    * regressions with firmware.
> > > +    */
> > > +   if (device->status.functional)
> > > +           return true;
>
> I have a report from within Oracle that this causes testing failures
> with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:
>
>         if (!device->status.present)
>                 return device->status.functional;
>
>         if (device->status.enabled)
>                 return true;
>
>         return !acpi_device_is_processor(device);

The above is fine by me.

> So we can better understand the history here, let's list it as a
> truth table. P=present, F=functional, E=enabled, Orig=how the code
> is in mainline, James=James' original proposal, Rafael=the proposed
> replacement but seems to be buggy, Rmk=the fixed version that passes
> tests:
>
> P F E   Orig    James   Rafael          Rmk
> 0 0 0   0       0       0               0
> 0 0 1   0       0       0               0
> 0 1 0   1       1       1               1
> 0 1 1   1       0       1               1
> 1 0 0   1       0       !processor      !processor
> 1 0 1   1       1       1               1
> 1 1 0   1       0       1               !processor
> 1 1 1   1       1       1               1
>
> Any objections to this?

So AFAIAC it can return false if not enabled, but present and
functional.  [Side note: I'm wondering what "functional" means then,
but whatever.]
Russell King (Oracle) Jan. 29, 2024, 3:16 p.m. UTC | #17
On Mon, Jan 29, 2024 at 04:05:42PM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 29, 2024 at 3:55 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > Hi Jonathan,
> >
> > On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> > > On Thu, 11 Jan 2024 10:26:15 +0000
> > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > > >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > > >   * @device: Pointer to the &struct acpi_device to check
> > > >   *
> > > > - * Check if the device is present and has no unmet dependencies.
> > > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > > >   *
> > > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > > >   */
> > > >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > > >  {
> > > >     if (device->flags.honor_deps && device->dep_unmet)
> > > >             return false;
> > > >
> > > > -   return acpi_device_is_present(device);
> > > > +   /*
> > > > +    * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > > +    * (!present && functional) for certain types of devices that should be
> > > > +    * enumerated. Note that the enabled bit should not be set unless the
> > > > +    * present bit is set.
> > > > +    *
> > > > +    * However, limit this only to processor devices to reduce possible
> > > > +    * regressions with firmware.
> > > > +    */
> > > > +   if (device->status.functional)
> > > > +           return true;
> >
> > I have a report from within Oracle that this causes testing failures
> > with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:
> >
> >         if (!device->status.present)
> >                 return device->status.functional;
> >
> >         if (device->status.enabled)
> >                 return true;
> >
> >         return !acpi_device_is_processor(device);
> 
> The above is fine by me.
> 
> > So we can better understand the history here, let's list it as a
> > truth table. P=present, F=functional, E=enabled, Orig=how the code
> > is in mainline, James=James' original proposal, Rafael=the proposed
> > replacement but seems to be buggy, Rmk=the fixed version that passes
> > tests:
> >
> > P F E   Orig    James   Rafael          Rmk
> > 0 0 0   0       0       0               0
> > 0 0 1   0       0       0               0
> > 0 1 0   1       1       1               1
> > 0 1 1   1       0       1               1
> > 1 0 0   1       0       !processor      !processor
> > 1 0 1   1       1       1               1
> > 1 1 0   1       0       1               !processor
> > 1 1 1   1       1       1               1
> >
> > Any objections to this?
> 
> So AFAIAC it can return false if not enabled, but present and
> functional.  [Side note: I'm wondering what "functional" means then,
> but whatever.]

From ACPI v6.5 (bit 3 is our "status.functional":

 _STA may return bit 0 clear (not present) with bit [3] set (device is
 functional). This case is used to indicate a valid device for which no
 device driver should be loaded (for example, a bridge device.) Children
 of this device may be present and valid. OSPM should continue
 enumeration below a device whose _STA returns this bit combination.

So, for this case, acpi_dev_ready_for_enumeration() returning true for
this case is correct, since we're supposed to enumerate it and child
devices.

It's probably also worth pointing out that in the above table, the two
combinations with P=0 E=1 goes against the spec, but are included for
completness.
Rafael J. Wysocki Jan. 29, 2024, 3:34 p.m. UTC | #18
On Mon, Jan 29, 2024 at 4:17 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Mon, Jan 29, 2024 at 04:05:42PM +0100, Rafael J. Wysocki wrote:
> > On Mon, Jan 29, 2024 at 3:55 PM Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:
> > >
> > > Hi Jonathan,
> > >
> > > On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> > > > On Thu, 11 Jan 2024 10:26:15 +0000
> > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > > > >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > > > >   * @device: Pointer to the &struct acpi_device to check
> > > > >   *
> > > > > - * Check if the device is present and has no unmet dependencies.
> > > > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > > > >   *
> > > > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > > > >   */
> > > > >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > > > >  {
> > > > >     if (device->flags.honor_deps && device->dep_unmet)
> > > > >             return false;
> > > > >
> > > > > -   return acpi_device_is_present(device);
> > > > > +   /*
> > > > > +    * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > > > +    * (!present && functional) for certain types of devices that should be
> > > > > +    * enumerated. Note that the enabled bit should not be set unless the
> > > > > +    * present bit is set.
> > > > > +    *
> > > > > +    * However, limit this only to processor devices to reduce possible
> > > > > +    * regressions with firmware.
> > > > > +    */
> > > > > +   if (device->status.functional)
> > > > > +           return true;
> > >
> > > I have a report from within Oracle that this causes testing failures
> > > with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:
> > >
> > >         if (!device->status.present)
> > >                 return device->status.functional;
> > >
> > >         if (device->status.enabled)
> > >                 return true;
> > >
> > >         return !acpi_device_is_processor(device);
> >
> > The above is fine by me.
> >
> > > So we can better understand the history here, let's list it as a
> > > truth table. P=present, F=functional, E=enabled, Orig=how the code
> > > is in mainline, James=James' original proposal, Rafael=the proposed
> > > replacement but seems to be buggy, Rmk=the fixed version that passes
> > > tests:
> > >
> > > P F E   Orig    James   Rafael          Rmk
> > > 0 0 0   0       0       0               0
> > > 0 0 1   0       0       0               0
> > > 0 1 0   1       1       1               1
> > > 0 1 1   1       0       1               1
> > > 1 0 0   1       0       !processor      !processor
> > > 1 0 1   1       1       1               1
> > > 1 1 0   1       0       1               !processor
> > > 1 1 1   1       1       1               1
> > >
> > > Any objections to this?
> >
> > So AFAIAC it can return false if not enabled, but present and
> > functional.  [Side note: I'm wondering what "functional" means then,
> > but whatever.]
>
> From ACPI v6.5 (bit 3 is our "status.functional":
>
>  _STA may return bit 0 clear (not present) with bit [3] set (device is
>  functional). This case is used to indicate a valid device for which no
>  device driver should be loaded (for example, a bridge device.) Children
>  of this device may be present and valid. OSPM should continue
>  enumeration below a device whose _STA returns this bit combination.
>
> So, for this case, acpi_dev_ready_for_enumeration() returning true for
> this case is correct, since we're supposed to enumerate it and child
> devices.
>
> It's probably also worth pointing out that in the above table, the two
> combinations with P=0 E=1 goes against the spec, but are included for
> completness.

The difference between the last two columns is the present and
functional, but not enabled combination AFAICS, for which my patch
just returned true, but the firmware disagrees with that.

It is kind of analogous to the "not present and functional" case
covered by the spec, which is why it is fine by me to return "false"
then (for processors), but the spec is not crystal clear about it.
diff mbox series

Patch

diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 3b4d048c4941..e3c80f3b3b57 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -313,7 +313,7 @@  int acpi_bus_init_power(struct acpi_device *device)
 		return -EINVAL;
 
 	device->power.state = ACPI_STATE_UNKNOWN;
-	if (!acpi_device_is_present(device)) {
+	if (!acpi_dev_ready_for_enumeration(device)) {
 		device->flags.initialized = false;
 		return -ENXIO;
 	}
diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
index 23373faa35ec..a0256d2493a7 100644
--- a/drivers/acpi/device_sysfs.c
+++ b/drivers/acpi/device_sysfs.c
@@ -141,7 +141,7 @@  static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
 	struct acpi_hardware_id *id;
 
 	/* Avoid unnecessarily loading modules for non present devices. */
-	if (!acpi_device_is_present(acpi_dev))
+	if (!acpi_dev_ready_for_enumeration(acpi_dev))
 		return 0;
 
 	/*
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 866c7c4ed233..a1b45e345bcc 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -107,7 +107,6 @@  int acpi_device_setup_files(struct acpi_device *dev);
 void acpi_device_remove_files(struct acpi_device *dev);
 void acpi_device_add_finalize(struct acpi_device *device);
 void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
-bool acpi_device_is_present(const struct acpi_device *adev);
 bool acpi_device_is_battery(struct acpi_device *adev);
 bool acpi_device_is_first_physical_node(struct acpi_device *adev,
 					const struct device *dev);
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index 6979a3f9f90a..14d6948fd88a 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1420,7 +1420,7 @@  static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
 	if (!is_acpi_device_node(fwnode))
 		return false;
 
-	return acpi_device_is_present(to_acpi_device_node(fwnode));
+	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
 }
 
 static const void *
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 02bb2cce423f..728649a2a251 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -304,7 +304,7 @@  static int acpi_scan_device_check(struct acpi_device *adev)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (acpi_device_is_present(adev)) {
+	if (acpi_dev_ready_for_enumeration(adev)) {
 		/*
 		 * This function is only called for device objects for which
 		 * matching scan handlers exist.  The only situation in which
@@ -338,7 +338,7 @@  static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (!acpi_device_is_present(adev)) {
+	if (!acpi_dev_ready_for_enumeration(adev)) {
 		acpi_scan_device_not_enumerated(adev);
 		return 0;
 	}
@@ -1913,11 +1913,6 @@  static bool acpi_device_should_be_hidden(acpi_handle handle)
 	return true;
 }
 
-bool acpi_device_is_present(const struct acpi_device *adev)
-{
-	return adev->status.present || adev->status.functional;
-}
-
 static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 				       const char *idstr,
 				       const struct acpi_device_id **matchid)
@@ -2381,16 +2376,25 @@  EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
  * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
  * @device: Pointer to the &struct acpi_device to check
  *
- * Check if the device is present and has no unmet dependencies.
+ * Check if the device is functional or enabled and has no unmet dependencies.
  *
- * Return true if the device is ready for enumeratino. Otherwise, return false.
+ * Return true if the device is ready for enumeration. Otherwise, return false.
  */
 bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
 {
 	if (device->flags.honor_deps && device->dep_unmet)
 		return false;
 
-	return acpi_device_is_present(device);
+	/*
+	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
+	 * (!present && functional) for certain types of devices that should be
+	 * enumerated. Note that the enabled bit can't be sert until the present
+	 * bit is set.
+	 */
+	if (device->status.present)
+		return device->status.enabled;
+	else
+		return device->status.functional;
 }
 EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);