diff mbox series

[RFC,v4,02/15] ACPI: processor: Register all CPUs from acpi_processor_get_info()

Message ID E1rVDmU-0027YP-Jz@rmk-PC.armlinux.org.uk (mailing list archive)
State Superseded, archived
Headers show
Series ACPI/arm64: add support for virtual cpu hotplug | expand

Commit Message

Russell King (Oracle) Jan. 31, 2024, 4:49 p.m. UTC
From: James Morse <james.morse@arm.com>

To allow ACPI to skip the call to arch_register_cpu() when the _STA
value indicates the CPU can't be brought online right now, move the
arch_register_cpu() call into acpi_processor_get_info().

Systems can still be booted with 'acpi=off', or not include an
ACPI description at all. For these, the CPUs continue to be
registered by cpu_dev_register_generic().

This moves the CPU register logic back to a subsys_initcall(),
while the memory nodes will have been registered earlier.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
Changes since RFC v2:
 * Fixup comment in acpi_processor_get_info() (Gavin Shan)
 * Add comment in cpu_dev_register_generic() (Gavin Shan)
---
 drivers/acpi/acpi_processor.c | 12 ++++++++++++
 drivers/base/cpu.c            |  6 +++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki Feb. 15, 2024, 7:22 p.m. UTC | #1
On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
>
> From: James Morse <james.morse@arm.com>
>
> To allow ACPI to skip the call to arch_register_cpu() when the _STA
> value indicates the CPU can't be brought online right now, move the
> arch_register_cpu() call into acpi_processor_get_info().
>
> Systems can still be booted with 'acpi=off', or not include an
> ACPI description at all. For these, the CPUs continue to be
> registered by cpu_dev_register_generic().
>
> This moves the CPU register logic back to a subsys_initcall(),
> while the memory nodes will have been registered earlier.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>  * Fixup comment in acpi_processor_get_info() (Gavin Shan)
>  * Add comment in cpu_dev_register_generic() (Gavin Shan)
> ---
>  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  drivers/base/cpu.c            |  6 +++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index cf7c1cca69dd..a68c475cdea5 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
>                         cpufreq_add_device("acpi-cpufreq");
>         }
>
> +       /*
> +        * Register CPUs that are present. get_cpu_device() is used to skip
> +        * duplicate CPU descriptions from firmware.
> +        */
> +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> +           !get_cpu_device(pr->id)) {
> +               int ret = arch_register_cpu(pr->id);
> +
> +               if (ret)
> +                       return ret;
> +       }
> +
>         /*
>          *  Extra Processor objects may be enumerated on MP systems with
>          *  less than the max # of CPUs. They should be ignored _iff

This is interesting, because right below there is the following code:

    if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
        int ret = acpi_processor_hotadd_init(pr);

        if (ret)
            return ret;
    }

and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
with some extra things around it (more about that below).

I do realize that acpi_processor_hotadd_init() is defined under
CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.

So why are the two conditionals that almost contradict each other both
needed?  It looks like the new code could be combined with
acpi_processor_hotadd_init() to do the right thing in all cases.

Now, acpi_processor_hotadd_init() does some extra things that look
like they should be done by the new code too.

1. It checks invalid_phys_cpuid() which appears to be a good idea to me.

2. It uses locking around arch_register_cpu() which doesn't seem
unreasonable either.

3. It calls acpi_map_cpu() and I'm not sure why this is not done by
the new code.

The only thing that can be dropped from it is the _STA check AFAICS,
because acpi_processor_add() won't even be called if the CPU is not
present (and not enabled after the first patch).

So why does the code not do 1 - 3 above?

> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 47de0f140ba6..13d052bf13f4 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
>  {
>         int i, ret;
>
> -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> +       /*
> +        * When ACPI is enabled, CPUs are registered via
> +        * acpi_processor_get_info().
> +        */
> +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
>                 return;

Honestly, this looks like a quick hack to me and it absolutely
requires an ACK from the x86 maintainers to go anywhere.

>
>         for_each_present_cpu(i) {
> --
Russell King (Oracle) Feb. 20, 2024, 11:27 a.m. UTC | #2
On Thu, Feb 15, 2024 at 08:22:29PM +0100, Rafael J. Wysocki wrote:
> On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index cf7c1cca69dd..a68c475cdea5 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >                         cpufreq_add_device("acpi-cpufreq");
> >         }
> >
> > +       /*
> > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > +        * duplicate CPU descriptions from firmware.
> > +        */
> > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > +           !get_cpu_device(pr->id)) {
> > +               int ret = arch_register_cpu(pr->id);
> > +
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> >         /*
> >          *  Extra Processor objects may be enumerated on MP systems with
> >          *  less than the max # of CPUs. They should be ignored _iff
> 
> This is interesting, because right below there is the following code:
> 
>     if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>         int ret = acpi_processor_hotadd_init(pr);
> 
>         if (ret)
>             return ret;
>     }
> 
> and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
> with some extra things around it (more about that below).
> 
> I do realize that acpi_processor_hotadd_init() is defined under
> CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
> consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.
> 
> So why are the two conditionals that almost contradict each other both
> needed?  It looks like the new code could be combined with
> acpi_processor_hotadd_init() to do the right thing in all cases.
> 
> Now, acpi_processor_hotadd_init() does some extra things that look
> like they should be done by the new code too.
> 
> 1. It checks invalid_phys_cpuid() which appears to be a good idea to me.
> 
> 2. It uses locking around arch_register_cpu() which doesn't seem
> unreasonable either.
> 
> 3. It calls acpi_map_cpu() and I'm not sure why this is not done by
> the new code.
> 
> The only thing that can be dropped from it is the _STA check AFAICS,
> because acpi_processor_add() won't even be called if the CPU is not
> present (and not enabled after the first patch).
> 
> So why does the code not do 1 - 3 above?

Honestly, I'm out of my depth with this and can't answer your
questions - and I really don't want to try fiddling with this code
because it's just too icky (even in its current form in mainline)
to be understandable to anyone who hasn't gained a detailed knowledge
of this code.

It's going to require a lot of analysis - how acpi_map_cpuid() behaves
in all circumstances, what this means for invalid_logical_cpuid() and
invalid_phys_cpuid(), what paths will be taken in each case. This code
is already just too hairy for someone who isn't an experienced ACPI
hacker to be able to follow and I don't see an obvious way to make it
more readable.

James' additions make it even more complex and less readable.
Russell King (Oracle) Feb. 20, 2024, 3:13 p.m. UTC | #3
On Tue, Feb 20, 2024 at 11:27:15AM +0000, Russell King (Oracle) wrote:
> On Thu, Feb 15, 2024 at 08:22:29PM +0100, Rafael J. Wysocki wrote:
> > On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index cf7c1cca69dd..a68c475cdea5 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > >                         cpufreq_add_device("acpi-cpufreq");
> > >         }
> > >
> > > +       /*
> > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > +        * duplicate CPU descriptions from firmware.
> > > +        */
> > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > +           !get_cpu_device(pr->id)) {
> > > +               int ret = arch_register_cpu(pr->id);
> > > +
> > > +               if (ret)
> > > +                       return ret;
> > > +       }
> > > +
> > >         /*
> > >          *  Extra Processor objects may be enumerated on MP systems with
> > >          *  less than the max # of CPUs. They should be ignored _iff
> > 
> > This is interesting, because right below there is the following code:
> > 
> >     if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >         int ret = acpi_processor_hotadd_init(pr);
> > 
> >         if (ret)
> >             return ret;
> >     }
> > 
> > and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
> > with some extra things around it (more about that below).
> > 
> > I do realize that acpi_processor_hotadd_init() is defined under
> > CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
> > consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.
> > 
> > So why are the two conditionals that almost contradict each other both
> > needed?  It looks like the new code could be combined with
> > acpi_processor_hotadd_init() to do the right thing in all cases.
> > 
> > Now, acpi_processor_hotadd_init() does some extra things that look
> > like they should be done by the new code too.
> > 
> > 1. It checks invalid_phys_cpuid() which appears to be a good idea to me.
> > 
> > 2. It uses locking around arch_register_cpu() which doesn't seem
> > unreasonable either.
> > 
> > 3. It calls acpi_map_cpu() and I'm not sure why this is not done by
> > the new code.
> > 
> > The only thing that can be dropped from it is the _STA check AFAICS,
> > because acpi_processor_add() won't even be called if the CPU is not
> > present (and not enabled after the first patch).
> > 
> > So why does the code not do 1 - 3 above?
> 
> Honestly, I'm out of my depth with this and can't answer your
> questions - and I really don't want to try fiddling with this code
> because it's just too icky (even in its current form in mainline)
> to be understandable to anyone who hasn't gained a detailed knowledge
> of this code.
> 
> It's going to require a lot of analysis - how acpi_map_cpuid() behaves
> in all circumstances, what this means for invalid_logical_cpuid() and
> invalid_phys_cpuid(), what paths will be taken in each case. This code
> is already just too hairy for someone who isn't an experienced ACPI
> hacker to be able to follow and I don't see an obvious way to make it
> more readable.
> 
> James' additions make it even more complex and less readable.

As an illustration of the problems I'm having here, I was just writing
a reply to this with a suggestion of transforming this code ultimately
to:

	if (!get_cpu_device(pr->id)) {
		int ret;

		if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id))
			ret = acpi_processor_make_enabled(pr);
		else
			ret = acpi_processor_make_present(pr);

		if (ret)
			return ret;
	}

(acpi_processor_make_present() would be acpi_processor_hotadd_init()
and acpi_processor_make_enabled() would be arch_register_cpu() at this
point.)

Then I realised that's a bad idea - because we really need to check
that pr->id is valid before calling get_cpu_device() on it, so this
won't work. That leaves us with:

	int ret;

	if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
		/* x86 et.al. path */
		ret = acpi_processor_make_present(pr);
	} else if (!get_cpu_device(pr->id)) {
		/* Arm64 path */
		ret = acpi_processor_make_enabled(pr);
	} else {
		ret = 0;
	}

	if (ret)
		return ret;

Now, the next transformation would be to move !get_cpu_device(pr->id)
into acpi_processor_make_enabled() which would eliminate one of those
if() legs.

Now, if we want to somehow make the call to arch_regster_cpu() common
in these two paths, the next question is what are the _precise_
semantics of acpi_map_cpu(), particularly with respect to it
modifying pr->id. Is it guaranteed to always give the same result
for the same processor described in ACPI? What acpi_map_cpu() anyway,
I can find no documentation for it.

Then there's the question whether calling acpi_unmap_cpu() should be
done on the failure path if arch_register_cpu() fails, which is done
for the x86 path but not the Arm64 path. Should it be done for the
Arm64 path? I've no idea, but as Arm64 doesn't implement either of
these two functions, I guess they could be stubbed out and thus be
no-ops - but then we open a hole where if pr->id is invalid, we
end up passing that invalid value to arch_register_cpu() which I'm
quite sure will explode with a negative CPU number.

So, to my mind, what you're effectively asking for is a total rewrite
of all the code in and called by acpi_processor_get_info()... and that
is not something I am willing to do (because it's too far outside of
my knowledge area.)

As I said in my reply to patch 1, I think your comments on patch 2
make Arm64 vcpu hotplug unachievable in a reasonable time frame, and
certainly outside the bounds of what I can do to progress this.

So, at this point I'm going to stand down from further participation
with this patch set as I believe I've reached the limit of what I can
do to progress it.
Jonathan Cameron Feb. 20, 2024, 4:24 p.m. UTC | #4
On Tue, 20 Feb 2024 15:13:58 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Tue, Feb 20, 2024 at 11:27:15AM +0000, Russell King (Oracle) wrote:
> > On Thu, Feb 15, 2024 at 08:22:29PM +0100, Rafael J. Wysocki wrote:  
> > > On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:  
> > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > index cf7c1cca69dd..a68c475cdea5 100644
> > > > --- a/drivers/acpi/acpi_processor.c
> > > > +++ b/drivers/acpi/acpi_processor.c
> > > > @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > >                         cpufreq_add_device("acpi-cpufreq");
> > > >         }
> > > >
> > > > +       /*
> > > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > > +        * duplicate CPU descriptions from firmware.
> > > > +        */
> > > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > > +           !get_cpu_device(pr->id)) {
> > > > +               int ret = arch_register_cpu(pr->id);
> > > > +
> > > > +               if (ret)
> > > > +                       return ret;
> > > > +       }
> > > > +
> > > >         /*
> > > >          *  Extra Processor objects may be enumerated on MP systems with
> > > >          *  less than the max # of CPUs. They should be ignored _iff  
> > > 
> > > This is interesting, because right below there is the following code:
> > > 
> > >     if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> > >         int ret = acpi_processor_hotadd_init(pr);
> > > 
> > >         if (ret)
> > >             return ret;
> > >     }
> > > 
> > > and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
> > > with some extra things around it (more about that below).
> > > 
> > > I do realize that acpi_processor_hotadd_init() is defined under
> > > CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
> > > consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.
> > > 
> > > So why are the two conditionals that almost contradict each other both
> > > needed?  It looks like the new code could be combined with
> > > acpi_processor_hotadd_init() to do the right thing in all cases.
> > > 
> > > Now, acpi_processor_hotadd_init() does some extra things that look
> > > like they should be done by the new code too.
> > > 
> > > 1. It checks invalid_phys_cpuid() which appears to be a good idea to me.
> > > 
> > > 2. It uses locking around arch_register_cpu() which doesn't seem
> > > unreasonable either.
> > > 
> > > 3. It calls acpi_map_cpu() and I'm not sure why this is not done by
> > > the new code.
> > > 
> > > The only thing that can be dropped from it is the _STA check AFAICS,
> > > because acpi_processor_add() won't even be called if the CPU is not
> > > present (and not enabled after the first patch).
> > > 
> > > So why does the code not do 1 - 3 above?  
> > 
> > Honestly, I'm out of my depth with this and can't answer your
> > questions - and I really don't want to try fiddling with this code
> > because it's just too icky (even in its current form in mainline)
> > to be understandable to anyone who hasn't gained a detailed knowledge
> > of this code.
> > 
> > It's going to require a lot of analysis - how acpi_map_cpuid() behaves
> > in all circumstances, what this means for invalid_logical_cpuid() and
> > invalid_phys_cpuid(), what paths will be taken in each case. This code
> > is already just too hairy for someone who isn't an experienced ACPI
> > hacker to be able to follow and I don't see an obvious way to make it
> > more readable.
> > 
> > James' additions make it even more complex and less readable.  
> 
> As an illustration of the problems I'm having here, I was just writing
> a reply to this with a suggestion of transforming this code ultimately
> to:
> 
> 	if (!get_cpu_device(pr->id)) {
> 		int ret;
> 
> 		if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id))
> 			ret = acpi_processor_make_enabled(pr);
> 		else
> 			ret = acpi_processor_make_present(pr);
> 
> 		if (ret)
> 			return ret;
> 	}
> 
> (acpi_processor_make_present() would be acpi_processor_hotadd_init()
> and acpi_processor_make_enabled() would be arch_register_cpu() at this
> point.)
> 
> Then I realised that's a bad idea - because we really need to check
> that pr->id is valid before calling get_cpu_device() on it, so this
> won't work. That leaves us with:
> 
> 	int ret;
> 
> 	if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> 		/* x86 et.al. path */
> 		ret = acpi_processor_make_present(pr);
> 	} else if (!get_cpu_device(pr->id)) {
> 		/* Arm64 path */
> 		ret = acpi_processor_make_enabled(pr);
> 	} else {
> 		ret = 0;
> 	}
> 
> 	if (ret)
> 		return ret;
> 
> Now, the next transformation would be to move !get_cpu_device(pr->id)
> into acpi_processor_make_enabled() which would eliminate one of those
> if() legs.
> 
> Now, if we want to somehow make the call to arch_regster_cpu() common
> in these two paths, the next question is what are the _precise_
> semantics of acpi_map_cpu(), particularly with respect to it
> modifying pr->id. Is it guaranteed to always give the same result
> for the same processor described in ACPI? What acpi_map_cpu() anyway,
> I can find no documentation for it.
> 
> Then there's the question whether calling acpi_unmap_cpu() should be
> done on the failure path if arch_register_cpu() fails, which is done
> for the x86 path but not the Arm64 path. Should it be done for the
> Arm64 path? I've no idea, but as Arm64 doesn't implement either of
> these two functions, I guess they could be stubbed out and thus be
> no-ops - but then we open a hole where if pr->id is invalid, we
> end up passing that invalid value to arch_register_cpu() which I'm
> quite sure will explode with a negative CPU number.
> 
> So, to my mind, what you're effectively asking for is a total rewrite
> of all the code in and called by acpi_processor_get_info()... and that
> is not something I am willing to do (because it's too far outside of
> my knowledge area.)
> 
> As I said in my reply to patch 1, I think your comments on patch 2
> make Arm64 vcpu hotplug unachievable in a reasonable time frame, and
> certainly outside the bounds of what I can do to progress this.
> 
> So, at this point I'm going to stand down from further participation
> with this patch set as I believe I've reached the limit of what I can
> do to progress it.
> 

Thanks for your hard work on this Russell - we have moved forwards.

Short of anyone else stepping up I'll pick this up with
the help of some my colleagues. As such I'm keen on getting patch
1 upstream ASAP so that we can exclude the need for some of the
other workarounds from earlier versions of this series (the ones
dropped before now).

We will need a little time to get up to speed on the current status
and discussion points Russell raises above.

Jonathan
Rafael J. Wysocki Feb. 20, 2024, 7:59 p.m. UTC | #5
On Tue, Feb 20, 2024 at 5:24 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Tue, 20 Feb 2024 15:13:58 +0000
> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
>
> > On Tue, Feb 20, 2024 at 11:27:15AM +0000, Russell King (Oracle) wrote:
> > > On Thu, Feb 15, 2024 at 08:22:29PM +0100, Rafael J. Wysocki wrote:
> > > > On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > index cf7c1cca69dd..a68c475cdea5 100644
> > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > >                         cpufreq_add_device("acpi-cpufreq");
> > > > >         }
> > > > >
> > > > > +       /*
> > > > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > > > +        * duplicate CPU descriptions from firmware.
> > > > > +        */
> > > > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > > > +           !get_cpu_device(pr->id)) {
> > > > > +               int ret = arch_register_cpu(pr->id);
> > > > > +
> > > > > +               if (ret)
> > > > > +                       return ret;
> > > > > +       }
> > > > > +
> > > > >         /*
> > > > >          *  Extra Processor objects may be enumerated on MP systems with
> > > > >          *  less than the max # of CPUs. They should be ignored _iff
> > > >
> > > > This is interesting, because right below there is the following code:
> > > >
> > > >     if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> > > >         int ret = acpi_processor_hotadd_init(pr);
> > > >
> > > >         if (ret)
> > > >             return ret;
> > > >     }
> > > >
> > > > and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
> > > > with some extra things around it (more about that below).
> > > >
> > > > I do realize that acpi_processor_hotadd_init() is defined under
> > > > CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
> > > > consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.
> > > >
> > > > So why are the two conditionals that almost contradict each other both
> > > > needed?  It looks like the new code could be combined with
> > > > acpi_processor_hotadd_init() to do the right thing in all cases.
> > > >
> > > > Now, acpi_processor_hotadd_init() does some extra things that look
> > > > like they should be done by the new code too.
> > > >
> > > > 1. It checks invalid_phys_cpuid() which appears to be a good idea to me.
> > > >
> > > > 2. It uses locking around arch_register_cpu() which doesn't seem
> > > > unreasonable either.
> > > >
> > > > 3. It calls acpi_map_cpu() and I'm not sure why this is not done by
> > > > the new code.
> > > >
> > > > The only thing that can be dropped from it is the _STA check AFAICS,
> > > > because acpi_processor_add() won't even be called if the CPU is not
> > > > present (and not enabled after the first patch).
> > > >
> > > > So why does the code not do 1 - 3 above?
> > >
> > > Honestly, I'm out of my depth with this and can't answer your
> > > questions - and I really don't want to try fiddling with this code
> > > because it's just too icky (even in its current form in mainline)
> > > to be understandable to anyone who hasn't gained a detailed knowledge
> > > of this code.
> > >
> > > It's going to require a lot of analysis - how acpi_map_cpuid() behaves
> > > in all circumstances, what this means for invalid_logical_cpuid() and
> > > invalid_phys_cpuid(), what paths will be taken in each case. This code
> > > is already just too hairy for someone who isn't an experienced ACPI
> > > hacker to be able to follow and I don't see an obvious way to make it
> > > more readable.
> > >
> > > James' additions make it even more complex and less readable.
> >
> > As an illustration of the problems I'm having here, I was just writing
> > a reply to this with a suggestion of transforming this code ultimately
> > to:
> >
> >       if (!get_cpu_device(pr->id)) {
> >               int ret;
> >
> >               if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id))
> >                       ret = acpi_processor_make_enabled(pr);
> >               else
> >                       ret = acpi_processor_make_present(pr);
> >
> >               if (ret)
> >                       return ret;
> >       }
> >
> > (acpi_processor_make_present() would be acpi_processor_hotadd_init()
> > and acpi_processor_make_enabled() would be arch_register_cpu() at this
> > point.)
> >
> > Then I realised that's a bad idea - because we really need to check
> > that pr->id is valid before calling get_cpu_device() on it, so this
> > won't work. That leaves us with:
> >
> >       int ret;
> >
> >       if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >               /* x86 et.al. path */
> >               ret = acpi_processor_make_present(pr);
> >       } else if (!get_cpu_device(pr->id)) {
> >               /* Arm64 path */
> >               ret = acpi_processor_make_enabled(pr);
> >       } else {
> >               ret = 0;
> >       }
> >
> >       if (ret)
> >               return ret;
> >
> > Now, the next transformation would be to move !get_cpu_device(pr->id)
> > into acpi_processor_make_enabled() which would eliminate one of those
> > if() legs.
> >
> > Now, if we want to somehow make the call to arch_regster_cpu() common
> > in these two paths, the next question is what are the _precise_
> > semantics of acpi_map_cpu(), particularly with respect to it
> > modifying pr->id. Is it guaranteed to always give the same result
> > for the same processor described in ACPI? What acpi_map_cpu() anyway,
> > I can find no documentation for it.
> >
> > Then there's the question whether calling acpi_unmap_cpu() should be
> > done on the failure path if arch_register_cpu() fails, which is done
> > for the x86 path but not the Arm64 path. Should it be done for the
> > Arm64 path? I've no idea, but as Arm64 doesn't implement either of
> > these two functions, I guess they could be stubbed out and thus be
> > no-ops - but then we open a hole where if pr->id is invalid, we
> > end up passing that invalid value to arch_register_cpu() which I'm
> > quite sure will explode with a negative CPU number.
> >
> > So, to my mind, what you're effectively asking for is a total rewrite
> > of all the code in and called by acpi_processor_get_info()... and that
> > is not something I am willing to do (because it's too far outside of
> > my knowledge area.)
> >
> > As I said in my reply to patch 1, I think your comments on patch 2
> > make Arm64 vcpu hotplug unachievable in a reasonable time frame, and
> > certainly outside the bounds of what I can do to progress this.
> >
> > So, at this point I'm going to stand down from further participation
> > with this patch set as I believe I've reached the limit of what I can
> > do to progress it.
> >
>
> Thanks for your hard work on this Russell - we have moved forwards.
>
> Short of anyone else stepping up I'll pick this up with
> the help of some my colleagues. As such I'm keen on getting patch
> 1 upstream ASAP so that we can exclude the need for some of the
> other workarounds from earlier versions of this series (the ones
> dropped before now).

Applied (as 6.9 material).

> We will need a little time to get up to speed on the current status
> and discussion points Russell raises above.

Sure.

I'm planning to send comments for some other patches in the series this week.

Thanks!
Thomas Gleixner Feb. 20, 2024, 8:59 p.m. UTC | #6
On Thu, Feb 15 2024 at 20:22, Rafael J. Wysocki wrote:
> On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
>> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
>> index 47de0f140ba6..13d052bf13f4 100644
>> --- a/drivers/base/cpu.c
>> +++ b/drivers/base/cpu.c
>> @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
>>  {
>>         int i, ret;
>>
>> -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
>> +       /*
>> +        * When ACPI is enabled, CPUs are registered via
>> +        * acpi_processor_get_info().
>> +        */
>> +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
>>                 return;
>
> Honestly, this looks like a quick hack to me and it absolutely
> requires an ACK from the x86 maintainers to go anywhere.

That should work, but yes I agree it's all but pretty.
Rafael J. Wysocki Feb. 21, 2024, 12:04 p.m. UTC | #7
On Tue, Feb 20, 2024 at 8:59 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 20, 2024 at 5:24 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Tue, 20 Feb 2024 15:13:58 +0000
> > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> >
> > > On Tue, Feb 20, 2024 at 11:27:15AM +0000, Russell King (Oracle) wrote:
> > > > On Thu, Feb 15, 2024 at 08:22:29PM +0100, Rafael J. Wysocki wrote:
> > > > > On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > > index cf7c1cca69dd..a68c475cdea5 100644
> > > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > > @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > >                         cpufreq_add_device("acpi-cpufreq");
> > > > > >         }
> > > > > >
> > > > > > +       /*
> > > > > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > > > > +        * duplicate CPU descriptions from firmware.
> > > > > > +        */
> > > > > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > > > > +           !get_cpu_device(pr->id)) {
> > > > > > +               int ret = arch_register_cpu(pr->id);
> > > > > > +
> > > > > > +               if (ret)
> > > > > > +                       return ret;
> > > > > > +       }
> > > > > > +
> > > > > >         /*
> > > > > >          *  Extra Processor objects may be enumerated on MP systems with
> > > > > >          *  less than the max # of CPUs. They should be ignored _iff
> > > > >
> > > > > This is interesting, because right below there is the following code:
> > > > >
> > > > >     if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> > > > >         int ret = acpi_processor_hotadd_init(pr);
> > > > >
> > > > >         if (ret)
> > > > >             return ret;
> > > > >     }
> > > > >
> > > > > and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
> > > > > with some extra things around it (more about that below).
> > > > >
> > > > > I do realize that acpi_processor_hotadd_init() is defined under
> > > > > CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
> > > > > consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.
> > > > >
> > > > > So why are the two conditionals that almost contradict each other both
> > > > > needed?  It looks like the new code could be combined with
> > > > > acpi_processor_hotadd_init() to do the right thing in all cases.
> > > > >
> > > > > Now, acpi_processor_hotadd_init() does some extra things that look
> > > > > like they should be done by the new code too.
> > > > >
> > > > > 1. It checks invalid_phys_cpuid() which appears to be a good idea to me.
> > > > >
> > > > > 2. It uses locking around arch_register_cpu() which doesn't seem
> > > > > unreasonable either.
> > > > >
> > > > > 3. It calls acpi_map_cpu() and I'm not sure why this is not done by
> > > > > the new code.
> > > > >
> > > > > The only thing that can be dropped from it is the _STA check AFAICS,
> > > > > because acpi_processor_add() won't even be called if the CPU is not
> > > > > present (and not enabled after the first patch).
> > > > >
> > > > > So why does the code not do 1 - 3 above?
> > > >
> > > > Honestly, I'm out of my depth with this and can't answer your
> > > > questions - and I really don't want to try fiddling with this code
> > > > because it's just too icky (even in its current form in mainline)
> > > > to be understandable to anyone who hasn't gained a detailed knowledge
> > > > of this code.
> > > >
> > > > It's going to require a lot of analysis - how acpi_map_cpuid() behaves
> > > > in all circumstances, what this means for invalid_logical_cpuid() and
> > > > invalid_phys_cpuid(), what paths will be taken in each case. This code
> > > > is already just too hairy for someone who isn't an experienced ACPI
> > > > hacker to be able to follow and I don't see an obvious way to make it
> > > > more readable.
> > > >
> > > > James' additions make it even more complex and less readable.
> > >
> > > As an illustration of the problems I'm having here, I was just writing
> > > a reply to this with a suggestion of transforming this code ultimately
> > > to:
> > >
> > >       if (!get_cpu_device(pr->id)) {
> > >               int ret;
> > >
> > >               if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id))
> > >                       ret = acpi_processor_make_enabled(pr);
> > >               else
> > >                       ret = acpi_processor_make_present(pr);
> > >
> > >               if (ret)
> > >                       return ret;
> > >       }
> > >
> > > (acpi_processor_make_present() would be acpi_processor_hotadd_init()
> > > and acpi_processor_make_enabled() would be arch_register_cpu() at this
> > > point.)
> > >
> > > Then I realised that's a bad idea - because we really need to check
> > > that pr->id is valid before calling get_cpu_device() on it, so this
> > > won't work. That leaves us with:
> > >
> > >       int ret;
> > >
> > >       if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> > >               /* x86 et.al. path */
> > >               ret = acpi_processor_make_present(pr);
> > >       } else if (!get_cpu_device(pr->id)) {
> > >               /* Arm64 path */
> > >               ret = acpi_processor_make_enabled(pr);
> > >       } else {
> > >               ret = 0;
> > >       }
> > >
> > >       if (ret)
> > >               return ret;
> > >
> > > Now, the next transformation would be to move !get_cpu_device(pr->id)
> > > into acpi_processor_make_enabled() which would eliminate one of those
> > > if() legs.
> > >
> > > Now, if we want to somehow make the call to arch_regster_cpu() common
> > > in these two paths, the next question is what are the _precise_
> > > semantics of acpi_map_cpu(), particularly with respect to it
> > > modifying pr->id. Is it guaranteed to always give the same result
> > > for the same processor described in ACPI? What acpi_map_cpu() anyway,
> > > I can find no documentation for it.
> > >
> > > Then there's the question whether calling acpi_unmap_cpu() should be
> > > done on the failure path if arch_register_cpu() fails, which is done
> > > for the x86 path but not the Arm64 path. Should it be done for the
> > > Arm64 path? I've no idea, but as Arm64 doesn't implement either of
> > > these two functions, I guess they could be stubbed out and thus be
> > > no-ops - but then we open a hole where if pr->id is invalid, we
> > > end up passing that invalid value to arch_register_cpu() which I'm
> > > quite sure will explode with a negative CPU number.
> > >
> > > So, to my mind, what you're effectively asking for is a total rewrite
> > > of all the code in and called by acpi_processor_get_info()... and that
> > > is not something I am willing to do (because it's too far outside of
> > > my knowledge area.)
> > >
> > > As I said in my reply to patch 1, I think your comments on patch 2
> > > make Arm64 vcpu hotplug unachievable in a reasonable time frame, and
> > > certainly outside the bounds of what I can do to progress this.
> > >
> > > So, at this point I'm going to stand down from further participation
> > > with this patch set as I believe I've reached the limit of what I can
> > > do to progress it.
> > >
> >
> > Thanks for your hard work on this Russell - we have moved forwards.
> >
> > Short of anyone else stepping up I'll pick this up with
> > the help of some my colleagues. As such I'm keen on getting patch
> > 1 upstream ASAP so that we can exclude the need for some of the
> > other workarounds from earlier versions of this series (the ones
> > dropped before now).
>
> Applied (as 6.9 material).

And I'm going to drop it, because it is not correct.

The problem is that it is going to affect non-processor devices, but
let me comment on that patch itself.
Jonathan Cameron March 22, 2024, 6:53 p.m. UTC | #8
On Thu, 15 Feb 2024 20:22:29 +0100
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Wed, Jan 31, 2024 at 5:50 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> >
> > From: James Morse <james.morse@arm.com>
> >
> > To allow ACPI to skip the call to arch_register_cpu() when the _STA
> > value indicates the CPU can't be brought online right now, move the
> > arch_register_cpu() call into acpi_processor_get_info().
> >
> > Systems can still be booted with 'acpi=off', or not include an
> > ACPI description at all. For these, the CPUs continue to be
> > registered by cpu_dev_register_generic().
> >
> > This moves the CPU register logic back to a subsys_initcall(),
> > while the memory nodes will have been registered earlier.
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Reviewed-by: Gavin Shan <gshan@redhat.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > Changes since RFC v2:
> >  * Fixup comment in acpi_processor_get_info() (Gavin Shan)
> >  * Add comment in cpu_dev_register_generic() (Gavin Shan)
> > ---
> >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
> >  drivers/base/cpu.c            |  6 +++++-
> >  2 files changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index cf7c1cca69dd..a68c475cdea5 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -314,6 +314,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >                         cpufreq_add_device("acpi-cpufreq");
> >         }
> >
> > +       /*
> > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > +        * duplicate CPU descriptions from firmware.
> > +        */
> > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > +           !get_cpu_device(pr->id)) {
> > +               int ret = arch_register_cpu(pr->id);
> > +
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> >         /*
> >          *  Extra Processor objects may be enumerated on MP systems with
> >          *  less than the max # of CPUs. They should be ignored _iff  
> 
> This is interesting, because right below there is the following code:
> 
>     if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>         int ret = acpi_processor_hotadd_init(pr);
> 
>         if (ret)
>             return ret;
>     }
> 
> and acpi_processor_hotadd_init() essentially calls arch_register_cpu()
> with some extra things around it (more about that below).
> 
> I do realize that acpi_processor_hotadd_init() is defined under
> CONFIG_ACPI_HOTPLUG_CPU, so for the sake of the argument let's
> consider an architecture where CONFIG_ACPI_HOTPLUG_CPU is set.
> 
> So why are the two conditionals that almost contradict each other both
> needed?  It looks like the new code could be combined with
> acpi_processor_hotadd_init() to do the right thing in all cases.

I jumped on to the end of this series to look at this as the two legs
look more similar at that point. I'll figure out how to drive
any changes through the series once the end goal is clear.

To make testing easy I made the acpi_process_make_enabled() look as
much like acpi_process_make_present() as possible.

> 
> Now, acpi_processor_hotadd_init() does some extra things that look
> like they should be done by the new code too.
> 
> 1. It checks invalid_phys_cpuid() which appears to be a good idea to me.

Indeed that is sensible. Not sure there is a path to here where it fails,
but defense in depth is good.

> 
> 2. It uses locking around arch_register_cpu() which doesn't seem
> unreasonable either.

Seems reasonable, though exactly what this protecting is unclear to me
- is the arch_register_cpu() and/or the acpi_map_cpu().
Whilst it would be nice to be sure, appears harmless, so let us
take it for consistency if nothing else.

The cpu_maps_update_begin()/end() calls though aren't necessary as
we aren't touching the cpu_present or cpu_online masks.


> 
> 3. It calls acpi_map_cpu() and I'm not sure why this is not done by
> the new code.

Doesn't exist except on x86 and longarch as Russell mentioned. So let's
see what it does (on x86)  So we are into the realm of interfaces that
look generic but really aren't :(  I particularly like the
generic_processor_info() which isn't particularly generic.

1. cpu = acpi_register_lapic()

Docs say: Register a local apic and generates a logic cpu number

2. generic_processor_info() in arch/x86/kernel/acpi/acpi.c

Checks against nr_cpus_ids - maybe that bit is useful

Allocate_logical_cpuid().
Digging in, it seems to do similar to setting __cpu_logical_map on arm64.
That's done in acpi_map_gic_cpu_interface, which happens when MADT is
parsed and I believe it's one of the the things we need to do whether
or not the CPU is enabled at boot. So already done.

acpi_processor_set_pdc() -- configure _PDC support (which I'd never heard
of before now).  Deprecated in ACPI 3.0. Given we are using stuff only added
in 6.5 we can probably skip that even if it would be harmless.

acpi_map_cpu2node() -- evalulate _PXM and set __apicid_to_node[]
entry. That is only used from x86 code. Not sure what equivalent would be.
Also numa_set_node(cpu, nid);  Which again sounds a lot more generic than
it is. Load of x86 specific stuff + set_cpu_numa_node() which is generic
and for ARM64 (and anything using CONFIG_GENERIC_ARCH_NUMA) is called
by numa_store_cpu_info() either from early_map_cpu_to_node() or smp_prepare_cpus()
which is called for_each_possible_cpu() and hence has already been done.

So conclusion on this one is there doesn't seem to be anything to do.
We could provide a __weak function or an ARM64 specific one that does
nothing or gate it on an appropriate config variable.  However, given
I presume 'future' ARM64 support for CPU hotplug will want to do something
in these calls, perhaps a better bet is to pass a bool into the function
to indicate these should be skipped if present is not changing.

Having done that, we end up with code that is messy enough we are
better off keeping them as separate functions, though they may
look a little more similar than in this version.

There is a final thing in here you didn't mention
setting pr->flags.need_hotplug_init
which causes extra stuff to occur in processor_driver.c
The extra stuff doesn't seem to be necessary for the enable case
despite being needed for change of present status.
I haven't figured this bit out yet (I need to mess around on x86
to understand what goes wrong if you don't use that flag).


> 
> The only thing that can be dropped from it is the _STA check AFAICS,
> because acpi_processor_add() won't even be called if the CPU is not
> present (and not enabled after the first patch).
> 
> So why does the code not do 1 - 3 above?
I agree with 1 and 2, reasoning for 3 given above.

> 
> > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > index 47de0f140ba6..13d052bf13f4 100644
> > --- a/drivers/base/cpu.c
> > +++ b/drivers/base/cpu.c
> > @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
> >  {
> >         int i, ret;
> >
> > -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > +       /*
> > +        * When ACPI is enabled, CPUs are registered via
> > +        * acpi_processor_get_info().
> > +        */
> > +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> >                 return;  
> 
> Honestly, this looks like a quick hack to me and it absolutely
> requires an ACK from the x86 maintainers to go anywhere.
Will address this separately.

> 
> >
> >         for_each_present_cpu(i) {
> > --
Jonathan Cameron April 10, 2024, 12:43 p.m. UTC | #9
> >   
> > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > > index 47de0f140ba6..13d052bf13f4 100644
> > > --- a/drivers/base/cpu.c
> > > +++ b/drivers/base/cpu.c
> > > @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
> > >  {
> > >         int i, ret;
> > >
> > > -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > > +       /*
> > > +        * When ACPI is enabled, CPUs are registered via
> > > +        * acpi_processor_get_info().
> > > +        */
> > > +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> > >                 return;    
> > 
> > Honestly, this looks like a quick hack to me and it absolutely
> > requires an ACK from the x86 maintainers to go anywhere.  
> Will address this separately.
> 

So do people prefer this hack, or something along lines of the following?

static int __init cpu_dev_register_generic(void)
{
        int i, ret = 0;

        for_each_online_cpu(i) {
                if (!get_cpu_device(i)) {
                        ret = arch_register_cpu(i);
                        if (ret)
                                pr_warn("register_cpu %d failed (%d)\n", i, ret);
                }
        }
	//Probably just eat the error.
        return 0;
}
subsys_initcall_sync(cpu_dev_register_generic);

Which may look familiar at it's effectively patch 3 from v3 which was dealing
with CPUs missing from DSDT (something we think doesn't happen).

It might be possible to elide the arch_register_cpu() in
make_present() but that will mean we use different flows in this patch set
for the hotplug and initially present cases which is a bit messy.

I've tested this lightly on arm64 and x86 ACPI + DT booting and it "seems" fine.

Jonathan

> >   
> > >
> > >         for_each_present_cpu(i) {
> > > --    
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Rafael J. Wysocki April 10, 2024, 1:28 p.m. UTC | #10
On Wed, Apr 10, 2024 at 2:43 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> > >
> > > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > > > index 47de0f140ba6..13d052bf13f4 100644
> > > > --- a/drivers/base/cpu.c
> > > > +++ b/drivers/base/cpu.c
> > > > @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
> > > >  {
> > > >         int i, ret;
> > > >
> > > > -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > > > +       /*
> > > > +        * When ACPI is enabled, CPUs are registered via
> > > > +        * acpi_processor_get_info().
> > > > +        */
> > > > +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> > > >                 return;
> > >
> > > Honestly, this looks like a quick hack to me and it absolutely
> > > requires an ACK from the x86 maintainers to go anywhere.
> > Will address this separately.
> >
>
> So do people prefer this hack, or something along lines of the following?
>
> static int __init cpu_dev_register_generic(void)
> {
>         int i, ret = 0;
>
>         for_each_online_cpu(i) {
>                 if (!get_cpu_device(i)) {
>                         ret = arch_register_cpu(i);
>                         if (ret)
>                                 pr_warn("register_cpu %d failed (%d)\n", i, ret);
>                 }
>         }
>         //Probably just eat the error.
>         return 0;
> }
> subsys_initcall_sync(cpu_dev_register_generic);

I would prefer something like the above.

I actually thought that arch_register_cpu() might return something
like -EPROBE_DEFER when it cannot determine whether or not the CPU is
really available.

Then, the ACPI processor enumeration path may take care of registering
CPU that have not been registered so far and in the more-or-less the
same way regardless of the architecture (modulo some arch-specific
stuff).

In the end, it should be possible to avoid changing the behavior of
x86 and loongarch in this series.

> Which may look familiar at it's effectively patch 3 from v3 which was dealing
> with CPUs missing from DSDT (something we think doesn't happen).
>
> It might be possible to elide the arch_register_cpu() in
> make_present() but that will mean we use different flows in this patch set
> for the hotplug and initially present cases which is a bit messy.
>
> I've tested this lightly on arm64 and x86 ACPI + DT booting and it "seems" fine.

Sounds promising.

Thanks!
Jonathan Cameron April 10, 2024, 1:50 p.m. UTC | #11
On Wed, 10 Apr 2024 15:28:18 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Wed, Apr 10, 2024 at 2:43 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >  
> > > >  
> > > > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > > > > index 47de0f140ba6..13d052bf13f4 100644
> > > > > --- a/drivers/base/cpu.c
> > > > > +++ b/drivers/base/cpu.c
> > > > > @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
> > > > >  {
> > > > >         int i, ret;
> > > > >
> > > > > -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > > > > +       /*
> > > > > +        * When ACPI is enabled, CPUs are registered via
> > > > > +        * acpi_processor_get_info().
> > > > > +        */
> > > > > +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> > > > >                 return;  
> > > >
> > > > Honestly, this looks like a quick hack to me and it absolutely
> > > > requires an ACK from the x86 maintainers to go anywhere.  
> > > Will address this separately.
> > >  
> >
> > So do people prefer this hack, or something along lines of the following?
> >
> > static int __init cpu_dev_register_generic(void)
> > {
> >         int i, ret = 0;
> >
> >         for_each_online_cpu(i) {
> >                 if (!get_cpu_device(i)) {
> >                         ret = arch_register_cpu(i);
> >                         if (ret)
> >                                 pr_warn("register_cpu %d failed (%d)\n", i, ret);
> >                 }
> >         }
> >         //Probably just eat the error.
> >         return 0;
> > }
> > subsys_initcall_sync(cpu_dev_register_generic);  
> 
> I would prefer something like the above.
> 
> I actually thought that arch_register_cpu() might return something
> like -EPROBE_DEFER when it cannot determine whether or not the CPU is
> really available.

Ok. That would end up looking much more like the original code I think.
So we wouldn't have this late registration at all, or keep it for DT
on arm64?  I'm not sure that's a clean solution though leaves
the x86 path alone.

If we get rid of this catch all, solution would be to move the
!acpi_disabled check into the arm64 version of arch_cpu_register()
because we would only want the delayed registration path to be
used on ACPI cases where the question of CPU availability can't
yet be resolved.

> 
> Then, the ACPI processor enumeration path may take care of registering
> CPU that have not been registered so far and in the more-or-less the
> same way regardless of the architecture (modulo some arch-specific
> stuff).

If I understand correctly, in acpi_processor_get_info() we'd end up
with a similar check on whether it was already registered (the x86 path)
or had be deferred (arm64 / acpi).
 
> 
> In the end, it should be possible to avoid changing the behavior of
> x86 and loongarch in this series.

Possible, yes, but result if I understand correctly is we end up with
very different flows and replication of functionality between the
early registration and the late one. I'm fine with that if you prefer it!

> 
> > Which may look familiar at it's effectively patch 3 from v3 which was dealing
> > with CPUs missing from DSDT (something we think doesn't happen).
> >
> > It might be possible to elide the arch_register_cpu() in
> > make_present() but that will mean we use different flows in this patch set
> > for the hotplug and initially present cases which is a bit messy.
> >
> > I've tested this lightly on arm64 and x86 ACPI + DT booting and it "seems" fine.  
> 
> Sounds promising.

Possibly not that relevant though if proposal is to drop this approach. :(
At least I now have test setups!

Jonathan
> 
> Thanks!
Rafael J. Wysocki April 10, 2024, 2:19 p.m. UTC | #12
On Wed, Apr 10, 2024 at 3:50 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 10 Apr 2024 15:28:18 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>
> > On Wed, Apr 10, 2024 at 2:43 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > > >
> > > > > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > > > > > index 47de0f140ba6..13d052bf13f4 100644
> > > > > > --- a/drivers/base/cpu.c
> > > > > > +++ b/drivers/base/cpu.c
> > > > > > @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
> > > > > >  {
> > > > > >         int i, ret;
> > > > > >
> > > > > > -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > > > > > +       /*
> > > > > > +        * When ACPI is enabled, CPUs are registered via
> > > > > > +        * acpi_processor_get_info().
> > > > > > +        */
> > > > > > +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> > > > > >                 return;
> > > > >
> > > > > Honestly, this looks like a quick hack to me and it absolutely
> > > > > requires an ACK from the x86 maintainers to go anywhere.
> > > > Will address this separately.
> > > >
> > >
> > > So do people prefer this hack, or something along lines of the following?
> > >
> > > static int __init cpu_dev_register_generic(void)
> > > {
> > >         int i, ret = 0;
> > >
> > >         for_each_online_cpu(i) {
> > >                 if (!get_cpu_device(i)) {
> > >                         ret = arch_register_cpu(i);
> > >                         if (ret)
> > >                                 pr_warn("register_cpu %d failed (%d)\n", i, ret);
> > >                 }
> > >         }
> > >         //Probably just eat the error.
> > >         return 0;
> > > }
> > > subsys_initcall_sync(cpu_dev_register_generic);
> >
> > I would prefer something like the above.
> >
> > I actually thought that arch_register_cpu() might return something
> > like -EPROBE_DEFER when it cannot determine whether or not the CPU is
> > really available.
>
> Ok. That would end up looking much more like the original code I think.
> So we wouldn't have this late registration at all, or keep it for DT
> on arm64?  I'm not sure that's a clean solution though leaves
> the x86 path alone.

I'm not sure why DT on arm64 would need to do late registration.

There is this chain of calls in the mainline today:

driver_init()
  cpu_dev_init()
    cpu_dev_register_generic()

the last of which registers CPUs on arm64/DT systems IIUC. I don't see
a need to change this behavior.

On arm64/ACPI, though, arch_register_cpu() cannot make progress until
it can look into the ACPI Namespace, so I would just make it return
-EPROBE_DEFER or equivalent then and the ACPI enumeration will find
the CPU and basically treat it as one that has just appeared.

> If we get rid of this catch all, solution would be to move the
> !acpi_disabled check into the arm64 version of arch_cpu_register()
> because we would only want the delayed registration path to be
> used on ACPI cases where the question of CPU availability can't
> yet be resolved.

Exactly.

This is similar (if not equivalent even) to a CPU becoming available
between the cpu_dev_register_generic() call and the ACPI enumeration.

> >
> > Then, the ACPI processor enumeration path may take care of registering
> > CPU that have not been registered so far and in the more-or-less the
> > same way regardless of the architecture (modulo some arch-specific
> > stuff).
>
> If I understand correctly, in acpi_processor_get_info() we'd end up
> with a similar check on whether it was already registered (the x86 path)
> or had be deferred (arm64 / acpi).
>
> >
> > In the end, it should be possible to avoid changing the behavior of
> > x86 and loongarch in this series.
>
> Possible, yes, but result if I understand correctly is we end up with
> very different flows and replication of functionality between the
> early registration and the late one. I'm fine with that if you prefer it!

But that's what is there today, isn't it?

I think this can be changed to reduce the duplication, but I'd prefer
to do that later, when the requisite functionality is in place and we
just do the consolidation.  In that case, if anything goes wrong, we
can take a step back and reconsider without deferring the arm64 CPU
hotplug support.

> >
> > > Which may look familiar at it's effectively patch 3 from v3 which was dealing
> > > with CPUs missing from DSDT (something we think doesn't happen).
> > >
> > > It might be possible to elide the arch_register_cpu() in
> > > make_present() but that will mean we use different flows in this patch set
> > > for the hotplug and initially present cases which is a bit messy.
> > >
> > > I've tested this lightly on arm64 and x86 ACPI + DT booting and it "seems" fine.
> >
> > Sounds promising.
>
> Possibly not that relevant though if proposal is to drop this approach. :(
> At least I now have test setups!

Great!
Jonathan Cameron April 10, 2024, 3:58 p.m. UTC | #13
On Wed, 10 Apr 2024 16:19:50 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Wed, Apr 10, 2024 at 3:50 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Wed, 10 Apr 2024 15:28:18 +0200
> > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> >  
> > > On Wed, Apr 10, 2024 at 2:43 PM Jonathan Cameron
> > > <Jonathan.Cameron@huawei.com> wrote:  
> > > >  
> > > > > >  
> > > > > > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > > > > > > index 47de0f140ba6..13d052bf13f4 100644
> > > > > > > --- a/drivers/base/cpu.c
> > > > > > > +++ b/drivers/base/cpu.c
> > > > > > > @@ -553,7 +553,11 @@ static void __init cpu_dev_register_generic(void)
> > > > > > >  {
> > > > > > >         int i, ret;
> > > > > > >
> > > > > > > -       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > > > > > > +       /*
> > > > > > > +        * When ACPI is enabled, CPUs are registered via
> > > > > > > +        * acpi_processor_get_info().
> > > > > > > +        */
> > > > > > > +       if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> > > > > > >                 return;  
> > > > > >
> > > > > > Honestly, this looks like a quick hack to me and it absolutely
> > > > > > requires an ACK from the x86 maintainers to go anywhere.  
> > > > > Will address this separately.
> > > > >  
> > > >
> > > > So do people prefer this hack, or something along lines of the following?
> > > >
> > > > static int __init cpu_dev_register_generic(void)
> > > > {
> > > >         int i, ret = 0;
> > > >
> > > >         for_each_online_cpu(i) {
> > > >                 if (!get_cpu_device(i)) {
> > > >                         ret = arch_register_cpu(i);
> > > >                         if (ret)
> > > >                                 pr_warn("register_cpu %d failed (%d)\n", i, ret);
> > > >                 }
> > > >         }
> > > >         //Probably just eat the error.
> > > >         return 0;
> > > > }
> > > > subsys_initcall_sync(cpu_dev_register_generic);  
> > >
> > > I would prefer something like the above.
> > >
> > > I actually thought that arch_register_cpu() might return something
> > > like -EPROBE_DEFER when it cannot determine whether or not the CPU is
> > > really available.  
> >
> > Ok. That would end up looking much more like the original code I think.
> > So we wouldn't have this late registration at all, or keep it for DT
> > on arm64?  I'm not sure that's a clean solution though leaves
> > the x86 path alone.  
> 
> I'm not sure why DT on arm64 would need to do late registration.

This was me falsely thinking better to do it close together for
DT and ACPI. It definitely doesn't need to (or it wouldn't work today!)

> 
> There is this chain of calls in the mainline today:
> 
> driver_init()
>   cpu_dev_init()
>     cpu_dev_register_generic()
> 
> the last of which registers CPUs on arm64/DT systems IIUC. I don't see
> a need to change this behavior.
> 
> On arm64/ACPI, though, arch_register_cpu() cannot make progress until
> it can look into the ACPI Namespace, so I would just make it return
> -EPROBE_DEFER or equivalent then and the ACPI enumeration will find
> the CPU and basically treat it as one that has just appeared.

Ok so giving this a go...

Arm 64 version of arch_register_cpu() ended up as the following
(obviously needs cleaning up, bikeshedding of naming etc)

int arch_register_cpu(int cpu)
{
        struct cpu *c = &per_cpu(cpu_devices, cpu);
        acpi_handle acpi_handle = ACPI_HANDLE(&c->dev);
        int ret;

	printk("!!!!! INTO arch_register_cpu() %px\n", ACPI_HANDLE(&c->dev));

        if (!acpi_disabled && !acpi_handle)
                return -EPROBE_DEFER;
        if (acpi_handle) {
                ret = acpi_sta_enabled(acpi_handle);
                if (ret) {
                        printk("Have handle, not enabled\n");
                        /* Not enabled */
                        return ret;
                }
        }
        printk("!!!!! onwards arch_register_cpu()\n");

        c->hotpluggable = arch_cpu_is_hotpluggable(cpu);

        return register_cpu(c, cpu);
}

with new utility function in drivers/acpi/utils.c

int acpi_sta_enabled(acpi_handle handle)
{
       unsigned long long sta;
       bool present, enabled;
       acpi_status status;

       if (acpi_has_method(handle, "_STA")) {
               status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
               if (ACPI_FAILURE(status))
                       return -ENODEV;

               present = sta & ACPI_STA_DEVICE_PRESENT;
               enabled = sta & ACPI_STA_DEVICE_ENABLED;
               if (!present || !enabled) {
                       return -EPROBE_DEFER;
               }
               return 0;
       }
       return 0; /* No _STA means always on! */
}
	struct cpu *c = &per_cpu(cpu_devices, pr->id);	
	ACPI_COMPANION_SET(&c->dev, device);

in acpi_processor_get_info() and that calls

static int acpi_processor_make_enabled(struct acpi_processor *pr)
{
        int ret;

        if (invalid_phys_cpuid(pr->phys_id))
                return -ENODEV;

        cpus_write_lock();
        ret = arch_register_cpu(pr->id);
        cpus_write_unlock();

        return ret;
}

I think setting the ACPI handle should be harmless on other architectures.
It seems like the obvious one to set it to for cpu->dev.

Brief tests on same set of DT and ACPI on x86 and arm64 seem fine.

> 
> > If we get rid of this catch all, solution would be to move the
> > !acpi_disabled check into the arm64 version of arch_cpu_register()
> > because we would only want the delayed registration path to be
> > used on ACPI cases where the question of CPU availability can't
> > yet be resolved.  
> 
> Exactly.
> 
> This is similar (if not equivalent even) to a CPU becoming available
> between the cpu_dev_register_generic() call and the ACPI enumeration.

> 
> > >
> > > Then, the ACPI processor enumeration path may take care of registering
> > > CPU that have not been registered so far and in the more-or-less the
> > > same way regardless of the architecture (modulo some arch-specific
> > > stuff).  
> >
> > If I understand correctly, in acpi_processor_get_info() we'd end up
> > with a similar check on whether it was already registered (the x86 path)
> > or had be deferred (arm64 / acpi).
> >  
> > >
> > > In the end, it should be possible to avoid changing the behavior of
> > > x86 and loongarch in this series.  
> >
> > Possible, yes, but result if I understand correctly is we end up with
> > very different flows and replication of functionality between the
> > early registration and the late one. I'm fine with that if you prefer it!  
> 
> But that's what is there today, isn't it?

Agreed - but I was previously thinking we could move everything late.
I'm fine with just keeping the two flows separate.

> 
> I think this can be changed to reduce the duplication, but I'd prefer
> to do that later, when the requisite functionality is in place and we
> just do the consolidation.  In that case, if anything goes wrong, we
> can take a step back and reconsider without deferring the arm64 CPU
> hotplug support.

Great. That plan certainly works for me :)

Thanks for quick replies and help getting this headed in right direction.

+CC Miguel who is also looking at some of this series. Sorry Miguel,
was assuming you were on the thread and never checked :(

Jonathan
Russell King (Oracle) April 10, 2024, 6:56 p.m. UTC | #14
On Wed, Apr 10, 2024 at 02:50:05PM +0100, Jonathan Cameron wrote:
> If we get rid of this catch all, solution would be to move the
> !acpi_disabled check into the arm64 version of arch_cpu_register()
> because we would only want the delayed registration path to be
> used on ACPI cases where the question of CPU availability can't
> yet be resolved.

Aren't we then needing two arch_register_cpu() implementations?
I'm assuming that you're suggesting that the !acpi_disabled, then
do nothing check is moved into arch_register_cpu() - or to put it
another way, arch_register_cpu() does nothing if ACPI is enabled.

If arch_register_cpu() does nothing if ACPI is enabled, how do
CPUs get registered (and sysfs files get created to control them)
on ACPI systems? ACPI wouldn't be able to call arch_register_cpu(),
so I suspect you'll need an ACPI-specific version of this function.
Rafael J. Wysocki April 10, 2024, 7:08 p.m. UTC | #15
On Wed, Apr 10, 2024 at 8:56 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Wed, Apr 10, 2024 at 02:50:05PM +0100, Jonathan Cameron wrote:
> > If we get rid of this catch all, solution would be to move the
> > !acpi_disabled check into the arm64 version of arch_cpu_register()
> > because we would only want the delayed registration path to be
> > used on ACPI cases where the question of CPU availability can't
> > yet be resolved.
>
> Aren't we then needing two arch_register_cpu() implementations?
> I'm assuming that you're suggesting that the !acpi_disabled, then
> do nothing check is moved into arch_register_cpu() - or to put it
> another way, arch_register_cpu() does nothing if ACPI is enabled.
>
> If arch_register_cpu() does nothing if ACPI is enabled, how do
> CPUs get registered (and sysfs files get created to control them)
> on ACPI systems? ACPI wouldn't be able to call arch_register_cpu(),
> so I suspect you'll need an ACPI-specific version of this function.

arch_register_cpu() will do what it does, but it will check (upfront)
if ACPI is enabled and if so, if the ACPI Namespace is available.  In
the case when ACPI is enabled and the ACPI Namespace is not ready, it
will return -EPROBE_DEFER (say).
Jonathan Cameron April 10, 2024, 9:07 p.m. UTC | #16
On Wed, 10 Apr 2024 21:08:06 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Wed, Apr 10, 2024 at 8:56 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Wed, Apr 10, 2024 at 02:50:05PM +0100, Jonathan Cameron wrote:  
> > > If we get rid of this catch all, solution would be to move the
> > > !acpi_disabled check into the arm64 version of arch_cpu_register()
> > > because we would only want the delayed registration path to be
> > > used on ACPI cases where the question of CPU availability can't
> > > yet be resolved.  
> >
> > Aren't we then needing two arch_register_cpu() implementations?
> > I'm assuming that you're suggesting that the !acpi_disabled, then
> > do nothing check is moved into arch_register_cpu() - or to put it
> > another way, arch_register_cpu() does nothing if ACPI is enabled.
> >
> > If arch_register_cpu() does nothing if ACPI is enabled, how do
> > CPUs get registered (and sysfs files get created to control them)
> > on ACPI systems? ACPI wouldn't be able to call arch_register_cpu(),
> > so I suspect you'll need an ACPI-specific version of this function.  
> 
> arch_register_cpu() will do what it does, but it will check (upfront)
> if ACPI is enabled and if so, if the ACPI Namespace is available.  In
> the case when ACPI is enabled and the ACPI Namespace is not ready, it
> will return -EPROBE_DEFER (say).

Exactly.  I oversimplified and wasn't clear enough.
The check is there in the arch_register_cpu() and is one of the ways
that function can decide to actually register the cpu but not the only one.

I think we may later want to consider breaking it into 2 arch calls
(check if ready to register + register) to reduce code duplication
in with the hotplug path where there is a little extra to do
inbetween.

Hopefully that can wait though.

Jonathan
diff mbox series

Patch

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index cf7c1cca69dd..a68c475cdea5 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -314,6 +314,18 @@  static int acpi_processor_get_info(struct acpi_device *device)
 			cpufreq_add_device("acpi-cpufreq");
 	}
 
+	/*
+	 * Register CPUs that are present. get_cpu_device() is used to skip
+	 * duplicate CPU descriptions from firmware.
+	 */
+	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
+	    !get_cpu_device(pr->id)) {
+		int ret = arch_register_cpu(pr->id);
+
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 *  Extra Processor objects may be enumerated on MP systems with
 	 *  less than the max # of CPUs. They should be ignored _iff
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 47de0f140ba6..13d052bf13f4 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -553,7 +553,11 @@  static void __init cpu_dev_register_generic(void)
 {
 	int i, ret;
 
-	if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
+	/*
+	 * When ACPI is enabled, CPUs are registered via
+	 * acpi_processor_get_info().
+	 */
+	if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
 		return;
 
 	for_each_present_cpu(i) {