Message ID | 20240418135412.14730-5-Jonathan.Cameron@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ACPI/arm64: add support for virtual cpu hotplug | expand |
On Thu, Apr 18, 2024 at 3:56 PM Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > Make the per_cpu(processors, cpu) entries available earlier so that > they are available in arch_register_cpu() as ARM64 will need access > to the acpi_handle to distinguish between acpi_processor_add() > and earlier registration attempts (which will fail as _STA cannot > be checked). > > Reorder the remove flow to clear this per_cpu() after > arch_unregister_cpu() has completed, allowing it to be used in > there as well. > > Note that on x86 for the CPU hotplug case, the pr->id prior to > acpi_map_cpu() may be invalid. Thus the per_cpu() structures > must be initialized after that call or after checking the ID > is valid (not hotplug path). > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > v7: Swap order with acpi_unmap_cpu() in acpi_processor_remove() > to keep it in reverse order of the setup path. (thanks Salil) > Fix an issue with placement of CONFIG_ACPI_HOTPLUG_CPU guards. > v6: As per discussion in v5 thread, don't use the cpu->dev and > make this data available earlier by moving the assignment checks > int acpi_processor_get_info(). > --- > drivers/acpi/acpi_processor.c | 78 +++++++++++++++++++++-------------- > 1 file changed, 46 insertions(+), 32 deletions(-) > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > index ba0a6f0ac841..ac7ddb30f10e 100644 > --- a/drivers/acpi/acpi_processor.c > +++ b/drivers/acpi/acpi_processor.c > @@ -183,8 +183,36 @@ static void __init acpi_pcc_cpufreq_init(void) {} > #endif /* CONFIG_X86 */ > > /* Initialization */ > +static DEFINE_PER_CPU(void *, processor_device_array); > + > +static void acpi_processor_set_per_cpu(struct acpi_processor *pr, > + struct acpi_device *device) > +{ > + BUG_ON(pr->id >= nr_cpu_ids); > + /* > + * Buggy BIOS check. > + * ACPI id of processors can be reported wrongly by the BIOS. > + * Don't trust it blindly > + */ > + if (per_cpu(processor_device_array, pr->id) != NULL && > + per_cpu(processor_device_array, pr->id) != device) { > + dev_warn(&device->dev, > + "BIOS reported wrong ACPI id %d for the processor\n", > + pr->id); > + /* Give up, but do not abort the namespace scan. */ > + return; In this case the caller should make acpi_pricessor_add() return 0, I think, because otherwise it will attempt to acpi_bind_one() "pr" to "device" which will confuse things. So I would make this return false to indicate that. Or just fold it into the caller and do the error handling there.
> @@ -232,6 +263,7 @@ static int acpi_processor_get_info(struct acpi_device *device) > acpi_status status = AE_OK; > static int cpu0_initialized; > unsigned long long value; > + int ret; > > acpi_processor_errata(); > > @@ -316,10 +348,12 @@ static int acpi_processor_get_info(struct acpi_device *device) > * because cpuid <-> apicid mapping is persistent now. > */ > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { > - int ret = acpi_processor_hotadd_init(pr); > + ret = acpi_processor_hotadd_init(pr, device); > > if (ret) > - return ret; > + goto err; > + } else { > + acpi_processor_set_per_cpu(pr, device); > } > > /* > @@ -357,6 +391,10 @@ static int acpi_processor_get_info(struct acpi_device *device) > arch_fix_phys_package_id(pr->id, value); > > return 0; > + > +err: > + per_cpu(processors, pr->id) = NULL; ... > + return ret; > } > > /* > @@ -365,8 +403,6 @@ static int acpi_processor_get_info(struct acpi_device *device) > * (cpu_data(cpu)) values, like CPU feature flags, family, model, etc. > * Such things have to be put in and set up by the processor driver's .probe(). > */ > -static DEFINE_PER_CPU(void *, processor_device_array); > - > static int acpi_processor_add(struct acpi_device *device, > const struct acpi_device_id *id) > { > @@ -395,28 +431,6 @@ static int acpi_processor_add(struct acpi_device *device, > if (result) /* Processor is not physically present or unavailable */ > return 0; > > - BUG_ON(pr->id >= nr_cpu_ids); > - > - /* > - * Buggy BIOS check. > - * ACPI id of processors can be reported wrongly by the BIOS. > - * Don't trust it blindly > - */ > - if (per_cpu(processor_device_array, pr->id) != NULL && > - per_cpu(processor_device_array, pr->id) != device) { > - dev_warn(&device->dev, > - "BIOS reported wrong ACPI id %d for the processor\n", > - pr->id); > - /* Give up, but do not abort the namespace scan. */ > - goto err; > - } > - /* > - * processor_device_array is not cleared on errors to allow buggy BIOS > - * checks. > - */ > - per_cpu(processor_device_array, pr->id) = device; > - per_cpu(processors, pr->id) = pr; Nit: seems we need to remove the duplicated per_cpu(processors, pr->id) = NULL; in acpi_processor_add(): --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -446,7 +446,6 @@ static int acpi_processor_add(struct acpi_device *device, err: free_cpumask_var(pr->throttling.shared_cpu_map); device->driver_data = NULL; - per_cpu(processors, pr->id) = NULL; err_free_pr: kfree(pr); return result; Thanks Hanjun
On Mon, 22 Apr 2024 20:56:55 +0200 "Rafael J. Wysocki" <rafael@kernel.org> wrote: > On Thu, Apr 18, 2024 at 3:56 PM Jonathan Cameron > <Jonathan.Cameron@huawei.com> wrote: > > > > Make the per_cpu(processors, cpu) entries available earlier so that > > they are available in arch_register_cpu() as ARM64 will need access > > to the acpi_handle to distinguish between acpi_processor_add() > > and earlier registration attempts (which will fail as _STA cannot > > be checked). > > > > Reorder the remove flow to clear this per_cpu() after > > arch_unregister_cpu() has completed, allowing it to be used in > > there as well. > > > > Note that on x86 for the CPU hotplug case, the pr->id prior to > > acpi_map_cpu() may be invalid. Thus the per_cpu() structures > > must be initialized after that call or after checking the ID > > is valid (not hotplug path). > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > --- > > v7: Swap order with acpi_unmap_cpu() in acpi_processor_remove() > > to keep it in reverse order of the setup path. (thanks Salil) > > Fix an issue with placement of CONFIG_ACPI_HOTPLUG_CPU guards. > > v6: As per discussion in v5 thread, don't use the cpu->dev and > > make this data available earlier by moving the assignment checks > > int acpi_processor_get_info(). > > --- > > drivers/acpi/acpi_processor.c | 78 +++++++++++++++++++++-------------- > > 1 file changed, 46 insertions(+), 32 deletions(-) > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > > index ba0a6f0ac841..ac7ddb30f10e 100644 > > --- a/drivers/acpi/acpi_processor.c > > +++ b/drivers/acpi/acpi_processor.c > > @@ -183,8 +183,36 @@ static void __init acpi_pcc_cpufreq_init(void) {} > > #endif /* CONFIG_X86 */ > > > > /* Initialization */ > > +static DEFINE_PER_CPU(void *, processor_device_array); > > + > > +static void acpi_processor_set_per_cpu(struct acpi_processor *pr, > > + struct acpi_device *device) > > +{ > > + BUG_ON(pr->id >= nr_cpu_ids); > > + /* > > + * Buggy BIOS check. > > + * ACPI id of processors can be reported wrongly by the BIOS. > > + * Don't trust it blindly > > + */ > > + if (per_cpu(processor_device_array, pr->id) != NULL && > > + per_cpu(processor_device_array, pr->id) != device) { > > + dev_warn(&device->dev, > > + "BIOS reported wrong ACPI id %d for the processor\n", > > + pr->id); > > + /* Give up, but do not abort the namespace scan. */ > > + return; > > In this case the caller should make acpi_pricessor_add() return 0, I > think, because otherwise it will attempt to acpi_bind_one() "pr" to > "device" which will confuse things. > > So I would make this return false to indicate that. > > Or just fold it into the caller and do the error handling there. The bios bug mentioned in reply to patch 14 (DSDT entries for non existent CPUs that have no _STA entries) showed me that we need to know if this succeeded (I'd not read this at that point). I'll make it return a bool to say this succeeded and in both call sites return 0 if not to deal with the bios bug here. Making sure not to clear the per_cpu() structures unless this we get past that call. If we do and arch_register_cpu() fails we need to clear these two IDs. Doing so means that acpi_processor_hotadd_init() is side effect free and hence we can return in acpi_processor_get_info() which avoids the need to clear pointers when we don't have a valid pr->id to do it with. So fully agree we need to bail out properly if this fails. Jonathan
On Tue, 23 Apr 2024 19:53:34 +0800 Hanjun Guo <guohanjun@huawei.com> wrote: > > @@ -232,6 +263,7 @@ static int acpi_processor_get_info(struct acpi_device *device) > > acpi_status status = AE_OK; > > static int cpu0_initialized; > > unsigned long long value; > > + int ret; > > > > acpi_processor_errata(); > > > > @@ -316,10 +348,12 @@ static int acpi_processor_get_info(struct acpi_device *device) > > * because cpuid <-> apicid mapping is persistent now. > > */ > > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { > > - int ret = acpi_processor_hotadd_init(pr); > > + ret = acpi_processor_hotadd_init(pr, device); > > > > if (ret) > > - return ret; > > + goto err; > > + } else { > > + acpi_processor_set_per_cpu(pr, device); > > } > > > > /* > > @@ -357,6 +391,10 @@ static int acpi_processor_get_info(struct acpi_device *device) > > arch_fix_phys_package_id(pr->id, value); > > > > return 0; > > + > > +err: > > + per_cpu(processors, pr->id) = NULL; > > ... > > > + return ret; > > } > > > > /* > > @@ -365,8 +403,6 @@ static int acpi_processor_get_info(struct acpi_device *device) > > * (cpu_data(cpu)) values, like CPU feature flags, family, model, etc. > > * Such things have to be put in and set up by the processor driver's .probe(). > > */ > > -static DEFINE_PER_CPU(void *, processor_device_array); > > - > > static int acpi_processor_add(struct acpi_device *device, > > const struct acpi_device_id *id) > > { > > @@ -395,28 +431,6 @@ static int acpi_processor_add(struct acpi_device *device, > > if (result) /* Processor is not physically present or unavailable */ > > return 0; > > > > - BUG_ON(pr->id >= nr_cpu_ids); > > - > > - /* > > - * Buggy BIOS check. > > - * ACPI id of processors can be reported wrongly by the BIOS. > > - * Don't trust it blindly > > - */ > > - if (per_cpu(processor_device_array, pr->id) != NULL && > > - per_cpu(processor_device_array, pr->id) != device) { > > - dev_warn(&device->dev, > > - "BIOS reported wrong ACPI id %d for the processor\n", > > - pr->id); > > - /* Give up, but do not abort the namespace scan. */ > > - goto err; > > - } > > - /* > > - * processor_device_array is not cleared on errors to allow buggy BIOS > > - * checks. > > - */ > > - per_cpu(processor_device_array, pr->id) = device; > > - per_cpu(processors, pr->id) = pr; > > Nit: seems we need to remove the duplicated > per_cpu(processors, pr->id) = NULL; in acpi_processor_add(): > > --- a/drivers/acpi/acpi_processor.c > +++ b/drivers/acpi/acpi_processor.c > @@ -446,7 +446,6 @@ static int acpi_processor_add(struct acpi_device > *device, > err: > free_cpumask_var(pr->throttling.shared_cpu_map); > device->driver_data = NULL; > - per_cpu(processors, pr->id) = NULL; I don't follow. This path is used if processor_get_info() succeeded and we later fail. I don't see where the the duplication is? > err_free_pr: > kfree(pr); > return result; > > Thanks > Hanjun
On 2024/4/25 1:18, Jonathan Cameron wrote: > On Tue, 23 Apr 2024 19:53:34 +0800 > Hanjun Guo <guohanjun@huawei.com> wrote: > >>> @@ -232,6 +263,7 @@ static int acpi_processor_get_info(struct acpi_device *device) >>> acpi_status status = AE_OK; >>> static int cpu0_initialized; >>> unsigned long long value; >>> + int ret; >>> >>> acpi_processor_errata(); >>> >>> @@ -316,10 +348,12 @@ static int acpi_processor_get_info(struct acpi_device *device) >>> * because cpuid <-> apicid mapping is persistent now. >>> */ >>> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { >>> - int ret = acpi_processor_hotadd_init(pr); >>> + ret = acpi_processor_hotadd_init(pr, device); >>> >>> if (ret) >>> - return ret; >>> + goto err; >>> + } else { >>> + acpi_processor_set_per_cpu(pr, device); >>> } >>> >>> /* >>> @@ -357,6 +391,10 @@ static int acpi_processor_get_info(struct acpi_device *device) >>> arch_fix_phys_package_id(pr->id, value); >>> >>> return 0; >>> + >>> +err: >>> + per_cpu(processors, pr->id) = NULL; >> >> ... >> >>> + return ret; >>> } >>> >>> /* >>> @@ -365,8 +403,6 @@ static int acpi_processor_get_info(struct acpi_device *device) >>> * (cpu_data(cpu)) values, like CPU feature flags, family, model, etc. >>> * Such things have to be put in and set up by the processor driver's .probe(). >>> */ >>> -static DEFINE_PER_CPU(void *, processor_device_array); >>> - >>> static int acpi_processor_add(struct acpi_device *device, >>> const struct acpi_device_id *id) >>> { >>> @@ -395,28 +431,6 @@ static int acpi_processor_add(struct acpi_device *device, >>> if (result) /* Processor is not physically present or unavailable */ >>> return 0; >>> >>> - BUG_ON(pr->id >= nr_cpu_ids); >>> - >>> - /* >>> - * Buggy BIOS check. >>> - * ACPI id of processors can be reported wrongly by the BIOS. >>> - * Don't trust it blindly >>> - */ >>> - if (per_cpu(processor_device_array, pr->id) != NULL && >>> - per_cpu(processor_device_array, pr->id) != device) { >>> - dev_warn(&device->dev, >>> - "BIOS reported wrong ACPI id %d for the processor\n", >>> - pr->id); >>> - /* Give up, but do not abort the namespace scan. */ >>> - goto err; >>> - } >>> - /* >>> - * processor_device_array is not cleared on errors to allow buggy BIOS >>> - * checks. >>> - */ >>> - per_cpu(processor_device_array, pr->id) = device; >>> - per_cpu(processors, pr->id) = pr; >> >> Nit: seems we need to remove the duplicated >> per_cpu(processors, pr->id) = NULL; in acpi_processor_add(): >> >> --- a/drivers/acpi/acpi_processor.c >> +++ b/drivers/acpi/acpi_processor.c >> @@ -446,7 +446,6 @@ static int acpi_processor_add(struct acpi_device >> *device, >> err: >> free_cpumask_var(pr->throttling.shared_cpu_map); >> device->driver_data = NULL; >> - per_cpu(processors, pr->id) = NULL; > > I don't follow. This path is used if processor_get_info() succeeded and > we later fail. I don't see where the the duplication is? It is! Thanks for the clarification. Thanks Hanjun
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index ba0a6f0ac841..ac7ddb30f10e 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -183,8 +183,36 @@ static void __init acpi_pcc_cpufreq_init(void) {} #endif /* CONFIG_X86 */ /* Initialization */ +static DEFINE_PER_CPU(void *, processor_device_array); + +static void acpi_processor_set_per_cpu(struct acpi_processor *pr, + struct acpi_device *device) +{ + BUG_ON(pr->id >= nr_cpu_ids); + /* + * Buggy BIOS check. + * ACPI id of processors can be reported wrongly by the BIOS. + * Don't trust it blindly + */ + if (per_cpu(processor_device_array, pr->id) != NULL && + per_cpu(processor_device_array, pr->id) != device) { + dev_warn(&device->dev, + "BIOS reported wrong ACPI id %d for the processor\n", + pr->id); + /* Give up, but do not abort the namespace scan. */ + return; + } + /* + * processor_device_array is not cleared on errors to allow buggy BIOS + * checks. + */ + per_cpu(processor_device_array, pr->id) = device; + per_cpu(processors, pr->id) = pr; +} + #ifdef CONFIG_ACPI_HOTPLUG_CPU -static int acpi_processor_hotadd_init(struct acpi_processor *pr) +static int acpi_processor_hotadd_init(struct acpi_processor *pr, + struct acpi_device *device) { int ret; @@ -198,6 +226,8 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr) if (ret) goto out; + acpi_processor_set_per_cpu(pr, device); + ret = arch_register_cpu(pr->id); if (ret) { acpi_unmap_cpu(pr->id); @@ -217,7 +247,8 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr) return ret; } #else -static inline int acpi_processor_hotadd_init(struct acpi_processor *pr) +static inline int acpi_processor_hotadd_init(struct acpi_processor *pr, + struct acpi_device *device) { return -ENODEV; } @@ -232,6 +263,7 @@ static int acpi_processor_get_info(struct acpi_device *device) acpi_status status = AE_OK; static int cpu0_initialized; unsigned long long value; + int ret; acpi_processor_errata(); @@ -316,10 +348,12 @@ static int acpi_processor_get_info(struct acpi_device *device) * because cpuid <-> apicid mapping is persistent now. */ if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { - int ret = acpi_processor_hotadd_init(pr); + ret = acpi_processor_hotadd_init(pr, device); if (ret) - return ret; + goto err; + } else { + acpi_processor_set_per_cpu(pr, device); } /* @@ -357,6 +391,10 @@ static int acpi_processor_get_info(struct acpi_device *device) arch_fix_phys_package_id(pr->id, value); return 0; + +err: + per_cpu(processors, pr->id) = NULL; + return ret; } /* @@ -365,8 +403,6 @@ static int acpi_processor_get_info(struct acpi_device *device) * (cpu_data(cpu)) values, like CPU feature flags, family, model, etc. * Such things have to be put in and set up by the processor driver's .probe(). */ -static DEFINE_PER_CPU(void *, processor_device_array); - static int acpi_processor_add(struct acpi_device *device, const struct acpi_device_id *id) { @@ -395,28 +431,6 @@ static int acpi_processor_add(struct acpi_device *device, if (result) /* Processor is not physically present or unavailable */ return 0; - BUG_ON(pr->id >= nr_cpu_ids); - - /* - * Buggy BIOS check. - * ACPI id of processors can be reported wrongly by the BIOS. - * Don't trust it blindly - */ - if (per_cpu(processor_device_array, pr->id) != NULL && - per_cpu(processor_device_array, pr->id) != device) { - dev_warn(&device->dev, - "BIOS reported wrong ACPI id %d for the processor\n", - pr->id); - /* Give up, but do not abort the namespace scan. */ - goto err; - } - /* - * processor_device_array is not cleared on errors to allow buggy BIOS - * checks. - */ - per_cpu(processor_device_array, pr->id) = device; - per_cpu(processors, pr->id) = pr; - dev = get_cpu_device(pr->id); if (!dev) { result = -ENODEV; @@ -469,10 +483,6 @@ static void acpi_processor_remove(struct acpi_device *device) device_release_driver(pr->dev); acpi_unbind_one(pr->dev); - /* Clean up. */ - per_cpu(processor_device_array, pr->id) = NULL; - per_cpu(processors, pr->id) = NULL; - cpu_maps_update_begin(); cpus_write_lock(); @@ -480,6 +490,10 @@ static void acpi_processor_remove(struct acpi_device *device) arch_unregister_cpu(pr->id); acpi_unmap_cpu(pr->id); + /* Clean up. */ + per_cpu(processor_device_array, pr->id) = NULL; + per_cpu(processors, pr->id) = NULL; + cpus_write_unlock(); cpu_maps_update_done();
Make the per_cpu(processors, cpu) entries available earlier so that they are available in arch_register_cpu() as ARM64 will need access to the acpi_handle to distinguish between acpi_processor_add() and earlier registration attempts (which will fail as _STA cannot be checked). Reorder the remove flow to clear this per_cpu() after arch_unregister_cpu() has completed, allowing it to be used in there as well. Note that on x86 for the CPU hotplug case, the pr->id prior to acpi_map_cpu() may be invalid. Thus the per_cpu() structures must be initialized after that call or after checking the ID is valid (not hotplug path). Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> --- v7: Swap order with acpi_unmap_cpu() in acpi_processor_remove() to keep it in reverse order of the setup path. (thanks Salil) Fix an issue with placement of CONFIG_ACPI_HOTPLUG_CPU guards. v6: As per discussion in v5 thread, don't use the cpu->dev and make this data available earlier by moving the assignment checks int acpi_processor_get_info(). --- drivers/acpi/acpi_processor.c | 78 +++++++++++++++++++++-------------- 1 file changed, 46 insertions(+), 32 deletions(-)