mbox series

[v8,0/9] Parallel CPU bringup for x86_64

Message ID 20230209154156.266385-1-usama.arif@bytedance.com (mailing list archive)
Headers show
Series Parallel CPU bringup for x86_64 | expand

Message

Usama Arif Feb. 9, 2023, 3:41 p.m. UTC
The major change over v7 is fixing CPU0 hotplug not working as reported by
Paul E. McKenney using rcu torture tests. This is fixed by setting up the
initial_gs, initial_stack and early_gdt_descr properly for this case.

The improvement in boot time is the same as v7.

Thanks,
Usama

Changes across versions:
v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
    in preparation for more parallelisation.
v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
    avoid scribbling on initial_gs in common_cpu_up(), and to allow all
    24 bits of the physical X2APIC ID to be used. That patch still needs
    a Signed-off-by from its original author, who once claimed not to
    remember writing it at all. But now we've fixed it, hopefully he'll
    admit it now :)
v5: rebase to v6.1 and remeasure performance, disable parallel bringup
    for AMD CPUs.
v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and
    reused timer calibration for secondary CPUs.
v7: [David Woodhouse] iterate over all possible CPUs to find any existing
    cluster mask in alloc_clustermask. (patch 1/9)
    Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf
    0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient.
    Included sanity checks for APIC id from 0x0B. (patch 6/9)
    Removed patch for reusing timer calibration for secondary CPUs.
    commit message and code improvements.
v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and
    early_gdt_descr.
    Drop trampoline lock and bail if APIC ID not found in find_cpunr.
    Code comments improved and debug prints added.

David Woodhouse (9):
  x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
  cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
  cpu/hotplug: Add dynamic parallel bringup states before
    CPUHP_BRINGUP_CPU
  x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
  x86/smpboot: Split up native_cpu_up into separate phases and document
    them
  x86/smpboot: Support parallel startup of secondary CPUs
  x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
  x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup
  x86/smpboot: Serialize topology updates for secondary bringup

 arch/x86/include/asm/realmode.h       |   3 +
 arch/x86/include/asm/smp.h            |  14 +-
 arch/x86/include/asm/topology.h       |   2 -
 arch/x86/kernel/acpi/sleep.c          |   1 +
 arch/x86/kernel/apic/apic.c           |   2 +-
 arch/x86/kernel/apic/x2apic_cluster.c | 130 ++++++----
 arch/x86/kernel/cpu/common.c          |   6 +-
 arch/x86/kernel/cpu/mtrr/mtrr.c       |   9 +
 arch/x86/kernel/head_64.S             |  99 +++++++-
 arch/x86/kernel/smpboot.c             | 350 +++++++++++++++++++-------
 arch/x86/realmode/init.c              |   3 +
 arch/x86/realmode/rm/trampoline_64.S  |  14 ++
 arch/x86/xen/smp_pv.c                 |   4 +-
 include/linux/cpuhotplug.h            |   2 +
 include/linux/smpboot.h               |   7 +
 kernel/cpu.c                          |  31 ++-
 kernel/smpboot.c                      |   2 +-
 kernel/smpboot.h                      |   2 -
 18 files changed, 521 insertions(+), 160 deletions(-)

Comments

Paul E. McKenney Feb. 10, 2023, 4:11 a.m. UTC | #1
On Thu, Feb 09, 2023 at 03:41:47PM +0000, Usama Arif wrote:
> The major change over v7 is fixing CPU0 hotplug not working as reported by
> Paul E. McKenney using rcu torture tests. This is fixed by setting up the
> initial_gs, initial_stack and early_gdt_descr properly for this case.
> 
> The improvement in boot time is the same as v7.

This one passes moderate rcutorture testing.

							Thanx, Paul

> Thanks,
> Usama
> 
> Changes across versions:
> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
>     in preparation for more parallelisation.
> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
>     avoid scribbling on initial_gs in common_cpu_up(), and to allow all
>     24 bits of the physical X2APIC ID to be used. That patch still needs
>     a Signed-off-by from its original author, who once claimed not to
>     remember writing it at all. But now we've fixed it, hopefully he'll
>     admit it now :)
> v5: rebase to v6.1 and remeasure performance, disable parallel bringup
>     for AMD CPUs.
> v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and
>     reused timer calibration for secondary CPUs.
> v7: [David Woodhouse] iterate over all possible CPUs to find any existing
>     cluster mask in alloc_clustermask. (patch 1/9)
>     Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf
>     0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient.
>     Included sanity checks for APIC id from 0x0B. (patch 6/9)
>     Removed patch for reusing timer calibration for secondary CPUs.
>     commit message and code improvements.
> v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and
>     early_gdt_descr.
>     Drop trampoline lock and bail if APIC ID not found in find_cpunr.
>     Code comments improved and debug prints added.
> 
> David Woodhouse (9):
>   x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
>   cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
>   cpu/hotplug: Add dynamic parallel bringup states before
>     CPUHP_BRINGUP_CPU
>   x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
>   x86/smpboot: Split up native_cpu_up into separate phases and document
>     them
>   x86/smpboot: Support parallel startup of secondary CPUs
>   x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
>   x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup
>   x86/smpboot: Serialize topology updates for secondary bringup
> 
>  arch/x86/include/asm/realmode.h       |   3 +
>  arch/x86/include/asm/smp.h            |  14 +-
>  arch/x86/include/asm/topology.h       |   2 -
>  arch/x86/kernel/acpi/sleep.c          |   1 +
>  arch/x86/kernel/apic/apic.c           |   2 +-
>  arch/x86/kernel/apic/x2apic_cluster.c | 130 ++++++----
>  arch/x86/kernel/cpu/common.c          |   6 +-
>  arch/x86/kernel/cpu/mtrr/mtrr.c       |   9 +
>  arch/x86/kernel/head_64.S             |  99 +++++++-
>  arch/x86/kernel/smpboot.c             | 350 +++++++++++++++++++-------
>  arch/x86/realmode/init.c              |   3 +
>  arch/x86/realmode/rm/trampoline_64.S  |  14 ++
>  arch/x86/xen/smp_pv.c                 |   4 +-
>  include/linux/cpuhotplug.h            |   2 +
>  include/linux/smpboot.h               |   7 +
>  kernel/cpu.c                          |  31 ++-
>  kernel/smpboot.c                      |   2 +-
>  kernel/smpboot.h                      |   2 -
>  18 files changed, 521 insertions(+), 160 deletions(-)
> 
> -- 
> 2.25.1
>
David Woodhouse Feb. 10, 2023, 9:02 a.m. UTC | #2
On Thu, 2023-02-09 at 20:11 -0800, Paul E. McKenney wrote:
> On Thu, Feb 09, 2023 at 03:41:47PM +0000, Usama Arif wrote:
> > The major change over v7 is fixing CPU0 hotplug not working as reported by
> > Paul E. McKenney using rcu torture tests. This is fixed by setting up the
> > initial_gs, initial_stack and early_gdt_descr properly for this case.
> > 
> > The improvement in boot time is the same as v7.
> 
> This one passes moderate rcutorture testing.

Thanks, Paul!