mbox series

[v4,0/9] Parallel CPU bringup for x86_64

Message ID 20220201205328.123066-1-dwmw2@infradead.org (mailing list archive)
Headers show
Series Parallel CPU bringup for x86_64 | expand

Message

David Woodhouse Feb. 1, 2022, 8:53 p.m. UTC
Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
them shaves about 80% off the AP bringup time on a 96-thread 2-socket
Skylake box (EC2 c5.metal) — from about 500ms to 100ms.

There are more wins to be had with further parallelisation, but this is
the simple part.

v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
    in preparation for more parallelisation.
v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
    avoid scribbling on initial_gs in common_cpu_up(), and to allow all
    24 bits of the physical X2APIC ID to be used. That patch still needs
    a Signed-off-by from its original author, who once claimed not to
    remember writing it at all. But now we've fixed it, hopefully he'll
    admit it now :)

David Woodhouse (8):
      x86/apic/x2apic: Fix parallel handling of cluster_mask
      cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
      cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU
      x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
      x86/smpboot: Split up native_cpu_up into separate phases and document them
      x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
      x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup
      x86/smpboot: Serialize topology updates for secondary bringup

Thomas Gleixner (1):
      x86/smpboot: Support parallel startup of secondary CPUs

[dwoodhou@i7 linux-2.6]$ git diff --stat  v5.17-rc2..share/parallel-5.17-part1 
 arch/x86/include/asm/realmode.h       |   3 +
 arch/x86/include/asm/smp.h            |  13 +-
 arch/x86/include/asm/topology.h       |   2 -
 arch/x86/kernel/acpi/sleep.c          |   1 +
 arch/x86/kernel/apic/apic.c           |   2 +-
 arch/x86/kernel/apic/x2apic_cluster.c | 108 ++++++-----
 arch/x86/kernel/cpu/common.c          |   6 +-
 arch/x86/kernel/cpu/mtrr/mtrr.c       |   9 +
 arch/x86/kernel/head_64.S             |  73 ++++++++
 arch/x86/kernel/smpboot.c             | 325 ++++++++++++++++++++++++----------
 arch/x86/realmode/init.c              |   3 +
 arch/x86/realmode/rm/trampoline_64.S  |  14 ++
 arch/x86/xen/smp_pv.c                 |   4 +-
 include/linux/cpuhotplug.h            |   2 +
 include/linux/smpboot.h               |   7 +
 kernel/cpu.c                          |  27 ++-
 kernel/smpboot.c                      |   2 +-
 kernel/smpboot.h                      |   2 -
 18 files changed, 442 insertions(+), 161 deletions(-)

Comments

Tom Lendacky Feb. 7, 2022, 6:50 p.m. UTC | #1
On 2/1/22 14:53, David Woodhouse wrote:
> Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
> them shaves about 80% off the AP bringup time on a 96-thread 2-socket
> Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
> 
> There are more wins to be had with further parallelisation, but this is
> the simple part.
> 
> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
>      in preparation for more parallelisation.
> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
>      avoid scribbling on initial_gs in common_cpu_up(), and to allow all
>      24 bits of the physical X2APIC ID to be used. That patch still needs
>      a Signed-off-by from its original author, who once claimed not to
>      remember writing it at all. But now we've fixed it, hopefully he'll
>      admit it now :)

I'm no longer seeing crashes launching high vCPU-count guests with this 
series.

Thanks,
Tom

> 
> David Woodhouse (8):
>        x86/apic/x2apic: Fix parallel handling of cluster_mask
>        cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
>        cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU
>        x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
>        x86/smpboot: Split up native_cpu_up into separate phases and document them
>        x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
>        x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup
>        x86/smpboot: Serialize topology updates for secondary bringup
> 
> Thomas Gleixner (1):
>        x86/smpboot: Support parallel startup of secondary CPUs
> 
> [dwoodhou@i7 linux-2.6]$ git diff --stat  v5.17-rc2..share/parallel-5.17-part1
>   arch/x86/include/asm/realmode.h       |   3 +
>   arch/x86/include/asm/smp.h            |  13 +-
>   arch/x86/include/asm/topology.h       |   2 -
>   arch/x86/kernel/acpi/sleep.c          |   1 +
>   arch/x86/kernel/apic/apic.c           |   2 +-
>   arch/x86/kernel/apic/x2apic_cluster.c | 108 ++++++-----
>   arch/x86/kernel/cpu/common.c          |   6 +-
>   arch/x86/kernel/cpu/mtrr/mtrr.c       |   9 +
>   arch/x86/kernel/head_64.S             |  73 ++++++++
>   arch/x86/kernel/smpboot.c             | 325 ++++++++++++++++++++++++----------
>   arch/x86/realmode/init.c              |   3 +
>   arch/x86/realmode/rm/trampoline_64.S  |  14 ++
>   arch/x86/xen/smp_pv.c                 |   4 +-
>   include/linux/cpuhotplug.h            |   2 +
>   include/linux/smpboot.h               |   7 +
>   kernel/cpu.c                          |  27 ++-
>   kernel/smpboot.c                      |   2 +-
>   kernel/smpboot.h                      |   2 -
>   18 files changed, 442 insertions(+), 161 deletions(-)
> 
> 
>
Usama Arif Feb. 1, 2023, 2:40 p.m. UTC | #2
On 01/02/2022 20:53, David Woodhouse wrote:
> Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
> them shaves about 80% off the AP bringup time on a 96-thread 2-socket
> Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
> 
> There are more wins to be had with further parallelisation, but this is
> the simple part.
> 

Hi,

We are interested in reducing the boot time of servers (with kexec), and 
smpboot takes up a significant amount of time while booting. When 
testing the patch series (rebased to v6.1) on a server with 128 CPUs 
split across 2 NUMA nodes, it brought down the smpboot time from ~700ms 
to 100ms. Adding another cpuhp state for do_wait_cpu_initialized to make 
sure cpu_init is reached (as done in v1 of the series + using the 
cpu_finishup_mask) brought it down further to ~30ms.

I just wanted to check what was needed to progress the patch series 
further for review? There weren't any comments on v4 of the patch so I 
couldn't figure out what more is needed. I think its quite useful to 
have this working so would be really glad help in anything needed to 
restart the review.

Thanks!
Usama



> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
>      in preparation for more parallelisation.
> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
>      avoid scribbling on initial_gs in common_cpu_up(), and to allow all
>      24 bits of the physical X2APIC ID to be used. That patch still needs
>      a Signed-off-by from its original author, who once claimed not to
>      remember writing it at all. But now we've fixed it, hopefully he'll
>      admit it now :)
> 
> David Woodhouse (8):
>        x86/apic/x2apic: Fix parallel handling of cluster_mask
>        cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
>        cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU
>        x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
>        x86/smpboot: Split up native_cpu_up into separate phases and document them
>        x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
>        x86/mtrr: Avoid repeated save of MTRRs on boot-time CPU bringup
>        x86/smpboot: Serialize topology updates for secondary bringup
> 
> Thomas Gleixner (1):
>        x86/smpboot: Support parallel startup of secondary CPUs
> 
> [dwoodhou@i7 linux-2.6]$ git diff --stat  v5.17-rc2..share/parallel-5.17-part1
>   arch/x86/include/asm/realmode.h       |   3 +
>   arch/x86/include/asm/smp.h            |  13 +-
>   arch/x86/include/asm/topology.h       |   2 -
>   arch/x86/kernel/acpi/sleep.c          |   1 +
>   arch/x86/kernel/apic/apic.c           |   2 +-
>   arch/x86/kernel/apic/x2apic_cluster.c | 108 ++++++-----
>   arch/x86/kernel/cpu/common.c          |   6 +-
>   arch/x86/kernel/cpu/mtrr/mtrr.c       |   9 +
>   arch/x86/kernel/head_64.S             |  73 ++++++++
>   arch/x86/kernel/smpboot.c             | 325 ++++++++++++++++++++++++----------
>   arch/x86/realmode/init.c              |   3 +
>   arch/x86/realmode/rm/trampoline_64.S  |  14 ++
>   arch/x86/xen/smp_pv.c                 |   4 +-
>   include/linux/cpuhotplug.h            |   2 +
>   include/linux/smpboot.h               |   7 +
>   kernel/cpu.c                          |  27 ++-
>   kernel/smpboot.c                      |   2 +-
>   kernel/smpboot.h                      |   2 -
>   18 files changed, 442 insertions(+), 161 deletions(-)
> 
> 
> 
>
David Woodhouse Feb. 1, 2023, 3:08 p.m. UTC | #3
On Wed, 2023-02-01 at 14:40 +0000, Usama Arif wrote:
> On 01/02/2022 20:53, David Woodhouse wrote:
> > Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
> > them shaves about 80% off the AP bringup time on a 96-thread 2-socket
> > Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
> > 
> > There are more wins to be had with further parallelisation, but this is
> > the simple part.
> > 
> 
> Hi,
> 
> We are interested in reducing the boot time of servers (with kexec), and 
> smpboot takes up a significant amount of time while booting. When 
> testing the patch series (rebased to v6.1) on a server with 128 CPUs 
> split across 2 NUMA nodes, it brought down the smpboot time from ~700ms 
> to 100ms. Adding another cpuhp state for do_wait_cpu_initialized to make 
> sure cpu_init is reached (as done in v1 of the series + using the 
> cpu_finishup_mask) brought it down further to ~30ms.
> 
> I just wanted to check what was needed to progress the patch series 
> further for review? There weren't any comments on v4 of the patch so I 
> couldn't figure out what more is needed. I think its quite useful to 
> have this working so would be really glad help in anything needed to 
> restart the review.


I believe the only thing holding it back was the fact that it broke on
some AMD CPUs.

We don't *think* there are any remaining software issues; we think it's
hardware. Either an actual hardware race in CPU or chipset, or perhaps
even something as simple as a voltage regulator which can't cope with
an increase in power draw from *all* the CPUs at the same time.

We have prodded AMD a few times to investigate, but so far to no avail.

Last time I actually spoke to Thomas in person, I think he agreed that
we should just merge it and disable the parallel mode for the affected
AMD CPUs.

If you've already rebased to a newer kernel and tested it, perhaps now
is the time to do just that.
Usama Arif Feb. 1, 2023, 4:38 p.m. UTC | #4
On 01/02/2023 15:08, David Woodhouse wrote:
> On Wed, 2023-02-01 at 14:40 +0000, Usama Arif wrote:
>> On 01/02/2022 20:53, David Woodhouse wrote:
>>> Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
>>> them shaves about 80% off the AP bringup time on a 96-thread 2-socket
>>> Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
>>>
>>> There are more wins to be had with further parallelisation, but this is
>>> the simple part.
>>>
>>
>> Hi,
>>
>> We are interested in reducing the boot time of servers (with kexec), and
>> smpboot takes up a significant amount of time while booting. When
>> testing the patch series (rebased to v6.1) on a server with 128 CPUs
>> split across 2 NUMA nodes, it brought down the smpboot time from ~700ms
>> to 100ms. Adding another cpuhp state for do_wait_cpu_initialized to make
>> sure cpu_init is reached (as done in v1 of the series + using the
>> cpu_finishup_mask) brought it down further to ~30ms.
>>
>> I just wanted to check what was needed to progress the patch series
>> further for review? There weren't any comments on v4 of the patch so I
>> couldn't figure out what more is needed. I think its quite useful to
>> have this working so would be really glad help in anything needed to
>> restart the review.
> 
> 
> I believe the only thing holding it back was the fact that it broke on
> some AMD CPUs.
> 
> We don't *think* there are any remaining software issues; we think it's
> hardware. Either an actual hardware race in CPU or chipset, or perhaps
> even something as simple as a voltage regulator which can't cope with
> an increase in power draw from *all* the CPUs at the same time.
> 
> We have prodded AMD a few times to investigate, but so far to no avail.
> 
> Last time I actually spoke to Thomas in person, I think he agreed that
> we should just merge it and disable the parallel mode for the affected
> AMD CPUs.
>

 From the comments in v3, it seems to affect multiple generations, would 
it be worth proceeding with the patches by disabling it on all AMD CPUs 
to be on the safe side, until the actual issue is found and what causes 
it, and then follow up later if the issue is found by disabling it only 
on affected cpus. Maybe simply do something like below?

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 0f144773a7fc..6b8884592341 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1575,7 +1575,8 @@ void __init native_smp_prepare_cpus(unsigned int 
max_cpus)
          * for SEV-ES guests because they can't use CPUID that early.
          */
         if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 
0x0B ||
-           cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
+           cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT) ||
+           boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
                 do_parallel_bringup = false;

         if (do_parallel_bringup) {




> If you've already rebased to a newer kernel and tested it, perhaps now
> is the time to do just that.

If you would like me to repost the rebased patches to restart the 
reviews (with do_parallel_bringup disabled for AMD), please let me know!

Thanks,
Usama
H. Peter Anvin Feb. 1, 2023, 4:55 p.m. UTC | #5
On February 1, 2023 8:38:14 AM PST, Usama Arif <usama.arif@bytedance.com> wrote:
>
>
>On 01/02/2023 15:08, David Woodhouse wrote:
>> On Wed, 2023-02-01 at 14:40 +0000, Usama Arif wrote:
>>> On 01/02/2022 20:53, David Woodhouse wrote:
>>>> Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
>>>> them shaves about 80% off the AP bringup time on a 96-thread 2-socket
>>>> Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
>>>> 
>>>> There are more wins to be had with further parallelisation, but this is
>>>> the simple part.
>>>> 
>>> 
>>> Hi,
>>> 
>>> We are interested in reducing the boot time of servers (with kexec), and
>>> smpboot takes up a significant amount of time while booting. When
>>> testing the patch series (rebased to v6.1) on a server with 128 CPUs
>>> split across 2 NUMA nodes, it brought down the smpboot time from ~700ms
>>> to 100ms. Adding another cpuhp state for do_wait_cpu_initialized to make
>>> sure cpu_init is reached (as done in v1 of the series + using the
>>> cpu_finishup_mask) brought it down further to ~30ms.
>>> 
>>> I just wanted to check what was needed to progress the patch series
>>> further for review? There weren't any comments on v4 of the patch so I
>>> couldn't figure out what more is needed. I think its quite useful to
>>> have this working so would be really glad help in anything needed to
>>> restart the review.
>> 
>> 
>> I believe the only thing holding it back was the fact that it broke on
>> some AMD CPUs.
>> 
>> We don't *think* there are any remaining software issues; we think it's
>> hardware. Either an actual hardware race in CPU or chipset, or perhaps
>> even something as simple as a voltage regulator which can't cope with
>> an increase in power draw from *all* the CPUs at the same time.
>> 
>> We have prodded AMD a few times to investigate, but so far to no avail.
>> 
>> Last time I actually spoke to Thomas in person, I think he agreed that
>> we should just merge it and disable the parallel mode for the affected
>> AMD CPUs.
>> 
>
>From the comments in v3, it seems to affect multiple generations, would it be worth proceeding with the patches by disabling it on all AMD CPUs to be on the safe side, until the actual issue is found and what causes it, and then follow up later if the issue is found by disabling it only on affected cpus. Maybe simply do something like below?
>
>diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>index 0f144773a7fc..6b8884592341 100644
>--- a/arch/x86/kernel/smpboot.c
>+++ b/arch/x86/kernel/smpboot.c
>@@ -1575,7 +1575,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
>         * for SEV-ES guests because they can't use CPUID that early.
>         */
>        if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 0x0B ||
>-           cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
>+           cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT) ||
>+           boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
>                do_parallel_bringup = false;
>
>        if (do_parallel_bringup) {
>
>
>
>
>> If you've already rebased to a newer kernel and tested it, perhaps now
>> is the time to do just that.
>
>If you would like me to repost the rebased patches to restart the reviews (with do_parallel_bringup disabled for AMD), please let me know!
>
>Thanks,
>Usama

This should be a CPU bug flag in my option.
David Woodhouse Feb. 1, 2023, 5:12 p.m. UTC | #6
On Wed, 2023-02-01 at 08:55 -0800, H. Peter Anvin wrote:
> On February 1, 2023 8:38:14 AM PST, Usama Arif <usama.arif@bytedance.com> wrote:
> > 
> > 
> > On 01/02/2023 15:08, David Woodhouse wrote:
> > > On Wed, 2023-02-01 at 14:40 +0000, Usama Arif wrote:
> > > > On 01/02/2022 20:53, David Woodhouse wrote:
> > > > > Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
> > > > > them shaves about 80% off the AP bringup time on a 96-thread 2-socket
> > > > > Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
> > > > > 
> > > > > There are more wins to be had with further parallelisation, but this is
> > > > > the simple part.
> > > > > 
> > > > 
> > > > Hi,
> > > > 
> > > > We are interested in reducing the boot time of servers (with kexec), and
> > > > smpboot takes up a significant amount of time while booting. When
> > > > testing the patch series (rebased to v6.1) on a server with 128 CPUs
> > > > split across 2 NUMA nodes, it brought down the smpboot time from ~700ms
> > > > to 100ms. Adding another cpuhp state for do_wait_cpu_initialized to make
> > > > sure cpu_init is reached (as done in v1 of the series + using the
> > > > cpu_finishup_mask) brought it down further to ~30ms.
> > > > 
> > > > I just wanted to check what was needed to progress the patch series
> > > > further for review? There weren't any comments on v4 of the patch so I
> > > > couldn't figure out what more is needed. I think its quite useful to
> > > > have this working so would be really glad help in anything needed to
> > > > restart the review.
> > > 
> > > 
> > > I believe the only thing holding it back was the fact that it broke on
> > > some AMD CPUs.
> > > 
> > > We don't *think* there are any remaining software issues; we think it's
> > > hardware. Either an actual hardware race in CPU or chipset, or perhaps
> > > even something as simple as a voltage regulator which can't cope with
> > > an increase in power draw from *all* the CPUs at the same time.
> > > 
> > > We have prodded AMD a few times to investigate, but so far to no avail.
> > > 
> > > Last time I actually spoke to Thomas in person, I think he agreed that
> > > we should just merge it and disable the parallel mode for the affected
> > > AMD CPUs.
> > > 
> > 
> > From the comments in v3, it seems to affect multiple generations, would it be worth proceeding with the patches by disabling it on all AMD CPUs to be on the safe side, until the actual issue is found and what causes it, and then follow up later if the issue is found by disabling it only on affected cpus. Maybe simply do something like below?
> > 
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index 0f144773a7fc..6b8884592341 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -1575,7 +1575,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
> >         * for SEV-ES guests because they can't use CPUID that early.
> >         */
> >        if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 0x0B ||
> > -           cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
> > +           cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT) ||
> > +           boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
> >                do_parallel_bringup = false;
> > 
> >        if (do_parallel_bringup) {
> > 
> > 
> > 
> > 
> > > If you've already rebased to a newer kernel and tested it, perhaps now
> > > is the time to do just that.
> > 
> > If you would like me to repost the rebased patches to restart the reviews (with do_parallel_bringup disabled for AMD), please let me know!
> > 

Sounds like you have a far fresher context on it all than I do now, so
yes please that sounds like a great idea.

I think we still need a sign-off from Thomas on the real mode patch but
as I noted in the last cover letter, now we've *fixed* it perhaps we
can persuade him to concede that it's his? Either that or we post it in
email and hope to trick him into adding a S-o-B in transit as he
applies it...

> > Thanks,
> > Usama
> 
> This should be a CPU bug flag in my option.

Yeah, probably true. But I think I agree with Usama that we should do
it for all AMD to start with. Best to err on the side of caution.
David Woodhouse Feb. 2, 2023, 10:06 a.m. UTC | #7
On Wed, 2023-02-01 at 08:55 -0800, H. Peter Anvin wrote:
> This should be a CPU bug flag in my option.

This is in the tree that I've just rebased to v6.2-rc6 for Usama to
continue testing and repost as appropriate.

(Oh, as I post it in email I realise we should probably retcon the
explicit check for AMD out of the previous patch in the series. You can
see it being *removed* in this patch.)

From 1f7cece1241e5b9c9988f943962155bb7154d4f8 Mon Sep 17 00:00:00 2001
From: David Woodhouse <dwmw@amazon.co.uk>
Date: Thu, 2 Feb 2023 09:53:26 +0000
Subject: [PATCH 07/15] x86/smpboot: Disable parallel boot for AMD CPUs

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/kernel/cpu/amd.c          | 11 +++++++++++
 arch/x86/kernel/smpboot.c          |  9 +++++++--
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 61012476d66e..ed7f32354edc 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -466,5 +466,6 @@
 #define X86_BUG_MMIO_UNKNOWN		X86_BUG(26) /* CPU is too old and its MMIO Stale Data status is unknown */
 #define X86_BUG_RETBLEED		X86_BUG(27) /* CPU is affected by RETBleed */
 #define X86_BUG_EIBRS_PBRSB		X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
+#define X86_BUG_NO_PARALLEL_BRINGUP	X86_BUG(29) /* CPU has hardware issues with parallel AP bringup */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f769d6d08b43..19b5c8342d7e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -941,6 +941,17 @@ static void init_amd(struct cpuinfo_x86 *c)
 	case 0x19: init_amd_zn(c); break;
 	}
 
+	/*
+	 * Various AMD CPUs appear to not to cope with APs being brought up
+	 * in parallel. In debugging, the AP doesn't even seem to reach an
+	 * outb to port 0x3f8 right at the top of the startup trampoline.
+	 * We don't *think* there are any remaining software issues which
+	 * may contribute to this, although it's possible. So far, attempts
+	 * to get AMD to investigate this have been to no avail. So just
+	 * disable parallel bring up for all AMD CPUs for now.
+	 */
+	set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP);
+
 	/*
 	 * Enable workaround for FXSAVE leak on CPUs
 	 * without a XSaveErPtr feature
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7920823d5a3b..95c182023d09 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1538,9 +1538,14 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	 * it for all AMD CPUs to be on the safe side.
 	 */
 	if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 0x0B ||
-	    cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT) ||
-	    boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+	    cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
 		do_parallel_bringup = false;
+	}
+	if (do_parallel_bringup &&
+	    boot_cpu_has_bug(X86_BUG_NO_PARALLEL_BRINGUP)) {
+		pr_info("Disabling parallel bringup due to CPU bugs\n");
+		do_parallel_bringup = false;
+	}
 
 	snp_set_wakeup_secondary_cpu();
 }