Message ID | 20240702005800.622910-1-zhiquan1.li@intel.com (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | [v3] x86/acpi: fix panic while AP online later with kernel parameter maxcpus=1 | expand |
On Tue, 2024-07-02 at 08:58 +0800, Zhiquan Li wrote: > The issue was found on the platform that using "Multiprocessor Wakeup > Structure"[1] to startup secondary CPU, which is usually used by > encrypted guest. When restrict boot time CPU to 1 with the kernel > parameter "maxcpus=1" and bring other CPUs online later, there will be > a kernel panic. > > The variable acpi_mp_wake_mailbox, which holds the virtual address of > the MP Wakeup Structure mailbox, will be set as read-only after init. > If the first AP gets online later, after init, the attempt to update > the variable results in panic. > > The memremap() call that initializes the variable cannot be moved into > acpi_parse_mp_wake() because memremap() is not functional at that point > in the boot process. > > [1] Details about the MP Wakeup structure can be found in ACPI v6.4, in > the "Multiprocessor Wakeup Structure" section. > > Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com> > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Seems this changelog only mentions the problem, but doesn't say how to fix: Remove the __ro_after_init annotation of acpi_mp_wake_mailbox to fix. [...] > > /* Virtual address of the Multiprocessor Wakeup Structure mailbox */ > -static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox __ro_after_init; > +static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; > > But if memremap() isn't available in acpi_parse_mp_wake() is the concern, could we change to do it before smp_init() but after memremap() is available? E.g, we should be able to use early_initcall() (only compile tested): diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c index 6cfe762be28b..ca0aea0832ac 100644 --- a/arch/x86/kernel/acpi/madt_wakeup.c +++ b/arch/x86/kernel/acpi/madt_wakeup.c @@ -176,18 +176,6 @@ static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) return -EOPNOTSUPP; } - /* - * Remap mailbox memory only for the first call to acpi_wakeup_cpu(). - * - * Wakeup of secondary CPUs is fully serialized in the core code. - * No need to protect acpi_mp_wake_mailbox from concurrent accesses. - */ - if (!acpi_mp_wake_mailbox) { - acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, - sizeof(*acpi_mp_wake_mailbox), - MEMREMAP_WB); - } - /* * Mailbox memory is shared between the firmware and OS. Firmware will * listen on mailbox command address, and once it receives the wakeup @@ -290,3 +278,29 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header, return 0; } + +static int __init acpi_mp_wake_map_mailbox(void) +{ + /* + * Nothing to do if the "Multiprocessor Wakeup Structure" is + * not present. acpi_mp_wake_mailbox_paddr could also be 0 + * if the kernel is from kexec. + */ + if (!acpi_mp_wake_mailbox_paddr) + return 0; + + /* memremap() isn't available in acpi_parse_mp_wake() */ + acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, + sizeof(*acpi_mp_wake_mailbox), + MEMREMAP_WB); + /* + * When memremap() fails, clear acpi_mp_wake_mailbox_paddr + * to prevent NULL dereference of acpi_mp_wake_mailbox + * when bringing up secondary CPUs. + */ + if (WARN_ON(!acpi_mp_wake_mailbox)) + acpi_mp_wake_mailbox_paddr = 0; + + return 0; +} +early_initcall(acpi_mp_wake_map_mailbox);
On Tue, Jul 02, 2024 at 12:05:38PM +0000, Huang, Kai wrote: > On Tue, 2024-07-02 at 08:58 +0800, Zhiquan Li wrote: > > The issue was found on the platform that using "Multiprocessor Wakeup > > Structure"[1] to startup secondary CPU, which is usually used by > > encrypted guest. When restrict boot time CPU to 1 with the kernel > > parameter "maxcpus=1" and bring other CPUs online later, there will be > > a kernel panic. > > > > The variable acpi_mp_wake_mailbox, which holds the virtual address of > > the MP Wakeup Structure mailbox, will be set as read-only after init. > > If the first AP gets online later, after init, the attempt to update > > the variable results in panic. > > > > The memremap() call that initializes the variable cannot be moved into > > acpi_parse_mp_wake() because memremap() is not functional at that point > > in the boot process. > > > > [1] Details about the MP Wakeup structure can be found in ACPI v6.4, in > > the "Multiprocessor Wakeup Structure" section. > > > > Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com> > > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > Seems this changelog only mentions the problem, but doesn't say how to fix: > > Remove the __ro_after_init annotation of acpi_mp_wake_mailbox to fix. Do not talk about *what* the patch is doing in the commit message - that should be obvious from the diff itself. Rather, concentrate on the *why* it needs to be done. Imagine one fine day you're doing git archeology, you find the place in the code about which you want to find out why it was changed the way it is now. You do git annotate <filename> ... find the line, see the commit id and you do: git show <commit id> You read the commit message and there's just gibberish and nothing's explaining *why* that change was done. And you start scratching your head, trying to figure out why. Because the damn commit message is worth sh*t. This happens to us maintainers at least once a week. Well, I don't want that to happen in my tree anymore. So none of this text above still doesn't explain to me *why* this is happening. Why do APs need to update acpi_mp_wake_mailbox? Which patch is this fixing? See Documentation/process/submitting-patches.rst Questions over questions...
On Tue, 2024-07-02 at 14:45 +0200, Borislav Petkov wrote: > On Tue, Jul 02, 2024 at 12:05:38PM +0000, Huang, Kai wrote: > > On Tue, 2024-07-02 at 08:58 +0800, Zhiquan Li wrote: > > > The issue was found on the platform that using "Multiprocessor Wakeup > > > Structure"[1] to startup secondary CPU, which is usually used by > > > encrypted guest. When restrict boot time CPU to 1 with the kernel > > > parameter "maxcpus=1" and bring other CPUs online later, there will be > > > a kernel panic. > > > > > > The variable acpi_mp_wake_mailbox, which holds the virtual address of > > > the MP Wakeup Structure mailbox, will be set as read-only after init. > > > If the first AP gets online later, after init, the attempt to update > > > the variable results in panic. > > > > > > The memremap() call that initializes the variable cannot be moved into > > > acpi_parse_mp_wake() because memremap() is not functional at that point > > > in the boot process. > > > > > > [1] Details about the MP Wakeup structure can be found in ACPI v6.4, in > > > the "Multiprocessor Wakeup Structure" section. > > > > > > Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com> > > > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > Seems this changelog only mentions the problem, but doesn't say how to fix: > > > > Remove the __ro_after_init annotation of acpi_mp_wake_mailbox to fix. > > Do not talk about *what* the patch is doing in the commit message - that > should be obvious from the diff itself. Rather, concentrate on the *why* > it needs to be done. > > Imagine one fine day you're doing git archeology, you find the place in > the code about which you want to find out why it was changed the way it > is now. > > You do git annotate <filename> ... find the line, see the commit id and > you do: > > git show <commit id> > > You read the commit message and there's just gibberish and nothing's > explaining *why* that change was done. And you start scratching your > head, trying to figure out why. Because the damn commit message is worth > sh*t. Yeah fully agree. Thanks for saying this again. > > This happens to us maintainers at least once a week. Well, I don't want > that to happen in my tree anymore. > > So none of this text above still doesn't explain to me *why* this is > happening. > > Why do APs need to update acpi_mp_wake_mailbox? They don't need to if acpi_mp_wake_mailbox can be setup before smp_init() once for all. But currently the setup of acpi_mp_wake_mailbox is done when the first AP is brought up because memremap() doesn't work in acpi_parse_mp_wake(), as mentioned in the changelog of this patch. I also feel it's not ideal to setup acpi_mp_wake_mailbox when bringing up the first AP, so I provided my diff. IIUC, if memremap() works for acpi_mp_wake_mailbox when bringing up the first AP, then it should also work in the early_initcall(). > > Which patch is this fixing? It fiexes below commit AFAICT: 24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init") Which didn't consider 'maxvcpus=xx' case. But I will leave to Kirill to confirm.
On 2024/7/3 07:55, Huang, Kai wrote: >> This happens to us maintainers at least once a week. Well, I don't want >> that to happen in my tree anymore. >> >> So none of this text above still doesn't explain to me *why* this is >> happening. >> >> Why do APs need to update acpi_mp_wake_mailbox? Not AP needs to update acpi_mp_wake_mailbox, but BSP might need to update it after the init stage. In the encrypted guest CPU hot-plug scenario, BSP memremap() the acpi_mp_wake_mailbox_paddr, and writes APIC ID of APs, wakeup vector and the ACPI_MP_WAKE_COMMAND_WAKEUP command into mailbox. Firmware will listen on mailbox command address, and once it receives the wakeup command, the CPU associated with the given apicid will be booted. We cannot assume that all APs will be brought up in the init stage. > They don't need to if acpi_mp_wake_mailbox can be setup before smp_init() > once for all. > > But currently the setup of acpi_mp_wake_mailbox is done when the first AP is > brought up because memremap() doesn't work in acpi_parse_mp_wake(), as > mentioned in the changelog of this patch. > > I also feel it's not ideal to setup acpi_mp_wake_mailbox when bringing up > the first AP, so I provided my diff. IIUC, if memremap() works for > acpi_mp_wake_mailbox when bringing up the first AP, then it should also work > in > the early_initcall(). Besides the factor that whether memremap() is functional at the point in the boot process, another reason I can think of is, if the intention is just to work with BSP, then the remapping is a redundant step. Especially in the kexec & kdump case, the capture kernel only needs single CPU to work usually with the "maxcpus=1" option. IMHO, the solution that postpone the remapping while really needs to bring up APs is reasonable, just don't make acpi_mp_wake_mailbox read-only. The APs might be brought up later, might be never. > >> Which patch is this fixing? > It fiexes below commit AFAICT: > > 24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as > __ro_after_init") > > Which didn't consider 'maxvcpus=xx' case. > Thanks a lot for checking this, Kai. > > But I will leave to Kirill to confirm. Best Regards, Zhiquan
On July 3, 2024 4:39:43 AM GMT+02:00, Zhiquan Li <zhiquan1.li@intel.com> wrote: > >On 2024/7/3 07:55, Huang, Kai wrote: >>> This happens to us maintainers at least once a week. Well, I don't want >>> that to happen in my tree anymore. >>> >>> So none of this text above still doesn't explain to me *why* this is >>> happening. >>> >>> Why do APs need to update acpi_mp_wake_mailbox? > >Not AP needs to update acpi_mp_wake_mailbox, but BSP might need to >update it after the init stage. In the encrypted guest CPU hot-plug >scenario, BSP memremap() the acpi_mp_wake_mailbox_paddr, and writes APIC >ID of APs, wakeup vector and the ACPI_MP_WAKE_COMMAND_WAKEUP command >into mailbox. Firmware will listen on mailbox command address, and once >it receives the wakeup command, the CPU associated with the given apicid >will be booted. > >We cannot assume that all APs will be brought up in the init stage. > >> They don't need to if acpi_mp_wake_mailbox can be setup before smp_init() >> once for all. >> >> But currently the setup of acpi_mp_wake_mailbox is done when the first AP is >> brought up because memremap() doesn't work in acpi_parse_mp_wake(), as >> mentioned in the changelog of this patch. >> >> I also feel it's not ideal to setup acpi_mp_wake_mailbox when bringing up >> the first AP, so I provided my diff. IIUC, if memremap() works for >> acpi_mp_wake_mailbox when bringing up the first AP, then it should also work >> in >> the early_initcall(). > >Besides the factor that whether memremap() is functional at the point in >the boot process, another reason I can think of is, if the intention is >just to work with BSP, then the remapping is a redundant step. >Especially in the kexec & kdump case, the capture kernel only needs >single CPU to work usually with the "maxcpus=1" option. > >IMHO, the solution that postpone the remapping while really needs to >bring up APs is reasonable, just don't make acpi_mp_wake_mailbox >read-only. The APs might be brought up later, might be never. > > >> >>> Which patch is this fixing? >> It fiexes below commit AFAICT: >> >> 24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as >> __ro_after_init") >> >> Which didn't consider 'maxvcpus=xx' case. >> > >Thanks a lot for checking this, Kai. > >> >> But I will leave to Kirill to confirm. > >Best Regards, >Zhiquan Then please extend the commit message with the "why", add the Fixes tag and resend. Thx.
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c index 6cfe762be28b..d5ef6215583b 100644 --- a/arch/x86/kernel/acpi/madt_wakeup.c +++ b/arch/x86/kernel/acpi/madt_wakeup.c @@ -19,7 +19,7 @@ static u64 acpi_mp_wake_mailbox_paddr __ro_after_init; /* Virtual address of the Multiprocessor Wakeup Structure mailbox */ -static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox __ro_after_init; +static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; static u64 acpi_mp_pgd __ro_after_init; static u64 acpi_mp_reset_vector_paddr __ro_after_init;