diff mbox series

[v8,16/16] microcode: block #NMI handling when loading an ucode

Message ID 1564654971-31328-17-git-send-email-chao.gao@intel.com (mailing list archive)
State Superseded
Headers show
Series improve late microcode loading | expand

Commit Message

Chao Gao Aug. 1, 2019, 10:22 a.m. UTC
register an nmi callback. And this callback does busy-loop on threads
which are waiting for loading completion if 'loading_ucode' is true.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v8:
 - new
---
 xen/arch/x86/microcode.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

Comments

Jan Beulich Aug. 5, 2019, 12:11 p.m. UTC | #1
On 01.08.2019 12:22, Chao Gao wrote:
> @@ -439,12 +440,37 @@ static int do_microcode_update(void *patch)
>       return ret;
>   }
>   
> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
> +{
> +    bool print = false;
> +
> +    /* The first thread of a core is to load an update. Don't block it. */
> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +        return 0;
> +
> +    if ( loading_state == LOADING_ENTERED )
> +    {
> +        cpumask_set_cpu(cpu, &cpu_callin_map);
> +        printk(XENLOG_DEBUG "CPU%u enters %s\n", smp_processor_id(), __func__);

Here  and ...

> +        print = true;
> +    }
> +
> +    while ( loading_state == LOADING_ENTERED )
> +        rep_nop();
> +
> +    if ( print )
> +        printk(XENLOG_DEBUG "CPU%u exits %s\n", smp_processor_id(), __func__);

... here - why smp_processor_id() when you can use "cpu"? And what
use is __func__ here?

The rep_nop() above also presumably wants to be cpu_relax() again.

But on the whole I was really hoping for more aggressive disabling
of NMI handling, more like (but of course not quite as heavy as)
the crash path wiring the IDT entry to trap_nop(). Andrew, I'm
curious to learn what you're thinking would be best here.

Jan
Chao Gao Aug. 7, 2019, 7:59 a.m. UTC | #2
On Mon, Aug 05, 2019 at 12:11:01PM +0000, Jan Beulich wrote:
>On 01.08.2019 12:22, Chao Gao wrote:
>> @@ -439,12 +440,37 @@ static int do_microcode_update(void *patch)
>>       return ret;
>>   }
>>   
>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>> +{
>> +    bool print = false;
>> +
>> +    /* The first thread of a core is to load an update. Don't block it. */
>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +        return 0;
>> +
>> +    if ( loading_state == LOADING_ENTERED )
>> +    {
>> +        cpumask_set_cpu(cpu, &cpu_callin_map);
>> +        printk(XENLOG_DEBUG "CPU%u enters %s\n", smp_processor_id(), __func__);
>
>Here  and ...
>
>> +        print = true;
>> +    }
>> +
>> +    while ( loading_state == LOADING_ENTERED )
>> +        rep_nop();
>> +
>> +    if ( print )
>> +        printk(XENLOG_DEBUG "CPU%u exits %s\n", smp_processor_id(), __func__);
>
>... here - why smp_processor_id() when you can use "cpu"? And what
>use is __func__ here?
>
>The rep_nop() above also presumably wants to be cpu_relax() again.
>
>But on the whole I was really hoping for more aggressive disabling
>of NMI handling, more like (but of course not quite as heavy as)
>the crash path wiring the IDT entry to trap_nop().

Hi Jan,

I agree with you that it should be more aggressive. This patch is
problematic considering there is a lot of code before reaching this
callback (especially, SPEC_CTRL_ENTRY_FROM_INTR_IST which may write
MSR_SPEC_CTRL).

In my mind, we have two options to solve the issue:
1. Wire the IDT entry to trap_nop() like the crash path.

2. Enhance this patch: A thread which is not going to load an update
is forced to send an #NMI to itself to enter the callback, do
busy-loop until completion of loading ucode on all cores.
With this method, no #NMI delivery, or MSR write would happen on this
threads during microcode update.

Thanks
Chao
Jan Beulich Aug. 7, 2019, 8:44 a.m. UTC | #3
On 07.08.2019 09:59, Chao Gao wrote:
> On Mon, Aug 05, 2019 at 12:11:01PM +0000, Jan Beulich wrote:
>> On 01.08.2019 12:22, Chao Gao wrote:
>>> @@ -439,12 +440,37 @@ static int do_microcode_update(void *patch)
>>>        return ret;
>>>    }
>>>    
>>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>>> +{
>>> +    bool print = false;
>>> +
>>> +    /* The first thread of a core is to load an update. Don't block it. */
>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>> +        return 0;
>>> +
>>> +    if ( loading_state == LOADING_ENTERED )
>>> +    {
>>> +        cpumask_set_cpu(cpu, &cpu_callin_map);
>>> +        printk(XENLOG_DEBUG "CPU%u enters %s\n", smp_processor_id(), __func__);
>>
>> Here  and ...
>>
>>> +        print = true;
>>> +    }
>>> +
>>> +    while ( loading_state == LOADING_ENTERED )
>>> +        rep_nop();
>>> +
>>> +    if ( print )
>>> +        printk(XENLOG_DEBUG "CPU%u exits %s\n", smp_processor_id(), __func__);
>>
>> ... here - why smp_processor_id() when you can use "cpu"? And what
>> use is __func__ here?
>>
>> The rep_nop() above also presumably wants to be cpu_relax() again.
>>
>> But on the whole I was really hoping for more aggressive disabling
>> of NMI handling, more like (but of course not quite as heavy as)
>> the crash path wiring the IDT entry to trap_nop().
> 
> I agree with you that it should be more aggressive. This patch is
> problematic considering there is a lot of code before reaching this
> callback (especially, SPEC_CTRL_ENTRY_FROM_INTR_IST which may write
> MSR_SPEC_CTRL).
> 
> In my mind, we have two options to solve the issue:
> 1. Wire the IDT entry to trap_nop() like the crash path.

As said, this is not directly an option - at the very least a thread
should record the fact that there was an NMI, such that it can
handle it after the ucode update has completed.

> 2. Enhance this patch: A thread which is not going to load an update
> is forced to send an #NMI to itself to enter the callback, do
> busy-loop until completion of loading ucode on all cores.
> With this method, no #NMI delivery, or MSR write would happen on this
> threads during microcode update.

This sounds doable at the first glance; iirc the CPU would latch
another NMI while NMIs are blocked due to there not having been an
IRET yet after the last one was raised. There would still be some
ambiguity in case the self-NMI and an actual one arrived at about
the same time, but I guess we need to live with this small window.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 67549be..4ac7e93 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -38,6 +38,7 @@ 
 
 #include <asm/delay.h>
 #include <asm/msr.h>
+#include <asm/nmi.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
@@ -439,12 +440,37 @@  static int do_microcode_update(void *patch)
     return ret;
 }
 
+static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
+{
+    bool print = false;
+
+    /* The first thread of a core is to load an update. Don't block it. */
+    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+        return 0;
+
+    if ( loading_state == LOADING_ENTERED )
+    {
+        cpumask_set_cpu(cpu, &cpu_callin_map);
+        printk(XENLOG_DEBUG "CPU%u enters %s\n", smp_processor_id(), __func__);
+        print = true;
+    }
+
+    while ( loading_state == LOADING_ENTERED )
+        rep_nop();
+
+    if ( print )
+        printk(XENLOG_DEBUG "CPU%u exits %s\n", smp_processor_id(), __func__);
+
+    return 0;
+}
+
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     void *buffer;
     unsigned int cpu, updated;
     struct microcode_patch *patch;
+    nmi_callback_t *saved_nmi_callback;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -509,6 +535,8 @@  int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
      * watchdog timeout.
      */
     watchdog_disable();
+
+    saved_nmi_callback = set_nmi_callback(microcode_nmi_callback);
     /*
      * Late loading dance. Why the heavy-handed stop_machine effort?
      *
@@ -521,6 +549,7 @@  int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
      *   conservative and good.
      */
     ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
+    set_nmi_callback(saved_nmi_callback);
     watchdog_enable();
 
     updated = atomic_read(&cpu_updated);