diff mbox series

[2/3] x86/smp: use a dedicated scratch cpumask in send_IPI_mask

Message ID 20200212164949.56434-3-roger.pau@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86: fixes/improvements for scratch cpumask | expand

Commit Message

Roger Pau Monne Feb. 12, 2020, 4:49 p.m. UTC
Using scratch_cpumask in send_IPI_mak is not safe because it can be
called from interrupt context, and hence Xen would have to make sure
all the users of the scratch cpumask disable interrupts while using
it.

Instead introduce a new cpumask to be used by send_IPI_mask, and
disable interrupts while using.

Fixes: 5500d265a2a8 ('x86/smp: use APIC ALLBUT destination shorthand when possible')
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/smp.c     | 14 +++++++++++++-
 xen/arch/x86/smpboot.c |  9 ++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

Comments

Jan Beulich Feb. 13, 2020, 9:59 a.m. UTC | #1
On 12.02.2020 17:49, Roger Pau Monne wrote:
> Using scratch_cpumask in send_IPI_mak is not safe because it can be
> called from interrupt context, and hence Xen would have to make sure
> all the users of the scratch cpumask disable interrupts while using
> it.
> 
> Instead introduce a new cpumask to be used by send_IPI_mask, and
> disable interrupts while using.

My first thought here was: What about NMI or #MC context? Even if
not using the function today (which I didn't check), there shouldn't
be a latent issue introduced here preventing them from doing so in
the future. Instead I think you want to allocate the scratch mask
dynamically (at least if caller context doesn't allow use of the
generic one), and simply refrain from coalescing IPIs if this
allocations fails.

Jan
Roger Pau Monne Feb. 13, 2020, 10:03 a.m. UTC | #2
On Thu, Feb 13, 2020 at 10:59:29AM +0100, Jan Beulich wrote:
> On 12.02.2020 17:49, Roger Pau Monne wrote:
> > Using scratch_cpumask in send_IPI_mak is not safe because it can be
> > called from interrupt context, and hence Xen would have to make sure
> > all the users of the scratch cpumask disable interrupts while using
> > it.
> > 
> > Instead introduce a new cpumask to be used by send_IPI_mask, and
> > disable interrupts while using.
> 
> My first thought here was: What about NMI or #MC context? Even if
> not using the function today (which I didn't check), there shouldn't
> be a latent issue introduced here preventing them from doing so in
> the future. Instead I think you want to allocate the scratch mask
> dynamically (at least if caller context doesn't allow use of the
> generic one), and simply refrain from coalescing IPIs if this
> allocations fails.

Hm, isn't this going to be quite expensive, and hence render the
benefit of using the shorthand moot?

Thanks, Roger.
Jan Beulich Feb. 13, 2020, 10:19 a.m. UTC | #3
On 13.02.2020 11:03, Roger Pau Monné wrote:
> On Thu, Feb 13, 2020 at 10:59:29AM +0100, Jan Beulich wrote:
>> On 12.02.2020 17:49, Roger Pau Monne wrote:
>>> Using scratch_cpumask in send_IPI_mak is not safe because it can be
>>> called from interrupt context, and hence Xen would have to make sure
>>> all the users of the scratch cpumask disable interrupts while using
>>> it.
>>>
>>> Instead introduce a new cpumask to be used by send_IPI_mask, and
>>> disable interrupts while using.
>>
>> My first thought here was: What about NMI or #MC context? Even if
>> not using the function today (which I didn't check), there shouldn't
>> be a latent issue introduced here preventing them from doing so in
>> the future. Instead I think you want to allocate the scratch mask
>> dynamically (at least if caller context doesn't allow use of the
>> generic one), and simply refrain from coalescing IPIs if this
>> allocations fails.
> 
> Hm, isn't this going to be quite expensive, and hence render the
> benefit of using the shorthand moot?

Depends on how many CPUs there are, i.e. how long the loop otherwise
would be. When xmalloc() doesn't need to turn to the page allocator,
I think it won't be overly slow. Another option would be to avoid
coalescing in a slightly different way (without having to fiddle
with the scratch mask) when called in interrupt context.

Jan
Roger Pau Monne Feb. 13, 2020, 11:41 a.m. UTC | #4
On Thu, Feb 13, 2020 at 11:19:02AM +0100, Jan Beulich wrote:
> On 13.02.2020 11:03, Roger Pau Monné wrote:
> > On Thu, Feb 13, 2020 at 10:59:29AM +0100, Jan Beulich wrote:
> >> On 12.02.2020 17:49, Roger Pau Monne wrote:
> >>> Using scratch_cpumask in send_IPI_mak is not safe because it can be
> >>> called from interrupt context, and hence Xen would have to make sure
> >>> all the users of the scratch cpumask disable interrupts while using
> >>> it.
> >>>
> >>> Instead introduce a new cpumask to be used by send_IPI_mask, and
> >>> disable interrupts while using.
> >>
> >> My first thought here was: What about NMI or #MC context? Even if
> >> not using the function today (which I didn't check), there shouldn't
> >> be a latent issue introduced here preventing them from doing so in
> >> the future. Instead I think you want to allocate the scratch mask
> >> dynamically (at least if caller context doesn't allow use of the
> >> generic one), and simply refrain from coalescing IPIs if this
> >> allocations fails.
> > 
> > Hm, isn't this going to be quite expensive, and hence render the
> > benefit of using the shorthand moot?
> 
> Depends on how many CPUs there are, i.e. how long the loop otherwise
> would be. When xmalloc() doesn't need to turn to the page allocator,
> I think it won't be overly slow. Another option would be to avoid
> coalescing in a slightly different way (without having to fiddle
> with the scratch mask) when called in interrupt context.

What about preventing the mask usage when in nmi context?

I could introduce something like in_nmi and just avoid the scratch
mask usage in that case (and the shorthand).

Thanks, Roger.
Jan Beulich Feb. 13, 2020, 1:35 p.m. UTC | #5
On 13.02.2020 12:41, Roger Pau Monné wrote:
> On Thu, Feb 13, 2020 at 11:19:02AM +0100, Jan Beulich wrote:
>> On 13.02.2020 11:03, Roger Pau Monné wrote:
>>> On Thu, Feb 13, 2020 at 10:59:29AM +0100, Jan Beulich wrote:
>>>> On 12.02.2020 17:49, Roger Pau Monne wrote:
>>>>> Using scratch_cpumask in send_IPI_mak is not safe because it can be
>>>>> called from interrupt context, and hence Xen would have to make sure
>>>>> all the users of the scratch cpumask disable interrupts while using
>>>>> it.
>>>>>
>>>>> Instead introduce a new cpumask to be used by send_IPI_mask, and
>>>>> disable interrupts while using.
>>>>
>>>> My first thought here was: What about NMI or #MC context? Even if
>>>> not using the function today (which I didn't check), there shouldn't
>>>> be a latent issue introduced here preventing them from doing so in
>>>> the future. Instead I think you want to allocate the scratch mask
>>>> dynamically (at least if caller context doesn't allow use of the
>>>> generic one), and simply refrain from coalescing IPIs if this
>>>> allocations fails.
>>>
>>> Hm, isn't this going to be quite expensive, and hence render the
>>> benefit of using the shorthand moot?
>>
>> Depends on how many CPUs there are, i.e. how long the loop otherwise
>> would be. When xmalloc() doesn't need to turn to the page allocator,
>> I think it won't be overly slow. Another option would be to avoid
>> coalescing in a slightly different way (without having to fiddle
>> with the scratch mask) when called in interrupt context.
> 
> What about preventing the mask usage when in nmi context?
> 
> I could introduce something like in_nmi and just avoid the scratch
> mask usage in that case (and the shorthand).

Right, allocation isn't permitted in NMI context anyway. As to
#MC context - I'm afraid we don't have any indicator of this so
far, which is a problem (here and maybe also elsewhere).

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 9bc925616a..384c3ba924 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -59,6 +59,7 @@  static void send_IPI_shortcut(unsigned int shortcut, int vector,
     apic_write(APIC_ICR, cfg);
 }
 
+DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask);
 /*
  * send_IPI_mask(cpumask, vector): sends @vector IPI to CPUs in @cpumask,
  * excluding the local CPU. @cpumask may be empty.
@@ -67,7 +68,8 @@  static void send_IPI_shortcut(unsigned int shortcut, int vector,
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
     bool cpus_locked = false;
-    cpumask_t *scratch = this_cpu(scratch_cpumask);
+    cpumask_t *scratch = this_cpu(send_ipi_cpumask);
+    unsigned long flags;
 
     /*
      * This can only be safely used when no CPU hotplug or unplug operations
@@ -81,7 +83,15 @@  void send_IPI_mask(const cpumask_t *mask, int vector)
          local_irq_is_enabled() && (cpus_locked = get_cpu_maps()) &&
          (park_offline_cpus ||
           cpumask_equal(&cpu_online_map, &cpu_present_map)) )
+    {
+        /*
+         * send_IPI_mask can be called from interrupt context, and hence we
+         * need to disable interrupts in order to protect the per-cpu
+         * send_ipi_cpumask while being used.
+         */
+        local_irq_save(flags);
         cpumask_or(scratch, mask, cpumask_of(smp_processor_id()));
+    }
     else
     {
         if ( cpus_locked )
@@ -89,6 +99,7 @@  void send_IPI_mask(const cpumask_t *mask, int vector)
             put_cpu_maps();
             cpus_locked = false;
         }
+        local_irq_save(flags);
         cpumask_clear(scratch);
     }
 
@@ -97,6 +108,7 @@  void send_IPI_mask(const cpumask_t *mask, int vector)
     else
         alternative_vcall(genapic.send_IPI_mask, mask, vector);
 
+    local_irq_restore(flags);
     if ( cpus_locked )
         put_cpu_maps();
 }
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index e83e4564a4..82e89201b3 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -57,6 +57,9 @@  DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
 static cpumask_t scratch_cpu0mask;
 
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, send_ipi_cpumask);
+static cpumask_t send_ipi_cpu0mask;
+
 cpumask_t cpu_online_map __read_mostly;
 EXPORT_SYMBOL(cpu_online_map);
 
@@ -930,6 +933,8 @@  static void cpu_smpboot_free(unsigned int cpu, bool remove)
         FREE_CPUMASK_VAR(per_cpu(cpu_core_mask, cpu));
         if ( per_cpu(scratch_cpumask, cpu) != &scratch_cpu0mask )
             FREE_CPUMASK_VAR(per_cpu(scratch_cpumask, cpu));
+        if ( per_cpu(send_ipi_cpumask, cpu) != &send_ipi_cpu0mask )
+            FREE_CPUMASK_VAR(per_cpu(send_ipi_cpumask, cpu));
     }
 
     cleanup_cpu_root_pgt(cpu);
@@ -1034,7 +1039,8 @@  static int cpu_smpboot_alloc(unsigned int cpu)
 
     if ( !(cond_zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, cpu)) &&
            cond_zalloc_cpumask_var(&per_cpu(cpu_core_mask, cpu)) &&
-           cond_alloc_cpumask_var(&per_cpu(scratch_cpumask, cpu))) )
+           cond_alloc_cpumask_var(&per_cpu(scratch_cpumask, cpu)) &&
+           cond_alloc_cpumask_var(&per_cpu(send_ipi_cpumask, cpu))) )
         goto out;
 
     rc = 0;
@@ -1175,6 +1181,7 @@  void __init smp_prepare_boot_cpu(void)
     cpumask_set_cpu(cpu, &cpu_present_map);
 #if NR_CPUS > 2 * BITS_PER_LONG
     per_cpu(scratch_cpumask, cpu) = &scratch_cpu0mask;
+    per_cpu(send_ipi_cpumask, cpu) = &send_ipi_cpu0mask;
 #endif
 
     get_cpu_info()->use_pv_cr3 = false;