diff mbox series

[v2,06/12] x86/IRQ: consolidate use of ->arch.cpu_mask

Message ID 5CD2D545020000780022CD3D@prv1-mh.provo.novell.com (mailing list archive)
State Superseded
Headers show
Series x86: IRQ management adjustments | expand

Commit Message

Jan Beulich May 8, 2019, 1:10 p.m. UTC
Mixed meaning was implied so far by different pieces of code -
disagreement was in particular about whether to expect offline CPUs'
bits to possibly be set. Switch to a mostly consistent meaning
(exception being high priority interrupts, which would perhaps better
be switched to the same model as well in due course). Use the field to
record the vector allocation mask, i.e. potentially including bits of
offline (parked) CPUs. This implies that before passing the mask to
certain functions (most notably cpu_mask_to_apicid()) it needs to be
further reduced to the online subset.

The exception of high priority interrupts is also why for the moment
_bind_irq_vector() is left as is, despite looking wrong: It's used
exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.

The prior lack of restricting to online CPUs in set_desc_affinity()
before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
clustered mode) offlined CPUs to end up enabled in an IRQ's destination
field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
follow a similar model, using cpu_present_map in favor of
cpu_online_map.)

For IO-APIC code it was definitely wrong to potentially store, as a
fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
would have caused problems when determining on which CPUs to release
vectors when they've gone out of use. Disable interrupts instead when
no valid target CPU can be established (which code elsewhere should
guarantee to never happen), and log a message in such an unlikely event.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

Comments

Roger Pau Monné May 13, 2019, 11:32 a.m. UTC | #1
On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote:
> Mixed meaning was implied so far by different pieces of code -
> disagreement was in particular about whether to expect offline CPUs'
> bits to possibly be set. Switch to a mostly consistent meaning
> (exception being high priority interrupts, which would perhaps better
> be switched to the same model as well in due course). Use the field to
> record the vector allocation mask, i.e. potentially including bits of
> offline (parked) CPUs. This implies that before passing the mask to
> certain functions (most notably cpu_mask_to_apicid()) it needs to be
> further reduced to the online subset.
> 
> The exception of high priority interrupts is also why for the moment
> _bind_irq_vector() is left as is, despite looking wrong: It's used
> exclusively for IRQ0, which isn't supposed to move off CPU0 at any time.
> 
> The prior lack of restricting to online CPUs in set_desc_affinity()
> before calling cpu_mask_to_apicid() in particular allowed (in x2APIC
> clustered mode) offlined CPUs to end up enabled in an IRQ's destination
> field. (I wonder whether vector_allocation_cpumask_flat() shouldn't
> follow a similar model, using cpu_present_map in favor of
> cpu_online_map.)
> 
> For IO-APIC code it was definitely wrong to potentially store, as a
> fallback, TARGET_CPUS (i.e. all online ones) into the field, as that
> would have caused problems when determining on which CPUs to release
> vectors when they've gone out of use. Disable interrupts instead when
> no valid target CPU can be established (which code elsewhere should
> guarantee to never happen), and log a message in such an unlikely event.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Thanks.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Some comments below.

> ---
> v2: New.
> 
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
>                  continue;
>              irq = pin_2_irq(irq_entry, ioapic, pin);
>              desc = irq_to_desc(irq);
> -            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
> +            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));

I wonder if maybe you could instead do:

if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
    set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
else
    ASSERT_UNREACHABLE();

I guess if the IRQ is in use by Xen itself the failure ought to be
fatal.

>              set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
>          }
>  
> @@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic,
>  {
>      struct irq_desc *desc = irq_to_desc(irq);
>      struct IO_APIC_route_entry entry;
> -    cpumask_t mask;
>      unsigned long flags;
>      int vector;
>  
> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
>          return vector;
>      entry.vector = vector;
>  
> -    cpumask_copy(&mask, TARGET_CPUS);
> -    /* Don't chance ending up with an empty mask. */
> -    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
> -        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
> -    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
> +    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
> +        cpumask_t *mask = this_cpu(scratch_cpumask);
> +
> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> +        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
> +    } else {
> +        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
> +        desc->status |= IRQ_DISABLED;
> +    }

Hm, part of this file doesn't seem to use Xen coding style, but the
chunk you add below does use it. And there are functions (like
mask_and_ack_level_ioapic_irq that seem to use a mix of coding
styles).

I'm not sure what's the policy here, should new chunks follow Xen's
coding style?

>  
>      apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
>  		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
> @@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy
>      /* Set the vector field to the real vector! */
>      rte.vector = desc->arch.vector;
>  
> -    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
> +    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
> +    {
> +        cpumask_t *mask = this_cpu(scratch_cpumask);
> +
> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> +        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
> +    }
> +    else
> +    {
> +        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
> +        desc->status |= IRQ_DISABLED;
> +        rte.mask = 1;
> +    }
>  
>      __ioapic_write_entry(apic, pin, 0, rte);
>      
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -471,11 +471,13 @@ static int __assign_irq_vector(
>       */
>      static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
>      int cpu, err, old_vector;
> -    cpumask_t tmp_mask;
>      vmask_t *irq_used_vectors = NULL;
>  
>      old_vector = irq_to_vector(irq);
> -    if (old_vector > 0) {
> +    if ( old_vector > 0 )

Another candidate to switch to valid_irq_vector or at least make an
explicit comparison with IRQ_VECTOR_UNASSIGNED.

Seeing your reply to my comment in that direction on a previous patch
this can be done as a follow up.

Roger.
Jan Beulich May 13, 2019, 3:21 p.m. UTC | #2
>>> On 13.05.19 at 13:32, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote:
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void)
>>                  continue;
>>              irq = pin_2_irq(irq_entry, ioapic, pin);
>>              desc = irq_to_desc(irq);
>> -            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
>> +            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
> 
> I wonder if maybe you could instead do:
> 
> if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
>     set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
> else
>     ASSERT_UNREACHABLE();
> 
> I guess if the IRQ is in use by Xen itself the failure ought to be
> fatal.

And imo also when it's another one (used by Dom0). Iirc we get
here only during Dom0 boot (the commented out __init serving as
a hint). Hence I think BUG_ON() is better in this case than any
for of assertion.

>> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic,
>>          return vector;
>>      entry.vector = vector;
>>  
>> -    cpumask_copy(&mask, TARGET_CPUS);
>> -    /* Don't chance ending up with an empty mask. */
>> -    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
>> -        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
>> -    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
>> +    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
>> +        cpumask_t *mask = this_cpu(scratch_cpumask);
>> +
>> +        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
>> +        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
>> +    } else {
>> +        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
>> +               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
>> +               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
>> +        desc->status |= IRQ_DISABLED;
>> +    }
> 
> Hm, part of this file doesn't seem to use Xen coding style, but the
> chunk you add below does use it. And there are functions (like
> mask_and_ack_level_ioapic_irq that seem to use a mix of coding
> styles).
> 
> I'm not sure what's the policy here, should new chunks follow Xen's
> coding style?

Well, I've decided to match surrounding code's style, until the file
gets morphed into consistent shape.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -680,7 +680,7 @@  void /*__init*/ setup_ioapic_dest(void)
                 continue;
             irq = pin_2_irq(irq_entry, ioapic, pin);
             desc = irq_to_desc(irq);
-            BUG_ON(cpumask_empty(desc->arch.cpu_mask));
+            BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
             set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
         }
 
@@ -2197,7 +2197,6 @@  int io_apic_set_pci_routing (int ioapic,
 {
     struct irq_desc *desc = irq_to_desc(irq);
     struct IO_APIC_route_entry entry;
-    cpumask_t mask;
     unsigned long flags;
     int vector;
 
@@ -2232,11 +2231,17 @@  int io_apic_set_pci_routing (int ioapic,
         return vector;
     entry.vector = vector;
 
-    cpumask_copy(&mask, TARGET_CPUS);
-    /* Don't chance ending up with an empty mask. */
-    if (cpumask_intersects(&mask, desc->arch.cpu_mask))
-        cpumask_and(&mask, &mask, desc->arch.cpu_mask);
-    SET_DEST(entry, logical, cpu_mask_to_apicid(&mask));
+    if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+    } else {
+        printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+    }
 
     apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry "
 		"(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic,
@@ -2422,7 +2427,21 @@  int ioapic_guest_write(unsigned long phy
     /* Set the vector field to the real vector! */
     rte.vector = desc->arch.vector;
 
-    SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask));
+    if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
+    {
+        cpumask_t *mask = this_cpu(scratch_cpumask);
+
+        cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
+        SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+    }
+    else
+    {
+        gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
+               irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask),
+               nr_cpu_ids, cpumask_bits(TARGET_CPUS));
+        desc->status |= IRQ_DISABLED;
+        rte.mask = 1;
+    }
 
     __ioapic_write_entry(apic, pin, 0, rte);
     
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -471,11 +471,13 @@  static int __assign_irq_vector(
      */
     static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0;
     int cpu, err, old_vector;
-    cpumask_t tmp_mask;
     vmask_t *irq_used_vectors = NULL;
 
     old_vector = irq_to_vector(irq);
-    if (old_vector > 0) {
+    if ( old_vector > 0 )
+    {
+        cpumask_t tmp_mask;
+
         cpumask_and(&tmp_mask, mask, &cpu_online_map);
         if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
             desc->arch.vector = old_vector;
@@ -498,7 +500,9 @@  static int __assign_irq_vector(
     else
         irq_used_vectors = irq_get_used_vector_mask(irq);
 
-    for_each_cpu(cpu, mask) {
+    for_each_cpu(cpu, mask)
+    {
+        const cpumask_t *vec_mask;
         int new_cpu;
         int vector, offset;
 
@@ -506,8 +510,7 @@  static int __assign_irq_vector(
         if (!cpu_online(cpu))
             continue;
 
-        cpumask_and(&tmp_mask, vector_allocation_cpumask(cpu),
-                    &cpu_online_map);
+        vec_mask = vector_allocation_cpumask(cpu);
 
         vector = current_vector;
         offset = current_offset;
@@ -528,7 +531,7 @@  next:
             && test_bit(vector, irq_used_vectors) )
             goto next;
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
                 goto next;
         /* Found one! */
@@ -547,12 +550,12 @@  next:
                 release_old_vec(desc);
         }
 
-        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+        trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
 
-        for_each_cpu(new_cpu, &tmp_mask)
+        for_each_cpu(new_cpu, vec_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
-        cpumask_copy(desc->arch.cpu_mask, &tmp_mask);
+        cpumask_copy(desc->arch.cpu_mask, vec_mask);
 
         desc->arch.used = IRQ_USED;
         ASSERT((desc->arch.used_vectors == NULL)
@@ -783,6 +786,7 @@  unsigned int set_desc_affinity(struct ir
 
     cpumask_copy(desc->affinity, mask);
     cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
 
     return cpu_mask_to_apicid(&dest_mask);
 }
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -32,6 +32,12 @@  struct irq_desc;
 struct arch_irq_desc {
         s16 vector;                  /* vector itself is only 8 bits, */
         s16 old_vector;              /* but we use -1 for unassigned  */
+        /*
+         * Except for high priority interrupts @cpu_mask may have bits set for
+         * offline CPUs.  Consumers need to be careful to mask this down to
+         * online ones as necessary.  There is supposed to always be a non-
+         * empty intersection with cpu_online_map.
+         */
         cpumask_var_t cpu_mask;
         cpumask_var_t old_cpu_mask;
         cpumask_var_t pending_mask;