diff mbox series

[RFC,1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()

Message ID 5CC6DE8F0200007800229E9A@prv1-mh.provo.novell.com (mailing list archive)
State Superseded
Headers show
Series x86: IRQ management adjustments | expand

Commit Message

Jan Beulich April 29, 2019, 11:22 a.m. UTC
The flag being set may prevent affinity changes, as these often imply
assignment of a new vector. When there's no possible destination left
for the IRQ, the clearing of the flag needs to happen right from
fixup_irqs().

Additionally _assign_irq_vector() needs to avoid setting the flag when
there's no online CPU left in what gets put into ->arch.old_cpu_mask.
The old vector can be released right away in this case.

Also extend the log message about broken affinity to include the new
affinity as well, allowing to notice issues with affinity changes not
actually having taken place. Swap the if/else-if order there at the
same time to reduce the amount of conditions checked.

At the same time replace two open coded instances of the new helper
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
     I'm pretty sure that this assertion triggering means something else
     is wrong, and has been even prior to this change (adding the
     assertion without any of the other changes here should be valid in
     my understanding).

Comments

Jan Beulich April 29, 2019, 12:55 p.m. UTC | #1
>>> On 29.04.19 at 13:22, <JBeulich@suse.com> wrote:
> RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
>      I'm pretty sure that this assertion triggering means something else
>      is wrong, and has been even prior to this change (adding the
>      assertion without any of the other changes here should be valid in
>      my understanding).

So I think what is missing is updating of vector_irq ...

> @@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
>              continue;
>          }
>  
> +        /*
> +         * In order for the affinity adjustment below to be successful, we
> +         * need __assign_irq_vector() to succeed. This in particular means
> +         * clearing desc->arch.move_in_progress if this would otherwise
> +         * prevent the function from succeeding. Since there's no way for the
> +         * flag to get cleared anymore when there's no possible destination
> +         * left (the only possibility then would be the IRQs enabled window
> +         * after this loop), there's then also no race with us doing it here.
> +         *
> +         * Therefore the logic here and there need to remain in sync.
> +         */
> +        if ( desc->arch.move_in_progress &&
> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
> +        {
> +            release_old_vec(desc);
> +            desc->arch.move_in_progress = 0;
> +        }

... here and in the somewhat similar logic patch 2 inserts a few lines
up. I'm about to try this out, but given how rarely I've seen the
problem this will take a while to feel confident (if, of course, it helps
in the first place).

Jan
Jan Beulich April 29, 2019, 1:08 p.m. UTC | #2
>>> On 29.04.19 at 14:55, <JBeulich@suse.com> wrote:
>>>> On 29.04.19 at 13:22, <JBeulich@suse.com> wrote:
>> RFC: I've seen the new ASSERT() in irq_move_cleanup_interrupt() trigger.
>>      I'm pretty sure that this assertion triggering means something else
>>      is wrong, and has been even prior to this change (adding the
>>      assertion without any of the other changes here should be valid in
>>      my understanding).
> 
> So I think what is missing is updating of vector_irq ...
> 
>> @@ -2391,6 +2401,24 @@ void fixup_irqs(const cpumask_t *mask, b
>>              continue;
>>          }
>>  
>> +        /*
>> +         * In order for the affinity adjustment below to be successful, we
>> +         * need __assign_irq_vector() to succeed. This in particular means
>> +         * clearing desc->arch.move_in_progress if this would otherwise
>> +         * prevent the function from succeeding. Since there's no way for the
>> +         * flag to get cleared anymore when there's no possible destination
>> +         * left (the only possibility then would be the IRQs enabled window
>> +         * after this loop), there's then also no race with us doing it here.
>> +         *
>> +         * Therefore the logic here and there need to remain in sync.
>> +         */
>> +        if ( desc->arch.move_in_progress &&
>> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
>> +        {
>> +            release_old_vec(desc);
>> +            desc->arch.move_in_progress = 0;
>> +        }
> 
> ... here and in the somewhat similar logic patch 2 inserts a few lines
> up. I'm about to try this out, but given how rarely I've seen the
> problem this will take a while to feel confident (if, of course, it helps
> in the first place).

Actually no, the 2nd patch doesn't need any change - the code
added there only deals with CPUs already marked offline.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -242,6 +242,20 @@  void destroy_irq(unsigned int irq)
     xfree(action);
 }
 
+static void release_old_vec(struct irq_desc *desc)
+{
+    unsigned int vector = desc->arch.old_vector;
+
+    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    cpumask_clear(desc->arch.old_cpu_mask);
+
+    if ( desc->arch.used_vectors )
+    {
+        ASSERT(test_bit(vector, desc->arch.used_vectors));
+        clear_bit(vector, desc->arch.used_vectors);
+    }
+}
+
 static void __clear_irq_vector(int irq)
 {
     int cpu, vector, old_vector;
@@ -285,14 +299,7 @@  static void __clear_irq_vector(int irq)
         per_cpu(vector_irq, cpu)[old_vector] = ~irq;
     }
 
-    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-    cpumask_clear(desc->arch.old_cpu_mask);
-
-    if ( desc->arch.used_vectors )
-    {
-        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
-        clear_bit(old_vector, desc->arch.used_vectors);
-    }
+    release_old_vec(desc);
 
     desc->arch.move_in_progress = 0;
 }
@@ -517,12 +524,21 @@  next:
         /* Found one! */
         current_vector = vector;
         current_offset = offset;
-        if (old_vector > 0) {
-            desc->arch.move_in_progress = 1;
-            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
+
+        if ( old_vector > 0 )
+        {
+            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
+                        &cpu_online_map);
             desc->arch.old_vector = desc->arch.vector;
+            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+                desc->arch.move_in_progress = 1;
+            else
+                /* This can happen while offlining a CPU. */
+                release_old_vec(desc);
         }
+
         trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
+
         for_each_cpu(new_cpu, &tmp_mask)
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         desc->arch.vector = vector;
@@ -691,14 +707,8 @@  void irq_move_cleanup_interrupt(struct c
 
         if ( desc->arch.move_cleanup_count == 0 )
         {
-            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
-            cpumask_clear(desc->arch.old_cpu_mask);
-
-            if ( desc->arch.used_vectors )
-            {
-                ASSERT(test_bit(vector, desc->arch.used_vectors));
-                clear_bit(vector, desc->arch.used_vectors);
-            }
+            ASSERT(vector == desc->arch.old_vector);
+            release_old_vec(desc);
         }
 unlock:
         spin_unlock(&desc->lock);
@@ -2391,6 +2401,24 @@  void fixup_irqs(const cpumask_t *mask, b
             continue;
         }
 
+        /*
+         * In order for the affinity adjustment below to be successful, we
+         * need __assign_irq_vector() to succeed. This in particular means
+         * clearing desc->arch.move_in_progress if this would otherwise
+         * prevent the function from succeeding. Since there's no way for the
+         * flag to get cleared anymore when there's no possible destination
+         * left (the only possibility then would be the IRQs enabled window
+         * after this loop), there's then also no race with us doing it here.
+         *
+         * Therefore the logic here and there need to remain in sync.
+         */
+        if ( desc->arch.move_in_progress &&
+             !cpumask_intersects(mask, desc->arch.cpu_mask) )
+        {
+            release_old_vec(desc);
+            desc->arch.move_in_progress = 0;
+        }
+
         cpumask_and(&affinity, &affinity, mask);
         if ( cpumask_empty(&affinity) )
         {
@@ -2409,15 +2437,18 @@  void fixup_irqs(const cpumask_t *mask, b
         if ( desc->handler->enable )
             desc->handler->enable(desc);
 
+        cpumask_copy(&affinity, desc->affinity);
+
         spin_unlock(&desc->lock);
 
         if ( !verbose )
             continue;
 
-        if ( break_affinity && set_affinity )
-            printk("Broke affinity for irq %i\n", irq);
-        else if ( !set_affinity )
-            printk("Cannot set affinity for irq %i\n", irq);
+        if ( !set_affinity )
+            printk("Cannot set affinity for IRQ%u\n", irq);
+        else if ( break_affinity )
+            printk("Broke affinity for IRQ%u, new: %*pb\n",
+                   irq, nr_cpu_ids, &affinity);
     }
 
     /* That doesn't seem sufficient.  Give it 1ms. */