diff mbox series

[for-4.19,4/9] x86/irq: describe how the interrupt CPU movement works

Message ID 20240529090132.59434-5-roger.pau@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86/irq: fixes for CPU hot{,un}plug | expand

Commit Message

Roger Pau Monne May 29, 2024, 9:01 a.m. UTC
The logic to move interrupts across CPUs is complex, attempt to provide a
comment that describes the expected behavior so users of the interrupt system
have more context about the usage of the arch_irq_desc structure fields.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/include/asm/irq.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comments

Jan Beulich May 29, 2024, 1:57 p.m. UTC | #1
On 29.05.2024 11:01, Roger Pau Monne wrote:
> --- a/xen/arch/x86/include/asm/irq.h
> +++ b/xen/arch/x86/include/asm/irq.h
> @@ -28,6 +28,32 @@ typedef struct {
>  
>  struct irq_desc;
>  
> +/*
> + * Xen logic for moving interrupts around CPUs allows manipulating interrupts
> + * that target remote CPUs.  The logic to move an interrupt from CPU(s) is as
> + * follows:
> + *
> + * 1. cpu_mask and vector is copied to old_cpu_mask and old_vector.
> + * 2. New cpu_mask and vector are set, vector is setup at the new destination.
> + * 3. move_in_progress is set.
> + * 4. Interrupt source is updated to target new CPU and vector.
> + * 5. Interrupts arriving at old_cpu_mask are processed normally.
> + * 6. When an interrupt is delivered at the new destination (cpu_mask) as part
> + *    of acking the interrupt move_in_progress is cleared and move_cleanup_count

Nit: A comma after "interrupt" may help reading.

> + *    is set to the weight of online CPUs in old_cpu_mask.
> + *    IRQ_MOVE_CLEANUP_VECTOR is sent to all CPUs in old_cpu_mask.

These last two steps aren't precise enough, compared to what the code does.
old_cpu_mask is first reduced to online CPUs therein. If the result is non-
empty, what you describe is done. If, however, the result is empty, the
vector is released right away (this code may be there just in case, but I
think it shouldn't be omitted here).

> + * 7. When receiving IRQ_MOVE_CLEANUP_VECTOR CPUs in old_cpu_mask clean the
> + *    vector entry and decrease the count in move_cleanup_count.  The CPU that
> + *    sets move_cleanup_count to 0 releases the vector.
> + *
> + * Note that when interrupt movement (either move_in_progress or
> + * move_cleanup_count set) is in progress it's not possible to move the
> + * interrupt to yet a different CPU.
> + *
> + * By keeping the vector in the old CPU(s) configured until the interrupt is
> + * acked on the new destination Xen allows draining any pending interrupts at
> + * the old destinations.
> + */
>  struct arch_irq_desc {
>          s16 vector;                  /* vector itself is only 8 bits, */
>          s16 old_vector;              /* but we use -1 for unassigned  */

I take it that it is not a goal to (also) describe under what conditions
an IRQ move may actually be initiated (IRQ_MOVE_PENDING)? I ask not the
least because the 2nd from last paragraph lightly touches that area.

Jan
Roger Pau Monne May 29, 2024, 3:28 p.m. UTC | #2
On Wed, May 29, 2024 at 03:57:19PM +0200, Jan Beulich wrote:
> On 29.05.2024 11:01, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/include/asm/irq.h
> > +++ b/xen/arch/x86/include/asm/irq.h
> > @@ -28,6 +28,32 @@ typedef struct {
> >  
> >  struct irq_desc;
> >  
> > +/*
> > + * Xen logic for moving interrupts around CPUs allows manipulating interrupts
> > + * that target remote CPUs.  The logic to move an interrupt from CPU(s) is as
> > + * follows:
> > + *
> > + * 1. cpu_mask and vector is copied to old_cpu_mask and old_vector.
> > + * 2. New cpu_mask and vector are set, vector is setup at the new destination.
> > + * 3. move_in_progress is set.
> > + * 4. Interrupt source is updated to target new CPU and vector.
> > + * 5. Interrupts arriving at old_cpu_mask are processed normally.
> > + * 6. When an interrupt is delivered at the new destination (cpu_mask) as part
> > + *    of acking the interrupt move_in_progress is cleared and move_cleanup_count
> 
> Nit: A comma after "interrupt" may help reading.
> 
> > + *    is set to the weight of online CPUs in old_cpu_mask.
> > + *    IRQ_MOVE_CLEANUP_VECTOR is sent to all CPUs in old_cpu_mask.
> 
> These last two steps aren't precise enough, compared to what the code does.
> old_cpu_mask is first reduced to online CPUs therein. If the result is non-
> empty, what you describe is done. If, however, the result is empty, the
> vector is released right away (this code may be there just in case, but I
> think it shouldn't be omitted here).

I've left that out because I got the impression it made the text more
complex to follow (with the extra branch) for no real benefit, but I'm
happy to attempt to add it.

> 
> > + * 7. When receiving IRQ_MOVE_CLEANUP_VECTOR CPUs in old_cpu_mask clean the
> > + *    vector entry and decrease the count in move_cleanup_count.  The CPU that
> > + *    sets move_cleanup_count to 0 releases the vector.
> > + *
> > + * Note that when interrupt movement (either move_in_progress or
> > + * move_cleanup_count set) is in progress it's not possible to move the
> > + * interrupt to yet a different CPU.
> > + *
> > + * By keeping the vector in the old CPU(s) configured until the interrupt is
> > + * acked on the new destination Xen allows draining any pending interrupts at
> > + * the old destinations.
> > + */
> >  struct arch_irq_desc {
> >          s16 vector;                  /* vector itself is only 8 bits, */
> >          s16 old_vector;              /* but we use -1 for unassigned  */
> 
> I take it that it is not a goal to (also) describe under what conditions
> an IRQ move may actually be initiated (IRQ_MOVE_PENDING)? I ask not the
> least because the 2nd from last paragraph lightly touches that area.

Right, I was mostly focused on moves (forcefully) initiated from
fixup_irqs(), which is different from the opportunistic affinity
changes signaled by IRQ_MOVE_PENDING.

Not sure whether I want to mention this ahead of the list in a
paragraph, or just add it as a step.  Do you have any preference?

Thanks, Roger.
Jan Beulich May 31, 2024, 7:06 a.m. UTC | #3
On 29.05.2024 17:28, Roger Pau Monné wrote:
> On Wed, May 29, 2024 at 03:57:19PM +0200, Jan Beulich wrote:
>> On 29.05.2024 11:01, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/include/asm/irq.h
>>> +++ b/xen/arch/x86/include/asm/irq.h
>>> @@ -28,6 +28,32 @@ typedef struct {
>>>  
>>>  struct irq_desc;
>>>  
>>> +/*
>>> + * Xen logic for moving interrupts around CPUs allows manipulating interrupts
>>> + * that target remote CPUs.  The logic to move an interrupt from CPU(s) is as
>>> + * follows:
>>> + *
>>> + * 1. cpu_mask and vector is copied to old_cpu_mask and old_vector.
>>> + * 2. New cpu_mask and vector are set, vector is setup at the new destination.
>>> + * 3. move_in_progress is set.
>>> + * 4. Interrupt source is updated to target new CPU and vector.
>>> + * 5. Interrupts arriving at old_cpu_mask are processed normally.
>>> + * 6. When an interrupt is delivered at the new destination (cpu_mask) as part
>>> + *    of acking the interrupt move_in_progress is cleared and move_cleanup_count
>>
>> Nit: A comma after "interrupt" may help reading.
>>
>>> + *    is set to the weight of online CPUs in old_cpu_mask.
>>> + *    IRQ_MOVE_CLEANUP_VECTOR is sent to all CPUs in old_cpu_mask.
>>
>> These last two steps aren't precise enough, compared to what the code does.
>> old_cpu_mask is first reduced to online CPUs therein. If the result is non-
>> empty, what you describe is done. If, however, the result is empty, the
>> vector is released right away (this code may be there just in case, but I
>> think it shouldn't be omitted here).
> 
> I've left that out because I got the impression it made the text more
> complex to follow (with the extra branch) for no real benefit, but I'm
> happy to attempt to add it.

Why "no real benefit"? Isn't the goal to accurately describe what code does
(in various places)? If the result isn't an accurate description in one
specific regard, how reliable would the rest be from a reader's perspective?

>>> + * 7. When receiving IRQ_MOVE_CLEANUP_VECTOR CPUs in old_cpu_mask clean the
>>> + *    vector entry and decrease the count in move_cleanup_count.  The CPU that
>>> + *    sets move_cleanup_count to 0 releases the vector.
>>> + *
>>> + * Note that when interrupt movement (either move_in_progress or
>>> + * move_cleanup_count set) is in progress it's not possible to move the
>>> + * interrupt to yet a different CPU.
>>> + *
>>> + * By keeping the vector in the old CPU(s) configured until the interrupt is
>>> + * acked on the new destination Xen allows draining any pending interrupts at
>>> + * the old destinations.
>>> + */
>>>  struct arch_irq_desc {
>>>          s16 vector;                  /* vector itself is only 8 bits, */
>>>          s16 old_vector;              /* but we use -1 for unassigned  */
>>
>> I take it that it is not a goal to (also) describe under what conditions
>> an IRQ move may actually be initiated (IRQ_MOVE_PENDING)? I ask not the
>> least because the 2nd from last paragraph lightly touches that area.
> 
> Right, I was mostly focused on moves (forcefully) initiated from
> fixup_irqs(), which is different from the opportunistic affinity
> changes signaled by IRQ_MOVE_PENDING.
> 
> Not sure whether I want to mention this ahead of the list in a
> paragraph, or just add it as a step.  Do you have any preference?

I think ahead might be better. But I also won't insist on it being added.
Just if you don't, perhaps mention in the description that leaving that
out is intentional.

Jan
Roger Pau Monne May 31, 2024, 7:39 a.m. UTC | #4
On Fri, May 31, 2024 at 09:06:10AM +0200, Jan Beulich wrote:
> On 29.05.2024 17:28, Roger Pau Monné wrote:
> > On Wed, May 29, 2024 at 03:57:19PM +0200, Jan Beulich wrote:
> >> On 29.05.2024 11:01, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/include/asm/irq.h
> >>> +++ b/xen/arch/x86/include/asm/irq.h
> >>> @@ -28,6 +28,32 @@ typedef struct {
> >>>  
> >>>  struct irq_desc;
> >>>  
> >>> +/*
> >>> + * Xen logic for moving interrupts around CPUs allows manipulating interrupts
> >>> + * that target remote CPUs.  The logic to move an interrupt from CPU(s) is as
> >>> + * follows:
> >>> + *
> >>> + * 1. cpu_mask and vector is copied to old_cpu_mask and old_vector.
> >>> + * 2. New cpu_mask and vector are set, vector is setup at the new destination.
> >>> + * 3. move_in_progress is set.
> >>> + * 4. Interrupt source is updated to target new CPU and vector.
> >>> + * 5. Interrupts arriving at old_cpu_mask are processed normally.
> >>> + * 6. When an interrupt is delivered at the new destination (cpu_mask) as part
> >>> + *    of acking the interrupt move_in_progress is cleared and move_cleanup_count
> >>
> >> Nit: A comma after "interrupt" may help reading.
> >>
> >>> + *    is set to the weight of online CPUs in old_cpu_mask.
> >>> + *    IRQ_MOVE_CLEANUP_VECTOR is sent to all CPUs in old_cpu_mask.
> >>
> >> These last two steps aren't precise enough, compared to what the code does.
> >> old_cpu_mask is first reduced to online CPUs therein. If the result is non-
> >> empty, what you describe is done. If, however, the result is empty, the
> >> vector is released right away (this code may be there just in case, but I
> >> think it shouldn't be omitted here).
> > 
> > I've left that out because I got the impression it made the text more
> > complex to follow (with the extra branch) for no real benefit, but I'm
> > happy to attempt to add it.
> 
> Why "no real benefit"? Isn't the goal to accurately describe what code does
> (in various places)? If the result isn't an accurate description in one
> specific regard, how reliable would the rest be from a reader's perspective?

FWIW, it seemed to me the reduction of old_cpu_mask was (kind of) a
shortcut to what the normal path does, by releasing the vector early
if there are no online CPUs in old_cpu_mask.

Now that you made me look into it, I think after this series the
old_cpu_mask should never contain offline CPUs, as fixup_irqs() will
take care of removing offliend CPUs from old_cpu_mask, and freeing the
vector if the set becomes empty.

I will expand the comment to mention this case, and consider adjusting
it if this series get merged.

> >>> + * 7. When receiving IRQ_MOVE_CLEANUP_VECTOR CPUs in old_cpu_mask clean the
> >>> + *    vector entry and decrease the count in move_cleanup_count.  The CPU that
> >>> + *    sets move_cleanup_count to 0 releases the vector.
> >>> + *
> >>> + * Note that when interrupt movement (either move_in_progress or
> >>> + * move_cleanup_count set) is in progress it's not possible to move the
> >>> + * interrupt to yet a different CPU.
> >>> + *
> >>> + * By keeping the vector in the old CPU(s) configured until the interrupt is
> >>> + * acked on the new destination Xen allows draining any pending interrupts at
> >>> + * the old destinations.
> >>> + */
> >>>  struct arch_irq_desc {
> >>>          s16 vector;                  /* vector itself is only 8 bits, */
> >>>          s16 old_vector;              /* but we use -1 for unassigned  */
> >>
> >> I take it that it is not a goal to (also) describe under what conditions
> >> an IRQ move may actually be initiated (IRQ_MOVE_PENDING)? I ask not the
> >> least because the 2nd from last paragraph lightly touches that area.
> > 
> > Right, I was mostly focused on moves (forcefully) initiated from
> > fixup_irqs(), which is different from the opportunistic affinity
> > changes signaled by IRQ_MOVE_PENDING.
> > 
> > Not sure whether I want to mention this ahead of the list in a
> > paragraph, or just add it as a step.  Do you have any preference?
> 
> I think ahead might be better. But I also won't insist on it being added.
> Just if you don't, perhaps mention in the description that leaving that
> out is intentional.

No, I'm fine with adding it.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index 413994d2133b..80a3aa7a88b9 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -28,6 +28,32 @@  typedef struct {
 
 struct irq_desc;
 
+/*
+ * Xen logic for moving interrupts around CPUs allows manipulating interrupts
+ * that target remote CPUs.  The logic to move an interrupt from CPU(s) is as
+ * follows:
+ *
+ * 1. cpu_mask and vector is copied to old_cpu_mask and old_vector.
+ * 2. New cpu_mask and vector are set, vector is setup at the new destination.
+ * 3. move_in_progress is set.
+ * 4. Interrupt source is updated to target new CPU and vector.
+ * 5. Interrupts arriving at old_cpu_mask are processed normally.
+ * 6. When an interrupt is delivered at the new destination (cpu_mask) as part
+ *    of acking the interrupt move_in_progress is cleared and move_cleanup_count
+ *    is set to the weight of online CPUs in old_cpu_mask.
+ *    IRQ_MOVE_CLEANUP_VECTOR is sent to all CPUs in old_cpu_mask.
+ * 7. When receiving IRQ_MOVE_CLEANUP_VECTOR CPUs in old_cpu_mask clean the
+ *    vector entry and decrease the count in move_cleanup_count.  The CPU that
+ *    sets move_cleanup_count to 0 releases the vector.
+ *
+ * Note that when interrupt movement (either move_in_progress or
+ * move_cleanup_count set) is in progress it's not possible to move the
+ * interrupt to yet a different CPU.
+ *
+ * By keeping the vector in the old CPU(s) configured until the interrupt is
+ * acked on the new destination Xen allows draining any pending interrupts at
+ * the old destinations.
+ */
 struct arch_irq_desc {
         s16 vector;                  /* vector itself is only 8 bits, */
         s16 old_vector;              /* but we use -1 for unassigned  */