[for-4.14,2/8] x86/hvm: don't force vCPU 0 for IRQ 0 when using fixed destination mode
diff mbox series

Message ID 20200612155640.4101-3-roger.pau@citrix.com
State New
Headers show
Series
  • x86/vpt: fixes for vpt and enable vPIT for PVH dom0
Related show

Commit Message

Roger Pau Monné June 12, 2020, 3:56 p.m. UTC
When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
use fixed delivery mode do not forcefully route interrupts to vCPU 0,
as the OS might have setup those interrupts to be injected to a
different vCPU, and injecting to vCPU 0 can cause the OS to miss such
interrupts or errors to happen due to unexpected vectors being
injected on vCPU 0.

In order to fix remove such handling altogether for fixed destination
mode pins and just inject them according to the data setup in the
IO-APIC entry.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vioapic.c | 23 ++++-------------------
 1 file changed, 4 insertions(+), 19 deletions(-)

Comments

Jan Beulich June 18, 2020, 1:43 p.m. UTC | #1
On 12.06.2020 17:56, Roger Pau Monne wrote:
> When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
> use fixed delivery mode do not forcefully route interrupts to vCPU 0,
> as the OS might have setup those interrupts to be injected to a
> different vCPU, and injecting to vCPU 0 can cause the OS to miss such
> interrupts or errors to happen due to unexpected vectors being
> injected on vCPU 0.
> 
> In order to fix remove such handling altogether for fixed destination
> mode pins and just inject them according to the data setup in the
> IO-APIC entry.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Technically
Reviewed-by: Jan Beulich <jbeulich@suse.com>

I wonder though why this was done in the first place - it very much
feels like a workaround for certain guest behavior, and hence
getting rid of it may mean a certain risk of regressions. Not a
very good point in time to make risky changes ...

Jan
Roger Pau Monné June 18, 2020, 1:48 p.m. UTC | #2
On Thu, Jun 18, 2020 at 03:43:00PM +0200, Jan Beulich wrote:
> On 12.06.2020 17:56, Roger Pau Monne wrote:
> > When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
> > use fixed delivery mode do not forcefully route interrupts to vCPU 0,
> > as the OS might have setup those interrupts to be injected to a
> > different vCPU, and injecting to vCPU 0 can cause the OS to miss such
> > interrupts or errors to happen due to unexpected vectors being
> > injected on vCPU 0.
> > 
> > In order to fix remove such handling altogether for fixed destination
> > mode pins and just inject them according to the data setup in the
> > IO-APIC entry.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Technically
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> I wonder though why this was done in the first place - it very much
> feels like a workaround for certain guest behavior, and hence
> getting rid of it may mean a certain risk of regressions. Not a
> very good point in time to make risky changes ...

We can defer to after the release I guess, but I will still ask for
the changes to be backported.

What we currently do is broken, up to the point that FreeBSD cannot
use the PIT because it will likely route the interrupt to a vCPU != 0
in fixed mode, and then it will just get stuck because the vector is
delivered to vCPU 0 where it's not even configured.

Roger.
Jan Beulich June 18, 2020, 2:08 p.m. UTC | #3
On 18.06.2020 15:48, Roger Pau Monné wrote:
> On Thu, Jun 18, 2020 at 03:43:00PM +0200, Jan Beulich wrote:
>> On 12.06.2020 17:56, Roger Pau Monne wrote:
>>> When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
>>> use fixed delivery mode do not forcefully route interrupts to vCPU 0,
>>> as the OS might have setup those interrupts to be injected to a
>>> different vCPU, and injecting to vCPU 0 can cause the OS to miss such
>>> interrupts or errors to happen due to unexpected vectors being
>>> injected on vCPU 0.
>>>
>>> In order to fix remove such handling altogether for fixed destination
>>> mode pins and just inject them according to the data setup in the
>>> IO-APIC entry.
>>>
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Technically
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>
>> I wonder though why this was done in the first place - it very much
>> feels like a workaround for certain guest behavior, and hence
>> getting rid of it may mean a certain risk of regressions. Not a
>> very good point in time to make risky changes ...
> 
> We can defer to after the release I guess, but I will still ask for
> the changes to be backported.

That's fine, albeit if we decide to delay it until 4.15 was branched,
then I think we want to also wait longer than usual until it would hit
the stable trees. Unfortunately c8e79412c001's description is of no
help to understand what or why "time jumps" may result from delivering
the interrupt as requested.

> What we currently do is broken, up to the point that FreeBSD cannot
> use the PIT because it will likely route the interrupt to a vCPU != 0
> in fixed mode, and then it will just get stuck because the vector is
> delivered to vCPU 0 where it's not even configured.

All understood.

Jan
Roger Pau Monné June 18, 2020, 2:18 p.m. UTC | #4
On Thu, Jun 18, 2020 at 04:08:28PM +0200, Jan Beulich wrote:
> On 18.06.2020 15:48, Roger Pau Monné wrote:
> > On Thu, Jun 18, 2020 at 03:43:00PM +0200, Jan Beulich wrote:
> >> On 12.06.2020 17:56, Roger Pau Monne wrote:
> >>> When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
> >>> use fixed delivery mode do not forcefully route interrupts to vCPU 0,
> >>> as the OS might have setup those interrupts to be injected to a
> >>> different vCPU, and injecting to vCPU 0 can cause the OS to miss such
> >>> interrupts or errors to happen due to unexpected vectors being
> >>> injected on vCPU 0.
> >>>
> >>> In order to fix remove such handling altogether for fixed destination
> >>> mode pins and just inject them according to the data setup in the
> >>> IO-APIC entry.
> >>>
> >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >>
> >> Technically
> >> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>
> >> I wonder though why this was done in the first place - it very much
> >> feels like a workaround for certain guest behavior, and hence
> >> getting rid of it may mean a certain risk of regressions. Not a
> >> very good point in time to make risky changes ...
> > 
> > We can defer to after the release I guess, but I will still ask for
> > the changes to be backported.
> 
> That's fine, albeit if we decide to delay it until 4.15 was branched,
> then I think we want to also wait longer than usual until it would hit
> the stable trees. Unfortunately c8e79412c001's description is of no
> help to understand what or why "time jumps" may result from delivering
> the interrupt as requested.

Yes, I've also looked at the original commit and have no idea what it
was actually trying to fix, and why delivering to vCPU 0 fixed it.
FWIW, I tried delivering to a different vCPU and it seems to work
fine.

Note that other timers (ie: RTC or HPET) are not tied to vCPU 0, so it
must have been something related to the PIT?

Thanks, Roger.
Jan Beulich June 18, 2020, 2:29 p.m. UTC | #5
On 18.06.2020 16:18, Roger Pau Monné wrote:
> On Thu, Jun 18, 2020 at 04:08:28PM +0200, Jan Beulich wrote:
>> On 18.06.2020 15:48, Roger Pau Monné wrote:
>>> On Thu, Jun 18, 2020 at 03:43:00PM +0200, Jan Beulich wrote:
>>>> On 12.06.2020 17:56, Roger Pau Monne wrote:
>>>>> When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
>>>>> use fixed delivery mode do not forcefully route interrupts to vCPU 0,
>>>>> as the OS might have setup those interrupts to be injected to a
>>>>> different vCPU, and injecting to vCPU 0 can cause the OS to miss such
>>>>> interrupts or errors to happen due to unexpected vectors being
>>>>> injected on vCPU 0.
>>>>>
>>>>> In order to fix remove such handling altogether for fixed destination
>>>>> mode pins and just inject them according to the data setup in the
>>>>> IO-APIC entry.
>>>>>
>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>
>>>> Technically
>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>
>>>> I wonder though why this was done in the first place - it very much
>>>> feels like a workaround for certain guest behavior, and hence
>>>> getting rid of it may mean a certain risk of regressions. Not a
>>>> very good point in time to make risky changes ...
>>>
>>> We can defer to after the release I guess, but I will still ask for
>>> the changes to be backported.
>>
>> That's fine, albeit if we decide to delay it until 4.15 was branched,
>> then I think we want to also wait longer than usual until it would hit
>> the stable trees. Unfortunately c8e79412c001's description is of no
>> help to understand what or why "time jumps" may result from delivering
>> the interrupt as requested.
> 
> Yes, I've also looked at the original commit and have no idea what it
> was actually trying to fix, and why delivering to vCPU 0 fixed it.
> FWIW, I tried delivering to a different vCPU and it seems to work
> fine.

Right, I too was thinking that delivering to a "stable" CPU might be
all that's needed. In patch 3 this may then call for latching that
CPU, and preferring it over what vlapic_lowest_prio() produces.

> Note that other timers (ie: RTC or HPET) are not tied to vCPU 0, so it
> must have been something related to the PIT?

Likely, but it may easily have been that an issue was papered over
by this change. Then we won't even know whether that underlying issue
still exists.

Jan
Roger Pau Monné June 18, 2020, 2:49 p.m. UTC | #6
On Thu, Jun 18, 2020 at 04:29:59PM +0200, Jan Beulich wrote:
> On 18.06.2020 16:18, Roger Pau Monné wrote:
> > On Thu, Jun 18, 2020 at 04:08:28PM +0200, Jan Beulich wrote:
> >> On 18.06.2020 15:48, Roger Pau Monné wrote:
> >>> On Thu, Jun 18, 2020 at 03:43:00PM +0200, Jan Beulich wrote:
> >>>> On 12.06.2020 17:56, Roger Pau Monne wrote:
> >>>>> When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
> >>>>> use fixed delivery mode do not forcefully route interrupts to vCPU 0,
> >>>>> as the OS might have setup those interrupts to be injected to a
> >>>>> different vCPU, and injecting to vCPU 0 can cause the OS to miss such
> >>>>> interrupts or errors to happen due to unexpected vectors being
> >>>>> injected on vCPU 0.
> >>>>>
> >>>>> In order to fix remove such handling altogether for fixed destination
> >>>>> mode pins and just inject them according to the data setup in the
> >>>>> IO-APIC entry.
> >>>>>
> >>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >>>>
> >>>> Technically
> >>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>>>
> >>>> I wonder though why this was done in the first place - it very much
> >>>> feels like a workaround for certain guest behavior, and hence
> >>>> getting rid of it may mean a certain risk of regressions. Not a
> >>>> very good point in time to make risky changes ...
> >>>
> >>> We can defer to after the release I guess, but I will still ask for
> >>> the changes to be backported.
> >>
> >> That's fine, albeit if we decide to delay it until 4.15 was branched,
> >> then I think we want to also wait longer than usual until it would hit
> >> the stable trees. Unfortunately c8e79412c001's description is of no
> >> help to understand what or why "time jumps" may result from delivering
> >> the interrupt as requested.
> > 
> > Yes, I've also looked at the original commit and have no idea what it
> > was actually trying to fix, and why delivering to vCPU 0 fixed it.
> > FWIW, I tried delivering to a different vCPU and it seems to work
> > fine.
> 
> Right, I too was thinking that delivering to a "stable" CPU might be
> all that's needed. In patch 3 this may then call for latching that
> CPU, and preferring it over what vlapic_lowest_prio() produces.

Yes, I also considered that route for the lowest priority mode (which
is dealt with in the next patch), but for fixed mode we need to
delivered to the listed vCPUs, there's no trick we can play here
IMO.

Roger.
Jan Beulich June 18, 2020, 3:16 p.m. UTC | #7
On 18.06.2020 16:49, Roger Pau Monné wrote:
> On Thu, Jun 18, 2020 at 04:29:59PM +0200, Jan Beulich wrote:
>> On 18.06.2020 16:18, Roger Pau Monné wrote:
>>> On Thu, Jun 18, 2020 at 04:08:28PM +0200, Jan Beulich wrote:
>>>> On 18.06.2020 15:48, Roger Pau Monné wrote:
>>>>> On Thu, Jun 18, 2020 at 03:43:00PM +0200, Jan Beulich wrote:
>>>>>> On 12.06.2020 17:56, Roger Pau Monne wrote:
>>>>>>> When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
>>>>>>> use fixed delivery mode do not forcefully route interrupts to vCPU 0,
>>>>>>> as the OS might have setup those interrupts to be injected to a
>>>>>>> different vCPU, and injecting to vCPU 0 can cause the OS to miss such
>>>>>>> interrupts or errors to happen due to unexpected vectors being
>>>>>>> injected on vCPU 0.
>>>>>>>
>>>>>>> In order to fix remove such handling altogether for fixed destination
>>>>>>> mode pins and just inject them according to the data setup in the
>>>>>>> IO-APIC entry.
>>>>>>>
>>>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>>>
>>>>>> Technically
>>>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>>>
>>>>>> I wonder though why this was done in the first place - it very much
>>>>>> feels like a workaround for certain guest behavior, and hence
>>>>>> getting rid of it may mean a certain risk of regressions. Not a
>>>>>> very good point in time to make risky changes ...
>>>>>
>>>>> We can defer to after the release I guess, but I will still ask for
>>>>> the changes to be backported.
>>>>
>>>> That's fine, albeit if we decide to delay it until 4.15 was branched,
>>>> then I think we want to also wait longer than usual until it would hit
>>>> the stable trees. Unfortunately c8e79412c001's description is of no
>>>> help to understand what or why "time jumps" may result from delivering
>>>> the interrupt as requested.
>>>
>>> Yes, I've also looked at the original commit and have no idea what it
>>> was actually trying to fix, and why delivering to vCPU 0 fixed it.
>>> FWIW, I tried delivering to a different vCPU and it seems to work
>>> fine.
>>
>> Right, I too was thinking that delivering to a "stable" CPU might be
>> all that's needed. In patch 3 this may then call for latching that
>> CPU, and preferring it over what vlapic_lowest_prio() produces.
> 
> Yes, I also considered that route for the lowest priority mode (which
> is dealt with in the next patch), but for fixed mode we need to
> delivered to the listed vCPUs, there's no trick we can play here
> IMO.

The set may still be empty, in which case the lowest-prio consideration
(of falling back to CPU0) may still apply here as well. But of course
there's nothing to latch here, as fixed mode means multi-cast if more
than one destination matches.

Jan

Patch
diff mbox series

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index bd41036137..67472e5934 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -445,26 +445,11 @@  static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
     }
 
     case dest_Fixed:
-    {
-#ifdef IRQ0_SPECIAL_ROUTING
-        /* Do not deliver timer interrupts to VCPU != 0 */
-        if ( (irq == hvm_isa_irq_to_gsi(0)) && pit_channel0_enabled() )
-        {
-            if ( (v = d->vcpu ? d->vcpu[0] : NULL) != NULL )
-                ioapic_inj_irq(vioapic, vcpu_vlapic(v), vector,
-                               trig_mode, delivery_mode);
-        }
-        else
-#endif
-        {
-            for_each_vcpu ( d, v )
-                if ( vlapic_match_dest(vcpu_vlapic(v), NULL,
-                                       0, dest, dest_mode) )
-                    ioapic_inj_irq(vioapic, vcpu_vlapic(v), vector,
-                                   trig_mode, delivery_mode);
-        }
+        for_each_vcpu ( d, v )
+            if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode) )
+                ioapic_inj_irq(vioapic, vcpu_vlapic(v), vector, trig_mode,
+                               delivery_mode);
         break;
-    }
 
     case dest_NMI:
     {