diff mbox series

[for-4.13,v3,1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...

Message ID 20191126132648.6917-2-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/vmx: posted interrupt fixes | expand

Commit Message

Roger Pau Monné Nov. 26, 2019, 1:26 p.m. UTC
... if the vCPU is different than the one currently running or if it's
not paused. Note that syncing PIR to IRR when the vCPU is running is
not allowed, since the hardware is in control of VMCS IRR field.

Allow syncing PIR to IRR when the vCPU is paused, this is required in
order to save the local APIC state.

No functional change intended.

Suggested by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Juergen Gross <jgross@suse.com>
---
Changes since v2:
 - Only allow syncing if the vCPU is the current one or if it's
   paused.

Changes since v1:
 - Use vcpu_runnable instead of is_running.
---
 xen/arch/x86/hvm/vmx/vmx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Jan Beulich Nov. 26, 2019, 4:32 p.m. UTC | #1
On 26.11.2019 14:26, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
>      unsigned int group, i;
>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
>  
> +    if ( v != current && !atomic_read(&v->pause_count) )
> +    {
> +        /*
> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> +         * since the IRR is controlled by the hardware when the vCPU is
> +         * executing. Only allow Xen to do such sync if the vCPU is the current
> +         * one or if it's paused: that's required in order to sync the lapic
> +         * state before saving it.
> +         */

Is this stated this way by the SDM anywhere? I ask because the
comment then really doesn't apply to just this function, but to
vlapic_{,test_and_}{set,clear}_vector() more generally. It's
not clear to me at all whether the CPU caches (in an incoherent
fashion) IRR (and maybe other APIC page elements), rather than
honoring the atomic updates these macros do.

Jan
Roger Pau Monné Nov. 26, 2019, 4:47 p.m. UTC | #2
On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
> On 26.11.2019 14:26, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
> >      unsigned int group, i;
> >      DECLARE_BITMAP(pending_intr, NR_VECTORS);
> >  
> > +    if ( v != current && !atomic_read(&v->pause_count) )
> > +    {
> > +        /*
> > +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> > +         * since the IRR is controlled by the hardware when the vCPU is
> > +         * executing. Only allow Xen to do such sync if the vCPU is the current
> > +         * one or if it's paused: that's required in order to sync the lapic
> > +         * state before saving it.
> > +         */
> 
> Is this stated this way by the SDM anywhere?

No, I think the SDM is not very clear on this, there's a paragraph
about PIR:

"The logical processor performs a logical-OR of PIR into VIRR and
clears PIR. No other agent can read or write a PIR bit (or group of
bits) between the time it is read (to determine what to OR into VIRR)
and when it is cleared."

> I ask because the
> comment then really doesn't apply to just this function, but to
> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
> not clear to me at all whether the CPU caches (in an incoherent
> fashion) IRR (and maybe other APIC page elements), rather than
> honoring the atomic updates these macros do.

IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
likely to at least defeat the purpose of posted interrupts: when the
CPU receives the posted interrupt vector it won't see the
outstanding-notification bit in the posted-interrupt descriptor
because the sync done from a different pCPU would have cleared it, at
which point it's not clear to me that the processor will check vIRR
for pending interrupts. The description in section 29.6
POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
value of the outstanding-notification bit affects the logic of posted
interrupt processing.

Thanks, Roger.
Jan Beulich Nov. 26, 2019, 4:58 p.m. UTC | #3
On 26.11.2019 17:47, Roger Pau Monné  wrote:
> On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
>> On 26.11.2019 14:26, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
>>>      unsigned int group, i;
>>>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
>>>  
>>> +    if ( v != current && !atomic_read(&v->pause_count) )
>>> +    {
>>> +        /*
>>> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
>>> +         * since the IRR is controlled by the hardware when the vCPU is
>>> +         * executing. Only allow Xen to do such sync if the vCPU is the current
>>> +         * one or if it's paused: that's required in order to sync the lapic
>>> +         * state before saving it.
>>> +         */
>>
>> Is this stated this way by the SDM anywhere?
> 
> No, I think the SDM is not very clear on this, there's a paragraph
> about PIR:
> 
> "The logical processor performs a logical-OR of PIR into VIRR and
> clears PIR. No other agent can read or write a PIR bit (or group of
> bits) between the time it is read (to determine what to OR into VIRR)
> and when it is cleared."

Well, this is about PIR, but my question was rather towards the
effects on vIRR.

>> I ask because the
>> comment then really doesn't apply to just this function, but to
>> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
>> not clear to me at all whether the CPU caches (in an incoherent
>> fashion) IRR (and maybe other APIC page elements), rather than
>> honoring the atomic updates these macros do.
> 
> IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
> likely to at least defeat the purpose of posted interrupts:

I agree here.

> when the
> CPU receives the posted interrupt vector it won't see the
> outstanding-notification bit in the posted-interrupt descriptor
> because the sync done from a different pCPU would have cleared it, at
> which point it's not clear to me that the processor will check vIRR
> for pending interrupts. The description in section 29.6
> POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
> value of the outstanding-notification bit affects the logic of posted
> interrupt processing.

But overall this again is all posted interrupt centric when my
question was about vIRR, in particular whether the asserting you
add may need to be even more rigid.

Anyway, let's see what the VMX maintainers have to say.

Jan
Tian, Kevin Nov. 27, 2019, 3:09 a.m. UTC | #4
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Wednesday, November 27, 2019 12:59 AM
> 
> On 26.11.2019 17:47, Roger Pau Monné  wrote:
> > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
> >> On 26.11.2019 14:26, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/vmx/vmx.c
> >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu
> *v)
> >>>      unsigned int group, i;
> >>>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
> >>>
> >>> +    if ( v != current && !atomic_read(&v->pause_count) )
> >>> +    {
> >>> +        /*
> >>> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> >>> +         * since the IRR is controlled by the hardware when the vCPU is
> >>> +         * executing. Only allow Xen to do such sync if the vCPU is the
> current
> >>> +         * one or if it's paused: that's required in order to sync the lapic
> >>> +         * state before saving it.
> >>> +         */
> >>
> >> Is this stated this way by the SDM anywhere?
> >
> > No, I think the SDM is not very clear on this, there's a paragraph
> > about PIR:
> >
> > "The logical processor performs a logical-OR of PIR into VIRR and
> > clears PIR. No other agent can read or write a PIR bit (or group of
> > bits) between the time it is read (to determine what to OR into VIRR)
> > and when it is cleared."
> 
> Well, this is about PIR, but my question was rather towards the
> effects on vIRR.
> 
> >> I ask because the
> >> comment then really doesn't apply to just this function, but to
> >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
> >> not clear to me at all whether the CPU caches (in an incoherent
> >> fashion) IRR (and maybe other APIC page elements), rather than
> >> honoring the atomic updates these macros do.
> >
> > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
> > likely to at least defeat the purpose of posted interrupts:
> 
> I agree here.
> 
> > when the
> > CPU receives the posted interrupt vector it won't see the
> > outstanding-notification bit in the posted-interrupt descriptor
> > because the sync done from a different pCPU would have cleared it, at
> > which point it's not clear to me that the processor will check vIRR
> > for pending interrupts. The description in section 29.6
> > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
> > value of the outstanding-notification bit affects the logic of posted
> > interrupt processing.

I think the outstanding-notification is one-off checked for triggering 
interrupt posting process. Once the process starts, there is no need to 
look at it again. The step 3 of posting process in 29.6 clearly says:

"The processor clears the outstanding-notification bit in the posted-
interrupt descriptor. This is done atomically so as to leave the remainder 
of the descriptor unmodified (e.g., with a locked AND operation)."

But regardless of the hardware behavior, I think it's safe to restrict
sync_pir_to_irr as this patch does.

> 
> But overall this again is all posted interrupt centric when my
> question was about vIRR, in particular whether the asserting you
> add may need to be even more rigid.
> 
> Anyway, let's see what the VMX maintainers have to say.
> 

There is one paragraph in 29.6:

"Use of the posted-interrupt descriptor differs from that of other data 
structures that are referenced by pointers in a VMCS. There is a general 
requirement that software ensure that each such data structure is 
modified only when no logical processor with a current VMCS that 
references it is in VMX non-root operation. That requirement does
not apply to the posted-interrupt descriptor. There is a requirement, 
however, that such modifications be done using locked read-modify-write 
instructions."

virtual-APIC page is pointer-referenced by VMCS, thus it falls into above
general requirement. But I suppose there should be some exception with
this page too, otherwise the point of posted interrupt is killed (if we have
to kick the dest vcpu into root to update the vIRR). Let me confirm
internally.

Thanks
Kevin
Roger Pau Monné Nov. 27, 2019, 11:18 a.m. UTC | #5
On Wed, Nov 27, 2019 at 03:09:51AM +0000, Tian, Kevin wrote:
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Wednesday, November 27, 2019 12:59 AM
> > 
> > On 26.11.2019 17:47, Roger Pau Monné  wrote:
> > > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
> > >> On 26.11.2019 14:26, Roger Pau Monne wrote:
> > >>> --- a/xen/arch/x86/hvm/vmx/vmx.c
> > >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu
> > *v)
> > >>>      unsigned int group, i;
> > >>>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
> > >>>
> > >>> +    if ( v != current && !atomic_read(&v->pause_count) )
> > >>> +    {
> > >>> +        /*
> > >>> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> > >>> +         * since the IRR is controlled by the hardware when the vCPU is
> > >>> +         * executing. Only allow Xen to do such sync if the vCPU is the
> > current
> > >>> +         * one or if it's paused: that's required in order to sync the lapic
> > >>> +         * state before saving it.
> > >>> +         */
> > >>
> > >> Is this stated this way by the SDM anywhere?
> > >
> > > No, I think the SDM is not very clear on this, there's a paragraph
> > > about PIR:
> > >
> > > "The logical processor performs a logical-OR of PIR into VIRR and
> > > clears PIR. No other agent can read or write a PIR bit (or group of
> > > bits) between the time it is read (to determine what to OR into VIRR)
> > > and when it is cleared."
> > 
> > Well, this is about PIR, but my question was rather towards the
> > effects on vIRR.
> > 
> > >> I ask because the
> > >> comment then really doesn't apply to just this function, but to
> > >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
> > >> not clear to me at all whether the CPU caches (in an incoherent
> > >> fashion) IRR (and maybe other APIC page elements), rather than
> > >> honoring the atomic updates these macros do.
> > >
> > > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
> > > likely to at least defeat the purpose of posted interrupts:
> > 
> > I agree here.
> > 
> > > when the
> > > CPU receives the posted interrupt vector it won't see the
> > > outstanding-notification bit in the posted-interrupt descriptor
> > > because the sync done from a different pCPU would have cleared it, at
> > > which point it's not clear to me that the processor will check vIRR
> > > for pending interrupts. The description in section 29.6
> > > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
> > > value of the outstanding-notification bit affects the logic of posted
> > > interrupt processing.
> 
> I think the outstanding-notification is one-off checked for triggering 
> interrupt posting process. Once the process starts, there is no need to 
> look at it again. The step 3 of posting process in 29.6 clearly says:
> 
> "The processor clears the outstanding-notification bit in the posted-
> interrupt descriptor. This is done atomically so as to leave the remainder 
> of the descriptor unmodified (e.g., with a locked AND operation)."

Yes, my question would be what happens if the outstanding-notification
bit is 0, does the processor jump to step 6 then?

Does it just ignore the value of the outstanding-notification bit and
continue to step 4?

> But regardless of the hardware behavior, I think it's safe to restrict
> sync_pir_to_irr as this patch does.
> 
> > 
> > But overall this again is all posted interrupt centric when my
> > question was about vIRR, in particular whether the asserting you
> > add may need to be even more rigid.
> > 
> > Anyway, let's see what the VMX maintainers have to say.
> > 
> 
> There is one paragraph in 29.6:
> 
> "Use of the posted-interrupt descriptor differs from that of other data 
> structures that are referenced by pointers in a VMCS. There is a general 
> requirement that software ensure that each such data structure is 
> modified only when no logical processor with a current VMCS that 
> references it is in VMX non-root operation. That requirement does
> not apply to the posted-interrupt descriptor. There is a requirement, 
> however, that such modifications be done using locked read-modify-write 
> instructions."
> 
> virtual-APIC page is pointer-referenced by VMCS, thus it falls into above
> general requirement. But I suppose there should be some exception with
> this page too, otherwise the point of posted interrupt is killed (if we have
> to kick the dest vcpu into root to update the vIRR). Let me confirm
> internally.

Ack, thanks. I think we can can hold off this improvement/restriction
until we get confirmation of the intended software behavior here.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index a55ff37733..c817aec75d 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2054,6 +2054,19 @@  static void vmx_sync_pir_to_irr(struct vcpu *v)
     unsigned int group, i;
     DECLARE_BITMAP(pending_intr, NR_VECTORS);
 
+    if ( v != current && !atomic_read(&v->pause_count) )
+    {
+        /*
+         * Syncing PIR to IRR must not be done behind the back of the CPU,
+         * since the IRR is controlled by the hardware when the vCPU is
+         * executing. Only allow Xen to do such sync if the vCPU is the current
+         * one or if it's paused: that's required in order to sync the lapic
+         * state before saving it.
+         */
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
     if ( !pi_test_and_clear_on(&v->arch.hvm.vmx.pi_desc) )
         return;