Message ID | 20191126132648.6917-2-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/vmx: posted interrupt fixes | expand |
On 26.11.2019 14:26, Roger Pau Monne wrote: > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v) > unsigned int group, i; > DECLARE_BITMAP(pending_intr, NR_VECTORS); > > + if ( v != current && !atomic_read(&v->pause_count) ) > + { > + /* > + * Syncing PIR to IRR must not be done behind the back of the CPU, > + * since the IRR is controlled by the hardware when the vCPU is > + * executing. Only allow Xen to do such sync if the vCPU is the current > + * one or if it's paused: that's required in order to sync the lapic > + * state before saving it. > + */ Is this stated this way by the SDM anywhere? I ask because the comment then really doesn't apply to just this function, but to vlapic_{,test_and_}{set,clear}_vector() more generally. It's not clear to me at all whether the CPU caches (in an incoherent fashion) IRR (and maybe other APIC page elements), rather than honoring the atomic updates these macros do. Jan
On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote: > On 26.11.2019 14:26, Roger Pau Monne wrote: > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v) > > unsigned int group, i; > > DECLARE_BITMAP(pending_intr, NR_VECTORS); > > > > + if ( v != current && !atomic_read(&v->pause_count) ) > > + { > > + /* > > + * Syncing PIR to IRR must not be done behind the back of the CPU, > > + * since the IRR is controlled by the hardware when the vCPU is > > + * executing. Only allow Xen to do such sync if the vCPU is the current > > + * one or if it's paused: that's required in order to sync the lapic > > + * state before saving it. > > + */ > > Is this stated this way by the SDM anywhere? No, I think the SDM is not very clear on this, there's a paragraph about PIR: "The logical processor performs a logical-OR of PIR into VIRR and clears PIR. No other agent can read or write a PIR bit (or group of bits) between the time it is read (to determine what to OR into VIRR) and when it is cleared." > I ask because the > comment then really doesn't apply to just this function, but to > vlapic_{,test_and_}{set,clear}_vector() more generally. It's > not clear to me at all whether the CPU caches (in an incoherent > fashion) IRR (and maybe other APIC page elements), rather than > honoring the atomic updates these macros do. IMO syncing PIR to IRR when the vCPU is running on a different pCPU is likely to at least defeat the purpose of posted interrupts: when the CPU receives the posted interrupt vector it won't see the outstanding-notification bit in the posted-interrupt descriptor because the sync done from a different pCPU would have cleared it, at which point it's not clear to me that the processor will check vIRR for pending interrupts. The description in section 29.6 POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the value of the outstanding-notification bit affects the logic of posted interrupt processing. Thanks, Roger.
On 26.11.2019 17:47, Roger Pau Monné wrote: > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote: >> On 26.11.2019 14:26, Roger Pau Monne wrote: >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v) >>> unsigned int group, i; >>> DECLARE_BITMAP(pending_intr, NR_VECTORS); >>> >>> + if ( v != current && !atomic_read(&v->pause_count) ) >>> + { >>> + /* >>> + * Syncing PIR to IRR must not be done behind the back of the CPU, >>> + * since the IRR is controlled by the hardware when the vCPU is >>> + * executing. Only allow Xen to do such sync if the vCPU is the current >>> + * one or if it's paused: that's required in order to sync the lapic >>> + * state before saving it. >>> + */ >> >> Is this stated this way by the SDM anywhere? > > No, I think the SDM is not very clear on this, there's a paragraph > about PIR: > > "The logical processor performs a logical-OR of PIR into VIRR and > clears PIR. No other agent can read or write a PIR bit (or group of > bits) between the time it is read (to determine what to OR into VIRR) > and when it is cleared." Well, this is about PIR, but my question was rather towards the effects on vIRR. >> I ask because the >> comment then really doesn't apply to just this function, but to >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's >> not clear to me at all whether the CPU caches (in an incoherent >> fashion) IRR (and maybe other APIC page elements), rather than >> honoring the atomic updates these macros do. > > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is > likely to at least defeat the purpose of posted interrupts: I agree here. > when the > CPU receives the posted interrupt vector it won't see the > outstanding-notification bit in the posted-interrupt descriptor > because the sync done from a different pCPU would have cleared it, at > which point it's not clear to me that the processor will check vIRR > for pending interrupts. The description in section 29.6 > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the > value of the outstanding-notification bit affects the logic of posted > interrupt processing. But overall this again is all posted interrupt centric when my question was about vIRR, in particular whether the asserting you add may need to be even more rigid. Anyway, let's see what the VMX maintainers have to say. Jan
> From: Jan Beulich <jbeulich@suse.com> > Sent: Wednesday, November 27, 2019 12:59 AM > > On 26.11.2019 17:47, Roger Pau Monné wrote: > > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote: > >> On 26.11.2019 14:26, Roger Pau Monne wrote: > >>> --- a/xen/arch/x86/hvm/vmx/vmx.c > >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c > >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu > *v) > >>> unsigned int group, i; > >>> DECLARE_BITMAP(pending_intr, NR_VECTORS); > >>> > >>> + if ( v != current && !atomic_read(&v->pause_count) ) > >>> + { > >>> + /* > >>> + * Syncing PIR to IRR must not be done behind the back of the CPU, > >>> + * since the IRR is controlled by the hardware when the vCPU is > >>> + * executing. Only allow Xen to do such sync if the vCPU is the > current > >>> + * one or if it's paused: that's required in order to sync the lapic > >>> + * state before saving it. > >>> + */ > >> > >> Is this stated this way by the SDM anywhere? > > > > No, I think the SDM is not very clear on this, there's a paragraph > > about PIR: > > > > "The logical processor performs a logical-OR of PIR into VIRR and > > clears PIR. No other agent can read or write a PIR bit (or group of > > bits) between the time it is read (to determine what to OR into VIRR) > > and when it is cleared." > > Well, this is about PIR, but my question was rather towards the > effects on vIRR. > > >> I ask because the > >> comment then really doesn't apply to just this function, but to > >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's > >> not clear to me at all whether the CPU caches (in an incoherent > >> fashion) IRR (and maybe other APIC page elements), rather than > >> honoring the atomic updates these macros do. > > > > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is > > likely to at least defeat the purpose of posted interrupts: > > I agree here. > > > when the > > CPU receives the posted interrupt vector it won't see the > > outstanding-notification bit in the posted-interrupt descriptor > > because the sync done from a different pCPU would have cleared it, at > > which point it's not clear to me that the processor will check vIRR > > for pending interrupts. The description in section 29.6 > > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the > > value of the outstanding-notification bit affects the logic of posted > > interrupt processing. I think the outstanding-notification is one-off checked for triggering interrupt posting process. Once the process starts, there is no need to look at it again. The step 3 of posting process in 29.6 clearly says: "The processor clears the outstanding-notification bit in the posted- interrupt descriptor. This is done atomically so as to leave the remainder of the descriptor unmodified (e.g., with a locked AND operation)." But regardless of the hardware behavior, I think it's safe to restrict sync_pir_to_irr as this patch does. > > But overall this again is all posted interrupt centric when my > question was about vIRR, in particular whether the asserting you > add may need to be even more rigid. > > Anyway, let's see what the VMX maintainers have to say. > There is one paragraph in 29.6: "Use of the posted-interrupt descriptor differs from that of other data structures that are referenced by pointers in a VMCS. There is a general requirement that software ensure that each such data structure is modified only when no logical processor with a current VMCS that references it is in VMX non-root operation. That requirement does not apply to the posted-interrupt descriptor. There is a requirement, however, that such modifications be done using locked read-modify-write instructions." virtual-APIC page is pointer-referenced by VMCS, thus it falls into above general requirement. But I suppose there should be some exception with this page too, otherwise the point of posted interrupt is killed (if we have to kick the dest vcpu into root to update the vIRR). Let me confirm internally. Thanks Kevin
On Wed, Nov 27, 2019 at 03:09:51AM +0000, Tian, Kevin wrote: > > From: Jan Beulich <jbeulich@suse.com> > > Sent: Wednesday, November 27, 2019 12:59 AM > > > > On 26.11.2019 17:47, Roger Pau Monné wrote: > > > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote: > > >> On 26.11.2019 14:26, Roger Pau Monne wrote: > > >>> --- a/xen/arch/x86/hvm/vmx/vmx.c > > >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c > > >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu > > *v) > > >>> unsigned int group, i; > > >>> DECLARE_BITMAP(pending_intr, NR_VECTORS); > > >>> > > >>> + if ( v != current && !atomic_read(&v->pause_count) ) > > >>> + { > > >>> + /* > > >>> + * Syncing PIR to IRR must not be done behind the back of the CPU, > > >>> + * since the IRR is controlled by the hardware when the vCPU is > > >>> + * executing. Only allow Xen to do such sync if the vCPU is the > > current > > >>> + * one or if it's paused: that's required in order to sync the lapic > > >>> + * state before saving it. > > >>> + */ > > >> > > >> Is this stated this way by the SDM anywhere? > > > > > > No, I think the SDM is not very clear on this, there's a paragraph > > > about PIR: > > > > > > "The logical processor performs a logical-OR of PIR into VIRR and > > > clears PIR. No other agent can read or write a PIR bit (or group of > > > bits) between the time it is read (to determine what to OR into VIRR) > > > and when it is cleared." > > > > Well, this is about PIR, but my question was rather towards the > > effects on vIRR. > > > > >> I ask because the > > >> comment then really doesn't apply to just this function, but to > > >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's > > >> not clear to me at all whether the CPU caches (in an incoherent > > >> fashion) IRR (and maybe other APIC page elements), rather than > > >> honoring the atomic updates these macros do. > > > > > > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is > > > likely to at least defeat the purpose of posted interrupts: > > > > I agree here. > > > > > when the > > > CPU receives the posted interrupt vector it won't see the > > > outstanding-notification bit in the posted-interrupt descriptor > > > because the sync done from a different pCPU would have cleared it, at > > > which point it's not clear to me that the processor will check vIRR > > > for pending interrupts. The description in section 29.6 > > > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the > > > value of the outstanding-notification bit affects the logic of posted > > > interrupt processing. > > I think the outstanding-notification is one-off checked for triggering > interrupt posting process. Once the process starts, there is no need to > look at it again. The step 3 of posting process in 29.6 clearly says: > > "The processor clears the outstanding-notification bit in the posted- > interrupt descriptor. This is done atomically so as to leave the remainder > of the descriptor unmodified (e.g., with a locked AND operation)." Yes, my question would be what happens if the outstanding-notification bit is 0, does the processor jump to step 6 then? Does it just ignore the value of the outstanding-notification bit and continue to step 4? > But regardless of the hardware behavior, I think it's safe to restrict > sync_pir_to_irr as this patch does. > > > > > But overall this again is all posted interrupt centric when my > > question was about vIRR, in particular whether the asserting you > > add may need to be even more rigid. > > > > Anyway, let's see what the VMX maintainers have to say. > > > > There is one paragraph in 29.6: > > "Use of the posted-interrupt descriptor differs from that of other data > structures that are referenced by pointers in a VMCS. There is a general > requirement that software ensure that each such data structure is > modified only when no logical processor with a current VMCS that > references it is in VMX non-root operation. That requirement does > not apply to the posted-interrupt descriptor. There is a requirement, > however, that such modifications be done using locked read-modify-write > instructions." > > virtual-APIC page is pointer-referenced by VMCS, thus it falls into above > general requirement. But I suppose there should be some exception with > this page too, otherwise the point of posted interrupt is killed (if we have > to kick the dest vcpu into root to update the vIRR). Let me confirm > internally. Ack, thanks. I think we can can hold off this improvement/restriction until we get confirmation of the intended software behavior here. Thanks, Roger.
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index a55ff37733..c817aec75d 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v) unsigned int group, i; DECLARE_BITMAP(pending_intr, NR_VECTORS); + if ( v != current && !atomic_read(&v->pause_count) ) + { + /* + * Syncing PIR to IRR must not be done behind the back of the CPU, + * since the IRR is controlled by the hardware when the vCPU is + * executing. Only allow Xen to do such sync if the vCPU is the current + * one or if it's paused: that's required in order to sync the lapic + * state before saving it. + */ + ASSERT_UNREACHABLE(); + return; + } + if ( !pi_test_and_clear_on(&v->arch.hvm.vmx.pi_desc) ) return;
... if the vCPU is different than the one currently running or if it's not paused. Note that syncing PIR to IRR when the vCPU is running is not allowed, since the hardware is in control of VMCS IRR field. Allow syncing PIR to IRR when the vCPU is paused, this is required in order to save the local APIC state. No functional change intended. Suggested by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Cc: Juergen Gross <jgross@suse.com> --- Changes since v2: - Only allow syncing if the vCPU is the current one or if it's paused. Changes since v1: - Use vcpu_runnable instead of is_running. --- xen/arch/x86/hvm/vmx/vmx.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)