diff mbox series

[v8,15/15] KVM: xen: allow vcpu_info content to be 'safely' copied

Message ID 20231121180223.12484-16-paul@xen.org (mailing list archive)
State New, archived
Headers show
Series KVM: xen: update shared_info and vcpu_info handling | expand

Commit Message

Paul Durrant Nov. 21, 2023, 6:02 p.m. UTC
From: Paul Durrant <pdurrant@amazon.com>

If the guest sets an explicit vcpu_info GPA then, for any of the first 32
vCPUs, the content of the default vcpu_info in the shared_info page must be
copied into the new location. Because this copy may race with event
delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) there
needs to be a way to defer that until the copy is complete.
Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen
that is used in atomic context if the vcpu_info PFN cache has been
invalidated so that the update of vcpu_info can be deferred until the
cache can be refreshed (on vCPU thread's the way back into guest context).

Also use this shadow if the vcpu_info cache has been *deactivated*, so that
the VMM can safely copy the vcpu_info content and then re-activate the
cache with the new GPA. To do this, stop considering an inactive vcpu_info
cache as a hard error in kvm_xen_set_evtchn_fast().

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

v8:
 - Update commit comment.

v6:
 - New in this version.
---
 arch/x86/kvm/xen.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

David Woodhouse Nov. 21, 2023, 10:53 p.m. UTC | #1
On Tue, 2023-11-21 at 18:02 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> If the guest sets an explicit vcpu_info GPA then, for any of the first 32
> vCPUs, the content of the default vcpu_info in the shared_info page must be
> copied into the new location. Because this copy may race with event
> delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) there
> needs to be a way to defer that until the copy is complete.
> Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen
> that is used in atomic context if the vcpu_info PFN cache has been
> invalidated so that the update of vcpu_info can be deferred until the
> cache can be refreshed (on vCPU thread's the way back into guest context).
> 
> Also use this shadow if the vcpu_info cache has been *deactivated*, so that
> the VMM can safely copy the vcpu_info content and then re-activate the
> cache with the new GPA. To do this, stop considering an inactive vcpu_info
> cache as a hard error in kvm_xen_set_evtchn_fast().
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>

Wait, didn't we realise that this leaves the bits set in the shadow
evtchn_pending_sel that get lost on migration?

The point in your previous patch which split out a shiny new
set_shinfo_evtchn_pending() function was that you could then *call*
that function to ensure that the corresponding index bit was set on the
destination host after migration, if the bit in the shinfo is.

So we'd do that from kvm_xen_setup_evtchn(), kvm_xen_eventfd_assign(),
and when setting KVM_XEN_VCPU_ATTR_TYPE_TIMER.

 if (bit_is_set_in_shinfo)
   set_shinfo_evtchn_pending()
David Woodhouse Nov. 22, 2023, 10:39 a.m. UTC | #2
On Tue, 2023-11-21 at 22:53 +0000, David Woodhouse wrote:
> On Tue, 2023-11-21 at 18:02 +0000, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> > 
> > If the guest sets an explicit vcpu_info GPA then, for any of the first 32
> > vCPUs, the content of the default vcpu_info in the shared_info page must be
> > copied into the new location. Because this copy may race with event
> > delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) there
> > needs to be a way to defer that until the copy is complete.
> > Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen
> > that is used in atomic context if the vcpu_info PFN cache has been
> > invalidated so that the update of vcpu_info can be deferred until the
> > cache can be refreshed (on vCPU thread's the way back into guest context).
> > 
> > Also use this shadow if the vcpu_info cache has been *deactivated*, so that
> > the VMM can safely copy the vcpu_info content and then re-activate the
> > cache with the new GPA. To do this, stop considering an inactive vcpu_info
> > cache as a hard error in kvm_xen_set_evtchn_fast().
> > 
> > Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> > Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
> 
> Wait, didn't we realise that this leaves the bits set in the shadow
> evtchn_pending_sel that get lost on migration?
> 
> The point in your previous patch which split out a shiny new
> set_shinfo_evtchn_pending() function was that you could then *call*
> that function to ensure that the corresponding index bit was set on the
> destination host after migration, if the bit in the shinfo is.
> 
> So we'd do that from kvm_xen_setup_evtchn(), kvm_xen_eventfd_assign(),
> and when setting KVM_XEN_VCPU_ATTR_TYPE_TIMER.
> 
>  if (bit_is_set_in_shinfo)
>    set_shinfo_evtchn_pending()

I mean set_vcpu_info_evtchn_pending() of course. And we probably want
to extend the xen_shinfo_test to test it, by setting the bit in the
shinfo to mark the event as pending, and then doing each of

 • Set up timer (a bit like in TEST_TIMER_RESTORE at line 817).
 • Add incoming eventfd with KVM_SET_GSI_ROUTING (cf. line 563)
 • Add IPI with KVM_XEN_ATTR_TYPE_EVTCHN (cf. line 597)

Each of those should set the index bit in the vcpu_info immediately if
the evtchn port is already set (and unmasked) in the shinfo.


(Ignore this part if you're cleverer than me or have had more coffee…)

It took me a moment to get my head around the different setups we have
for event channels, but that's because the standard KVM_SET_GSI_ROUTING
one is for *incoming* events, just as we would for MSIs, and we use the
standard way of attaching an eventfd to an incoming GSI/MSI/evtchn.

The KVM_XEN_ATTR_TYPE_EVTCHN one is for *outbound* events where the
guest does an EVTCHNOP_send. That can *raise* events on an eventfd, or
it can be an IPI or loopback interdomain port, which is the case we
need to test.
Paul Durrant Nov. 22, 2023, 10:55 a.m. UTC | #3
On 22/11/2023 10:39, David Woodhouse wrote:
> On Tue, 2023-11-21 at 22:53 +0000, David Woodhouse wrote:
>> On Tue, 2023-11-21 at 18:02 +0000, Paul Durrant wrote:
>>> From: Paul Durrant <pdurrant@amazon.com>
>>>
>>> If the guest sets an explicit vcpu_info GPA then, for any of the first 32
>>> vCPUs, the content of the default vcpu_info in the shared_info page must be
>>> copied into the new location. Because this copy may race with event
>>> delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) there
>>> needs to be a way to defer that until the copy is complete.
>>> Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen
>>> that is used in atomic context if the vcpu_info PFN cache has been
>>> invalidated so that the update of vcpu_info can be deferred until the
>>> cache can be refreshed (on vCPU thread's the way back into guest context).
>>>
>>> Also use this shadow if the vcpu_info cache has been *deactivated*, so that
>>> the VMM can safely copy the vcpu_info content and then re-activate the
>>> cache with the new GPA. To do this, stop considering an inactive vcpu_info
>>> cache as a hard error in kvm_xen_set_evtchn_fast().
>>>
>>> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
>>> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
>>
>> Wait, didn't we realise that this leaves the bits set in the shadow
>> evtchn_pending_sel that get lost on migration?
>>

Indeed we did not, but that's not something that *this* patch, or even 
this series, is dealing with.  We also know that setting the 'width' of 
shared_info has some issues, but again, can we keep that for other 
patches? The series is at v9 and has already suffered a fair amount of 
scope-creep.

   Paul
David Woodhouse Nov. 22, 2023, 11:25 a.m. UTC | #4
On Wed, 2023-11-22 at 10:55 +0000, Paul Durrant wrote:
> 
> > > Wait, didn't we realise that this leaves the bits set in the shadow
> > > evtchn_pending_sel that get lost on migration?
> > > 
> 
> Indeed we did not, but that's not something that *this* patch, or even 
> this series, is dealing with.  We also know that setting the 'width' of 
> shared_info has some issues, but again, can we keep that for other 
> patches? The series is at v9 and has already suffered a fair amount of 
> scope-creep.

Hrm... OK, that makes sense. This series is attempting to address the
fact that we can't do overlays on memslots without temporarily taking
away GPA ranges that we didn't mean to change. This patch is sufficient
to allow evtchn delivery to work while the memslots are being frobbed,
because userspace takes the vcpu_info away during the update.

In that case the shadow works just fine.

So yeah, if you want to handle the migration case separately then that
seems reasonable.
diff mbox series

Patch

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index eff405eead1c..cfd5051e0800 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1742,9 +1742,6 @@  int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx);
 	}
 
-	if (!vcpu->arch.xen.vcpu_info_cache.active)
-		return -EINVAL;
-
 	if (xe->port >= max_evtchn_port(kvm))
 		return -EINVAL;