diff mbox series

[v5,07/10] KVM: xen: allow vcpu_info to be mapped by fixed HVA

Message ID 20230922150009.3319-8-paul@xen.org (mailing list archive)
State New, archived
Headers show
Series KVM: xen: update shared_info and vcpu_info handling | expand

Commit Message

Paul Durrant Sept. 22, 2023, 3 p.m. UTC
From: Paul Durrant <pdurrant@amazon.com>

If the guest does not explicitly set the GPA of vcpu_info structure in
memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
in the shared_info page may be used. As described in a previous commit,
the shared_info page is an overlay at a fixed HVA within the VMM, so in
this case it also more optimal to activate the vcpu_info cache with a
fixed HVA to avoid unnecessary invalidation if the guest memory layout
is modified.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

v5:
 - New in this version.
---
 Documentation/virt/kvm/api.rst | 26 +++++++++++++++++++++-----
 arch/x86/kvm/xen.c             | 33 +++++++++++++++++++++++++++------
 include/uapi/linux/kvm.h       |  3 +++
 3 files changed, 51 insertions(+), 11 deletions(-)

Comments

David Woodhouse Sept. 22, 2023, 3:44 p.m. UTC | #1
On Fri, 2023-09-22 at 15:00 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> If the guest does not explicitly set the GPA of vcpu_info structure in
> memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
> in the shared_info page may be used. As described in a previous commit,
> the shared_info page is an overlay at a fixed HVA within the VMM, so in
> this case it also more optimal to activate the vcpu_info cache with a
> fixed HVA to avoid unnecessary invalidation if the guest memory layout
> is modified.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

But it should *only* be defined as an HVA in the case where it's the
one in the shinfo. Otherwise, it's defined by its GPA.

Which almost makes me want to see a sanity check that it precisely
equals &shinfo->vcpu_info[vcpu->arch.xen.vcpu_id].

Which brings me back around the circle again to wonder why we don't
just *default* to it.... you hate me, don't you?

Your previous set of patches did that, and it did end requiring that
the VMM restore both VCPU_INFO and VCPU_ID for each vCPU *before*
restoring the SHARED_INFO_HVA on resume, but wasn't that OK?
Paul Durrant Sept. 22, 2023, 3:49 p.m. UTC | #2
On 22/09/2023 16:44, David Woodhouse wrote:
> On Fri, 2023-09-22 at 15:00 +0000, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> If the guest does not explicitly set the GPA of vcpu_info structure in
>> memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
>> in the shared_info page may be used. As described in a previous commit,
>> the shared_info page is an overlay at a fixed HVA within the VMM, so in
>> this case it also more optimal to activate the vcpu_info cache with a
>> fixed HVA to avoid unnecessary invalidation if the guest memory layout
>> is modified.
>>
>> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> 
> But it should *only* be defined as an HVA in the case where it's the
> one in the shinfo. Otherwise, it's defined by its GPA.
> 
> Which almost makes me want to see a sanity check that it precisely
> equals &shinfo->vcpu_info[vcpu->arch.xen.vcpu_id].
> 
> Which brings me back around the circle again to wonder why we don't
> just *default* to it.... you hate me, don't you?
> 
> Your previous set of patches did that, and it did end requiring that
> the VMM restore both VCPU_INFO and VCPU_ID for each vCPU *before*
> restoring the SHARED_INFO_HVA on resume, but wasn't that OK?
> 

No. It was painful and overly complex, with too many corner cases that 
were making my brain hurt. Given the pain of the last week, leaving the 
default handling in the VMM is preferable. It means I don't need to
impose rules about attribute ordering and hence get caught by needing
one thread to wait for another inside the VMM. It's better and more 
robust this way.

   Paul
David Woodhouse Sept. 22, 2023, 3:52 p.m. UTC | #3
On Fri, 2023-09-22 at 16:49 +0100, Paul Durrant wrote:
> On 22/09/2023 16:44, David Woodhouse wrote:
> > On Fri, 2023-09-22 at 15:00 +0000, Paul Durrant wrote:
> > > From: Paul Durrant <pdurrant@amazon.com>
> > > 
> > > If the guest does not explicitly set the GPA of vcpu_info structure in
> > > memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
> > > in the shared_info page may be used. As described in a previous commit,
> > > the shared_info page is an overlay at a fixed HVA within the VMM, so in
> > > this case it also more optimal to activate the vcpu_info cache with a
> > > fixed HVA to avoid unnecessary invalidation if the guest memory layout
> > > is modified.
> > > 
> > > Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> > 
> > But it should *only* be defined as an HVA in the case where it's the
> > one in the shinfo. Otherwise, it's defined by its GPA.
> > 
> > Which almost makes me want to see a sanity check that it precisely
> > equals &shinfo->vcpu_info[vcpu->arch.xen.vcpu_id].
> > 
> > Which brings me back around the circle again to wonder why we don't
> > just *default* to it.... you hate me, don't you?
> > 
> > Your previous set of patches did that, and it did end requiring that
> > the VMM restore both VCPU_INFO and VCPU_ID for each vCPU *before*
> > restoring the SHARED_INFO_HVA on resume, but wasn't that OK?
> > 
> 
> No. It was painful and overly complex, with too many corner cases that 
> were making my brain hurt. Given the pain of the last week, leaving the 
> default handling in the VMM is preferable. It means I don't need to
> impose rules about attribute ordering and hence get caught by needing
> one thread to wait for another inside the VMM. It's better and more 
> robust this way.

Fair enough, I won't argue. That is fairly much the reason I ripped it
out of João's original patch set in the first place :)
David Woodhouse Sept. 22, 2023, 4:10 p.m. UTC | #4
On Fri, 2023-09-22 at 15:00 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> If the guest does not explicitly set the GPA of vcpu_info structure in
> memory then, for guests with 32 vCPUs or fewer, the vcpu_info embedded
> in the shared_info page may be used. As described in a previous commit,
> the shared_info page is an overlay at a fixed HVA within the VMM, so in
> this case it also more optimal to activate the vcpu_info cache with a
> fixed HVA to avoid unnecessary invalidation if the guest memory layout
> is modified.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
diff mbox series

Patch

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e9df4df6fe48..5adc6dfc8c6e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5444,11 +5444,12 @@  KVM_XEN_ATTR_TYPE_SHARED_INFO
   Sets the guest physical frame number at which the Xen shared_info
   page resides. Note that although Xen places vcpu_info for the first
   32 vCPUs in the shared_info page, KVM does not automatically do so
-  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
-  explicitly even when the vcpu_info for a given vCPU resides at the
-  "default" location in the shared_info page. This is because KVM may
-  not be aware of the Xen CPU id which is used as the index into the
-  vcpu_info[] array, so may know the correct default location.
+  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO or
+  KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA be used explicitly even when
+  the vcpu_info for a given vCPU resides at the "default" location
+  in the shared_info page. This is because KVM may not be aware of
+  the Xen CPU id which is used as the index into the vcpu_info[]
+  array, so may know the correct default location.
 
   Note that the shared_info page may be constantly written to by KVM;
   it contains the event channel bitmap used to deliver interrupts to
@@ -5570,6 +5571,21 @@  KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
   on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
   the vcpu_info.
 
+KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA
+  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
+  Xen capabilities, then this attribute may be used to set the
+  userspace address of the vcpu_info for a given vCPU. It should
+  only be used when the vcpu_info resides at the "default" location
+  in the shared_info page. In this case it is safe to assume the
+  userspace address will not change, because the shared_info page is
+  an overlay on guest memory and remains at a fixed host address
+  regardless of where it is mapped in guest physical address space
+  and hence unnecessary invalidation of an internal cache may be
+  avoided if the guest memory layout is modified.
+  If the vcpu_info does not reside at the "default" location then
+  it is not guaranteed to remain at the same host address and
+  hence the aforementioned cache invalidation is required.
+
 KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
   Sets the guest physical address of an additional pvclock structure
   for a given vCPU. This is typically used for guest vsyscall support.
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 1abb4547642a..aafc794940e4 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -736,20 +736,33 @@  int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	switch (data->type) {
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO:
+	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA:
 		/* No compat necessary here. */
 		BUILD_BUG_ON(sizeof(struct vcpu_info) !=
 			     sizeof(struct compat_vcpu_info));
 		BUILD_BUG_ON(offsetof(struct vcpu_info, time) !=
 			     offsetof(struct compat_vcpu_info, time));
 
-		if (data->u.gpa == KVM_XEN_INVALID_GPA) {
-			kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
-			r = 0;
-			break;
+		if (data->type == KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO) {
+			if (data->u.gpa == KVM_XEN_INVALID_GPA) {
+				kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
+				r = 0;
+				break;
+			}
+
+			r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache,
+					     data->u.gpa, sizeof(struct vcpu_info));
+		} else {
+			if (data->u.hva == 0) {
+				kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
+				r = 0;
+				break;
+			}
+
+			r = kvm_gpc_activate_hva(&vcpu->arch.xen.vcpu_info_cache,
+						 data->u.hva, sizeof(struct vcpu_info));
 		}
 
-		r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache,
-				     data->u.gpa, sizeof(struct vcpu_info));
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
@@ -978,6 +991,14 @@  int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 		r = 0;
 		break;
 
+	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA:
+		if (vcpu->arch.xen.vcpu_info_cache.active)
+			data->u.hva = kvm_gpc_hva(&vcpu->arch.xen.vcpu_info_cache);
+		else
+			data->u.hva = 0;
+		r = 0;
+		break;
+
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (vcpu->arch.xen.vcpu_time_info_cache.active)
 			data->u.gpa = kvm_gpc_gpa(&vcpu->arch.xen.vcpu_time_info_cache);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 062bfa14b4d9..0267c2ef43de 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1858,6 +1858,7 @@  struct kvm_xen_vcpu_attr {
 	union {
 		__u64 gpa;
 #define KVM_XEN_INVALID_GPA ((__u64)-1)
+		__u64 hva;
 		__u64 pad[8];
 		struct {
 			__u64 state;
@@ -1888,6 +1889,8 @@  struct kvm_xen_vcpu_attr {
 #define KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID		0x6
 #define KVM_XEN_VCPU_ATTR_TYPE_TIMER		0x7
 #define KVM_XEN_VCPU_ATTR_TYPE_UPCALL_VECTOR	0x8
+/* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA */
+#define KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO_HVA	0x9
 
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {