diff mbox series

[4/5] KVM: x86: Ensure a full memory barrier is emitted in the VM-Exit path

Message ID 20240309010929.1403984-5-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: VMX: Drop MTRR virtualization, honor guest PAT | expand

Commit Message

Sean Christopherson March 9, 2024, 1:09 a.m. UTC
From: Yan Zhao <yan.y.zhao@intel.com>

Ensure a full memory barrier is emitted in the VM-Exit path, as a full
barrier is required on Intel CPUs to evict WC buffers.  This will allow
unconditionally honoring guest PAT on Intel CPUs that support self-snoop.

As srcu_read_lock() is always called in the VM-Exit path and it internally
has a smp_mb(), call smp_mb__after_srcu_read_lock() to avoid adding a
second fence and make sure smp_mb() is called without dependency on
implementation details of srcu_read_lock().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Paolo Bonzini June 20, 2024, 10:38 p.m. UTC | #1
On 3/9/24 02:09, Sean Christopherson wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
> 
> Ensure a full memory barrier is emitted in the VM-Exit path, as a full
> barrier is required on Intel CPUs to evict WC buffers.  This will allow
> unconditionally honoring guest PAT on Intel CPUs that support self-snoop.
> 
> As srcu_read_lock() is always called in the VM-Exit path and it internally
> has a smp_mb(), call smp_mb__after_srcu_read_lock() to avoid adding a
> second fence and make sure smp_mb() is called without dependency on
> implementation details of srcu_read_lock().

Do you really need mfence or is a locked operation enough?  mfence is 
mb(), not smp_mb().

Paolo

> +	/*
> +	 * Call this to ensure WC buffers in guest are evicted after each VM
> +	 * Exit, so that the evicted WC writes can be snooped across all cpus
> +	 */
> +	smp_mb__after_srcu_read_lock();
Paul E. McKenney June 20, 2024, 11:42 p.m. UTC | #2
On Fri, Jun 21, 2024 at 12:38:21AM +0200, Paolo Bonzini wrote:
> On 3/9/24 02:09, Sean Christopherson wrote:
> > From: Yan Zhao <yan.y.zhao@intel.com>
> > 
> > Ensure a full memory barrier is emitted in the VM-Exit path, as a full
> > barrier is required on Intel CPUs to evict WC buffers.  This will allow
> > unconditionally honoring guest PAT on Intel CPUs that support self-snoop.
> > 
> > As srcu_read_lock() is always called in the VM-Exit path and it internally
> > has a smp_mb(), call smp_mb__after_srcu_read_lock() to avoid adding a
> > second fence and make sure smp_mb() is called without dependency on
> > implementation details of srcu_read_lock().
> 
> Do you really need mfence or is a locked operation enough?  mfence is mb(),
> not smp_mb().

We only need smp_mb(), which is supplied by the srcu_read_lock()
function.  For now, anyway.  If we ever figure out how to get by with
lighter-weight ordering for srcu_read_lock(), then we will add an smp_mb()
to smp_mb__after_srcu_read_lock() to compensate.

							Thanx, Paul

> Paolo
> 
> > +	/*
> > +	 * Call this to ensure WC buffers in guest are evicted after each VM
> > +	 * Exit, so that the evicted WC writes can be snooped across all cpus
> > +	 */
> > +	smp_mb__after_srcu_read_lock();
>
Yan Zhao June 21, 2024, 12:52 a.m. UTC | #3
On Fri, Jun 21, 2024 at 12:38:21AM +0200, Paolo Bonzini wrote:
> On 3/9/24 02:09, Sean Christopherson wrote:
> > From: Yan Zhao <yan.y.zhao@intel.com>
> > 
> > Ensure a full memory barrier is emitted in the VM-Exit path, as a full
> > barrier is required on Intel CPUs to evict WC buffers.  This will allow
> > unconditionally honoring guest PAT on Intel CPUs that support self-snoop.
> > 
> > As srcu_read_lock() is always called in the VM-Exit path and it internally
> > has a smp_mb(), call smp_mb__after_srcu_read_lock() to avoid adding a
> > second fence and make sure smp_mb() is called without dependency on
> > implementation details of srcu_read_lock().
> 
> Do you really need mfence or is a locked operation enough?  mfence is mb(),
> not smp_mb().
> 
A locked operation should be enough, since the barrier here is to evict
partially filled WC buffers.

"
If the WC buffer is partially filled, the writes may be delayed until the next
occurrence of a serializing event; such as an SFENCE or MFENCE instruction,
CPUID or other serializing instruction, a read or write to uncached memory, an
interrupt occurrence, or an execution of a LOCK instruction (including one with
an XACQUIRE or XRELEASE prefix).
"

> 
> > +	/*
> > +	 * Call this to ensure WC buffers in guest are evicted after each VM
> > +	 * Exit, so that the evicted WC writes can be snooped across all cpus
> > +	 */
> > +	smp_mb__after_srcu_read_lock();
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 276ae56dd888..69e815df1699 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11082,6 +11082,12 @@  static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 	kvm_vcpu_srcu_read_lock(vcpu);
 
+	/*
+	 * Call this to ensure WC buffers in guest are evicted after each VM
+	 * Exit, so that the evicted WC writes can be snooped across all cpus
+	 */
+	smp_mb__after_srcu_read_lock();
+
 	/*
 	 * Profile KVM exit RIPs:
 	 */