diff mbox

[3/6] KVM: MMU: Explicitly set D-bit for writable spte.

Message ID 1422413668-3509-4-git-send-email-kai.huang@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kai Huang Jan. 28, 2015, 2:54 a.m. UTC
This patch avoids unnecessary dirty GPA logging to PML buffer in EPT violation
path by setting D-bit manually prior to the occurrence of the write from guest.

We only set D-bit manually in set_spte, and leave fast_page_fault path
unchanged, as fast_page_fault is very unlikely to happen in case of PML.

For the hva <-> pa change case, the spte is updated to either read-only (host
pte is read-only) or be dropped (host pte is writeable), and both cases will be
handled by above changes, therefore no change is necessary.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
---
 arch/x86/kvm/mmu.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c438224..fb35535 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2597,8 +2597,14 @@  static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		}
 	}
 
-	if (pte_access & ACC_WRITE_MASK)
+	if (pte_access & ACC_WRITE_MASK) {
 		mark_page_dirty(vcpu->kvm, gfn);
+		/*
+		 * Explicitly set dirty bit. It is used to eliminate unnecessary
+		 * dirty GPA logging in case of PML is enabled on VMX.
+		 */
+		spte |= shadow_dirty_mask;
+	}
 
 set_pte:
 	if (mmu_spte_update(sptep, spte))
@@ -2914,6 +2920,16 @@  fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	 */
 	gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
 
+	/*
+	 * Theoretically we could also set dirty bit (and flush TLB) here in
+	 * order to eliminate the unnecessary PML logging. See comments in
+	 * set_spte. But as in case of PML, fast_page_fault is very unlikely to
+	 * happen so we leave it unchanged. This might result in the same GPA
+	 * to be logged in PML buffer again when the write really happens, and
+	 * eventually to be called by mark_page_dirty twice. But it's also no
+	 * harm. This also avoids the TLB flush needed after setting dirty bit
+	 * so non-PML cases won't be impacted.
+	 */
 	if (cmpxchg64(sptep, spte, spte | PT_WRITABLE_MASK) == spte)
 		mark_page_dirty(vcpu->kvm, gfn);