From patchwork Thu Dec 22 04:29:32 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Junaid Shahid <junaids@google.com>
X-Patchwork-Id: 9484245
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	6DA0F601C0 for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 22 Dec 2016 04:29:52 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FB962816B
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 22 Dec 2016 04:29:52 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 54A76283D4; Thu, 22 Dec 2016 04:29:52 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,
	DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_HI,
	T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B2C332816B
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 22 Dec 2016 04:29:51 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1761421AbcLVE3r (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Wed, 21 Dec 2016 23:29:47 -0500
Received: from mail-pf0-f173.google.com ([209.85.192.173]:33841 "EHLO
	mail-pf0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1754484AbcLVE3i (ORCPT <rfc822; kvm@vger.kernel.org>);
	Wed, 21 Dec 2016 23:29:38 -0500
Received: by mail-pf0-f173.google.com with SMTP id c4so37659371pfb.1
	for <kvm@vger.kernel.org>; Wed, 21 Dec 2016 20:29:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20161025;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=A+vZIMncZlwyD8B0g1yjNVp3fDehj6Ts1w/zmufheIM=;
	b=AVVTnNl5eyXruy6amN4rVZExUai8oqS7j+8F3KWQ7cWh3wlwoZuqkVxqVHBemy0PQj
	QjVCqhZ4S8SQcecFjdsQAdq/lFEJ+2drg0yWq88AK/C8zV4qFNCvk3SAg43WHud7I43A
	QisCxmlVSEZWk6YF15bEPNubJ0Fg3Xw2B0QhMS8GPOhRmG7Bm6cyJsOw+NnZi199QRcn
	pM9xLnzC1BxUtBfAiZJdXOjl5PdeEiAezsel+lA1hWpNhKlwmBv2NTMQGD/F0Jbyzvav
	GIAVfiFCgNRLDmniUBVMLNLCGe4jzhAj2aJa1ydUNXQSwSNdtdC0CDJ2S2xkeCwsE7NM
	w19g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=A+vZIMncZlwyD8B0g1yjNVp3fDehj6Ts1w/zmufheIM=;
	b=fMFJTwGPF1QCQBQfXwPuksoAv4TqoJdEvxvENsn8anGKhU2fPBH1vDLAZWaY8IcUxp
	7JrGyShJQ9UV/8aHVsPDucwgYczLmUCYTSi3k98UvYFLvFt3W5D4t+CeGM9GPzR9+9yB
	sgKkiTPhVC5ZzytOSisBam20Ph0UauroSf+5PDZs4okm5y2WochnYaySNc/kWbfFt3gW
	QxPqyXV7sPXWzwj3ZFpdPc7HlyTdtAb9qGfNmtrnaToxOf/IrqHk4MUY3tFyxaLlPeQY
	fWp7r7RM/yNW7+YZ1kDVsrplmKBEOysduzPQ1XCv5NlQMakegZoZF6alRcZ2h8DR1GeW
	RDLw==
X-Gm-Message-State: 
 AIkVDXJqsb13/CZ2AP5UjZYn17Ao71i21FTJyz2/If/foTPMbeuPOeLOcREwqKR25ikU7w+m
X-Received: by 10.84.140.133 with SMTP id 5mr15791084plt.130.1482380977920;
	Wed, 21 Dec 2016 20:29:37 -0800 (PST)
Received: from js-desktop.mtv.corp.google.com ([172.17.128.41])
	by smtp.gmail.com with ESMTPSA id
	c128sm50316655pfc.39.2016.12.21.20.29.37
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Wed, 21 Dec 2016 20:29:37 -0800 (PST)
From: Junaid Shahid <junaids@google.com>
To: kvm@vger.kernel.org
Cc: andreslc@google.com, pfeiner@google.com, pbonzini@redhat.com,
	guangrong.xiao@linux.intel.com
Subject: [PATCH 5/5] kvm: x86: mmu: Verify that restored PTE has needed
	perms in fast page fault
Date: Wed, 21 Dec 2016 20:29:32 -0800
Message-Id: <1482380972-25573-6-git-send-email-junaids@google.com>
X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020
In-Reply-To: <1482380972-25573-1-git-send-email-junaids@google.com>
References: <1482380972-25573-1-git-send-email-junaids@google.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Before fast page fault restores an access track PTE back to a regular PTE,
it now also verifies that the restored PTE would grant the necessary
permissions for the faulting access to succeed. If not, it falls back
to the slow page fault path.

Signed-off-by: Junaid Shahid <junaids@google.com>
---
 arch/x86/kvm/mmu.c | 132 ++++++++++++++++++++++++++++-------------------------
 1 file changed, 70 insertions(+), 62 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 49a7fe4..1317d1e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -372,6 +372,11 @@ static int is_last_spte(u64 pte, int level)
 	return 0;
 }
 
+static bool is_executable_pte(u64 spte)
+{
+	return (spte & (shadow_x_mask | shadow_nx_mask)) == shadow_x_mask;
+}
+
 static kvm_pfn_t spte_to_pfn(u64 pte)
 {
 	return (pte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
@@ -727,6 +732,23 @@ static u64 mark_spte_for_access_track(u64 spte)
 	return spte;
 }
 
+/* Restore an acc-track PTE back to a regular PTE */
+static u64 restore_acc_track_spte(u64 spte)
+{
+	u64 new_spte = spte;
+	u64 saved_bits = (spte >> shadow_acc_track_saved_bits_shift)
+			 & shadow_acc_track_saved_bits_mask;
+
+	WARN_ON_ONCE(!is_access_track_spte(spte));
+
+	new_spte &= ~shadow_acc_track_mask;
+	new_spte &= ~(shadow_acc_track_saved_bits_mask <<
+		      shadow_acc_track_saved_bits_shift);
+	new_spte |= saved_bits;
+
+	return new_spte;
+}
+
 /* Returns the Accessed status of the PTE and resets it at the same time. */
 static bool mmu_spte_age(u64 *sptep)
 {
@@ -3018,27 +3040,12 @@ static bool page_fault_can_be_fast(u32 error_code)
  */
 static bool
 fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
-			u64 *sptep, u64 old_spte,
-			bool remove_write_prot, bool remove_acc_track)
+			u64 *sptep, u64 old_spte, u64 new_spte)
 {
 	gfn_t gfn;
-	u64 new_spte = old_spte;
 
 	WARN_ON(!sp->role.direct);
 
-	if (remove_acc_track) {
-		u64 saved_bits = (old_spte >> shadow_acc_track_saved_bits_shift)
-				 & shadow_acc_track_saved_bits_mask;
-
-		new_spte &= ~shadow_acc_track_mask;
-		new_spte &= ~(shadow_acc_track_saved_bits_mask <<
-			      shadow_acc_track_saved_bits_shift);
-		new_spte |= saved_bits;
-	}
-
-	if (remove_write_prot)
-		new_spte |= PT_WRITABLE_MASK;
-
 	/*
 	 * Theoretically we could also set dirty bit (and flush TLB) here in
 	 * order to eliminate unnecessary PML logging. See comments in
@@ -3054,7 +3061,7 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	if (cmpxchg64(sptep, old_spte, new_spte) != old_spte)
 		return false;
 
-	if (remove_write_prot) {
+	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
 		/*
 		 * The gfn of direct spte is stable since it is
 		 * calculated by sp->gfn.
@@ -3066,6 +3073,18 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	return true;
 }
 
+static bool is_access_allowed(u32 fault_err_code, u64 spte)
+{
+	if (fault_err_code & PFERR_FETCH_MASK)
+		return is_executable_pte(spte);
+
+	if (fault_err_code & PFERR_WRITE_MASK)
+		return is_writable_pte(spte);
+
+	/* Fault was on Read access */
+	return spte & PT_PRESENT_MASK;
+}
+
 /*
  * Return value:
  * - true: let the vcpu to access on the same address again.
@@ -3089,8 +3108,7 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 	walk_shadow_page_lockless_begin(vcpu);
 
 	do {
-		bool remove_write_prot = false;
-		bool remove_acc_track;
+		u64 new_spte;
 
 		for_each_shadow_entry_lockless(vcpu, gva, iterator, spte)
 			if (!is_shadow_present_pte(spte) ||
@@ -3111,52 +3129,43 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 		 * Need not check the access of upper level table entries since
 		 * they are always ACC_ALL.
 		 */
-
-		if (error_code & PFERR_FETCH_MASK) {
-			if ((spte & (shadow_x_mask | shadow_nx_mask))
-			    == shadow_x_mask) {
-				fault_handled = true;
-				break;
-			}
-		} else if (error_code & PFERR_WRITE_MASK) {
-			if (is_writable_pte(spte)) {
-				fault_handled = true;
-				break;
-			}
-
-			/*
-			 * Currently, to simplify the code, write-protection can
-			 * be removed in the fast path only if the SPTE was
-			 * write-protected for dirty-logging.
-			 */
-			remove_write_prot =
-				spte_can_locklessly_be_made_writable(spte);
-		} else {
-			/* Fault was on Read access */
-			if (spte & PT_PRESENT_MASK) {
-				fault_handled = true;
-				break;
-			}
+		if (is_access_allowed(error_code, spte)) {
+			fault_handled = true;
+			break;
 		}
 
-		remove_acc_track = is_access_track_spte(spte);
-
-		/* Verify that the fault can be handled in the fast path */
-		if (!remove_acc_track && !remove_write_prot)
-			break;
+		new_spte = is_access_track_spte(spte)
+			   ? restore_acc_track_spte(spte)
+			   : spte;
 
 		/*
-		 * Do not fix write-permission on the large spte since we only
-		 * dirty the first page into the dirty-bitmap in
-		 * fast_pf_fix_direct_spte() that means other pages are missed
-		 * if its slot is dirty-logged.
-		 *
-		 * Instead, we let the slow page fault path create a normal spte
-		 * to fix the access.
-		 *
-		 * See the comments in kvm_arch_commit_memory_region().
+		 * Currently, to simplify the code, write-protection can
+		 * be removed in the fast path only if the SPTE was
+		 * write-protected for dirty-logging or access tracking.
 		 */
-		if (sp->role.level > PT_PAGE_TABLE_LEVEL && remove_write_prot)
+		if ((error_code & PFERR_WRITE_MASK) &&
+		    spte_can_locklessly_be_made_writable(spte))
+		{
+			new_spte |= PT_WRITABLE_MASK;
+
+			/*
+			 * Do not fix write-permission on the large spte since
+			 * we only dirty the first page into the dirty-bitmap in
+			 * fast_pf_fix_direct_spte() that means other pages are
+			 * missed if its slot is dirty-logged.
+			 *
+			 * Instead, we let the slow page fault path create a
+			 * normal spte to fix the access.
+			 *
+			 * See the comments in kvm_arch_commit_memory_region().
+			 */
+			if (sp->role.level > PT_PAGE_TABLE_LEVEL)
+				break;
+		}
+
+		/* Verify that the fault can be handled in the fast path */
+		if (new_spte == spte ||
+		    !is_access_allowed(error_code, new_spte))
 			break;
 
 		/*
@@ -3166,8 +3175,7 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 		 */
 		fault_handled = fast_pf_fix_direct_spte(vcpu, sp,
 							iterator.sptep, spte,
-							remove_write_prot,
-							remove_acc_track);
+							new_spte);
 		if (fault_handled)
 			break;