From patchwork Tue Sep 18 22:01:27 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Christoffer Dall <c.dall@virtualopensystems.com>
X-Patchwork-Id: 1474871
Return-Path: <kvm-owner@vger.kernel.org>
X-Original-To: patchwork-kvm@patchwork.kernel.org
Delivered-To: patchwork-process-083081@patchwork1.kernel.org
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by patchwork1.kernel.org (Postfix) with ESMTP id 66E3A400EC
	for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 18 Sep 2012 22:01:34 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755630Ab2IRWB3 (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Tue, 18 Sep 2012 18:01:29 -0400
Received: from mail-vc0-f174.google.com ([209.85.220.174]:50316 "EHLO
	mail-vc0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1755552Ab2IRWB2 (ORCPT <rfc822; kvm@vger.kernel.org>);
	Tue, 18 Sep 2012 18:01:28 -0400
Received: by vcbfo13 with SMTP id fo13so475061vcb.19
	for <kvm@vger.kernel.org>; Tue, 18 Sep 2012 15:01:27 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:x-originating-ip:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type:x-gm-message-state;
	bh=t2gb5QOVgjUu5nWjocCT9LT2OL4GPMtpmsP8Q/rQdsM=;
	b=g9Kik2Uufj1jOimrEu2Cef0wLHtTTwSEJvsaHrDGbbXYyBW6Lb2ynNd2mI+XJ6irvx
	EKMyyWV78msvX7tzhOgXkET4KEQKxs0E+LmzDyr9zXcU4nGFlU7bVPmaYSGKrMOzWwjg
	XRzfuZemSXPWduNvNjoTBFiHDMyZC2D2/0ijDYQ7c9I98hmqn0+PdM0F4SaR2k8ynOWE
	GgF4ODu7yTerTdgoud6pZnb703E5mwqJ0iNRnU6J+vXxUep4xkJUSMiNMMJ70d0sxUXl
	L2rakb6dC1wbHPxd30HfXLyES0BeLUky0H6iivKpzttF6ZLejRbgl7L6e8agYQOfAIE1
	r6hg==
MIME-Version: 1.0
Received: by 10.220.239.209 with SMTP id kx17mr802105vcb.41.1348005687513;
	Tue, 18 Sep 2012 15:01:27 -0700 (PDT)
Received: by 10.58.127.97 with HTTP; Tue, 18 Sep 2012 15:01:27 -0700 (PDT)
X-Originating-IP: [128.59.22.176]
In-Reply-To: <20120918124724.GK32204@mudshark.cambridge.arm.com>
References: <20120915153359.21241.86002.stgit@ubuntu>
	<20120915153443.21241.37958.stgit@ubuntu>
	<20120918124724.GK32204@mudshark.cambridge.arm.com>
Date: Tue, 18 Sep 2012 18:01:27 -0400
Message-ID: 
 <CANM98qJZBYjBh5Jqu4H_oiS+4vgVi3PNcOUP1prUApOTmhoqpQ@mail.gmail.com>
Subject: Re: [PATCH 02/15] ARM: Add page table and page defines needed by KVM
From: Christoffer Dall <c.dall@virtualopensystems.com>
To: Will Deacon <will.deacon@arm.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>
X-Gm-Message-State: 
 ALoCoQnU/IRLJeumpeUFA10SrGzx/bd3bzlATrbDkbp1yjhA2YSvLGpYT2kWu/z+Djww+qGZh/H/
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

On Tue, Sep 18, 2012 at 8:47 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Sat, Sep 15, 2012 at 04:34:43PM +0100, Christoffer Dall wrote:
>> KVM uses the stage-2 page tables and the Hyp page table format,
>> so let's define the fields we need to access in KVM.
>>
>> We use pgprot_guest to indicate stage-2 entries.
>>
>> Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>>  arch/arm/include/asm/pgtable-3level.h |   13 +++++++++++++
>>  arch/arm/include/asm/pgtable.h        |    5 +++++
>>  arch/arm/mm/mmu.c                     |    3 +++
>>  3 files changed, 21 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
>> index b249035..7351eee 100644
>> --- a/arch/arm/include/asm/pgtable-3level.h
>> +++ b/arch/arm/include/asm/pgtable-3level.h
>> @@ -102,11 +102,24 @@
>>   */
>>  #define L_PGD_SWAPPER                (_AT(pgdval_t, 1) << 55)        /* swapper_pg_dir entry */
>>
>> +/*
>> + * 2-nd stage PTE definitions for LPAE.
>> + */
>
> Minor nit: 2nd
>
>> +#define L_PTE2_SHARED                L_PTE_SHARED
>> +#define L_PTE2_READ          (_AT(pteval_t, 1) << 6) /* HAP[0] */
>> +#define L_PTE2_WRITE         (_AT(pteval_t, 1) << 7) /* HAP[1] */
>
> This is actually HAP[2:1], not HAP[1:0]. Also, can you follow what we do for
> stage 1 translation and name these RDONLY and WRONLY (do you even use
> that?).
>

The ARM arm is actually ambiguous, B3-1335 defines it as HAP[2:1], but
B3-1355 defines it as HAP[1:0] and I chose the latter as it is more
clear to most people not knowing that for historical reasons {H}AP[0]
is not defined. If there's a consensus for the other choice here, then
I'm good with that.

Also, these bits have a different meaning for stage-2, HAP[2] (ok, in
this case it's less misleading with bit index 2), HAP[2] actually
means you can write this, two clear bits means access denied, not
read/write, so it made more sense to me to do:

prot = READ | WRITE;

than

prt = RDONLY | WRONLY; // (not mutually exclusive). See my point?

>> +#define L_PTE2_NORM_WB               (_AT(pteval_t, 3) << 4) /* MemAttr[3:2] */
>> +#define L_PTE2_INNER_WB              (_AT(pteval_t, 3) << 2) /* MemAttr[1:0] */
>
> Another minor nit: PTE2 looks awful. Maybe L_PTE_HYP_* instead?
>
>>  #ifndef __ASSEMBLY__
>>
>>  #define pud_none(pud)                (!pud_val(pud))
>>  #define pud_bad(pud)         (!(pud_val(pud) & 2))
>>  #define pud_present(pud)     (pud_val(pud))
>> +#define pmd_table(pmd)               ((pmd_val(pmd) & PMD_TYPE_MASK) == \
>> +                                              PMD_TYPE_TABLE)
>> +#define pmd_sect(pmd)                ((pmd_val(pmd) & PMD_TYPE_MASK) == \
>> +                                              PMD_TYPE_SECT)
>>
>>  #define pud_clear(pudp)                      \
>>       do {                            \
>> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
>> index 41dc31f..c422f62 100644
>> --- a/arch/arm/include/asm/pgtable.h
>> +++ b/arch/arm/include/asm/pgtable.h
>> @@ -70,6 +70,7 @@ extern void __pgd_error(const char *file, int line, pgd_t);
>>
>>  extern pgprot_t              pgprot_user;
>>  extern pgprot_t              pgprot_kernel;
>> +extern pgprot_t              pgprot_guest;
>>
>>  #define _MOD_PROT(p, b)      __pgprot(pgprot_val(p) | (b))
>>
>> @@ -82,6 +83,10 @@ extern pgprot_t            pgprot_kernel;
>>  #define PAGE_READONLY_EXEC   _MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_RDONLY)
>>  #define PAGE_KERNEL          _MOD_PROT(pgprot_kernel, L_PTE_XN)
>>  #define PAGE_KERNEL_EXEC     pgprot_kernel
>> +#define PAGE_HYP             _MOD_PROT(pgprot_kernel, L_PTE_USER)
>
> Just define L_PTE_HYP to L_PTE_USER, otherwise that's confusing.
>
>> +#define PAGE_KVM_GUEST               _MOD_PROT(pgprot_guest, L_PTE2_READ | \
>> +                                       L_PTE2_NORM_WB | L_PTE2_INNER_WB | \
>> +                                       L_PTE2_SHARED)
>
> It would be cleaner to separate the cacheability attributes out from here
> and into the cache_policies array. Then you just need L_PTE_HYP_RDONLY here.
>

ok, below is an attempt to rework all this, comments please:
---
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/arch/arm/include/asm/pgtable-3level.h
b/arch/arm/include/asm/pgtable-3level.h
index 7351eee..6df235c 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -103,13 +103,19 @@
 #define L_PGD_SWAPPER		(_AT(pgdval_t, 1) << 55)	/* swapper_pg_dir entry */

 /*
- * 2-nd stage PTE definitions for LPAE.
+ * 2nd stage PTE definitions for LPAE.
  */
-#define L_PTE2_SHARED		L_PTE_SHARED
-#define L_PTE2_READ		(_AT(pteval_t, 1) << 6)	/* HAP[0] */
-#define L_PTE2_WRITE		(_AT(pteval_t, 1) << 7)	/* HAP[1] */
-#define L_PTE2_NORM_WB		(_AT(pteval_t, 3) << 4)	/* MemAttr[3:2] */
-#define L_PTE2_INNER_WB		(_AT(pteval_t, 3) << 2)	/* MemAttr[1:0] */
+#define L_PTE_S2_SHARED		L_PTE_SHARED
+#define L_PTE_S2_READ		(_AT(pteval_t, 1) << 6)	  /* HAP[1] */
+#define L_PTE_S2_WRITE		(_AT(pteval_t, 1) << 7)	  /* HAP[2] */
+#define L_PTE_S2_MT_UNCACHED	(_AT(pteval_t, 0x5) << 2) /* MemAttr[3:0] */
+#define L_PTE_S2_MT_WRTHROUGH	(_AT(pteval_t, 0xa) << 2) /* MemAttr[3:0] */
+#define L_PTE_S2_MT_WRBACK	(_AT(pteval_t, 0xf) << 2) /* MemAttr[3:0] */
+
+/*
+ * Hyp-mode PL2 PTE definitions for LPAE.
+ */
+#define L_PTE_HYP		L_PTE_USER

 #ifndef __ASSEMBLY__

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index c422f62..6ab276b 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -83,10 +83,8 @@ extern pgprot_t		pgprot_guest;
 #define PAGE_READONLY_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_RDONLY)
 #define PAGE_KERNEL		_MOD_PROT(pgprot_kernel, L_PTE_XN)
 #define PAGE_KERNEL_EXEC	pgprot_kernel
-#define PAGE_HYP		_MOD_PROT(pgprot_kernel, L_PTE_USER)
-#define PAGE_KVM_GUEST		_MOD_PROT(pgprot_guest, L_PTE2_READ | \
-					  L_PTE2_NORM_WB | L_PTE2_INNER_WB | \
-					  L_PTE2_SHARED)
+#define PAGE_HYP		_MOD_PROT(pgprot_kernel, L_PTE_HYP)
+#define PAGE_KVM_GUEST		_MOD_PROT(pgprot_guest, L_PTE_S2_READ)

 #define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN)
 #define __PAGE_SHARED		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 82d0edf..3ff427b 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -476,7 +476,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm,
phys_addr_t guest_ipa,

 	end = (guest_ipa + size + PAGE_SIZE - 1) & PAGE_MASK;
 	prot = __pgprot(get_mem_type_prot_pte(MT_DEVICE) | L_PTE_USER |
-			L_PTE2_READ | L_PTE2_WRITE);
+			L_PTE_S2_READ | L_PTE_S2_WRITE);
 	pfn = __phys_to_pfn(pa);

 	for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
@@ -567,7 +567,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
phys_addr_t fault_ipa,
 		goto out;
 	new_pte = pfn_pte(pfn, PAGE_KVM_GUEST);
 	if (writable)
-		pte_val(new_pte) |= L_PTE2_WRITE;
+		pte_val(new_pte) |= L_PTE_S2_WRITE;
 	coherent_icache_guest_page(vcpu->kvm, gfn);

 	spin_lock(&vcpu->kvm->arch.pgd_lock);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index f2b6287..a06f3496 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -67,6 +67,7 @@ struct cachepolicy {
 	unsigned int	cr_mask;
 	pmdval_t	pmd;
 	pteval_t	pte;
+	pteval_t	pte_s2;
 };

 static struct cachepolicy cache_policies[] __initdata = {
@@ -75,26 +76,31 @@ static struct cachepolicy cache_policies[] __initdata = {
 		.cr_mask	= CR_W|CR_C,
 		.pmd		= PMD_SECT_UNCACHED,
 		.pte		= L_PTE_MT_UNCACHED,
+		.pte_s2		= L_PTE_S2_MT_UNCACHED,
 	}, {
 		.policy		= "buffered",
 		.cr_mask	= CR_C,
 		.pmd		= PMD_SECT_BUFFERED,
 		.pte		= L_PTE_MT_BUFFERABLE,
+		.pte_s2		= L_PTE_S2_MT_UNCACHED,
 	}, {
 		.policy		= "writethrough",
 		.cr_mask	= 0,
 		.pmd		= PMD_SECT_WT,
 		.pte		= L_PTE_MT_WRITETHROUGH,
+		.pte_s2		= L_PTE_S2_MT_WRTHROUGH,
 	}, {
 		.policy		= "writeback",
 		.cr_mask	= 0,
 		.pmd		= PMD_SECT_WB,
 		.pte		= L_PTE_MT_WRITEBACK,
+		.pte_s2		= L_PTE_S2_MT_WRBACK,
 	}, {
 		.policy		= "writealloc",
 		.cr_mask	= 0,
 		.pmd		= PMD_SECT_WBWA,
 		.pte		= L_PTE_MT_WRITEALLOC,
+		.pte_s2		= L_PTE_S2_MT_WRBACK,
 	}
 };

@@ -318,7 +324,7 @@ static void __init build_mem_type_table(void)
 {
 	struct cachepolicy *cp;
 	unsigned int cr = get_cr();
-	pteval_t user_pgprot, kern_pgprot, vecs_pgprot;
+	pteval_t user_pgprot, kern_pgprot, vecs_pgprot, guest_pgprot;
 	int cpu_arch = cpu_architecture();
 	int i;

@@ -430,6 +436,7 @@ static void __init build_mem_type_table(void)
 	 */
 	cp = &cache_policies[cachepolicy];
 	vecs_pgprot = kern_pgprot = user_pgprot = cp->pte;
+	guest_pgprot = cp->pte_s2;

 	/*
 	 * Enable CPU-specific coherency if supported.
@@ -464,6 +471,7 @@ static void __init build_mem_type_table(void)
 			user_pgprot |= L_PTE_SHARED;
 			kern_pgprot |= L_PTE_SHARED;
 			vecs_pgprot |= L_PTE_SHARED;
+			guest_pgprot |= L_PTE_SHARED;
 			mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_S;
 			mem_types[MT_DEVICE_WC].prot_pte |= L_PTE_SHARED;
 			mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_S;
@@ -518,7 +526,7 @@ static void __init build_mem_type_table(void)
 	pgprot_user   = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG | user_pgprot);
 	pgprot_kernel = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG |
 				 L_PTE_DIRTY | kern_pgprot);
-	pgprot_guest  = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG);
+	pgprot_guest  = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG | guest_pgprot);

 	mem_types[MT_LOW_VECTORS].prot_l1 |= ecc_mask;
 	mem_types[MT_HIGH_VECTORS].prot_l1 |= ecc_mask;