diff mbox

[v6,05/11] arm64/mm: Add support for XPFO

Message ID 20170907173609.22696-6-tycho@docker.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tycho Andersen Sept. 7, 2017, 5:36 p.m. UTC
From: Juerg Haefliger <juerg.haefliger@canonical.com>

Enable support for eXclusive Page Frame Ownership (XPFO) for arm64 and
provide a hook for updating a single kernel page table entry (which is
required by the generic XPFO code).

v6: use flush_tlb_kernel_range() instead of __flush_tlb_one()

CC: linux-arm-kernel@lists.infradead.org
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Tycho Andersen <tycho@docker.com>
---
 arch/arm64/Kconfig     |  1 +
 arch/arm64/mm/Makefile |  2 ++
 arch/arm64/mm/xpfo.c   | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+)

Comments

Christoph Hellwig Sept. 8, 2017, 7:53 a.m. UTC | #1
> +/*
> + * Lookup the page table entry for a virtual address and return a pointer to
> + * the entry. Based on x86 tree.
> + */
> +static pte_t *lookup_address(unsigned long addr)

Seems like this should be moved to common arm64 mm code and used by
kernel_page_present.
Mark Rutland Sept. 14, 2017, 6:22 p.m. UTC | #2
Hi,

On Thu, Sep 07, 2017 at 11:36:03AM -0600, Tycho Andersen wrote:
> From: Juerg Haefliger <juerg.haefliger@canonical.com>
> 
> Enable support for eXclusive Page Frame Ownership (XPFO) for arm64 and
> provide a hook for updating a single kernel page table entry (which is
> required by the generic XPFO code).
> 
> v6: use flush_tlb_kernel_range() instead of __flush_tlb_one()
> 
> CC: linux-arm-kernel@lists.infradead.org
> Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
> Signed-off-by: Tycho Andersen <tycho@docker.com>
> ---
>  arch/arm64/Kconfig     |  1 +
>  arch/arm64/mm/Makefile |  2 ++
>  arch/arm64/mm/xpfo.c   | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 61 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index dfd908630631..44fa44ef02ec 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -121,6 +121,7 @@ config ARM64
>  	select SPARSE_IRQ
>  	select SYSCTL_EXCEPTION_TRACE
>  	select THREAD_INFO_IN_TASK
> +	select ARCH_SUPPORTS_XPFO

A bit of a nit, but this list is (intended to be) organised alphabetically.
Could you please try to retain that?

i.e. place this between ARCH_SUPPORTS_NUMA_BALANCING and
ARCH_WANT_COMPAT_IPC_PARSE_VERSION.

>  	help
>  	  ARM 64-bit (AArch64) Linux support.
>  
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 9b0ba191e48e..22e5cab543d8 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -11,3 +11,5 @@ KASAN_SANITIZE_physaddr.o	+= n
>  
>  obj-$(CONFIG_KASAN)		+= kasan_init.o
>  KASAN_SANITIZE_kasan_init.o	:= n
> +
> +obj-$(CONFIG_XPFO)		+= xpfo.o
> diff --git a/arch/arm64/mm/xpfo.c b/arch/arm64/mm/xpfo.c
> new file mode 100644
> index 000000000000..678e2be848eb
> --- /dev/null
> +++ b/arch/arm64/mm/xpfo.c
> @@ -0,0 +1,58 @@
> +/*
> + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P.
> + * Copyright (C) 2016 Brown University. All rights reserved.
> + *
> + * Authors:
> + *   Juerg Haefliger <juerg.haefliger@hpe.com>
> + *   Vasileios P. Kemerlis <vpk@cs.brown.edu>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published by
> + * the Free Software Foundation.
> + */
> +
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +
> +#include <asm/tlbflush.h>
> +
> +/*
> + * Lookup the page table entry for a virtual address and return a pointer to
> + * the entry. Based on x86 tree.
> + */

Is this intended for kernel VAs, user VAs, or both?

There are different constraints for fiddling with either (e.g. holding
mmap_sem), so we should be clear regarding the intended use-case.

> +static pte_t *lookup_address(unsigned long addr)
> +{
> +	pgd_t *pgd;
> +	pud_t *pud;
> +	pmd_t *pmd;
> +
> +	pgd = pgd_offset_k(addr);
> +	if (pgd_none(*pgd))
> +		return NULL;
> +
> +	pud = pud_offset(pgd, addr);
> +	if (pud_none(*pud))
> +		return NULL;

What if it's not none, but not a table?

I think we chould check pud_sect() here, and/or pud_bad().

> +
> +	pmd = pmd_offset(pud, addr);
> +	if (pmd_none(*pmd))
> +		return NULL;

Likewise.

> +
> +	return pte_offset_kernel(pmd, addr);
> +}

Given this expects a pte, it might make more sense to call this
lookup_address_pte() to make that clear.

> +
> +/* Update a single kernel page table entry */
> +inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot)
> +{
> +	pte_t *pte = lookup_address((unsigned long)kaddr);
> +
> +	set_pte(pte, pfn_pte(page_to_pfn(page), prot));

We can get NULL from lookup_address(), so this doesn't look right.

If NULL implies an error, drop a BUG_ON(!pte) before the set_pte.

> +}
> +
> +inline void xpfo_flush_kernel_tlb(struct page *page, int order)
> +{
> +	unsigned long kaddr = (unsigned long)page_address(page);
> +	unsigned long size = PAGE_SIZE;

unsigned long size = PAGE_SIZE << order;

> +
> +	flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size);

... and this can be simpler.

I haven't brought myself back up to speed, so it might not be possible, but I
still think it would be preferable for XPFO to call flush_tlb_kernel_range()
directly.

Thanks,
Mark.
Tycho Andersen Sept. 18, 2017, 9:27 p.m. UTC | #3
Hi Mark,

On Thu, Sep 14, 2017 at 07:22:08PM +0100, Mark Rutland wrote:
> Hi,
> 
> On Thu, Sep 07, 2017 at 11:36:03AM -0600, Tycho Andersen wrote:
> > From: Juerg Haefliger <juerg.haefliger@canonical.com>
> > 
> > Enable support for eXclusive Page Frame Ownership (XPFO) for arm64 and
> > provide a hook for updating a single kernel page table entry (which is
> > required by the generic XPFO code).
> > 
> > v6: use flush_tlb_kernel_range() instead of __flush_tlb_one()
> > 
> > CC: linux-arm-kernel@lists.infradead.org
> > Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
> > Signed-off-by: Tycho Andersen <tycho@docker.com>
> > ---
> >  arch/arm64/Kconfig     |  1 +
> >  arch/arm64/mm/Makefile |  2 ++
> >  arch/arm64/mm/xpfo.c   | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 61 insertions(+)
> > 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index dfd908630631..44fa44ef02ec 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -121,6 +121,7 @@ config ARM64
> >  	select SPARSE_IRQ
> >  	select SYSCTL_EXCEPTION_TRACE
> >  	select THREAD_INFO_IN_TASK
> > +	select ARCH_SUPPORTS_XPFO
> 
> A bit of a nit, but this list is (intended to be) organised alphabetically.
> Could you please try to retain that?
> 
> i.e. place this between ARCH_SUPPORTS_NUMA_BALANCING and
> ARCH_WANT_COMPAT_IPC_PARSE_VERSION.

Will do.

> >  	help
> >  	  ARM 64-bit (AArch64) Linux support.
> >  
> > diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> > index 9b0ba191e48e..22e5cab543d8 100644
> > --- a/arch/arm64/mm/Makefile
> > +++ b/arch/arm64/mm/Makefile
> > @@ -11,3 +11,5 @@ KASAN_SANITIZE_physaddr.o	+= n
> >  
> >  obj-$(CONFIG_KASAN)		+= kasan_init.o
> >  KASAN_SANITIZE_kasan_init.o	:= n
> > +
> > +obj-$(CONFIG_XPFO)		+= xpfo.o
> > diff --git a/arch/arm64/mm/xpfo.c b/arch/arm64/mm/xpfo.c
> > new file mode 100644
> > index 000000000000..678e2be848eb
> > --- /dev/null
> > +++ b/arch/arm64/mm/xpfo.c
> > @@ -0,0 +1,58 @@
> > +/*
> > + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P.
> > + * Copyright (C) 2016 Brown University. All rights reserved.
> > + *
> > + * Authors:
> > + *   Juerg Haefliger <juerg.haefliger@hpe.com>
> > + *   Vasileios P. Kemerlis <vpk@cs.brown.edu>
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms of the GNU General Public License version 2 as published by
> > + * the Free Software Foundation.
> > + */
> > +
> > +#include <linux/mm.h>
> > +#include <linux/module.h>
> > +
> > +#include <asm/tlbflush.h>
> > +
> > +/*
> > + * Lookup the page table entry for a virtual address and return a pointer to
> > + * the entry. Based on x86 tree.
> > + */
> 
> Is this intended for kernel VAs, user VAs, or both?
> 
> There are different constraints for fiddling with either (e.g. holding
> mmap_sem), so we should be clear regarding the intended use-case.

kernel only; I can add a comment noting this.

> > +static pte_t *lookup_address(unsigned long addr)
> > +{
> > +	pgd_t *pgd;
> > +	pud_t *pud;
> > +	pmd_t *pmd;
> > +
> > +	pgd = pgd_offset_k(addr);
> > +	if (pgd_none(*pgd))
> > +		return NULL;
> > +
> > +	pud = pud_offset(pgd, addr);
> > +	if (pud_none(*pud))
> > +		return NULL;
> 
> What if it's not none, but not a table?
> 
> I think we chould check pud_sect() here, and/or pud_bad().
> 
> > +
> > +	pmd = pmd_offset(pud, addr);
> > +	if (pmd_none(*pmd))
> > +		return NULL;
> 
> Likewise.

In principle pud_sect() should be okay, because we say that XPFO
doesn't support section mappings yet. I'll add a check for pud_bad().

However, Christoph suggested that we move this to common code and
there it won't be okay.

> > +
> > +	return pte_offset_kernel(pmd, addr);
> > +}
> 
> Given this expects a pte, it might make more sense to call this
> lookup_address_pte() to make that clear.
> 
> > +
> > +/* Update a single kernel page table entry */
> > +inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot)
> > +{
> > +	pte_t *pte = lookup_address((unsigned long)kaddr);
> > +
> > +	set_pte(pte, pfn_pte(page_to_pfn(page), prot));
> 
> We can get NULL from lookup_address(), so this doesn't look right.
> 
> If NULL implies an error, drop a BUG_ON(!pte) before the set_pte.

It does, I'll add this (as a WARN() and then no-op), thanks.

> > +}
> > +
> > +inline void xpfo_flush_kernel_tlb(struct page *page, int order)
> > +{
> > +	unsigned long kaddr = (unsigned long)page_address(page);
> > +	unsigned long size = PAGE_SIZE;
> 
> unsigned long size = PAGE_SIZE << order;
> 
> > +
> > +	flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size);
> 
> ... and this can be simpler.
> 
> I haven't brought myself back up to speed, so it might not be possible, but I
> still think it would be preferable for XPFO to call flush_tlb_kernel_range()
> directly.

I don't think we can, since on x86 it uses smp functions, and in some
cases those aren't safe.

Cheers,

Tycho
diff mbox

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dfd908630631..44fa44ef02ec 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -121,6 +121,7 @@  config ARM64
 	select SPARSE_IRQ
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
+	select ARCH_SUPPORTS_XPFO
 	help
 	  ARM 64-bit (AArch64) Linux support.
 
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 9b0ba191e48e..22e5cab543d8 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -11,3 +11,5 @@  KASAN_SANITIZE_physaddr.o	+= n
 
 obj-$(CONFIG_KASAN)		+= kasan_init.o
 KASAN_SANITIZE_kasan_init.o	:= n
+
+obj-$(CONFIG_XPFO)		+= xpfo.o
diff --git a/arch/arm64/mm/xpfo.c b/arch/arm64/mm/xpfo.c
new file mode 100644
index 000000000000..678e2be848eb
--- /dev/null
+++ b/arch/arm64/mm/xpfo.c
@@ -0,0 +1,58 @@ 
+/*
+ * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P.
+ * Copyright (C) 2016 Brown University. All rights reserved.
+ *
+ * Authors:
+ *   Juerg Haefliger <juerg.haefliger@hpe.com>
+ *   Vasileios P. Kemerlis <vpk@cs.brown.edu>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/mm.h>
+#include <linux/module.h>
+
+#include <asm/tlbflush.h>
+
+/*
+ * Lookup the page table entry for a virtual address and return a pointer to
+ * the entry. Based on x86 tree.
+ */
+static pte_t *lookup_address(unsigned long addr)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	pgd = pgd_offset_k(addr);
+	if (pgd_none(*pgd))
+		return NULL;
+
+	pud = pud_offset(pgd, addr);
+	if (pud_none(*pud))
+		return NULL;
+
+	pmd = pmd_offset(pud, addr);
+	if (pmd_none(*pmd))
+		return NULL;
+
+	return pte_offset_kernel(pmd, addr);
+}
+
+/* Update a single kernel page table entry */
+inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot)
+{
+	pte_t *pte = lookup_address((unsigned long)kaddr);
+
+	set_pte(pte, pfn_pte(page_to_pfn(page), prot));
+}
+
+inline void xpfo_flush_kernel_tlb(struct page *page, int order)
+{
+	unsigned long kaddr = (unsigned long)page_address(page);
+	unsigned long size = PAGE_SIZE;
+
+	flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size);
+}