diff mbox series

[v3,03/10] arm64, kfence: enable KFENCE for ARM64

Message ID 20200921132611.1700350-4-elver@google.com (mailing list archive)
State New, archived
Headers show
Series KFENCE: A low-overhead sampling-based memory safety error detector | expand

Commit Message

Marco Elver Sept. 21, 2020, 1:26 p.m. UTC
Add architecture specific implementation details for KFENCE and enable
KFENCE for the arm64 architecture. In particular, this implements the
required interface in <asm/kfence.h>. Currently, the arm64 version does
not yet use a statically allocated memory pool, at the cost of a pointer
load for each is_kfence_address().

Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Co-developed-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Marco Elver <elver@google.com>
---
For ARM64, we would like to solicit feedback on what the best option is
to obtain a constant address for __kfence_pool. One option is to declare
a memory range in the memory layout to be dedicated to KFENCE (like is
done for KASAN), however, it is unclear if this is the best available
option. We would like to avoid touching the memory layout.
---
 arch/arm64/Kconfig              |  1 +
 arch/arm64/include/asm/kfence.h | 39 +++++++++++++++++++++++++++++++++
 arch/arm64/mm/fault.c           |  4 ++++
 3 files changed, 44 insertions(+)
 create mode 100644 arch/arm64/include/asm/kfence.h

Comments

Will Deacon Sept. 21, 2020, 2:31 p.m. UTC | #1
On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> Add architecture specific implementation details for KFENCE and enable
> KFENCE for the arm64 architecture. In particular, this implements the
> required interface in <asm/kfence.h>. Currently, the arm64 version does
> not yet use a statically allocated memory pool, at the cost of a pointer
> load for each is_kfence_address().
> 
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> Co-developed-by: Alexander Potapenko <glider@google.com>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
> For ARM64, we would like to solicit feedback on what the best option is
> to obtain a constant address for __kfence_pool. One option is to declare
> a memory range in the memory layout to be dedicated to KFENCE (like is
> done for KASAN), however, it is unclear if this is the best available
> option. We would like to avoid touching the memory layout.

Sorry for the delay on this.

Given that the pool is relatively small (i.e. when compared with our virtual
address space), dedicating an area of virtual space sounds like it makes
the most sense here. How early do you need it to be available?

An alternative approach would be to patch in the address at runtime, with
something like a static key to swizzle off the direct __kfence_pool load
once we're up and running.

Will
Alexander Potapenko Sept. 21, 2020, 2:58 p.m. UTC | #2
On Mon, Sep 21, 2020 at 4:31 PM Will Deacon <will@kernel.org> wrote:
>
> On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> > Add architecture specific implementation details for KFENCE and enable
> > KFENCE for the arm64 architecture. In particular, this implements the
> > required interface in <asm/kfence.h>. Currently, the arm64 version does
> > not yet use a statically allocated memory pool, at the cost of a pointer
> > load for each is_kfence_address().
> >
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > Co-developed-by: Alexander Potapenko <glider@google.com>
> > Signed-off-by: Alexander Potapenko <glider@google.com>
> > Signed-off-by: Marco Elver <elver@google.com>
> > ---
> > For ARM64, we would like to solicit feedback on what the best option is
> > to obtain a constant address for __kfence_pool. One option is to declare
> > a memory range in the memory layout to be dedicated to KFENCE (like is
> > done for KASAN), however, it is unclear if this is the best available
> > option. We would like to avoid touching the memory layout.
>
> Sorry for the delay on this.

NP, thanks for looking!

> Given that the pool is relatively small (i.e. when compared with our virtual
> address space), dedicating an area of virtual space sounds like it makes
> the most sense here. How early do you need it to be available?

Yes, having a dedicated address sounds good.
We're inserting kfence_init() into start_kernel() after timekeeping_init().
So way after mm_init(), if that matters.

> An alternative approach would be to patch in the address at runtime, with
> something like a static key to swizzle off the direct __kfence_pool load
> once we're up and running.

IIUC there's no such thing as address patching in the kernel at the
moment, at least static keys work differently?
I am not sure how much we need to randomize this address range (we
don't on x86 anyway).

> Will
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200921143059.GO2139%40willie-the-truck.



--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Alexander Potapenko Sept. 21, 2020, 3:37 p.m. UTC | #3
On Mon, Sep 21, 2020 at 4:58 PM Alexander Potapenko <glider@google.com> wrote:
>
> On Mon, Sep 21, 2020 at 4:31 PM Will Deacon <will@kernel.org> wrote:
> >
> > On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> > > Add architecture specific implementation details for KFENCE and enable
> > > KFENCE for the arm64 architecture. In particular, this implements the
> > > required interface in <asm/kfence.h>. Currently, the arm64 version does
> > > not yet use a statically allocated memory pool, at the cost of a pointer
> > > load for each is_kfence_address().
> > >
> > > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > > Co-developed-by: Alexander Potapenko <glider@google.com>
> > > Signed-off-by: Alexander Potapenko <glider@google.com>
> > > Signed-off-by: Marco Elver <elver@google.com>
> > > ---
> > > For ARM64, we would like to solicit feedback on what the best option is
> > > to obtain a constant address for __kfence_pool. One option is to declare
> > > a memory range in the memory layout to be dedicated to KFENCE (like is
> > > done for KASAN), however, it is unclear if this is the best available
> > > option. We would like to avoid touching the memory layout.
> >
> > Sorry for the delay on this.
>
> NP, thanks for looking!
>
> > Given that the pool is relatively small (i.e. when compared with our virtual
> > address space), dedicating an area of virtual space sounds like it makes
> > the most sense here. How early do you need it to be available?
>
> Yes, having a dedicated address sounds good.
> We're inserting kfence_init() into start_kernel() after timekeeping_init().
> So way after mm_init(), if that matters.

The question is though, how big should that dedicated area be?
Right now KFENCE_NUM_OBJECTS can be up to 16383 (which makes the pool
size 64MB), but this number actually comes from the limitation on
static objects, so we might want to increase that number on arm64.
Will Deacon Sept. 21, 2020, 5:43 p.m. UTC | #4
On Mon, Sep 21, 2020 at 05:37:10PM +0200, Alexander Potapenko wrote:
> On Mon, Sep 21, 2020 at 4:58 PM Alexander Potapenko <glider@google.com> wrote:
> >
> > On Mon, Sep 21, 2020 at 4:31 PM Will Deacon <will@kernel.org> wrote:
> > >
> > > On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> > > > Add architecture specific implementation details for KFENCE and enable
> > > > KFENCE for the arm64 architecture. In particular, this implements the
> > > > required interface in <asm/kfence.h>. Currently, the arm64 version does
> > > > not yet use a statically allocated memory pool, at the cost of a pointer
> > > > load for each is_kfence_address().
> > > >
> > > > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > > > Co-developed-by: Alexander Potapenko <glider@google.com>
> > > > Signed-off-by: Alexander Potapenko <glider@google.com>
> > > > Signed-off-by: Marco Elver <elver@google.com>
> > > > ---
> > > > For ARM64, we would like to solicit feedback on what the best option is
> > > > to obtain a constant address for __kfence_pool. One option is to declare
> > > > a memory range in the memory layout to be dedicated to KFENCE (like is
> > > > done for KASAN), however, it is unclear if this is the best available
> > > > option. We would like to avoid touching the memory layout.
> > >
> > > Sorry for the delay on this.
> >
> > NP, thanks for looking!
> >
> > > Given that the pool is relatively small (i.e. when compared with our virtual
> > > address space), dedicating an area of virtual space sounds like it makes
> > > the most sense here. How early do you need it to be available?
> >
> > Yes, having a dedicated address sounds good.
> > We're inserting kfence_init() into start_kernel() after timekeeping_init().
> > So way after mm_init(), if that matters.
> 
> The question is though, how big should that dedicated area be?
> Right now KFENCE_NUM_OBJECTS can be up to 16383 (which makes the pool
> size 64MB), but this number actually comes from the limitation on
> static objects, so we might want to increase that number on arm64.

What happens on x86 and why would we do something different?

Will
Marco Elver Sept. 22, 2020, 9:56 a.m. UTC | #5
On Mon, 21 Sep 2020 at 19:44, Will Deacon <will@kernel.org> wrote:
[...]
> > > > > For ARM64, we would like to solicit feedback on what the best option is
> > > > > to obtain a constant address for __kfence_pool. One option is to declare
> > > > > a memory range in the memory layout to be dedicated to KFENCE (like is
> > > > > done for KASAN), however, it is unclear if this is the best available
> > > > > option. We would like to avoid touching the memory layout.
> > > >
> > > > Sorry for the delay on this.
> > >
> > > NP, thanks for looking!
> > >
> > > > Given that the pool is relatively small (i.e. when compared with our virtual
> > > > address space), dedicating an area of virtual space sounds like it makes
> > > > the most sense here. How early do you need it to be available?
> > >
> > > Yes, having a dedicated address sounds good.
> > > We're inserting kfence_init() into start_kernel() after timekeeping_init().
> > > So way after mm_init(), if that matters.
> >
> > The question is though, how big should that dedicated area be?
> > Right now KFENCE_NUM_OBJECTS can be up to 16383 (which makes the pool
> > size 64MB), but this number actually comes from the limitation on
> > static objects, so we might want to increase that number on arm64.
>
> What happens on x86 and why would we do something different?

On x86 we just do `char __kfence_pool[KFENCE_POOL_SIZE] ...;` to
statically allocate the pool. On arm64 this doesn't seem to work
because static memory doesn't have struct pages?

Thanks,
-- Marco
Alexander Potapenko Sept. 25, 2020, 3:25 p.m. UTC | #6
Will,

> Given that the pool is relatively small (i.e. when compared with our virtual
> address space), dedicating an area of virtual space sounds like it makes
> the most sense here. How early do you need it to be available?

How do we assign struct pages to a fixed virtual space area (I'm
currently experimenting with 0xffff7f0000000000-0xffff7f0000200000)?
Looks like filling page table entries (similarly to what's being done
in arch/arm64/mm/kasan_init.c) is not enough.
I thought maybe vmemmap_populate() would do the job, but it didn't
(virt_to_pfn() still returns invalid PFNs).
Marco Elver Sept. 28, 2020, 11:53 a.m. UTC | #7
On Mon, 21 Sep 2020 at 16:31, Will Deacon <will@kernel.org> wrote:
> On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> > Add architecture specific implementation details for KFENCE and enable
> > KFENCE for the arm64 architecture. In particular, this implements the
> > required interface in <asm/kfence.h>. Currently, the arm64 version does
> > not yet use a statically allocated memory pool, at the cost of a pointer
> > load for each is_kfence_address().
[...]
> > For ARM64, we would like to solicit feedback on what the best option is
> > to obtain a constant address for __kfence_pool. One option is to declare
> > a memory range in the memory layout to be dedicated to KFENCE (like is
> > done for KASAN), however, it is unclear if this is the best available
> > option. We would like to avoid touching the memory layout.

> Given that the pool is relatively small (i.e. when compared with our virtual
> address space), dedicating an area of virtual space sounds like it makes
> the most sense here. How early do you need it to be available?

Note: we're going to send a v4 this or next week with a few other
minor fixes in it. But I think we just don't want to block the entire
series on figuring out what the static-pool arm64 version should do,
especially if we'll have a few iterations with only this patch here
changing.

So the plan will be:

1. Send v4, which could from our point-of-view be picked up for
merging. Unless of course there are more comments.

2. Work out the details for the static-pool arm64 version, since it
doesn't seem trivial to do the same thing as we do for x86. In
preparation for that, v4 will allow the __kfence_pool's attributes to
be defined entirely by <asm/kfence.h>, so that we can fiddle with
sections etc.

3. Send patch switching out the simpler arm64 version here for one
that places __kfence_pool at a static location.

Hopefully that plan is reasonable.

Thanks,
-- Marco
Mark Rutland Sept. 29, 2020, 1:53 p.m. UTC | #8
On Tue, Sep 22, 2020 at 11:56:26AM +0200, Marco Elver wrote:
> On Mon, 21 Sep 2020 at 19:44, Will Deacon <will@kernel.org> wrote:
> [...]
> > > > > > For ARM64, we would like to solicit feedback on what the best option is
> > > > > > to obtain a constant address for __kfence_pool. One option is to declare
> > > > > > a memory range in the memory layout to be dedicated to KFENCE (like is
> > > > > > done for KASAN), however, it is unclear if this is the best available
> > > > > > option. We would like to avoid touching the memory layout.
> > > > >
> > > > > Sorry for the delay on this.
> > > >
> > > > NP, thanks for looking!
> > > >
> > > > > Given that the pool is relatively small (i.e. when compared with our virtual
> > > > > address space), dedicating an area of virtual space sounds like it makes
> > > > > the most sense here. How early do you need it to be available?
> > > >
> > > > Yes, having a dedicated address sounds good.
> > > > We're inserting kfence_init() into start_kernel() after timekeeping_init().
> > > > So way after mm_init(), if that matters.
> > >
> > > The question is though, how big should that dedicated area be?
> > > Right now KFENCE_NUM_OBJECTS can be up to 16383 (which makes the pool
> > > size 64MB), but this number actually comes from the limitation on
> > > static objects, so we might want to increase that number on arm64.
> >
> > What happens on x86 and why would we do something different?
> 
> On x86 we just do `char __kfence_pool[KFENCE_POOL_SIZE] ...;` to
> statically allocate the pool. On arm64 this doesn't seem to work
> because static memory doesn't have struct pages?

Are you using virt_to_page() directly on that statically-allocated
__kfence_pool? If so you'll need to use lm_alias() if so, as is done in
mm/kasan/init.c.

Anything statically allocated is part of the kernel image address range
rather than the linear/direct map, and doesn't have a valid virt addr,
but its linear map alias does.

If you enable CONFIG_DEBUG_VIRTUAL you should get warnings if missing
lm_alias() calls.

Thanks,
Mark.
Mark Rutland Sept. 29, 2020, 2:02 p.m. UTC | #9
On Fri, Sep 25, 2020 at 05:25:11PM +0200, Alexander Potapenko wrote:
> Will,
> 
> > Given that the pool is relatively small (i.e. when compared with our virtual
> > address space), dedicating an area of virtual space sounds like it makes
> > the most sense here. How early do you need it to be available?
> 
> How do we assign struct pages to a fixed virtual space area (I'm
> currently experimenting with 0xffff7f0000000000-0xffff7f0000200000)?

You don't.

There should be a struct page for each of the /physical/ pages, and
these can be found:

* via the physical address, using phyts_to_page() or pfn_to_page()
* via the linear/direct map, using virt_to_page()
* via the vmalloc page tables using vmalloc_to_page()

If you need virt_to_page() to work, the address has to be part of the
linear/direct map.

If you need to find the struct page for something that's part of the
kernel image you can use virt_to_page(lm_alias(x)).

> Looks like filling page table entries (similarly to what's being done
> in arch/arm64/mm/kasan_init.c) is not enough.
> I thought maybe vmemmap_populate() would do the job, but it didn't
> (virt_to_pfn() still returns invalid PFNs).

As above, I think lm_alias() will solve the problem here. Please see
that and CONFIG_DEBUG_VIRTUAL.

Thanks,
Mark.
Mark Rutland Sept. 29, 2020, 2:27 p.m. UTC | #10
On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> Add architecture specific implementation details for KFENCE and enable
> KFENCE for the arm64 architecture. In particular, this implements the
> required interface in <asm/kfence.h>. Currently, the arm64 version does
> not yet use a statically allocated memory pool, at the cost of a pointer
> load for each is_kfence_address().
> 
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> Co-developed-by: Alexander Potapenko <glider@google.com>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
> For ARM64, we would like to solicit feedback on what the best option is
> to obtain a constant address for __kfence_pool. One option is to declare
> a memory range in the memory layout to be dedicated to KFENCE (like is
> done for KASAN), however, it is unclear if this is the best available
> option. We would like to avoid touching the memory layout.
> ---
>  arch/arm64/Kconfig              |  1 +
>  arch/arm64/include/asm/kfence.h | 39 +++++++++++++++++++++++++++++++++
>  arch/arm64/mm/fault.c           |  4 ++++
>  3 files changed, 44 insertions(+)
>  create mode 100644 arch/arm64/include/asm/kfence.h
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6d232837cbee..1acc6b2877c3 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -132,6 +132,7 @@ config ARM64
>  	select HAVE_ARCH_JUMP_LABEL_RELATIVE
>  	select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
>  	select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
> +	select HAVE_ARCH_KFENCE if (!ARM64_16K_PAGES && !ARM64_64K_PAGES)
>  	select HAVE_ARCH_KGDB
>  	select HAVE_ARCH_MMAP_RND_BITS
>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> diff --git a/arch/arm64/include/asm/kfence.h b/arch/arm64/include/asm/kfence.h
> new file mode 100644
> index 000000000000..608dde80e5ca
> --- /dev/null
> +++ b/arch/arm64/include/asm/kfence.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_KFENCE_H
> +#define __ASM_KFENCE_H
> +
> +#include <linux/kfence.h>
> +#include <linux/log2.h>
> +#include <linux/mm.h>
> +
> +#include <asm/cacheflush.h>
> +
> +#define KFENCE_SKIP_ARCH_FAULT_HANDLER "el1_sync"
> +
> +/*
> + * FIXME: Support HAVE_ARCH_KFENCE_STATIC_POOL: Use the statically allocated
> + * __kfence_pool, to avoid the extra pointer load for is_kfence_address(). By
> + * default, however, we do not have struct pages for static allocations.
> + */
> +
> +static inline bool arch_kfence_initialize_pool(void)
> +{
> +	const unsigned int num_pages = ilog2(roundup_pow_of_two(KFENCE_POOL_SIZE / PAGE_SIZE));
> +	struct page *pages = alloc_pages(GFP_KERNEL, num_pages);
> +
> +	if (!pages)
> +		return false;
> +
> +	__kfence_pool = page_address(pages);
> +	return true;
> +}
> +
> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> +{
> +	set_memory_valid(addr, 1, !protect);
> +
> +	return true;
> +}

This is only safe if the linear map is force ot page granularity. That's
the default with rodata=full, but this is not always the case, so this
will need some interaction with the MMU setup in arch/arm64/mm/mmu.c.

Thanks,
Mark.
Alexander Potapenko Sept. 29, 2020, 4:52 p.m. UTC | #11
> > On x86 we just do `char __kfence_pool[KFENCE_POOL_SIZE] ...;` to
> > statically allocate the pool. On arm64 this doesn't seem to work
> > because static memory doesn't have struct pages?
>
> Are you using virt_to_page() directly on that statically-allocated
> __kfence_pool? If so you'll need to use lm_alias() if so, as is done in
> mm/kasan/init.c.
>
> Anything statically allocated is part of the kernel image address range
> rather than the linear/direct map, and doesn't have a valid virt addr,
> but its linear map alias does.
>
> If you enable CONFIG_DEBUG_VIRTUAL you should get warnings if missing
> lm_alias() calls.

I just checked that on x86 CONFIG_DEBUG_VIRTUAL prints no warnings on our tests.
virt_addr_valid() also returns true for addresses belonging to
__kfence_pool declared in BSS.
Could this be related to x86 mapping the kernel twice?
Alexander Potapenko Sept. 29, 2020, 5:04 p.m. UTC | #12
On Tue, Sep 29, 2020 at 4:28 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Mon, Sep 21, 2020 at 03:26:04PM +0200, Marco Elver wrote:
> > Add architecture specific implementation details for KFENCE and enable
> > KFENCE for the arm64 architecture. In particular, this implements the
> > required interface in <asm/kfence.h>. Currently, the arm64 version does
> > not yet use a statically allocated memory pool, at the cost of a pointer
> > load for each is_kfence_address().
> >
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > Co-developed-by: Alexander Potapenko <glider@google.com>
> > Signed-off-by: Alexander Potapenko <glider@google.com>
> > Signed-off-by: Marco Elver <elver@google.com>
> > ---
> > For ARM64, we would like to solicit feedback on what the best option is
> > to obtain a constant address for __kfence_pool. One option is to declare
> > a memory range in the memory layout to be dedicated to KFENCE (like is
> > done for KASAN), however, it is unclear if this is the best available
> > option. We would like to avoid touching the memory layout.
> > ---
> >  arch/arm64/Kconfig              |  1 +
> >  arch/arm64/include/asm/kfence.h | 39 +++++++++++++++++++++++++++++++++
> >  arch/arm64/mm/fault.c           |  4 ++++
> >  3 files changed, 44 insertions(+)
> >  create mode 100644 arch/arm64/include/asm/kfence.h
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 6d232837cbee..1acc6b2877c3 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -132,6 +132,7 @@ config ARM64
> >       select HAVE_ARCH_JUMP_LABEL_RELATIVE
> >       select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
> >       select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
> > +     select HAVE_ARCH_KFENCE if (!ARM64_16K_PAGES && !ARM64_64K_PAGES)
> >       select HAVE_ARCH_KGDB
> >       select HAVE_ARCH_MMAP_RND_BITS
> >       select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> > diff --git a/arch/arm64/include/asm/kfence.h b/arch/arm64/include/asm/kfence.h
> > new file mode 100644
> > index 000000000000..608dde80e5ca
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kfence.h
> > @@ -0,0 +1,39 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef __ASM_KFENCE_H
> > +#define __ASM_KFENCE_H
> > +
> > +#include <linux/kfence.h>
> > +#include <linux/log2.h>
> > +#include <linux/mm.h>
> > +
> > +#include <asm/cacheflush.h>
> > +
> > +#define KFENCE_SKIP_ARCH_FAULT_HANDLER "el1_sync"
> > +
> > +/*
> > + * FIXME: Support HAVE_ARCH_KFENCE_STATIC_POOL: Use the statically allocated
> > + * __kfence_pool, to avoid the extra pointer load for is_kfence_address(). By
> > + * default, however, we do not have struct pages for static allocations.
> > + */
> > +
> > +static inline bool arch_kfence_initialize_pool(void)
> > +{
> > +     const unsigned int num_pages = ilog2(roundup_pow_of_two(KFENCE_POOL_SIZE / PAGE_SIZE));
> > +     struct page *pages = alloc_pages(GFP_KERNEL, num_pages);
> > +
> > +     if (!pages)
> > +             return false;
> > +
> > +     __kfence_pool = page_address(pages);
> > +     return true;
> > +}
> > +
> > +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> > +{
> > +     set_memory_valid(addr, 1, !protect);
> > +
> > +     return true;
> > +}
>
> This is only safe if the linear map is force ot page granularity. That's
> the default with rodata=full, but this is not always the case, so this
> will need some interaction with the MMU setup in arch/arm64/mm/mmu.c.

On x86 we ensure this by reallocating the necessary page tables.

But looks like your suggestion is what we need for arm64 as long as we
also want virt_to_page() to work for our pool.
It's still unclear to me whether a carveout you suggest can be placed
at a fixed (known at link time) address, as the main point of this
dance is to remove memory accesses from is_kfence_addr().

> Thanks,
> Mark.
Alexander Potapenko Oct. 1, 2020, 11:24 a.m. UTC | #13
Mark,

> If you need virt_to_page() to work, the address has to be part of the
> linear/direct map.
>
> If you need to find the struct page for something that's part of the
> kernel image you can use virt_to_page(lm_alias(x)).
>
> > Looks like filling page table entries (similarly to what's being done
> > in arch/arm64/mm/kasan_init.c) is not enough.
> > I thought maybe vmemmap_populate() would do the job, but it didn't
> > (virt_to_pfn() still returns invalid PFNs).
>
> As above, I think lm_alias() will solve the problem here. Please see
> that and CONFIG_DEBUG_VIRTUAL.

The approach you suggest works to some extent, but there are some caveats.

To reiterate, we are trying to allocate the pool (2Mb by default, but
users may want a bigger one, up to, say, 64 Mb) in a way that:
(1) The underlying page tables support 4K granularity.
(2) is_kfence_address() (checks that __kfence_pool <= addr <=
__kfence_pool + KFENCE_POOL_SIZE) does not reference memory
(3) For addresses belonging to that pool virt_addr_valid() is true
(SLAB/SLUB rely on that)

On x86 we achieve (2) by making our pool a .bss array, so that its
address is known statically. Aligning that array on 4K and calling
set_memory_4k() ensures that (1) is also fulfilled. (3) seems to just
happen automagically without any address translations.

Now, what we are seeing on arm64 is different.
My understanding (please correct me if I'm wrong) is that on arm64
only the memory range at 0xffff000000000000 has valid struct pages,
and the size of that range depends on the amount of memory on the
system.
This probably means we cannot just pick a fixed address for our pool
in that range, unless it is very close to 0xffff000000000000.

If we allocate the pool statically in the way x86 does (assuming we
somehow resolve (1)), we can apply lm_alias() to addresses returned by
the KFENCE allocator, making kmalloc() always return addresses from
the linear map and satisfying (3).
But in that case is_kfence_address() will also need to be updated to
compare the addresses to lm_alias(__kfence_pool), and this becomes
more heavyweight than just reading the address from memory.

So looks like it's still more preferable to allocate the pool
dynamically on ARM64, unless there's a clever trick to allocate a
fixed address in the linear map (DTS maybe?)

Thanks,
Alex
Mark Rutland Oct. 1, 2020, 5:57 p.m. UTC | #14
On Thu, Oct 01, 2020 at 01:24:49PM +0200, Alexander Potapenko wrote:
> Mark,
> 
> > If you need virt_to_page() to work, the address has to be part of the
> > linear/direct map.
> >
> > If you need to find the struct page for something that's part of the
> > kernel image you can use virt_to_page(lm_alias(x)).
> >
> > > Looks like filling page table entries (similarly to what's being done
> > > in arch/arm64/mm/kasan_init.c) is not enough.
> > > I thought maybe vmemmap_populate() would do the job, but it didn't
> > > (virt_to_pfn() still returns invalid PFNs).
> >
> > As above, I think lm_alias() will solve the problem here. Please see
> > that and CONFIG_DEBUG_VIRTUAL.
> 
> The approach you suggest works to some extent, but there are some caveats.
> 
> To reiterate, we are trying to allocate the pool (2Mb by default, but
> users may want a bigger one, up to, say, 64 Mb) in a way that:
> (1) The underlying page tables support 4K granularity.
> (2) is_kfence_address() (checks that __kfence_pool <= addr <=
> __kfence_pool + KFENCE_POOL_SIZE) does not reference memory

What's the underlying requirement here? Is this a performance concern,
codegen/codesize, or something else?

> (3) For addresses belonging to that pool virt_addr_valid() is true
> (SLAB/SLUB rely on that)

As I hinted at before, there's a reasonable amount of code which relies
on being able to round-trip convert (va->{pa,page}->va) allocations from
SLUB, e.g. phys = virt_to_page(addr); ... ; phys = page_to_virt(phys).
Usually this is because the phys addr is stored in some HW register, or
in-memory structure shared with HW.

I'm fairly certain KFENCE will need to support this in order to be
deployable in production, and arm64 is the canary in the coalmine.

I added tests for this back when tag-based KASAN broke this property.
See commit:

  b92a953cb7f727c4 ("lib/test_kasan.c: add roundtrip tests")

... for which IIUC the kfree_via_phys() test would be broken by KFENCE,
even on x86:

| static noinline void __init kfree_via_phys(void)
| {
|        char *ptr;
|        size_t size = 8;
|        phys_addr_t phys;
| 
|        pr_info("invalid-free false positive (via phys)\n");
|        ptr = kmalloc(size, GFP_KERNEL);
|        if (!ptr) {
|                pr_err("Allocation failed\n");
|                return;
|        }
| 
|        phys = virt_to_phys(ptr);
|        kfree(phys_to_virt(phys));
| }

... since the code will pass the linear map alias of the KFENCE VA into
kfree().

To avoid random breakage we either need to:

* Have KFENCE retain this property (which effectively requires
  allocation VAs to fall within the linear/direct map)

* Decide that round-trips are forbidden, and go modify that code
  somehow, which was deemed to be impractical in the past

... and I would strongly prefer the former as it's less liable to break any
existing code.

> On x86 we achieve (2) by making our pool a .bss array, so that its
> address is known statically. Aligning that array on 4K and calling
> set_memory_4k() ensures that (1) is also fulfilled. (3) seems to just
> happen automagically without any address translations.
> 
> Now, what we are seeing on arm64 is different.
> My understanding (please correct me if I'm wrong) is that on arm64
> only the memory range at 0xffff000000000000 has valid struct pages,
> and the size of that range depends on the amount of memory on the
> system.

The way virt_to_page() works is based on there being a constant (at
runtime) offset between a linear map address and the corresponding
physical page. That makes it easy to get the PA with a subtraction, then
the PFN with a shift, then to index the vmemmap array with that to get
the page. The x86 version of virt_to_page() automatically fixes up an
image address to its linear map alias internally.

> This probably means we cannot just pick a fixed address for our pool
> in that range, unless it is very close to 0xffff000000000000.

It would have to be part of the linear map, or we'd have to apply the
same fixup as x86 does. But as above, I'm reluctant to do that as it
only encourages writing fragile code. The only sensible way to detect
that is to disallow virt_to_*() on image addresses, since that's the
only time we can distinguish the source.

> If we allocate the pool statically in the way x86 does (assuming we
> somehow resolve (1)), we can apply lm_alias() to addresses returned by
> the KFENCE allocator, making kmalloc() always return addresses from
> the linear map and satisfying (3).
> But in that case is_kfence_address() will also need to be updated to
> compare the addresses to lm_alias(__kfence_pool), and this becomes
> more heavyweight than just reading the address from memory.

We can calculate the lm_alias(__kfence_pool) at boot time, so it's only
a read from memory in the fast-path.

> So looks like it's still more preferable to allocate the pool
> dynamically on ARM64, unless there's a clever trick to allocate a
> fixed address in the linear map (DTS maybe?)

I'm not too worried about allocating this dynamically, but:

* The arch code needs to set up the translation tables for this, as we
  cannot safely change the mapping granularity live.

* As above I'm fairly certain x86 needs to use a carevout from the
  linear map to function correctly anyhow, so we should follow the same
  approach for both arm64 and x86. That might be a static carevout that
  we figure out the aliasing for, or something entirely dynamic.

Thanks,
Mark.
Marco Elver Oct. 8, 2020, 9:40 a.m. UTC | #15
On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@arm.com> wrote:
[...]
> > > If you need virt_to_page() to work, the address has to be part of the
> > > linear/direct map.
[...]
>
> What's the underlying requirement here? Is this a performance concern,
> codegen/codesize, or something else?

It used to be performance, since is_kfence_address() is used in the
fast path. However, with some further tweaks we just did to
is_kfence_address(), our benchmarks show a pointer load can be
tolerated.

> > (3) For addresses belonging to that pool virt_addr_valid() is true
> > (SLAB/SLUB rely on that)
>
> As I hinted at before, there's a reasonable amount of code which relies
> on being able to round-trip convert (va->{pa,page}->va) allocations from
> SLUB, e.g. phys = virt_to_page(addr); ... ; phys = page_to_virt(phys).
> Usually this is because the phys addr is stored in some HW register, or
> in-memory structure shared with HW.
>
> I'm fairly certain KFENCE will need to support this in order to be
> deployable in production, and arm64 is the canary in the coalmine.
>
> I added tests for this back when tag-based KASAN broke this property.
> See commit:
>
>   b92a953cb7f727c4 ("lib/test_kasan.c: add roundtrip tests")
>
> ... for which IIUC the kfree_via_phys() test would be broken by KFENCE,
> even on x86:

Yeah, we're fixing that by also making x86 use a dynamically allocated
pool now. The benefits we got from the static pool no longer apply, so
the whole dance to make the static pool work right is no longer worth
it.

> | static noinline void __init kfree_via_phys(void)
> | {
> |        char *ptr;
> |        size_t size = 8;
> |        phys_addr_t phys;
> |
> |        pr_info("invalid-free false positive (via phys)\n");
> |        ptr = kmalloc(size, GFP_KERNEL);
> |        if (!ptr) {
> |                pr_err("Allocation failed\n");
> |                return;
> |        }
> |
> |        phys = virt_to_phys(ptr);
> |        kfree(phys_to_virt(phys));
> | }
>
> ... since the code will pass the linear map alias of the KFENCE VA into
> kfree().
>
> To avoid random breakage we either need to:
>
> * Have KFENCE retain this property (which effectively requires
>   allocation VAs to fall within the linear/direct map)

^^ Yes, this is the only realistic option.

> * Decide that round-trips are forbidden, and go modify that code
>   somehow, which was deemed to be impractical in the past
>
> ... and I would strongly prefer the former as it's less liable to break any
> existing code.
>
> > On x86 we achieve (2) by making our pool a .bss array, so that its
> > address is known statically. Aligning that array on 4K and calling
> > set_memory_4k() ensures that (1) is also fulfilled. (3) seems to just
> > happen automagically without any address translations.
> >
> > Now, what we are seeing on arm64 is different.
> > My understanding (please correct me if I'm wrong) is that on arm64
> > only the memory range at 0xffff000000000000 has valid struct pages,
> > and the size of that range depends on the amount of memory on the
> > system.
>
> The way virt_to_page() works is based on there being a constant (at
> runtime) offset between a linear map address and the corresponding
> physical page. That makes it easy to get the PA with a subtraction, then
> the PFN with a shift, then to index the vmemmap array with that to get
> the page. The x86 version of virt_to_page() automatically fixes up an
> image address to its linear map alias internally.
>
> > This probably means we cannot just pick a fixed address for our pool
> > in that range, unless it is very close to 0xffff000000000000.
>
> It would have to be part of the linear map, or we'd have to apply the
> same fixup as x86 does. But as above, I'm reluctant to do that as it
> only encourages writing fragile code. The only sensible way to detect
> that is to disallow virt_to_*() on image addresses, since that's the
> only time we can distinguish the source.
>
> > If we allocate the pool statically in the way x86 does (assuming we
> > somehow resolve (1)), we can apply lm_alias() to addresses returned by
> > the KFENCE allocator, making kmalloc() always return addresses from
> > the linear map and satisfying (3).
> > But in that case is_kfence_address() will also need to be updated to
> > compare the addresses to lm_alias(__kfence_pool), and this becomes
> > more heavyweight than just reading the address from memory.
>
> We can calculate the lm_alias(__kfence_pool) at boot time, so it's only
> a read from memory in the fast-path.
>
> > So looks like it's still more preferable to allocate the pool
> > dynamically on ARM64, unless there's a clever trick to allocate a
> > fixed address in the linear map (DTS maybe?)
>
> I'm not too worried about allocating this dynamically, but:
>
> * The arch code needs to set up the translation tables for this, as we
>   cannot safely change the mapping granularity live.
>
> * As above I'm fairly certain x86 needs to use a carevout from the
>   linear map to function correctly anyhow, so we should follow the same
>   approach for both arm64 and x86. That might be a static carevout that
>   we figure out the aliasing for, or something entirely dynamic.

We're going with dynamically allocating the pool (for both x86 and
arm64), since any benefits we used to measure from the static pool are
no longer measurable (after removing a branch from
is_kfence_address()). It should hopefully simplify a lot of things,
given all the caveats that you pointed out.

For arm64, the only thing left then is to fix up the case if the
linear map is not forced to page granularity.

Thanks,
-- Marco
Mark Rutland Oct. 8, 2020, 10:45 a.m. UTC | #16
On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote:
> On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@arm.com> wrote:
> [...]
> > > > If you need virt_to_page() to work, the address has to be part of the
> > > > linear/direct map.
> [...]
> >
> > What's the underlying requirement here? Is this a performance concern,
> > codegen/codesize, or something else?
> 
> It used to be performance, since is_kfence_address() is used in the
> fast path. However, with some further tweaks we just did to
> is_kfence_address(), our benchmarks show a pointer load can be
> tolerated.

Great!

I reckon that this is something we can optimize in futue if necessary
(e.g. with some form of code-patching for immediate values), but it's
good to have a starting point that works everywhere!

[...]

> > I'm not too worried about allocating this dynamically, but:
> >
> > * The arch code needs to set up the translation tables for this, as we
> >   cannot safely change the mapping granularity live.
> >
> > * As above I'm fairly certain x86 needs to use a carevout from the
> >   linear map to function correctly anyhow, so we should follow the same
> >   approach for both arm64 and x86. That might be a static carevout that
> >   we figure out the aliasing for, or something entirely dynamic.
> 
> We're going with dynamically allocating the pool (for both x86 and
> arm64), since any benefits we used to measure from the static pool are
> no longer measurable (after removing a branch from
> is_kfence_address()). It should hopefully simplify a lot of things,
> given all the caveats that you pointed out.
> 
> For arm64, the only thing left then is to fix up the case if the
> linear map is not forced to page granularity.

The simplest way to do this is to modify arm64's arch_add_memory() to
force the entire linear map to be mapped at page granularity when KFENCE
is enabled, something like:

| diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
| index 936c4762dadff..f6eba0642a4a3 100644
| --- a/arch/arm64/mm/mmu.c
| +++ b/arch/arm64/mm/mmu.c
| @@ -1454,7 +1454,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
|  {
|         int ret, flags = 0;
|  
| -       if (rodata_full || debug_pagealloc_enabled())
| +       if (rodata_full || debug_pagealloc_enabled() ||
| +           IS_ENABLED(CONFIG_KFENCE))
|                 flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
|  
|         __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),

... and I given that RODATA_FULL_DEFAULT_ENABLED is the default, I
suspect it's not worth trying to only for that for the KFENCE region
unless someone complains.

Thanks,
Mark.
Marco Elver Oct. 14, 2020, 7:12 p.m. UTC | #17
On Thu, 8 Oct 2020 at 12:45, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote:
> > On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@arm.com> wrote:
> > [...]
> > > > > If you need virt_to_page() to work, the address has to be part of the
> > > > > linear/direct map.
> > [...]
> > >
> > > What's the underlying requirement here? Is this a performance concern,
> > > codegen/codesize, or something else?
> >
> > It used to be performance, since is_kfence_address() is used in the
> > fast path. However, with some further tweaks we just did to
> > is_kfence_address(), our benchmarks show a pointer load can be
> > tolerated.
>
> Great!
>
> I reckon that this is something we can optimize in futue if necessary
> (e.g. with some form of code-patching for immediate values), but it's
> good to have a starting point that works everywhere!
>
> [...]
>
> > > I'm not too worried about allocating this dynamically, but:
> > >
> > > * The arch code needs to set up the translation tables for this, as we
> > >   cannot safely change the mapping granularity live.
> > >
> > > * As above I'm fairly certain x86 needs to use a carevout from the
> > >   linear map to function correctly anyhow, so we should follow the same
> > >   approach for both arm64 and x86. That might be a static carevout that
> > >   we figure out the aliasing for, or something entirely dynamic.
> >
> > We're going with dynamically allocating the pool (for both x86 and
> > arm64), since any benefits we used to measure from the static pool are
> > no longer measurable (after removing a branch from
> > is_kfence_address()). It should hopefully simplify a lot of things,
> > given all the caveats that you pointed out.
> >
> > For arm64, the only thing left then is to fix up the case if the
> > linear map is not forced to page granularity.
>
> The simplest way to do this is to modify arm64's arch_add_memory() to
> force the entire linear map to be mapped at page granularity when KFENCE
> is enabled, something like:
>
[...]
>
> ... and I given that RODATA_FULL_DEFAULT_ENABLED is the default, I
> suspect it's not worth trying to only for that for the KFENCE region
> unless someone complains.

We've got most of this sorted now for v5 -- thank you!

The only thing we're wondering now, is if there are any corner cases
with using memblock_alloc'd memory for the KFENCE pool? (We'd like to
avoid page alloc's MAX_ORDER limit.) We have a version that passes
tests on x86 and arm64, but checking just in case. :-)

Thanks,
-- Marco
Mark Rutland Oct. 15, 2020, 1:39 p.m. UTC | #18
On Wed, Oct 14, 2020 at 09:12:37PM +0200, Marco Elver wrote:
> On Thu, 8 Oct 2020 at 12:45, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote:
> > > On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@arm.com> wrote:

> > > > > > If you need virt_to_page() to work, the address has to be part of the
> > > > > > linear/direct map.

> > > We're going with dynamically allocating the pool (for both x86 and
> > > arm64), 

[...]

> We've got most of this sorted now for v5 -- thank you!
> 
> The only thing we're wondering now, is if there are any corner cases
> with using memblock_alloc'd memory for the KFENCE pool? (We'd like to
> avoid page alloc's MAX_ORDER limit.) We have a version that passes
> tests on x86 and arm64, but checking just in case. :-)

AFAICT otherwise the only noticeable difference might be PageSlab(), if
that's clear for KFENCE allocated pages? A few helpers appear to check
that to determine how something was allocated (e.g. in the scatterlist
and hwpoison code), and I suspect that needs to behave the same.

Otherwise, I *think* using memblock_alloc should be fine on arm64; I'm
not entirely sure for x86 (but suspect it's similar). On arm64:

* All memory is given a struct page via memblocks_present() adding all
  memory memblocks. This includes memory allocated by memblock_alloc().

* All memory is mapped into the linear map via arm64's map_mem() adding
  all (non-nomap) memory memblocks. This includes memory allocated by
  memblock_alloc().

Thanks,
Mark.
Marco Elver Oct. 15, 2020, 2:15 p.m. UTC | #19
On Thu, 15 Oct 2020 at 15:39, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Oct 14, 2020 at 09:12:37PM +0200, Marco Elver wrote:
> > On Thu, 8 Oct 2020 at 12:45, Mark Rutland <mark.rutland@arm.com> wrote:
> > > On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote:
> > > > On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@arm.com> wrote:
>
> > > > > > > If you need virt_to_page() to work, the address has to be part of the
> > > > > > > linear/direct map.
>
> > > > We're going with dynamically allocating the pool (for both x86 and
> > > > arm64),
>
> [...]
>
> > We've got most of this sorted now for v5 -- thank you!
> >
> > The only thing we're wondering now, is if there are any corner cases
> > with using memblock_alloc'd memory for the KFENCE pool? (We'd like to
> > avoid page alloc's MAX_ORDER limit.) We have a version that passes
> > tests on x86 and arm64, but checking just in case. :-)
>
> AFAICT otherwise the only noticeable difference might be PageSlab(), if
> that's clear for KFENCE allocated pages? A few helpers appear to check
> that to determine how something was allocated (e.g. in the scatterlist
> and hwpoison code), and I suspect that needs to behave the same.

We had to take care of setting PageSlab before, too. We do this during
kfence_init().

> Otherwise, I *think* using memblock_alloc should be fine on arm64; I'm
> not entirely sure for x86 (but suspect it's similar). On arm64:
>
> * All memory is given a struct page via memblocks_present() adding all
>   memory memblocks. This includes memory allocated by memblock_alloc().
>
> * All memory is mapped into the linear map via arm64's map_mem() adding
>   all (non-nomap) memory memblocks. This includes memory allocated by
>   memblock_alloc().

Very good, thank you. We'll send v5 with these changes rebased on
5.10-rc1 (in ~2 weeks).

Thanks,
-- Marco
diff mbox series

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6d232837cbee..1acc6b2877c3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -132,6 +132,7 @@  config ARM64
 	select HAVE_ARCH_JUMP_LABEL_RELATIVE
 	select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
 	select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
+	select HAVE_ARCH_KFENCE if (!ARM64_16K_PAGES && !ARM64_64K_PAGES)
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
diff --git a/arch/arm64/include/asm/kfence.h b/arch/arm64/include/asm/kfence.h
new file mode 100644
index 000000000000..608dde80e5ca
--- /dev/null
+++ b/arch/arm64/include/asm/kfence.h
@@ -0,0 +1,39 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_KFENCE_H
+#define __ASM_KFENCE_H
+
+#include <linux/kfence.h>
+#include <linux/log2.h>
+#include <linux/mm.h>
+
+#include <asm/cacheflush.h>
+
+#define KFENCE_SKIP_ARCH_FAULT_HANDLER "el1_sync"
+
+/*
+ * FIXME: Support HAVE_ARCH_KFENCE_STATIC_POOL: Use the statically allocated
+ * __kfence_pool, to avoid the extra pointer load for is_kfence_address(). By
+ * default, however, we do not have struct pages for static allocations.
+ */
+
+static inline bool arch_kfence_initialize_pool(void)
+{
+	const unsigned int num_pages = ilog2(roundup_pow_of_two(KFENCE_POOL_SIZE / PAGE_SIZE));
+	struct page *pages = alloc_pages(GFP_KERNEL, num_pages);
+
+	if (!pages)
+		return false;
+
+	__kfence_pool = page_address(pages);
+	return true;
+}
+
+static inline bool kfence_protect_page(unsigned long addr, bool protect)
+{
+	set_memory_valid(addr, 1, !protect);
+
+	return true;
+}
+
+#endif /* __ASM_KFENCE_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index f07333e86c2f..d5b72ecbeeea 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -10,6 +10,7 @@ 
 #include <linux/acpi.h>
 #include <linux/bitfield.h>
 #include <linux/extable.h>
+#include <linux/kfence.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
 #include <linux/hardirq.h>
@@ -310,6 +311,9 @@  static void __do_kernel_fault(unsigned long addr, unsigned int esr,
 	    "Ignoring spurious kernel translation fault at virtual address %016lx\n", addr))
 		return;
 
+	if (kfence_handle_page_fault(addr))
+		return;
+
 	if (is_el1_permission_fault(addr, esr, regs)) {
 		if (esr & ESR_ELx_WNR)
 			msg = "write to read-only memory";