diff mbox series

[RFC,01/10] vmalloc: Add basic perm alloc implementation

Message ID 20201120202426.18009-2-rick.p.edgecombe@intel.com (mailing list archive)
State RFC
Headers show
Series New permission vmalloc interface | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Edgecombe, Rick P Nov. 20, 2020, 8:24 p.m. UTC
In order to allow for future arch specific optimizations for vmalloc
permissions, first add an implementation of a new interface that will
work cross arch by using the existing set_memory_() functions.

When allocating some memory that will be RO, for example it should be used
like:

/* Reserve va */
struct perm_allocation *alloc = perm_alloc(vstart, vend, page_cnt, PERM_R);
unsigned long ro = (unsigned long)perm_alloc_address(alloc);

/* Write to writable address */
strcpy((char *)perm_writable_addr(alloc, ro), "Some data to be RO");
/* Signal that writing is done and mapping should be live */
perm_writable_finish(alloc);
/* Print from RO address */
printk("Read only data is: %s\n", (char *)ro);

Create some new flags to handle the memory permissions currently defined
cross-architectually in the set_memory_() function names themselves. The
PAGE_ defines are not uniform across the architectures, so couldn't be used
without unifying them. However in the future there may also be some other
flags, for example requesting to try to allocate into part of a 2MB page
for longer lived allocations.

Have the default implementation use the primary address for loading the
data as is done today for special kernel permission usages. However, make
the interface compatible with having the writable data loaded at a
separate address or via some PKS backed solution. Allocate using
module_alloc() in the default implementation in order to allocate from
each arch's chosen place for executable code.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/Kconfig            |   3 +
 include/linux/vmalloc.h |  82 ++++++++++++++++++++++++
 mm/nommu.c              |  66 ++++++++++++++++++++
 mm/vmalloc.c            | 135 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 286 insertions(+)

Comments

Andy Lutomirski Nov. 22, 2020, 4:10 a.m. UTC | #1
On Fri, Nov 20, 2020 at 12:30 PM Rick Edgecombe
<rick.p.edgecombe@intel.com> wrote:
>
> In order to allow for future arch specific optimizations for vmalloc
> permissions, first add an implementation of a new interface that will
> work cross arch by using the existing set_memory_() functions.
>
> When allocating some memory that will be RO, for example it should be used
> like:
>
> /* Reserve va */
> struct perm_allocation *alloc = perm_alloc(vstart, vend, page_cnt, PERM_R);

I'm sure I could reverse-engineer this from the code, but:

Where do vstart and vend come from?  Does perm_alloc() allocate memory
or just virtual addresses?  Is the caller expected to call vmalloc()?
How does one free this thing?

> unsigned long ro = (unsigned long)perm_alloc_address(alloc);
>
> /* Write to writable address */
> strcpy((char *)perm_writable_addr(alloc, ro), "Some data to be RO");
> /* Signal that writing is done and mapping should be live */
> perm_writable_finish(alloc);
> /* Print from RO address */
> printk("Read only data is: %s\n", (char *)ro);
>
Edgecombe, Rick P Nov. 23, 2020, 12:01 a.m. UTC | #2
On Sat, 2020-11-21 at 20:10 -0800, Andy Lutomirski wrote:
> On Fri, Nov 20, 2020 at 12:30 PM Rick Edgecombe
> <rick.p.edgecombe@intel.com> wrote:
> > In order to allow for future arch specific optimizations for
> > vmalloc
> > permissions, first add an implementation of a new interface that
> > will
> > work cross arch by using the existing set_memory_() functions.
> > 
> > When allocating some memory that will be RO, for example it should
> > be used
> > like:
> > 
> > /* Reserve va */
> > struct perm_allocation *alloc = perm_alloc(vstart, vend, page_cnt,
> > PERM_R);
> 
> I'm sure I could reverse-engineer this from the code, but:
> 
> Where do vstart and vend come from?

They are the start and end virtual address range to try to allocate in,
like __vmalloc_node_range() has. So like, MODULES_VADDR and
MODULES_END. Sorry for the terse example. The header in this patch has
some comments about each of the new functions to supplement it a bit.

> Does perm_alloc() allocate memory or just virtual addresses? Is the
> caller expected to call vmalloc()?

The caller does not need to call vmalloc(). perm_alloc() behaves
similar to __vmalloc_node_range(), where it allocates both memory and
virtual addresses. I left a little wiggle room in the descriptions in
the header, that the virtual address range doesn't actually need to be
mapped until after perm_writable_finish(). But both of the
implementations in this series will map it right away like a vmalloc().

So the interface could actually pretty easily be changed to look like
another flavor of vmalloc() that just returns a pointer to allocation.
The reason why it returns this new struct instead is that, unlike most
vmalloc()'s, the callers will be looking up metadata about the
allocation a bunch of times (the writable address). Having this
metadata stored in some struct inside vmalloc would mean
perm_writable_addr() would have to do something like find_vmap_area()
every time in order to find the writable allocation address from the
allocations address passed in. So returning a struct makes it so the
writable translation can skip a global lock and lookup. 

Another option could be putting the new metadata in vm_struct and just
return that, like get_vm_area(). Then we don't need to invent a new
struct. But then normal vmalloc()'s would have a bit of wasted memory
since they don't need this metadata.

A nice thing about that though, is there would be a central place to
translate to the writable addresses in cases where only a virtual
address is available. In later patches here, a similar lookup happens
anyway for modules using __module_address() to get the writable
address. This is due to some existing code where plumbing the new
struct all the way through would have resulted in too many changes.

I'm not sure which is best.

> How does one free this thing?

void perm_free(struct perm_allocation *alloc);
Christoph Hellwig Nov. 23, 2020, 9 a.m. UTC | #3
First thanks for doing this, having a vmalloc variant that starts out
with proper permissions has been on my todo list for a while.


> +#define PERM_R	1
> +#define PERM_W	2
> +#define PERM_X	4
> +#define PERM_RWX	(PERM_R | PERM_W | PERM_X)
> +#define PERM_RW		(PERM_R | PERM_W)
> +#define PERM_RX		(PERM_R | PERM_X)

Why can't this use the normal pgprot flags?

> +typedef u8 virtual_perm;

This would need __bitwise annotations to allow sparse to typecheck the
flags.

> +/*
> + * Allocate a special permission kva region. The region may not be mapped
> + * until a call to perm_writable_finish(). A writable region will be mapped
> + * immediately at the address returned by perm_writable_addr(). The allocation
> + * will be made between the start and end virtual addresses.
> + */
> +struct perm_allocation *perm_alloc(unsigned long vstart, unsigned long vend, unsigned long page_cnt,
> +				   virtual_perm perms);

Please avoid totally pointless overly long line (all over the series)

Also I find the unsigned long for kernel virtual address interface
strange, but I'll take a look at the callers later.
Edgecombe, Rick P Nov. 23, 2020, 8:44 p.m. UTC | #4
On Mon, 2020-11-23 at 09:00 +0000, Christoph Hellwig wrote:
> First thanks for doing this, having a vmalloc variant that starts out
> with proper permissions has been on my todo list for a while.
> 
> > +#define PERM_R	1
> > +#define PERM_W	2
> > +#define PERM_X	4
> > +#define PERM_RWX	(PERM_R | PERM_W | PERM_X)
> > +#define PERM_RW		(PERM_R | PERM_W)
> > +#define PERM_RX		(PERM_R | PERM_X)
> 
> Why can't this use the normal pgprot flags?
> 

Well, there were two reasons:
1. Non-standard naming for the PAGE_FOO flags. For example,
PAGE_KERNEL_ROX vs PAGE_KERNEL_READ_EXEC. This could be unified. I
think it's just riscv that breaks the conventions. Others are just
missing some.

2. The need to translate between the flags and set_memory_foo() calls.
For example if a permission is RW and the caller is asking to change it
to RWX. Some architectures have an X permission and others an NX
permission, and it's the same with read only vs writable. So these
flags are trying to be more analogous of the cross-arch set_memory_()
function names rather than pgprot flags.

I guess you could do something like (pgprot_val(PAGE_KERNEL_EXEC) &
~pgprot_val(PAGE_KERNEL)) and assume if there are any bits set it is a
positive permission and from that deduce whether to call set_memory_nx() or set_memory_x().

But I thought that using those pgprot flags was still sort overloading
the meaning of pgprot. My understanding was that it is supposed to hold
the actual bits set in the PTE. For example large pages or TLB hints
(like PAGE_KERNEL_EXEC_CONT) could set or unset extra bits, so asking
for PAGE_KERNEL_EXEC wouldn't necessarily mean "set these bits in all
of the PTEs", it could mean something more like "infer what I want from
these bits and do that".

x86's cpa will also avoid changing NX if it is not supported, so if the
caller asked for PAGE_KERNEL->PAGE_KERNEL_EXEC in perm_change() it
should not necessarily bother setting all of the PAGE_KERNEL_EXEC bits
in the actual PTEs. Asking for PERM_RW->PERM_RWX on the other hand,
would let the implementation do whatever it needs to set the memory
executable, like set_memory_x() does. It should work either way but
seems like the expectations would be a little clearer with the PERM_
flags.

On the other hand, creating a whole new set of flags is not ideal
either. But that was just my reasoning. Does it seem worth it?

> > +typedef u8 virtual_perm;
> 
> This would need __bitwise annotations to allow sparse to typecheck
> the
> flags.
> 

Ok, thanks.

> > +/*
> > + * Allocate a special permission kva region. The region may not be
> > mapped
> > + * until a call to perm_writable_finish(). A writable region will
> > be mapped
> > + * immediately at the address returned by perm_writable_addr().
> > The allocation
> > + * will be made between the start and end virtual addresses.
> > + */
> > +struct perm_allocation *perm_alloc(unsigned long vstart, unsigned
> > long vend, unsigned long page_cnt,
> > +				   virtual_perm perms);
> 
> Please avoid totally pointless overly long line (all over the series)

Could easily wrap this one, but just to clarify, do you mean lines over
80 chars? There were already some over 80 in vmalloc before the move to
100 chars, so figured it was ok to stretch out now.

> Also I find the unsigned long for kernel virtual address interface
> strange, but I'll take a look at the callers later.

Yea, some of the callers need to cast either way. I think I changed it
to unsigned long, because casting (void *) was smaller in the code than
(unsigned long) and it shorted some line lengths.
Christoph Hellwig Nov. 24, 2020, 10:16 a.m. UTC | #5
On Mon, Nov 23, 2020 at 12:01:35AM +0000, Edgecombe, Rick P wrote:
> Another option could be putting the new metadata in vm_struct and just
> return that, like get_vm_area(). Then we don't need to invent a new
> struct. But then normal vmalloc()'s would have a bit of wasted memory
> since they don't need this metadata.

That would seem most natural to me.  We'll need to figure out how we
can do that without bloating vm_struct too much.  One option would
be a bigger structure that embedds vm_struct and can be retreived using
container_of().
Christoph Hellwig Nov. 24, 2020, 10:19 a.m. UTC | #6
On Mon, Nov 23, 2020 at 08:44:12PM +0000, Edgecombe, Rick P wrote:
> Well, there were two reasons:
> 1. Non-standard naming for the PAGE_FOO flags. For example,
> PAGE_KERNEL_ROX vs PAGE_KERNEL_READ_EXEC. This could be unified. I
> think it's just riscv that breaks the conventions. Others are just
> missing some.

We need to standardize those anyway.  I've done that for a few
PAGE_* constants already but as you see there is more work to do.

> But I thought that using those pgprot flags was still sort overloading
> the meaning of pgprot. My understanding was that it is supposed to hold
> the actual bits set in the PTE. For example large pages or TLB hints
> (like PAGE_KERNEL_EXEC_CONT) could set or unset extra bits, so asking
> for PAGE_KERNEL_EXEC wouldn't necessarily mean "set these bits in all
> of the PTEs", it could mean something more like "infer what I want from
> these bits and do that".
> 
> x86's cpa will also avoid changing NX if it is not supported, so if the
> caller asked for PAGE_KERNEL->PAGE_KERNEL_EXEC in perm_change() it
> should not necessarily bother setting all of the PAGE_KERNEL_EXEC bits
> in the actual PTEs. Asking for PERM_RW->PERM_RWX on the other hand,
> would let the implementation do whatever it needs to set the memory
> executable, like set_memory_x() does. It should work either way but
> seems like the expectations would be a little clearer with the PERM_
> flags.

Ok, maybe that is an argument, and we should use the new flags more
broadly.

> Could easily wrap this one, but just to clarify, do you mean lines over
> 80 chars? There were already some over 80 in vmalloc before the move to
> 100 chars, so figured it was ok to stretch out now.

CodingStyle still says 80 characters unless you have an exception where
a longer line improves the readability.  The quoted code absolutely
does not fit the definition of an exception or improves readability.
Edgecombe, Rick P Nov. 24, 2020, 7:59 p.m. UTC | #7
On Tue, 2020-11-24 at 10:19 +0000, hch@infradead.org wrote:
> But I thought that using those pgprot flags was still sort
> overloading
> > the meaning of pgprot. My understanding was that it is supposed to
> > hold
> > the actual bits set in the PTE. For example large pages or TLB
> > hints
> > (like PAGE_KERNEL_EXEC_CONT) could set or unset extra bits, so
> > asking
> > for PAGE_KERNEL_EXEC wouldn't necessarily mean "set these bits in
> > all
> > of the PTEs", it could mean something more like "infer what I want
> > from
> > these bits and do that".
> > 
> > x86's cpa will also avoid changing NX if it is not supported, so if
> > the
> > caller asked for PAGE_KERNEL->PAGE_KERNEL_EXEC in perm_change() it
> > should not necessarily bother setting all of the PAGE_KERNEL_EXEC
> > bits
> > in the actual PTEs. Asking for PERM_RW->PERM_RWX on the other hand,
> > would let the implementation do whatever it needs to set the memory
> > executable, like set_memory_x() does. It should work either way but
> > seems like the expectations would be a little clearer with the
> > PERM_
> > flags.
> 
> Ok, maybe that is an argument, and we should use the new flags more
> broadly.

They might make sense to live in set_memory.h then. Separate from this
patchset, a call like set_memory(addr, numpages, PERM_R) could be more
efficient than two calls to set_memory_ro() and set_memory_nx(). Not
that it happens very much outside of vmalloc usages. But just to try to
think where else it could be used.

> > Could easily wrap this one, but just to clarify, do you mean lines
> > over
> > 80 chars? There were already some over 80 in vmalloc before the
> > move to
> > 100 chars, so figured it was ok to stretch out now.
> 
> CodingStyle still says 80 characters unless you have an exception
> where
> a longer line improves the readability.  The quoted code absolutely
> does not fit the definition of an exception or improves readability.

Fair enough.

And to the other comment in your first mail, glad to do this and
finally send it out. This series has been sitting in a local branch for
most of the year while stuff kept interrupting it.
Edgecombe, Rick P Nov. 24, 2020, 8 p.m. UTC | #8
On Tue, 2020-11-24 at 10:16 +0000, Christoph Hellwig wrote:
> On Mon, Nov 23, 2020 at 12:01:35AM +0000, Edgecombe, Rick P wrote:
> > Another option could be putting the new metadata in vm_struct and
> > just
> > return that, like get_vm_area(). Then we don't need to invent a new
> > struct. But then normal vmalloc()'s would have a bit of wasted
> > memory
> > since they don't need this metadata.
> 
> That would seem most natural to me.  We'll need to figure out how we
> can do that without bloating vm_struct too much.  One option would
> be a bigger structure that embedds vm_struct and can be retreived
> using
> container_of().

Hmm, neat. I can change this in the next version.
Sean Christopherson Dec. 4, 2020, 11:24 p.m. UTC | #9
On Fri, Nov 20, 2020, Rick Edgecombe wrote:
> +struct perm_allocation {
> +	struct page **pages;
> +	virtual_perm cur_perm;
> +	virtual_perm orig_perm;
> +	struct vm_struct *area;
> +	unsigned long offset;
> +	unsigned long size;
> +	void *writable;
> +};
> +
> +/*
> + * Allocate a special permission kva region. The region may not be mapped
> + * until a call to perm_writable_finish(). A writable region will be mapped
> + * immediately at the address returned by perm_writable_addr(). The allocation
> + * will be made between the start and end virtual addresses.
> + */
> +struct perm_allocation *perm_alloc(unsigned long vstart, unsigned long vend, unsigned long page_cnt,
> +				   virtual_perm perms);

IMO, 'perm' as the root namespace is too generic, and perm_ is already very
prevelant throughout the kernel.  E.g. it's not obvious when looking at the
callers that perm_alloc() is the first step in setting up an alternate kernel
VA->PA mapping.

I don't have a suggestion for a more intuitive name, but in the absence of a
perfect name, I'd vote for an acronym that is easy to grep.  Something like
pvmap?  That isn't currently used in the kernel, though I can't help but read it
as "paravirt map"...
Edgecombe, Rick P Dec. 7, 2020, 11:55 p.m. UTC | #10
On Fri, 2020-12-04 at 15:24 -0800, Sean Christopherson wrote:
> On Fri, Nov 20, 2020, Rick Edgecombe wrote:
> > +struct perm_allocation {
> > +	struct page **pages;
> > +	virtual_perm cur_perm;
> > +	virtual_perm orig_perm;
> > +	struct vm_struct *area;
> > +	unsigned long offset;
> > +	unsigned long size;
> > +	void *writable;
> > +};
> > +
> > +/*
> > + * Allocate a special permission kva region. The region may not be
> > mapped
> > + * until a call to perm_writable_finish(). A writable region will
> > be mapped
> > + * immediately at the address returned by perm_writable_addr().
> > The allocation
> > + * will be made between the start and end virtual addresses.
> > + */
> > +struct perm_allocation *perm_alloc(unsigned long vstart, unsigned
> > long vend, unsigned long page_cnt,
> > +				   virtual_perm perms);
> 
> IMO, 'perm' as the root namespace is too generic, and perm_ is
> already very
> prevelant throughout the kernel.  E.g. it's not obvious when looking
> at the
> callers that perm_alloc() is the first step in setting up an
> alternate kernel
> VA->PA mapping.
> 
> I don't have a suggestion for a more intuitive name, but in the
> absence of a
> perfect name, I'd vote for an acronym that is easy to
> grep.  Something like
> pvmap?  That isn't currently used in the kernel, though I can't help
> but read it
> as "paravirt map"...

Good point, thanks.

After Christoph's comments to return a vm_struct pointer, I was going
to try to pick some more vmalloc-like names. Like vmalloc_perm(),
vmalloc_writable_finish(), etc. Still have to play around with it some
more.
diff mbox series

Patch

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..0fa42f76548d 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -259,6 +259,9 @@  config ARCH_HAS_SET_MEMORY
 config ARCH_HAS_SET_DIRECT_MAP
 	bool
 
+config ARCH_HAS_PERM_ALLOC_IMPLEMENTATION
+	bool
+
 #
 # Select if the architecture provides the arch_dma_set_uncached symbol to
 # either provide an uncached segement alias for a DMA allocation, or
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 938eaf9517e2..4a6b30014fff 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -248,4 +248,86 @@  pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
 int register_vmap_purge_notifier(struct notifier_block *nb);
 int unregister_vmap_purge_notifier(struct notifier_block *nb);
 
+#define PERM_R	1
+#define PERM_W	2
+#define PERM_X	4
+#define PERM_RWX	(PERM_R | PERM_W | PERM_X)
+#define PERM_RW		(PERM_R | PERM_W)
+#define PERM_RX		(PERM_R | PERM_X)
+
+typedef u8 virtual_perm;
+
+struct perm_allocation {
+	struct page **pages;
+	virtual_perm cur_perm;
+	virtual_perm orig_perm;
+	struct vm_struct *area;
+	unsigned long offset;
+	unsigned long size;
+	void *writable;
+};
+
+/*
+ * Allocate a special permission kva region. The region may not be mapped
+ * until a call to perm_writable_finish(). A writable region will be mapped
+ * immediately at the address returned by perm_writable_addr(). The allocation
+ * will be made between the start and end virtual addresses.
+ */
+struct perm_allocation *perm_alloc(unsigned long vstart, unsigned long vend, unsigned long page_cnt,
+				   virtual_perm perms);
+
+/* The writable address for data to be loaded into the allocation */
+unsigned long perm_writable_addr(struct perm_allocation *alloc, unsigned long addr);
+
+/* The writable address for data to be loaded into the allocation */
+bool perm_writable_finish(struct perm_allocation *alloc);
+
+/* Change the permission of an allocation that is already live */
+bool perm_change(struct perm_allocation *alloc, virtual_perm perms);
+
+/* Free an allocation */
+void perm_free(struct perm_allocation *alloc);
+
+/* Helper for memsetting an allocation. Should be called before perm_writable_finish() */
+void perm_memset(struct perm_allocation *alloc, char val);
+
+/* The final address of the allocation */
+static inline unsigned long perm_alloc_address(const struct perm_allocation *alloc)
+{
+	return (unsigned long)alloc->area->addr + alloc->offset;
+}
+
+/* The size of the allocation */
+static inline unsigned long perm_alloc_size(const struct perm_allocation *alloc)
+{
+	return alloc->size;
+}
+
+static inline unsigned long within_perm_alloc(const struct perm_allocation *alloc,
+					      unsigned long addr)
+{
+	unsigned long base, size;
+
+	if (!alloc)
+		return false;
+
+	base = perm_alloc_address(alloc);
+	size = perm_alloc_size(alloc);
+
+	return base <= addr && addr < base + size;
+}
+
+static inline unsigned long perm_writable_base(struct perm_allocation *alloc)
+{
+	return perm_writable_addr(alloc, perm_alloc_address(alloc));
+}
+
+static inline bool perm_is_writable(struct perm_allocation *alloc)
+{
+	if (!alloc)
+		return false;
+
+	return (alloc->cur_perm & PERM_W) || alloc->writable;
+}
+
 #endif /* _LINUX_VMALLOC_H */
diff --git a/mm/nommu.c b/mm/nommu.c
index 0faf39b32cdb..6458bd23de3e 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1810,6 +1810,72 @@  int nommu_shrink_inode_mappings(struct inode *inode, size_t size,
 	return 0;
 }
 
+struct perm_allocation *perm_alloc(unsigned long vstart, unsigned long vend, unsigned long page_cnt,
+				   virtual_perm perms)
+{
+	struct perm_allocation *alloc;
+	struct vm_struct *area;
+	unsigned long size = page_cnt << PAGE_SHIFT;
+	void *ptr;
+
+	if (!size)
+		return NULL;
+
+	alloc = kmalloc(sizeof(*alloc), GFP_KERNEL | __GFP_ZERO);
+
+	if (!alloc)
+		return NULL;
+
+	area = kmalloc(sizeof(*area), GFP_KERNEL | __GFP_ZERO);
+
+	if (!area)
+		goto free_alloc;
+
+	alloc->area = area;
+
+	ptr = vmalloc(size);
+
+	if (!ptr)
+		goto free_area;
+
+	alloc->size = size;
+	alloc->cur_perm = PERM_RWX;
+
+	return alloc;
+
+free_area:
+	kfree(area);
+free_alloc:
+	kfree(alloc);
+	return NULL;
+}
+
+unsigned long perm_writable_addr(struct perm_allocation *alloc, unsigned long addr)
+{
+	return addr;
+}
+
+bool perm_writable_finish(struct perm_allocation *alloc)
+{
+	return true;
+}
+
+bool perm_change(struct perm_allocation *alloc, virtual_perm perms)
+{
+	return true;
+}
+
+void perm_free(struct perm_allocation *alloc)
+{
+	if (!alloc)
+		return;
+
+	kfree(alloc->area);
+	kfree(alloc);
+}
+
+void perm_memset(struct perm_allocation *alloc, char val) {}
+
 /*
  * Initialise sysctl_user_reserve_kbytes.
  *
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6ae491a8b210..3e8e54a75dfc 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,6 +34,7 @@ 
 #include <linux/bitops.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/overflow.h>
+#include <linux/moduleloader.h>
 
 #include <linux/uaccess.h>
 #include <asm/tlbflush.h>
@@ -3088,6 +3089,140 @@  void free_vm_area(struct vm_struct *area)
 }
 EXPORT_SYMBOL_GPL(free_vm_area);
 
+#ifndef CONFIG_ARCH_HAS_PERM_ALLOC_IMPLEMENTATION
+
+#ifndef CONFIG_MODULES
+/* If modules is not configured, provide stubs so perm_alloc() could use fallback logic. */
+void *module_alloc(unsigned long size)
+{
+	return NULL;
+}
+
+void module_memfree(void *module_region) { }
+#endif /* !CONFIG_MODULES */
+
+struct perm_allocation *perm_alloc(unsigned long vstart, unsigned long vend, unsigned long page_cnt,
+				   virtual_perm perms)
+{
+	struct perm_allocation *alloc;
+	unsigned long size = page_cnt << PAGE_SHIFT;
+	void *ptr;
+
+	if (!size)
+		return NULL;
+
+	alloc = kmalloc(sizeof(*alloc), GFP_KERNEL | __GFP_ZERO);
+
+	if (!alloc)
+		return NULL;
+
+	ptr = module_alloc(size);
+
+	if (!ptr) {
+		kfree(alloc);
+		return NULL;
+	}
+
+	/*
+	 * In order to work with all arch's we call the arch's module_alloc() which is the only
+	 * cross-arch place where information about where an executable allocation should go is
+	 * located. If the caller passed in a different range they want for the allocation...we
+	 * could try a vmalloc_node_range() at this point, but just return NULL for now.
+	 */
+	if ((unsigned long)ptr < vstart || (unsigned long)ptr >= vend) {
+		module_memfree(ptr);
+		kfree(alloc);
+		return NULL;
+	}
+
+	alloc->area = find_vm_area(ptr);
+	alloc->size = size;
+
+	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_X86))
+		alloc->cur_perm = PERM_RW;
+	else
+		alloc->cur_perm = PERM_RWX;
+
+	alloc->orig_perm = perms;
+
+	return alloc;
+}
+
+unsigned long perm_writable_addr(struct perm_allocation *alloc, unsigned long addr)
+{
+	return addr;
+}
+
+bool perm_writable_finish(struct perm_allocation *alloc)
+{
+	if (!alloc)
+		return false;
+
+	return perm_change(alloc, alloc->orig_perm);
+}
+
+bool perm_change(struct perm_allocation *alloc, virtual_perm perm)
+{
+	unsigned long start, npages;
+	virtual_perm unset, set;
+
+	if (!alloc)
+		return false;
+
+	npages = alloc->size >> PAGE_SHIFT;
+
+	start = perm_alloc_address(alloc);
+
+	set = ~alloc->cur_perm & perm;
+	unset = alloc->cur_perm & ~perm;
+
+	if (set & PERM_W)
+		set_memory_rw(start, npages);
+
+	if (unset & PERM_W)
+		set_memory_ro(start, npages);
+
+	if (set & PERM_X)
+		set_memory_x(start, npages);
+
+	if (unset & PERM_X)
+		set_memory_nx(start, npages);
+
+	alloc->cur_perm = perm;
+
+	return false;
+}
+
+static inline bool perms_need_reset(struct perm_allocation *alloc)
+{
+	return (alloc->cur_perm & PERM_X) || (~alloc->cur_perm & PERM_W);
+}
+
+void perm_free(struct perm_allocation *alloc)
+{
+	unsigned long addr;
+
+	if (!alloc)
+		return;
+
+	addr = perm_alloc_address(alloc);
+
+	if (perms_need_reset(alloc))
+		set_vm_flush_reset_perms((void *)addr);
+
+	module_memfree((void *)addr);
+
+	kfree(alloc);
+}
+
+void perm_memset(struct perm_allocation *alloc, char val)
+{
+	if (!alloc)
+		return;
+	memset((void *)perm_writable_base(alloc), val, perm_alloc_size(alloc));
+}
+#endif /* CONFIG_ARCH_HAS_PERM_ALLOC_IMPLEMENTATION */
+
 #ifdef CONFIG_SMP
 static struct vmap_area *node_to_va(struct rb_node *n)
 {