Message ID | 20230918072955.2507221-4-rppt@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: jit/text allocator | expand |
On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote: > + > +static void execmem_init_missing(struct execmem_params *p) > +{ > + struct execmem_range *default_range = &p- > >ranges[EXECMEM_DEFAULT]; > + > + for (int i = EXECMEM_DEFAULT + 1; i < EXECMEM_TYPE_MAX; i++) > { > + struct execmem_range *r = &p->ranges[i]; > + > + if (!r->start) { > + r->pgprot = default_range->pgprot; > + r->alignment = default_range->alignment; > + r->start = default_range->start; > + r->end = default_range->end; > + } > + } > +} > + It seems a bit weird to copy all of this. Is it trying to be faster or something? Couldn't it just check r->start in execmem_text/data_alloc() path and switch to EXECMEM_DEFAULT if needed then? The execmem_range_is_data() part that comes later could be added to the logic there too. So this seems like unnecessary complexity to me or I don't see the reason.
On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote: > It seems a bit weird to copy all of this. Is it trying to be faster > or > something? > > Couldn't it just check r->start in execmem_text/data_alloc() path and > switch to EXECMEM_DEFAULT if needed then? The execmem_range_is_data() > part that comes later could be added to the logic there too. So this > seems like unnecessary complexity to me or I don't see the reason. I guess this is a bad idea because if you have the full size array sitting around anyway you might as well use it and reduce the exec_mem_alloc() logic. Just looking at it from the x86 side (and similar) though, where there is actually only one execmem_range and it building this whole array with identical data and it seems weird.
On Wed, Oct 04, 2023 at 03:39:26PM +0000, Edgecombe, Rick P wrote: > On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote: > > It seems a bit weird to copy all of this. Is it trying to be faster > > or > > something? > > > > Couldn't it just check r->start in execmem_text/data_alloc() path and > > switch to EXECMEM_DEFAULT if needed then? The execmem_range_is_data() > > part that comes later could be added to the logic there too. So this > > seems like unnecessary complexity to me or I don't see the reason. > > I guess this is a bad idea because if you have the full size array > sitting around anyway you might as well use it and reduce the > exec_mem_alloc() logic. That's was the idea, indeed. :) > Just looking at it from the x86 side (and > similar) though, where there is actually only one execmem_range and it > building this whole array with identical data and it seems weird. Right, most architectures have only one range, but to support all variants that we have, execmem has to maintain the whole array.
On Thu, 2023-10-05 at 08:26 +0300, Mike Rapoport wrote: > On Wed, Oct 04, 2023 at 03:39:26PM +0000, Edgecombe, Rick P wrote: > > On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote: > > > It seems a bit weird to copy all of this. Is it trying to be > > > faster > > > or > > > something? > > > > > > Couldn't it just check r->start in execmem_text/data_alloc() path > > > and > > > switch to EXECMEM_DEFAULT if needed then? The > > > execmem_range_is_data() > > > part that comes later could be added to the logic there too. So > > > this > > > seems like unnecessary complexity to me or I don't see the > > > reason. > > > > I guess this is a bad idea because if you have the full size array > > sitting around anyway you might as well use it and reduce the > > exec_mem_alloc() logic. > > That's was the idea, indeed. :) > > > Just looking at it from the x86 side (and > > similar) though, where there is actually only one execmem_range and > > it > > building this whole array with identical data and it seems weird. > > Right, most architectures have only one range, but to support all > variants > that we have, execmem has to maintain the whole array. What about just having an index into a smaller set of ranges. The module area and the extra JIT area. So ->ranges can be size 3 (statically allocated in the arch code) for three areas and then the index array can be size EXECMEM_TYPE_MAX. The default 0 value of the indexing array will point to the default area and any special areas can be set in the index point to the desired range. Looking at how it would do for x86 and arm64, it looks maybe a bit better to me. A little bit less code and memory usage, and a bit easier to trace the configuration through to the final state (IMO). What do you think? Very rough, on top of this series, below. As I was playing around with this, I was also wondering why it needs two copies of struct execmem_params: one returned from the arch code and one in exec mem. And why the temporary arch copy is ro_after_init, but the final execmem.c copy is not ro_after_init? arch/arm64/mm/init.c | 67 ++++++++++++++++++++++++++++++++++++++--- -------------------------- arch/x86/mm/init.c | 24 +++++++++++++----------- include/linux/execmem.h | 5 +++-- mm/execmem.c | 61 ++++++++++++++++------------------------- -------------------- 4 files changed, 70 insertions(+), 87 deletions(-) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 9b7716b4d84c..7df119101f20 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -633,49 +633,58 @@ static int __init module_init_limits(void) return 0; } -static struct execmem_params execmem_params __ro_after_init = { - .ranges = { - [EXECMEM_DEFAULT] = { - .flags = EXECMEM_KASAN_SHADOW, - .alignment = MODULE_ALIGN, - }, - [EXECMEM_KPROBES] = { - .start = VMALLOC_START, - .end = VMALLOC_END, - .alignment = 1, - }, - [EXECMEM_BPF] = { - .start = VMALLOC_START, - .end = VMALLOC_END, - .alignment = 1, - }, +static struct execmem_range[2] ranges __ro_after_init = { + /* Module area */ + [0] = { + .flags = EXECMEM_KASAN_SHADOW, + .alignment = MODULE_ALIGN, + }, + /* Kprobes area */ + [1] = { + .start = VMALLOC_START, + .end = VMALLOC_END, + .alignment = 1, + }, + /* BPF area */ + [2] = { + .start = VMALLOC_START, + .end = VMALLOC_END, + .alignment = 1, }, }; -struct execmem_params __init *execmem_arch_params(void) +void __init execmem_arch_params(struct execmem_params *p) { - struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT]; + struct execmem_range *default; + struct execmem_range *jit; + + p->ranges = &ranges; module_init_limits(); - r->pgprot = PAGE_KERNEL; - + /* Default area */ + default = &ranges[0]; + default->pgprot = PAGE_KERNEL; if (module_direct_base) { - r->start = module_direct_base; - r->end = module_direct_base + SZ_128M; + default->start = module_direct_base; + default->end = module_direct_base + SZ_128M; if (module_plt_base) { - r->fallback_start = module_plt_base; - r->fallback_end = module_plt_base + SZ_2G; + default->fallback_start = module_plt_base; + default->fallback_end = module_plt_base + SZ_2G; } } else if (module_plt_base) { - r->start = module_plt_base; - r->end = module_plt_base + SZ_2G; + default->start = module_plt_base; + default->end = module_plt_base + SZ_2G; } - execmem_params.ranges[EXECMEM_KPROBES].pgprot = PAGE_KERNEL_ROX; - execmem_params.ranges[EXECMEM_BPF].pgprot = PAGE_KERNEL; + /* Jit area */ + ranges[1].pgprot = PAGE_KERNEL_ROX; + p->defaults[EXECMEM_KPROBES] = 1; + - return &execmem_params; + /* BPF Area */ + ranges[2].pgprot = PAGE_KERNEL; + p->defaults[EXECMEM_BPF] = 2; } #endif diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 022af7ab50f9..7397472ffc39 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -1102,16 +1102,15 @@ unsigned long arch_max_swapfile_size(void) #endif #ifdef CONFIG_EXECMEM -static struct execmem_params execmem_params __ro_after_init = { - .ranges = { - [EXECMEM_DEFAULT] = { - .flags = EXECMEM_KASAN_SHADOW, - .alignment = MODULE_ALIGN, - }, +static struct execmem_range ranges[1] __ro_after_init = { + /* Module area */ + [0] = { + .flags = EXECMEM_KASAN_SHADOW, + .alignment = MODULE_ALIGN, }, }; -struct execmem_params __init *execmem_arch_params(void) +void __init execmem_arch_params(struct execmem_params *p) { unsigned long module_load_offset = 0; unsigned long start; @@ -1121,10 +1120,13 @@ struct execmem_params __init *execmem_arch_params(void) get_random_u32_inclusive(1, 1024) * PAGE_SIZE; start = MODULES_VADDR + module_load_offset; - execmem_params.ranges[EXECMEM_DEFAULT].start = start; - execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END; - execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL; + p->ranges = ranges; - return &execmem_params; + /* Module area */ + p->ranges[0].start = start; + p->ranges[0].end = MODULES_END; + p->ranges[0].pgprot = PAGE_KERNEL; + p->ranges[0].flags = EXECMEM_KASAN_SHADOW; + p->ranges[0].alignment = MODULE_ALIGN; } #endif /* CONFIG_EXECMEM */ diff --git a/include/linux/execmem.h b/include/linux/execmem.h index 09d45ac786e9..702435443d87 100644 --- a/include/linux/execmem.h +++ b/include/linux/execmem.h @@ -77,7 +77,8 @@ struct execmem_range { * each type of executable memory allocations */ struct execmem_params { - struct execmem_range ranges[EXECMEM_TYPE_MAX]; + int areas[EXECMEM_TYPE_MAX]; + struct execmem_range *ranges; }; /** @@ -92,7 +93,7 @@ struct execmem_params { * Return: a structure defining architecture parameters and restrictions * for allocations of executable memory */ -struct execmem_params *execmem_arch_params(void); +void execmem_arch_params(struct execmem_params *p); /** * execmem_text_alloc - allocate executable memory diff --git a/mm/execmem.c b/mm/execmem.c index aeff85261360..dfdec8c2b074 100644 --- a/mm/execmem.c +++ b/mm/execmem.c @@ -6,15 +6,15 @@ #include <linux/moduleloader.h> static struct execmem_params execmem_params; +static struct execmem_range default_range; -static void *execmem_alloc(size_t size, struct execmem_range *range) +static void *execmem_alloc(size_t size, struct execmem_range *range, pgprot_t pgprot) { unsigned long start = range->start; unsigned long end = range->end; unsigned long fallback_start = range->fallback_start; unsigned long fallback_end = range->fallback_end; unsigned int align = range->alignment; - pgprot_t pgprot = range->pgprot; bool kasan = range->flags & EXECMEM_KASAN_SHADOW; unsigned long vm_flags = VM_FLUSH_RESET_PERMS; bool fallback = !!fallback_start; @@ -60,14 +60,18 @@ static inline bool execmem_range_is_data(enum execmem_type type) void *execmem_text_alloc(enum execmem_type type, size_t size) { - return execmem_alloc(size, &execmem_params.ranges[type]); + struct execmem_range *range = &execmem_params.ranges[execmem_params.areas[type]]; + + return execmem_alloc(size, range, range->pgprot); } void *execmem_data_alloc(enum execmem_type type, size_t size) { + struct execmem_range *range = &execmem_params.ranges[execmem_params.areas[type]]; + WARN_ON_ONCE(!execmem_range_is_data(type)); - return execmem_alloc(size, &execmem_params.ranges[type]); + return execmem_alloc(size, range, PAGE_KERNEL); } void execmem_free(void *ptr) @@ -80,9 +84,13 @@ void execmem_free(void *ptr) vfree(ptr); } -struct execmem_params * __weak execmem_arch_params(void) +void __weak execmem_arch_params(struct execmem_params *p) { - return NULL; + p->ranges = default_range; + p->ranges[EXECMEM_DEFAULT].start = VMALLOC_START; + p->ranges[EXECMEM_DEFAULT].end = VMALLOC_END; + p->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC; + p->ranges[EXECMEM_DEFAULT].alignment = 1; } static bool execmem_validate_params(struct execmem_params *p) @@ -97,46 +105,9 @@ static bool execmem_validate_params(struct execmem_params *p) return true; } -static void execmem_init_missing(struct execmem_params *p) -{ - struct execmem_range *default_range = &p- >ranges[EXECMEM_DEFAULT]; - - for (int i = EXECMEM_DEFAULT + 1; i < EXECMEM_TYPE_MAX; i++) { - struct execmem_range *r = &p->ranges[i]; - - if (!r->start) { - if (execmem_range_is_data(i)) - r->pgprot = PAGE_KERNEL; - else - r->pgprot = default_range->pgprot; - r->alignment = default_range->alignment; - r->start = default_range->start; - r->end = default_range->end; - r->flags = default_range->flags; - r->fallback_start = default_range- >fallback_start; - r->fallback_end = default_range->fallback_end; - } - } -} - void __init execmem_init(void) { - struct execmem_params *p = execmem_arch_params(); + execmem_arch_params(&execmem_params); - if (!p) { - p = &execmem_params; - p->ranges[EXECMEM_DEFAULT].start = VMALLOC_START; - p->ranges[EXECMEM_DEFAULT].end = VMALLOC_END; - p->ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL_EXEC; - p->ranges[EXECMEM_DEFAULT].alignment = 1; - - return; - } - - if (!execmem_validate_params(p)) - return; - - execmem_init_missing(p); - - execmem_params = *p; + execmem_validate_params(&execmem_params); }
On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote: > +/** > + * struct execmem_range - definition of a memory range suitable for > code and > + * related data allocations > + * @start: address space start > + * @end: address space end (inclusive) > + * @pgprot: permissions for memory in this address space > + * @alignment: alignment required for text allocations > + */ > +struct execmem_range { > + unsigned long start; > + unsigned long end; > + pgprot_t pgprot; > + unsigned int alignment; > +}; Not a strong opinion, but range doesn't seem an appropriate name. It *has* a range, but also other allocation configuration. It gets especially confusing when multiple "ranges" have the same range. Maybe execmem_alloc_params?
Hi Rick, Sorry for the delay, I was a bit preoccupied with $stuff. On Thu, Oct 05, 2023 at 06:09:07PM +0000, Edgecombe, Rick P wrote: > On Thu, 2023-10-05 at 08:26 +0300, Mike Rapoport wrote: > > On Wed, Oct 04, 2023 at 03:39:26PM +0000, Edgecombe, Rick P wrote: > > > On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote: > > > > It seems a bit weird to copy all of this. Is it trying to be > > > > faster > > > > or > > > > something? > > > > > > > > Couldn't it just check r->start in execmem_text/data_alloc() path > > > > and > > > > switch to EXECMEM_DEFAULT if needed then? The > > > > execmem_range_is_data() > > > > part that comes later could be added to the logic there too. So > > > > this > > > > seems like unnecessary complexity to me or I don't see the > > > > reason. > > > > > > I guess this is a bad idea because if you have the full size array > > > sitting around anyway you might as well use it and reduce the > > > exec_mem_alloc() logic. > > > > That's was the idea, indeed. :) > > > > > Just looking at it from the x86 side (and > > > similar) though, where there is actually only one execmem_range and > > > it > > > building this whole array with identical data and it seems weird. > > > > Right, most architectures have only one range, but to support all > > variants > > that we have, execmem has to maintain the whole array. > > What about just having an index into a smaller set of ranges. The > module area and the extra JIT area. So ->ranges can be size 3 > (statically allocated in the arch code) for three areas and then the > index array can be size EXECMEM_TYPE_MAX. The default 0 value of the > indexing array will point to the default area and any special areas can > be set in the index point to the desired range. > > Looking at how it would do for x86 and arm64, it looks maybe a bit > better to me. A little bit less code and memory usage, and a bit easier > to trace the configuration through to the final state (IMO). What do > you think? Very rough, on top of this series, below. I like your suggestion to only have definitions of actual ranges in arch code and index array to redirect allocation requests to the right range. I'll make the next version along the lines of your patch. > As I was playing around with this, I was also wondering why it needs > two copies of struct execmem_params: one returned from the arch code > and one in exec mem. No actual reason, one copy is enough, thanks for catching this. > And why the temporary arch copy is ro_after_init, > but the final execmem.c copy is not ro_after_init? I just missed it, thanks for pointing out.
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c index b8b86088b2dd..a1d8fe9796fa 100644 --- a/arch/loongarch/kernel/module.c +++ b/arch/loongarch/kernel/module.c @@ -18,6 +18,7 @@ #include <linux/ftrace.h> #include <linux/string.h> #include <linux/kernel.h> +#include <linux/execmem.h> #include <asm/alternative.h> #include <asm/inst.h> @@ -469,10 +470,21 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab, return 0; } -void *module_alloc(unsigned long size) +static struct execmem_params execmem_params __ro_after_init = { + .ranges = { + [EXECMEM_DEFAULT] = { + .pgprot = PAGE_KERNEL, + .alignment = 1, + }, + }, +}; + +struct execmem_params __init *execmem_arch_params(void) { - return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, - GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0)); + execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR; + execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END; + + return &execmem_params; } static void module_init_ftrace_plt(const Elf_Ehdr *hdr, diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c index 0c936cbf20c5..1c959074b35f 100644 --- a/arch/mips/kernel/module.c +++ b/arch/mips/kernel/module.c @@ -20,6 +20,7 @@ #include <linux/kernel.h> #include <linux/spinlock.h> #include <linux/jump_label.h> +#include <linux/execmem.h> extern void jump_label_apply_nops(struct module *mod); @@ -33,11 +34,21 @@ static LIST_HEAD(dbe_list); static DEFINE_SPINLOCK(dbe_lock); #ifdef MODULE_START -void *module_alloc(unsigned long size) +static struct execmem_params execmem_params __ro_after_init = { + .ranges = { + [EXECMEM_DEFAULT] = { + .start = MODULE_START, + .end = MODULE_END, + .alignment = 1, + }, + }, +}; + +struct execmem_params __init *execmem_arch_params(void) { - return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END, - GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, - __builtin_return_address(0)); + execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL; + + return &execmem_params; } #endif diff --git a/arch/nios2/kernel/module.c b/arch/nios2/kernel/module.c index 9c97b7513853..5a8df4f9c04e 100644 --- a/arch/nios2/kernel/module.c +++ b/arch/nios2/kernel/module.c @@ -18,15 +18,24 @@ #include <linux/fs.h> #include <linux/string.h> #include <linux/kernel.h> +#include <linux/execmem.h> #include <asm/cacheflush.h> -void *module_alloc(unsigned long size) +static struct execmem_params execmem_params __ro_after_init = { + .ranges = { + [EXECMEM_DEFAULT] = { + .start = MODULES_VADDR, + .end = MODULES_END, + .pgprot = PAGE_KERNEL_EXEC, + .alignment = 1, + }, + }, +}; + +struct execmem_params __init *execmem_arch_params(void) { - return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, - GFP_KERNEL, PAGE_KERNEL_EXEC, - VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, - __builtin_return_address(0)); + return &execmem_params; } int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab, diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c index d214bbe3c2af..0c6dfd1daef3 100644 --- a/arch/parisc/kernel/module.c +++ b/arch/parisc/kernel/module.c @@ -49,6 +49,7 @@ #include <linux/bug.h> #include <linux/mm.h> #include <linux/slab.h> +#include <linux/execmem.h> #include <asm/unwind.h> #include <asm/sections.h> @@ -173,15 +174,21 @@ static inline int reassemble_22(int as22) ((as22 & 0x0003ff) << 3)); } -void *module_alloc(unsigned long size) +static struct execmem_params execmem_params __ro_after_init = { + .ranges = { + [EXECMEM_DEFAULT] = { + .pgprot = PAGE_KERNEL_RWX, + .alignment = 1, + }, + }, +}; + +struct execmem_params __init *execmem_arch_params(void) { - /* using RWX means less protection for modules, but it's - * easier than trying to map the text, data, init_text and - * init_data correctly */ - return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, - GFP_KERNEL, - PAGE_KERNEL_RWX, 0, NUMA_NO_NODE, - __builtin_return_address(0)); + execmem_params.ranges[EXECMEM_DEFAULT].start = VMALLOC_START; + execmem_params.ranges[EXECMEM_DEFAULT].end = VMALLOC_END; + + return &execmem_params; } #ifndef CONFIG_64BIT diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c index 7c651d55fcbd..343a0edfb6dd 100644 --- a/arch/riscv/kernel/module.c +++ b/arch/riscv/kernel/module.c @@ -11,6 +11,7 @@ #include <linux/vmalloc.h> #include <linux/sizes.h> #include <linux/pgtable.h> +#include <linux/execmem.h> #include <asm/alternative.h> #include <asm/sections.h> @@ -436,12 +437,21 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab, } #if defined(CONFIG_MMU) && defined(CONFIG_64BIT) -void *module_alloc(unsigned long size) +static struct execmem_params execmem_params __ro_after_init = { + .ranges = { + [EXECMEM_DEFAULT] = { + .pgprot = PAGE_KERNEL, + .alignment = 1, + }, + }, +}; + +struct execmem_params __init *execmem_arch_params(void) { - return __vmalloc_node_range(size, 1, MODULES_VADDR, - MODULES_END, GFP_KERNEL, - PAGE_KERNEL, 0, NUMA_NO_NODE, - __builtin_return_address(0)); + execmem_params.ranges[EXECMEM_DEFAULT].start = MODULES_VADDR; + execmem_params.ranges[EXECMEM_DEFAULT].end = MODULES_END; + + return &execmem_params; } #endif diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c index 66c45a2764bc..1d8d1fba95b9 100644 --- a/arch/sparc/kernel/module.c +++ b/arch/sparc/kernel/module.c @@ -14,6 +14,10 @@ #include <linux/string.h> #include <linux/ctype.h> #include <linux/mm.h> +#include <linux/execmem.h> +#ifdef CONFIG_SPARC64 +#include <linux/jump_label.h> +#endif #include <asm/processor.h> #include <asm/spitfire.h> @@ -21,34 +25,26 @@ #include "entry.h" +static struct execmem_params execmem_params __ro_after_init = { + .ranges = { + [EXECMEM_DEFAULT] = { #ifdef CONFIG_SPARC64 - -#include <linux/jump_label.h> - -static void *module_map(unsigned long size) -{ - if (PAGE_ALIGN(size) > MODULES_LEN) - return NULL; - return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, - GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE, - __builtin_return_address(0)); -} + .start = MODULES_VADDR, + .end = MODULES_END, #else -static void *module_map(unsigned long size) -{ - return vmalloc(size); -} -#endif /* CONFIG_SPARC64 */ - -void *module_alloc(unsigned long size) + .start = VMALLOC_START, + .end = VMALLOC_END, +#endif + .alignment = 1, + }, + }, +}; + +struct execmem_params __init *execmem_arch_params(void) { - void *ret; - - ret = module_map(size); - if (ret) - memset(ret, 0, size); + execmem_params.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL; - return ret; + return &execmem_params; } /* Make generic code ignore STT_REGISTER dummy undefined symbols. */ diff --git a/include/linux/execmem.h b/include/linux/execmem.h index 3491bf7e9714..44e213625053 100644 --- a/include/linux/execmem.h +++ b/include/linux/execmem.h @@ -32,6 +32,44 @@ enum execmem_type { EXECMEM_TYPE_MAX, }; +/** + * struct execmem_range - definition of a memory range suitable for code and + * related data allocations + * @start: address space start + * @end: address space end (inclusive) + * @pgprot: permissions for memory in this address space + * @alignment: alignment required for text allocations + */ +struct execmem_range { + unsigned long start; + unsigned long end; + pgprot_t pgprot; + unsigned int alignment; +}; + +/** + * struct execmem_params - architecture parameters for code allocations + * @ranges: array of ranges defining architecture specific parameters for + * each type of executable memory allocations + */ +struct execmem_params { + struct execmem_range ranges[EXECMEM_TYPE_MAX]; +}; + +/** + * execmem_arch_params - supply parameters for allocations of executable memory + * + * A hook for architectures to define parameters for allocations of + * executable memory described by struct execmem_params + * + * For architectures that do not implement this method a default set of + * parameters will be used + * + * Return: a structure defining architecture parameters and restrictions + * for allocations of executable memory + */ +struct execmem_params *execmem_arch_params(void); + /** * execmem_text_alloc - allocate executable memory * @type: type of the allocation @@ -53,4 +91,10 @@ void *execmem_text_alloc(enum execmem_type type, size_t size); */ void execmem_free(void *ptr); +#ifdef CONFIG_EXECMEM +void execmem_init(void); +#else +static inline void execmem_init(void) {} +#endif + #endif /* _LINUX_EXECMEM_ALLOC_H */ diff --git a/mm/execmem.c b/mm/execmem.c index 638dc2b26a81..f25a5e064886 100644 --- a/mm/execmem.c +++ b/mm/execmem.c @@ -5,14 +5,26 @@ #include <linux/execmem.h> #include <linux/moduleloader.h> -static void *execmem_alloc(size_t size) +static struct execmem_params execmem_params; + +static void *execmem_alloc(size_t size, struct execmem_range *range) { - return module_alloc(size); + unsigned long start = range->start; + unsigned long end = range->end; + unsigned int align = range->alignment; + pgprot_t pgprot = range->pgprot; + + return __vmalloc_node_range(size, align, start, end, + GFP_KERNEL, pgprot, VM_FLUSH_RESET_PERMS, + NUMA_NO_NODE, __builtin_return_address(0)); } void *execmem_text_alloc(enum execmem_type type, size_t size) { - return execmem_alloc(size); + if (!execmem_params.ranges[type].start) + return module_alloc(size); + + return execmem_alloc(size, &execmem_params.ranges[type]); } void execmem_free(void *ptr) @@ -24,3 +36,51 @@ void execmem_free(void *ptr) WARN_ON(in_interrupt()); vfree(ptr); } + +struct execmem_params * __weak execmem_arch_params(void) +{ + return NULL; +} + +static bool execmem_validate_params(struct execmem_params *p) +{ + struct execmem_range *r = &p->ranges[EXECMEM_DEFAULT]; + + if (!r->alignment || !r->start || !r->end || !pgprot_val(r->pgprot)) { + pr_crit("Invalid parameters for execmem allocator, module loading will fail"); + return false; + } + + return true; +} + +static void execmem_init_missing(struct execmem_params *p) +{ + struct execmem_range *default_range = &p->ranges[EXECMEM_DEFAULT]; + + for (int i = EXECMEM_DEFAULT + 1; i < EXECMEM_TYPE_MAX; i++) { + struct execmem_range *r = &p->ranges[i]; + + if (!r->start) { + r->pgprot = default_range->pgprot; + r->alignment = default_range->alignment; + r->start = default_range->start; + r->end = default_range->end; + } + } +} + +void __init execmem_init(void) +{ + struct execmem_params *p = execmem_arch_params(); + + if (!p) + return; + + if (!execmem_validate_params(p)) + return; + + execmem_init_missing(p); + + execmem_params = *p; +} diff --git a/mm/mm_init.c b/mm/mm_init.c index 50f2f34745af..7c002b36da21 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -26,6 +26,7 @@ #include <linux/pgtable.h> #include <linux/swap.h> #include <linux/cma.h> +#include <linux/execmem.h> #include "internal.h" #include "slab.h" #include "shuffle.h" @@ -2797,4 +2798,5 @@ void __init mm_core_init(void) pti_init(); kmsan_init_runtime(); mm_cache_init(); + execmem_init(); }