diff mbox

[1/1] arm/module: maximum utilization of module area.

Message ID 1481012975-44478-1-git-send-email-maninder1.s@samsung.com (mailing list archive)
State New, archived
Headers show

Commit Message

Maninder Singh Dec. 6, 2016, 8:29 a.m. UTC
This patch defines new macro MODULE_START to ensure kernel text
and module remains within 32 MB of address range.

Tried this patch by inserting 20 MB size module on 4.1 kernel:-

Earlier:-

Comments

Ard Biesheuvel Dec. 6, 2016, 9:13 a.m. UTC | #1
On 6 December 2016 at 08:29, Maninder Singh <maninder1.s@samsung.com> wrote:
> This patch defines new macro MODULE_START to ensure kernel text
> and module remains within 32 MB of address range.
>
> Tried this patch by inserting 20 MB size module on 4.1 kernel:-
>
> Earlier:-
> ==========
> sh# insmod size.ko
> ....
> insmod: ERROR: could not insert module size.ko: Cannot allocate memory
> sh#
>
> With this patch
> ===============
> sh# insmod size.ko
> ...
> sh# lsmod
> Module                  Size  Used by
> size                20972425  0
>
> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
> Reviewed-by: Ajeet Yadav <ajeet.y@samsung.com>
> ---
>  arch/arm/include/asm/memory.h | 4 ++--
>  arch/arm/kernel/module.c      | 9 ++++++++-
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 76cbd9c..1a0a6e5 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -37,7 +37,7 @@
>   * TASK_SIZE - the maximum size of a user space task.
>   * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area
>   */
> -#define TASK_SIZE              (UL(CONFIG_PAGE_OFFSET) - UL(SZ_16M))
> +#define TASK_SIZE              (UL(CONFIG_PAGE_OFFSET) - UL(SZ_32M))
>  #define TASK_UNMAPPED_BASE     ALIGN(TASK_SIZE / 3, SZ_16M)
>
>  /*
> @@ -50,7 +50,7 @@
>   * and PAGE_OFFSET - it must be within 32MB of the kernel text.
>   */
>  #ifndef CONFIG_THUMB2_KERNEL
> -#define MODULES_VADDR          (PAGE_OFFSET - SZ_16M)
> +#define MODULES_VADDR          (PAGE_OFFSET - SZ_32M)
>  #else
>  /* smaller range for Thumb-2 symbols relocation (2^24)*/
>  #define MODULES_VADDR          (PAGE_OFFSET - SZ_8M)
> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> index 4f14b5c..b8e1f9c 100644
> --- a/arch/arm/kernel/module.c
> +++ b/arch/arm/kernel/module.c
> @@ -35,12 +35,19 @@
>   */
>  #undef MODULES_VADDR
>  #define MODULES_VADDR  (((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
> +#define MODULES_START MODULES_VADDR
> +#else
> +#ifndef CONFIG_THUMB2_KERNEL
> +#define MODULES_START  ALIGN((unsigned long)_etext - SZ_32M, PAGE_SIZE)

On a multi_v7_defconfig kernel, the text size is >8 MB, which means
you are only adding ~7 MB to the module area, while consuming 16 MB of
additional address space. Given that 20 MB modules are very uncommon,
I think it is better to enable CONFIG_ARM_MODULE_PLTS instead. That
way, there is no need to update these defaults for everyone.

> +#else
> +#define MODULES_START MODULES_VADDR
> +#endif
>  #endif
>
>  #ifdef CONFIG_MMU
>  void *module_alloc(unsigned long size)
>  {
> -       void *p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> +       void *p = __vmalloc_node_range(size, 1, MODULES_START, MODULES_END,
>                                 GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>                                 __builtin_return_address(0));
>         if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)
> --
> 1.9.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Nicolas Pitre Dec. 6, 2016, 3:57 p.m. UTC | #2
On Tue, 6 Dec 2016, Maninder Singh wrote:

> This patch defines new macro MODULE_START to ensure kernel text
> and module remains within 32 MB of address range.
> 
> Tried this patch by inserting 20 MB size module on 4.1 kernel:-
> 
> Earlier:-
> ==========
> sh# insmod size.ko
> ....
> insmod: ERROR: could not insert module size.ko: Cannot allocate memory
> sh#
> 
> With this patch
> ===============
> sh# insmod size.ko
> ...
> sh# lsmod
> Module                  Size  Used by
> size                20972425  0

Could you please try enabling CONFIG_ARM_MODULE_PLTS instead of this 
patch?


Nicolas
Vaneet Narang Dec. 12, 2016, 10:08 a.m. UTC | #3
Hi Ard,

>> +#ifndef CONFIG_THUMB2_KERNEL

>> +#define MODULES_START  ALIGN((unsigned long)_etext - SZ_32M, PAGE_SIZE)

>

>On a multi_v7_defconfig kernel, the text size is >8 MB, which means

>you are only adding ~7 MB to the module area, while consuming 16 MB of

>additional address space.


I am not sure if 16MB virtual address space will make any difference on embedded 
systems where physical memory is already less than virtual address space.
if required address space wastage can be reduced by keeping TASK_SIZE as 
PAGE_OFFSET - 24MB like below

#define MODULES_VADDR          (PAGE_OFFSET - SZ_24M)
#define TASK_SIZE              (UL(CONFIG_PAGE_OFFSET) - UL(SZ_24M))


> Given that 20 MB modules are very uncommon,


Size of all modules can be 20MB, this seems to be normal scenario.

>I think it is better to enable CONFIG_ARM_MODULE_PLTS instead. That


CONFIG_ARM_MODULE_PLTS has function call overhead as it refers PLT table
while calling kernel functions. Also size of modules will also gets increased a bit.
So using short calls from modules to kernel will be faster. 
These changes trying to utilize best available space for kernel modules for
making short calls. 
So CONFIG_ARM_MODULE_PLTS is not required when modules
can be accomdated within 20MB.

>way, there is no need to update these defaults for everyone.

>


>> +#else

>> +#define MODULES_START MODULES_VADDR


Thanks & Regards,
Vaneet Narang
Russell King (Oracle) Dec. 12, 2016, 10:23 a.m. UTC | #4
On Tue, Dec 06, 2016 at 01:59:35PM +0530, Maninder Singh wrote:
> This patch defines new macro MODULE_START to ensure kernel text
> and module remains within 32 MB of address range.
> 
> Tried this patch by inserting 20 MB size module on 4.1 kernel:-
> 
> Earlier:-
> ==========
> sh# insmod size.ko
> ....
> insmod: ERROR: could not insert module size.ko: Cannot allocate memory
> sh#
> 
> With this patch
> ===============
> sh# insmod size.ko
> ...
> sh# lsmod
> Module                  Size  Used by
> size                20972425  0
> 
> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
> Reviewed-by: Ajeet Yadav <ajeet.y@samsung.com>

A PC24 relocation has a range of +/-32MB.  This means that where-ever
the module is placed, it must be capable of reaching any function
within the kernel text, which may itself be quite large (eg, 8MB, or
possibly larger).  The module area exists to allow modules to be
located in an area where PC24 relocations are able to reach all of the
kernel text on sensibly configured kernels, thereby allowing for
optimal performance.

If you wish to load large modules, then enable ARM_MODULE_PLTS, which
will use the less efficient PLT method (which is basically an indirect
function call) for relocations that PC24 can't handle, and will allow
the module to be loaded into the vmalloc area.

Growing the module area so that smaller modules also get penalised by
the PLT indirection is not sane.

So, I'm afraid this change is not acceptable.
Vaneet Narang Dec. 12, 2016, 11:22 a.m. UTC | #5
Hi,

>A PC24 relocation has a range of +/-32MB.  This means that where-ever

>the module is placed, it must be capable of reaching any function

>within the kernel text, which may itself be quite large (eg, 8MB, or

>possibly larger).  The module area exists to allow modules to be

>located in an area where PC24 relocations are able to reach all of the

>kernel text on sensibly configured kernels, thereby allowing for

>optimal performance.

>

>If you wish to load large modules, then enable ARM_MODULE_PLTS, which

>will use the less efficient PLT method (which is basically an indirect

>function call) for relocations that PC24 can't handle, and will allow

>the module to be loaded into the vmalloc area.

>

>Growing the module area so that smaller modules also get penalised by

>the PLT indirection is not sane.


This is exactly what i am saying. These changes are useful to accomdate
22MB modules without enabling ARM_MODULE_PLTS. 
Without these changes ARM_MODULE_PLTS needs to be enabled to insert more than 14MB
modules but with these changes 22MB modules (considering 8MB uImage) can be inserted 
without enabling ARM_MODULE_PLTS.

So till 22MB modules are not penalised but without these changes once size of modules
gets over 14MB PLT becomes must. 

Regards,
Vaneet Narang
Russell King (Oracle) Dec. 12, 2016, 12:56 p.m. UTC | #6
On Mon, Dec 12, 2016 at 11:22:30AM +0000, Vaneet Narang wrote:
> Hi,
> 
> >A PC24 relocation has a range of +/-32MB.  This means that where-ever
> >the module is placed, it must be capable of reaching any function
> >within the kernel text, which may itself be quite large (eg, 8MB, or
> >possibly larger).  The module area exists to allow modules to be
> >located in an area where PC24 relocations are able to reach all of the
> >kernel text on sensibly configured kernels, thereby allowing for
> >optimal performance.
> >
> >If you wish to load large modules, then enable ARM_MODULE_PLTS, which
> >will use the less efficient PLT method (which is basically an indirect
> >function call) for relocations that PC24 can't handle, and will allow
> >the module to be loaded into the vmalloc area.
> >
> >Growing the module area so that smaller modules also get penalised by
> >the PLT indirection is not sane.
> 
> This is exactly what i am saying. These changes are useful to accomdate
> 22MB modules without enabling ARM_MODULE_PLTS. 

Obviously I wasn't clear enough.

No, I do not like your change for the reasons given above.  I'm not going
to accept it.

If you want to use 22MB modules, enable ARM_MODULE_PLTS.

Thank you.
Nicolas Pitre Dec. 12, 2016, 3:28 p.m. UTC | #7
On Mon, 12 Dec 2016, Vaneet Narang wrote:

> Hi,
> 
> >A PC24 relocation has a range of +/-32MB.  This means that where-ever
> >the module is placed, it must be capable of reaching any function
> >within the kernel text, which may itself be quite large (eg, 8MB, or
> >possibly larger).  The module area exists to allow modules to be
> >located in an area where PC24 relocations are able to reach all of the
> >kernel text on sensibly configured kernels, thereby allowing for
> >optimal performance.
> >
> >If you wish to load large modules, then enable ARM_MODULE_PLTS, which
> >will use the less efficient PLT method (which is basically an indirect
> >function call) for relocations that PC24 can't handle, and will allow
> >the module to be loaded into the vmalloc area.
> >
> >Growing the module area so that smaller modules also get penalised by
> >the PLT indirection is not sane.
> 
> This is exactly what i am saying. These changes are useful to accomdate
> 22MB modules without enabling ARM_MODULE_PLTS. 

I think you need to figure out why you need such a huge module in the 
first place.  That is very uncommon indeed.


Nicolas
Ard Biesheuvel Dec. 12, 2016, 4:57 p.m. UTC | #8
On 12 December 2016 at 15:28, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Mon, 12 Dec 2016, Vaneet Narang wrote:
>
>> Hi,
>>
>> >A PC24 relocation has a range of +/-32MB.  This means that where-ever
>> >the module is placed, it must be capable of reaching any function
>> >within the kernel text, which may itself be quite large (eg, 8MB, or
>> >possibly larger).  The module area exists to allow modules to be
>> >located in an area where PC24 relocations are able to reach all of the
>> >kernel text on sensibly configured kernels, thereby allowing for
>> >optimal performance.
>> >
>> >If you wish to load large modules, then enable ARM_MODULE_PLTS, which
>> >will use the less efficient PLT method (which is basically an indirect
>> >function call) for relocations that PC24 can't handle, and will allow
>> >the module to be loaded into the vmalloc area.
>> >
>> >Growing the module area so that smaller modules also get penalised by
>> >the PLT indirection is not sane.
>>
>> This is exactly what i am saying. These changes are useful to accomdate
>> 22MB modules without enabling ARM_MODULE_PLTS.
>
> I think you need to figure out why you need such a huge module in the
> first place.  That is very uncommon indeed.
>

Also, note that the module PLT code was recently optimized, to remove
some pathological behavior which severely affected load times of large
modules.

Can you quantify the performance hit you are taking when using module
PLTs? And the actual increase in memory footprint?
diff mbox

Patch

==========
sh# insmod size.ko
....
insmod: ERROR: could not insert module size.ko: Cannot allocate memory
sh#

With this patch
===============
sh# insmod size.ko
...
sh# lsmod
Module                  Size  Used by
size                20972425  0

Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Ajeet Yadav <ajeet.y@samsung.com>
---
 arch/arm/include/asm/memory.h | 4 ++--
 arch/arm/kernel/module.c      | 9 ++++++++-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 76cbd9c..1a0a6e5 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -37,7 +37,7 @@ 
  * TASK_SIZE - the maximum size of a user space task.
  * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area
  */
-#define TASK_SIZE		(UL(CONFIG_PAGE_OFFSET) - UL(SZ_16M))
+#define TASK_SIZE		(UL(CONFIG_PAGE_OFFSET) - UL(SZ_32M))
 #define TASK_UNMAPPED_BASE	ALIGN(TASK_SIZE / 3, SZ_16M)
 
 /*
@@ -50,7 +50,7 @@ 
  * and PAGE_OFFSET - it must be within 32MB of the kernel text.
  */
 #ifndef CONFIG_THUMB2_KERNEL
-#define MODULES_VADDR		(PAGE_OFFSET - SZ_16M)
+#define MODULES_VADDR		(PAGE_OFFSET - SZ_32M)
 #else
 /* smaller range for Thumb-2 symbols relocation (2^24)*/
 #define MODULES_VADDR		(PAGE_OFFSET - SZ_8M)
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 4f14b5c..b8e1f9c 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -35,12 +35,19 @@ 
  */
 #undef MODULES_VADDR
 #define MODULES_VADDR	(((unsigned long)_exiprom + ~PMD_MASK) & PMD_MASK)
+#define MODULES_START MODULES_VADDR
+#else
+#ifndef CONFIG_THUMB2_KERNEL
+#define MODULES_START  ALIGN((unsigned long)_etext - SZ_32M, PAGE_SIZE)
+#else
+#define MODULES_START MODULES_VADDR
+#endif
 #endif
 
 #ifdef CONFIG_MMU
 void *module_alloc(unsigned long size)
 {
-	void *p = __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+	void *p = __vmalloc_node_range(size, 1, MODULES_START, MODULES_END,
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
 				__builtin_return_address(0));
 	if (!IS_ENABLED(CONFIG_ARM_MODULE_PLTS) || p)