diff mbox series

module/decompress: use vmalloc() for zstd decompression workspace

Message ID 20230829120508.317611-1-andrea.righi@canonical.com (mailing list archive)
State New, archived
Headers show
Series module/decompress: use vmalloc() for zstd decompression workspace | expand

Commit Message

Andrea Righi Aug. 29, 2023, 12:05 p.m. UTC
Using kmalloc() to allocate the decompression workspace for zstd may
trigger the following warning when large modules are loaded (i.e., xfs):

[    2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350
...
[    2.989033] Call Trace:
[    2.989841]  <TASK>
[    2.990614]  ? show_regs+0x6d/0x80
[    2.991573]  ? __warn+0x89/0x160
[    2.992485]  ? __alloc_pages+0x2c3/0x350
[    2.993520]  ? report_bug+0x17e/0x1b0
[    2.994506]  ? handle_bug+0x51/0xa0
[    2.995474]  ? exc_invalid_op+0x18/0x80
[    2.996469]  ? asm_exc_invalid_op+0x1b/0x20
[    2.997530]  ? module_zstd_decompress+0xdc/0x2a0
[    2.998665]  ? __alloc_pages+0x2c3/0x350
[    2.999695]  ? module_zstd_decompress+0xdc/0x2a0
[    3.000821]  __kmalloc_large_node+0x7a/0x150
[    3.001920]  __kmalloc+0xdb/0x170
[    3.002824]  module_zstd_decompress+0xdc/0x2a0
[    3.003857]  module_decompress+0x37/0xc0
[    3.004688]  init_module_from_file+0xd0/0x100
[    3.005668]  idempotent_init_module+0x11c/0x2b0
[    3.006632]  __x64_sys_finit_module+0x64/0xd0
[    3.007568]  do_syscall_64+0x59/0x90
[    3.008373]  ? ksys_read+0x73/0x100
[    3.009395]  ? exit_to_user_mode_prepare+0x30/0xb0
[    3.010531]  ? syscall_exit_to_user_mode+0x37/0x60
[    3.011662]  ? do_syscall_64+0x68/0x90
[    3.012511]  ? do_syscall_64+0x68/0x90
[    3.013364]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

However, continuous physical memory does not seem to be required in
module_zstd_decompress(), so use vmalloc() instead, to prevent the
warning and avoid potential failures at loading compressed modules.

Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
---
 kernel/module/decompress.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Luis Chamberlain Aug. 29, 2023, 4:41 p.m. UTC | #1
On Tue, Aug 29, 2023 at 02:05:08PM +0200, Andrea Righi wrote:
> Using kmalloc() to allocate the decompression workspace for zstd may
> trigger the following warning when large modules are loaded (i.e., xfs):
> 
> [    2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350
> ...
> [    2.989033] Call Trace:
> [    2.989841]  <TASK>
> [    2.990614]  ? show_regs+0x6d/0x80
> [    2.991573]  ? __warn+0x89/0x160
> [    2.992485]  ? __alloc_pages+0x2c3/0x350
> [    2.993520]  ? report_bug+0x17e/0x1b0
> [    2.994506]  ? handle_bug+0x51/0xa0
> [    2.995474]  ? exc_invalid_op+0x18/0x80
> [    2.996469]  ? asm_exc_invalid_op+0x1b/0x20
> [    2.997530]  ? module_zstd_decompress+0xdc/0x2a0
> [    2.998665]  ? __alloc_pages+0x2c3/0x350
> [    2.999695]  ? module_zstd_decompress+0xdc/0x2a0
> [    3.000821]  __kmalloc_large_node+0x7a/0x150
> [    3.001920]  __kmalloc+0xdb/0x170
> [    3.002824]  module_zstd_decompress+0xdc/0x2a0
> [    3.003857]  module_decompress+0x37/0xc0
> [    3.004688]  init_module_from_file+0xd0/0x100
> [    3.005668]  idempotent_init_module+0x11c/0x2b0
> [    3.006632]  __x64_sys_finit_module+0x64/0xd0
> [    3.007568]  do_syscall_64+0x59/0x90
> [    3.008373]  ? ksys_read+0x73/0x100
> [    3.009395]  ? exit_to_user_mode_prepare+0x30/0xb0
> [    3.010531]  ? syscall_exit_to_user_mode+0x37/0x60
> [    3.011662]  ? do_syscall_64+0x68/0x90
> [    3.012511]  ? do_syscall_64+0x68/0x90
> [    3.013364]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> 
> However, continuous physical memory does not seem to be required in
> module_zstd_decompress(), so use vmalloc() instead, to prevent the
> warning and avoid potential failures at loading compressed modules.
> 
> Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> ---
>  kernel/module/decompress.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c
> index 8a5d6d63b06c..87440f714c0c 100644
> --- a/kernel/module/decompress.c
> +++ b/kernel/module/decompress.c
> @@ -241,7 +241,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
>  	}
>  
>  	wksp_size = zstd_dstream_workspace_bound(header.windowSize);
> -	wksp = kmalloc(wksp_size, GFP_KERNEL);
> +	wksp = vmalloc(wksp_size);
>  	if (!wksp) {
>  		retval = -ENOMEM;
>  		goto out;
> @@ -284,7 +284,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
>  	retval = new_size;
>  
>   out:
> -	kfree(wksp);
> +	vfree(wksp);
>  	return retval;

Thanks! Applied and queued up.

  Luis
Lucas De Marchi Aug. 29, 2023, 5:30 p.m. UTC | #2
On Tue, Aug 29, 2023 at 09:41:32AM -0700, Luis Chamberlain wrote:
>On Tue, Aug 29, 2023 at 02:05:08PM +0200, Andrea Righi wrote:
>> Using kmalloc() to allocate the decompression workspace for zstd may
>> trigger the following warning when large modules are loaded (i.e., xfs):
>>
>> [    2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350
>> ...
>> [    2.989033] Call Trace:
>> [    2.989841]  <TASK>
>> [    2.990614]  ? show_regs+0x6d/0x80
>> [    2.991573]  ? __warn+0x89/0x160
>> [    2.992485]  ? __alloc_pages+0x2c3/0x350
>> [    2.993520]  ? report_bug+0x17e/0x1b0
>> [    2.994506]  ? handle_bug+0x51/0xa0
>> [    2.995474]  ? exc_invalid_op+0x18/0x80
>> [    2.996469]  ? asm_exc_invalid_op+0x1b/0x20
>> [    2.997530]  ? module_zstd_decompress+0xdc/0x2a0
>> [    2.998665]  ? __alloc_pages+0x2c3/0x350
>> [    2.999695]  ? module_zstd_decompress+0xdc/0x2a0
>> [    3.000821]  __kmalloc_large_node+0x7a/0x150
>> [    3.001920]  __kmalloc+0xdb/0x170
>> [    3.002824]  module_zstd_decompress+0xdc/0x2a0
>> [    3.003857]  module_decompress+0x37/0xc0
>> [    3.004688]  init_module_from_file+0xd0/0x100
>> [    3.005668]  idempotent_init_module+0x11c/0x2b0
>> [    3.006632]  __x64_sys_finit_module+0x64/0xd0
>> [    3.007568]  do_syscall_64+0x59/0x90
>> [    3.008373]  ? ksys_read+0x73/0x100
>> [    3.009395]  ? exit_to_user_mode_prepare+0x30/0xb0
>> [    3.010531]  ? syscall_exit_to_user_mode+0x37/0x60
>> [    3.011662]  ? do_syscall_64+0x68/0x90
>> [    3.012511]  ? do_syscall_64+0x68/0x90
>> [    3.013364]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>>
>> However, continuous physical memory does not seem to be required in
>> module_zstd_decompress(), so use vmalloc() instead, to prevent the
>> warning and avoid potential failures at loading compressed modules.
>>
>> Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
>> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
>> ---
>>  kernel/module/decompress.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c
>> index 8a5d6d63b06c..87440f714c0c 100644
>> --- a/kernel/module/decompress.c
>> +++ b/kernel/module/decompress.c
>> @@ -241,7 +241,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
>>  	}
>>
>>  	wksp_size = zstd_dstream_workspace_bound(header.windowSize);
>> -	wksp = kmalloc(wksp_size, GFP_KERNEL);
>> +	wksp = vmalloc(wksp_size);
>>  	if (!wksp) {
>>  		retval = -ENOMEM;
>>  		goto out;
>> @@ -284,7 +284,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
>>  	retval = new_size;
>>
>>   out:
>> -	kfree(wksp);
>> +	vfree(wksp);
>>  	return retval;
>
>Thanks! Applied and queued up.

I can see at least the gz decompress would need the same kind of change.
Shouldn't we tackle them all at once?

Lucas De Marchi

>
>  Luis
Andrea Righi Aug. 29, 2023, 5:46 p.m. UTC | #3
On Tue, Aug 29, 2023 at 10:30:55AM -0700, Lucas De Marchi wrote:
> On Tue, Aug 29, 2023 at 09:41:32AM -0700, Luis Chamberlain wrote:
> > On Tue, Aug 29, 2023 at 02:05:08PM +0200, Andrea Righi wrote:
> > > Using kmalloc() to allocate the decompression workspace for zstd may
> > > trigger the following warning when large modules are loaded (i.e., xfs):
> > > 
> > > [    2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350
> > > ...
> > > [    2.989033] Call Trace:
> > > [    2.989841]  <TASK>
> > > [    2.990614]  ? show_regs+0x6d/0x80
> > > [    2.991573]  ? __warn+0x89/0x160
> > > [    2.992485]  ? __alloc_pages+0x2c3/0x350
> > > [    2.993520]  ? report_bug+0x17e/0x1b0
> > > [    2.994506]  ? handle_bug+0x51/0xa0
> > > [    2.995474]  ? exc_invalid_op+0x18/0x80
> > > [    2.996469]  ? asm_exc_invalid_op+0x1b/0x20
> > > [    2.997530]  ? module_zstd_decompress+0xdc/0x2a0
> > > [    2.998665]  ? __alloc_pages+0x2c3/0x350
> > > [    2.999695]  ? module_zstd_decompress+0xdc/0x2a0
> > > [    3.000821]  __kmalloc_large_node+0x7a/0x150
> > > [    3.001920]  __kmalloc+0xdb/0x170
> > > [    3.002824]  module_zstd_decompress+0xdc/0x2a0
> > > [    3.003857]  module_decompress+0x37/0xc0
> > > [    3.004688]  init_module_from_file+0xd0/0x100
> > > [    3.005668]  idempotent_init_module+0x11c/0x2b0
> > > [    3.006632]  __x64_sys_finit_module+0x64/0xd0
> > > [    3.007568]  do_syscall_64+0x59/0x90
> > > [    3.008373]  ? ksys_read+0x73/0x100
> > > [    3.009395]  ? exit_to_user_mode_prepare+0x30/0xb0
> > > [    3.010531]  ? syscall_exit_to_user_mode+0x37/0x60
> > > [    3.011662]  ? do_syscall_64+0x68/0x90
> > > [    3.012511]  ? do_syscall_64+0x68/0x90
> > > [    3.013364]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > > 
> > > However, continuous physical memory does not seem to be required in
> > > module_zstd_decompress(), so use vmalloc() instead, to prevent the
> > > warning and avoid potential failures at loading compressed modules.
> > > 
> > > Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
> > > Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> > > ---
> > >  kernel/module/decompress.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c
> > > index 8a5d6d63b06c..87440f714c0c 100644
> > > --- a/kernel/module/decompress.c
> > > +++ b/kernel/module/decompress.c
> > > @@ -241,7 +241,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
> > >  	}
> > > 
> > >  	wksp_size = zstd_dstream_workspace_bound(header.windowSize);
> > > -	wksp = kmalloc(wksp_size, GFP_KERNEL);
> > > +	wksp = vmalloc(wksp_size);
> > >  	if (!wksp) {
> > >  		retval = -ENOMEM;
> > >  		goto out;
> > > @@ -284,7 +284,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
> > >  	retval = new_size;
> > > 
> > >   out:
> > > -	kfree(wksp);
> > > +	vfree(wksp);
> > >  	return retval;
> > 
> > Thanks! Applied and queued up.
> 
> I can see at least the gz decompress would need the same kind of change.
> Shouldn't we tackle them all at once?

gz decompress needs to allocate a struct inflate_workspace, that is not
too bad (11 pages on my system), but it also seems safer to just use
vmalloc():

struct inflate_workspace {
	struct inflate_state       inflate_state;        /*     0  9544 */
	/* --- cacheline 149 boundary (9536 bytes) was 8 bytes ago --- */
	unsigned char              working_window[32768]; /*  9544 32768 */

	/* size: 42312, cachelines: 662, members: 2 */
	/* last cacheline: 8 bytes */
};

xz is also using kmalloc() internally in xz_dec_init() to allocate
struct xz_dec that seems to be less than a page, so kmalloc() should be
fine in this case:

struct xz_dec {
...
	/* size: 1232, cachelines: 20, members: 16 */
...
}

In conclusion I think we should be pretty safe for now by just changing
gz and zstd.

Maybe having two separate patches is better (in case we need to revert
just one for any reason...)?

-Andrea
Luis Chamberlain Aug. 29, 2023, 7:05 p.m. UTC | #4
On Tue, Aug 29, 2023 at 07:46:35PM +0200, Andrea Righi wrote:
> On Tue, Aug 29, 2023 at 10:30:55AM -0700, Lucas De Marchi wrote:
> In conclusion I think we should be pretty safe for now by just changing
> gz and zstd.
> 
> Maybe having two separate patches is better (in case we need to revert
> just one for any reason...)?

Yes, that is why I already merged your zstd patch. If things do blow up
the collateral is smaller.

  Luis
diff mbox series

Patch

diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c
index 8a5d6d63b06c..87440f714c0c 100644
--- a/kernel/module/decompress.c
+++ b/kernel/module/decompress.c
@@ -241,7 +241,7 @@  static ssize_t module_zstd_decompress(struct load_info *info,
 	}
 
 	wksp_size = zstd_dstream_workspace_bound(header.windowSize);
-	wksp = kmalloc(wksp_size, GFP_KERNEL);
+	wksp = vmalloc(wksp_size);
 	if (!wksp) {
 		retval = -ENOMEM;
 		goto out;
@@ -284,7 +284,7 @@  static ssize_t module_zstd_decompress(struct load_info *info,
 	retval = new_size;
 
  out:
-	kfree(wksp);
+	vfree(wksp);
 	return retval;
 }
 #else