diff mbox series

[5.10,v3] locking/csd_lock: fix csdlock_debug cause arm64 boot panic

Message ID 20220507084510.14761-1-chenzhongjin@huawei.com (mailing list archive)
State New, archived
Headers show
Series [5.10,v3] locking/csd_lock: fix csdlock_debug cause arm64 boot panic | expand

Commit Message

Chen Zhongjin May 7, 2022, 8:45 a.m. UTC
csdlock_debug is a early_param to enable csd_lock_wait
feature.

It uses static_branch_enable in early_param which triggers
a panic on arm64 with config:
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_VMEMMAP=n

The log shows:
Unable to handle kernel NULL pointer dereference at
virtual address ", '0' <repeats 16 times>, "
...
Call trace:
__aarch64_insn_write+0x9c/0x18c
...
static_key_enable+0x1c/0x30
csdlock_debug+0x4c/0x78
do_early_param+0x9c/0xcc
parse_args+0x26c/0x3a8
parse_early_options+0x34/0x40
parse_early_param+0x80/0xa4
setup_arch+0x150/0x6c8
start_kernel+0x8c/0x720
...
Kernel panic - not syncing: Oops: Fatal exception

Call trace inside __aarch64_insn_write:
__nr_to_section
__pfn_to_page
phys_to_page
patch_map
__aarch64_insn_write

Here, with CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section returns
NULL and makes the NULL dereference because mem_section is
initialized in sparse_init after parse_early_param stage.

So, static_branch_enable shouldn't be used inside early_param.
To avoid this, I changed it to __setup and fixed this.

Reported-by: Chen jingwen <chenjingwen6@huawei.com>
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
Change v2 -> v3:
Add module name in title

Change v1 -> v2:
Fix return 1 for __setup
---

 kernel/smp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Greg Kroah-Hartman May 7, 2022, 9:47 a.m. UTC | #1
On Sat, May 07, 2022 at 04:45:10PM +0800, Chen Zhongjin wrote:
> csdlock_debug is a early_param to enable csd_lock_wait
> feature.
> 
> It uses static_branch_enable in early_param which triggers
> a panic on arm64 with config:
> CONFIG_SPARSEMEM=y
> CONFIG_SPARSEMEM_VMEMMAP=n
> 
> The log shows:
> Unable to handle kernel NULL pointer dereference at
> virtual address ", '0' <repeats 16 times>, "
> ...
> Call trace:
> __aarch64_insn_write+0x9c/0x18c
> ...
> static_key_enable+0x1c/0x30
> csdlock_debug+0x4c/0x78
> do_early_param+0x9c/0xcc
> parse_args+0x26c/0x3a8
> parse_early_options+0x34/0x40
> parse_early_param+0x80/0xa4
> setup_arch+0x150/0x6c8
> start_kernel+0x8c/0x720
> ...
> Kernel panic - not syncing: Oops: Fatal exception
> 
> Call trace inside __aarch64_insn_write:
> __nr_to_section
> __pfn_to_page
> phys_to_page
> patch_map
> __aarch64_insn_write
> 
> Here, with CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section returns
> NULL and makes the NULL dereference because mem_section is
> initialized in sparse_init after parse_early_param stage.
> 
> So, static_branch_enable shouldn't be used inside early_param.
> To avoid this, I changed it to __setup and fixed this.
> 
> Reported-by: Chen jingwen <chenjingwen6@huawei.com>
> Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
> ---
> Change v2 -> v3:
> Add module name in title
> 
> Change v1 -> v2:
> Fix return 1 for __setup
> ---
> 
>  kernel/smp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 65a630f62363..381eb15cd28f 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -174,9 +174,9 @@ static int __init csdlock_debug(char *str)
>  	if (val)
>  		static_branch_enable(&csdlock_debug_enabled);
>  
> -	return 0;
> +	return 1;
>  }
> -early_param("csdlock_debug", csdlock_debug);
> +__setup("csdlock_debug=", csdlock_debug);
>  
>  static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
>  static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
> -- 
> 2.17.1
> 


<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>
Chen Zhongjin May 9, 2022, 3:14 a.m. UTC | #2
Hi Greg,

Since the patch:
https://lore.kernel.org/all/20210420093559.23168-1-catalin.marinas@arm.com/
has forced CONFIG_SPARSEMEM_VMEMMAP=y from 5.12, it's not necessary to include
this patch on master.

However this problem still exist on 5.10 stable, so either we can backport the
above patch to 5.10, or independently apply mine.

I'm not sure if backporting one exist patch is better, but that patch only
changed configs without any fix for old builds.

If you have any advice please tell me.

Thanks!
Chen

On 2022/5/7 17:47, Greg KH wrote:
> On Sat, May 07, 2022 at 04:45:10PM +0800, Chen Zhongjin wrote:
>> csdlock_debug is a early_param to enable csd_lock_wait
>> feature.
>>
>> It uses static_branch_enable in early_param which triggers
>> a panic on arm64 with config:
>> CONFIG_SPARSEMEM=y
>> CONFIG_SPARSEMEM_VMEMMAP=n
>>
>> The log shows:
>> Unable to handle kernel NULL pointer dereference at
>> virtual address ", '0' <repeats 16 times>, "
>> ...
>> Call trace:
>> __aarch64_insn_write+0x9c/0x18c
>> ...
>> static_key_enable+0x1c/0x30
>> csdlock_debug+0x4c/0x78
>> do_early_param+0x9c/0xcc
>> parse_args+0x26c/0x3a8
>> parse_early_options+0x34/0x40
>> parse_early_param+0x80/0xa4
>> setup_arch+0x150/0x6c8
>> start_kernel+0x8c/0x720
>> ...
>> Kernel panic - not syncing: Oops: Fatal exception
>>
>> Call trace inside __aarch64_insn_write:
>> __nr_to_section
>> __pfn_to_page
>> phys_to_page
>> patch_map
>> __aarch64_insn_write
>>
>> Here, with CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section returns
>> NULL and makes the NULL dereference because mem_section is
>> initialized in sparse_init after parse_early_param stage.
>>
>> So, static_branch_enable shouldn't be used inside early_param.
>> To avoid this, I changed it to __setup and fixed this.
>>
>> Reported-by: Chen jingwen <chenjingwen6@huawei.com>
>> Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
>> ---
>> Change v2 -> v3:
>> Add module name in title
>>
>> Change v1 -> v2:
>> Fix return 1 for __setup
>> ---
>>
>>  kernel/smp.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index 65a630f62363..381eb15cd28f 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -174,9 +174,9 @@ static int __init csdlock_debug(char *str)
>>  	if (val)
>>  		static_branch_enable(&csdlock_debug_enabled);
>>  
>> -	return 0;
>> +	return 1;
>>  }
>> -early_param("csdlock_debug", csdlock_debug);
>> +__setup("csdlock_debug=", csdlock_debug);
>>  
>>  static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
>>  static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
>> -- 
>> 2.17.1
>>
> 
> 
> <formletter>
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
> 
> </formletter>
> .
Greg Kroah-Hartman May 9, 2022, 10:14 a.m. UTC | #3
A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Mon, May 09, 2022 at 11:14:12AM +0800, Chen Zhongjin wrote:
> Hi Greg,
> 
> Since the patch:
> https://lore.kernel.org/all/20210420093559.23168-1-catalin.marinas@arm.com/
> has forced CONFIG_SPARSEMEM_VMEMMAP=y from 5.12, it's not necessary to include
> this patch on master.
> 
> However this problem still exist on 5.10 stable, so either we can backport the
> above patch to 5.10, or independently apply mine.
> 
> I'm not sure if backporting one exist patch is better, but that patch only
> changed configs without any fix for old builds.
> 
> If you have any advice please tell me.

If you want to include a patch in the stable tree that is NOT in Linus's
tree, then you need to document it very very well as to why this is not
the case.

If backporting the above commit is better, I would much rather do that,
please ask the maintainers and developers of it if they will do that.

thanks,

greg k-h
Chen Zhongjin May 10, 2022, 6:32 a.m. UTC | #4
On 2022/5/9 18:14, Greg KH wrote:
> 
> A: http://en.wikipedia.org/wiki/Top_post
> Q: Were do I find info about this thing called top-posting?
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top

Sorry for this. Thanks so much you point it out before I make more mistakes.

> 
> On Mon, May 09, 2022 at 11:14:12AM +0800, Chen Zhongjin wrote:
>> Hi Greg,
>>
>> Since the patch:
>> https://lore.kernel.org/all/20210420093559.23168-1-catalin.marinas@arm.com/
>> has forced CONFIG_SPARSEMEM_VMEMMAP=y from 5.12, it's not necessary to include
>> this patch on master.
>>
>> However this problem still exist on 5.10 stable, so either we can backport the
>> above patch to 5.10, or independently apply mine.
>>
>> I'm not sure if backporting one exist patch is better, but that patch only
>> changed configs without any fix for old builds.
>>
>> If you have any advice please tell me.
> 
> If you want to include a patch in the stable tree that is NOT in Linus's
> tree, then you need to document it very very well as to why this is not
> the case.
> 
> If backporting the above commit is better, I would much rather do that,
> please ask the maintainers and developers of it if they will do that.

I'll try to send this patch to master because I found it is broken on ppc64 for
this problem as well. Also I'll add CC to stable so after it is accepted by
master we can backport it to stable.

Thanks!

> thanks,
> 
> greg k-h
> .
diff mbox series

Patch

diff --git a/kernel/smp.c b/kernel/smp.c
index 65a630f62363..381eb15cd28f 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -174,9 +174,9 @@  static int __init csdlock_debug(char *str)
 	if (val)
 		static_branch_enable(&csdlock_debug_enabled);
 
-	return 0;
+	return 1;
 }
-early_param("csdlock_debug", csdlock_debug);
+__setup("csdlock_debug=", csdlock_debug);
 
 static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
 static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);