diff mbox series

[v3] RISC-V: Add fixup to support fast call of crash_kexec()

Message ID 20220606123750.2884245-1-xianting.tian@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series [v3] RISC-V: Add fixup to support fast call of crash_kexec() | expand

Commit Message

Xianting Tian June 6, 2022, 12:37 p.m. UTC
Currently, almost all archs (x86, arm64, mips...) support fast call
of crash_kexec() when "regs && kexec_should_crash()" is true. But
RISC-V not, it can only enter crash system via panic(). However panic()
doesn't pass the regs of the real accident scene to crash_kexec(),
it caused we can't get accurate backtrace via gdb,
	$ riscv64-linux-gnu-gdb vmlinux vmcore
	Reading symbols from vmlinux...
	[New LWP 95]
	#0  console_unlock () at kernel/printk/printk.c:2557
	2557                    if (do_cond_resched)
	(gdb) bt
	#0  console_unlock () at kernel/printk/printk.c:2557
	#1  0x0000000000000000 in ?? ()

With the patch we can get the accurate backtrace,
	$ riscv64-linux-gnu-gdb vmlinux vmcore
	Reading symbols from vmlinux...
	[New LWP 95]
	#0  0xffffffe00063a4e0 in test_thread (data=<optimized out>) at drivers/test_crash.c:81
	81             *(int *)p = 0xdead;
	(gdb)
	(gdb) bt
	#0  0xffffffe00064d5c0 in test_thread (data=<optimized out>) at drivers/test_crash.c:81
	#1  0x0000000000000000 in ?? ()

Test code to produce NULL address dereference in test_crash.c,
	void *p = NULL;
	*(int *)p = 0xdead;

Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
---
Changes from v1:
- simplify the commit message
Changes from v2:
- add fixup in title
---
 arch/riscv/kernel/traps.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Kefeng Wang June 7, 2022, 1:21 a.m. UTC | #1
On 2022/6/6 20:37, Xianting Tian wrote:
> Currently, almost all archs (x86, arm64, mips...) support fast call
> of crash_kexec() when "regs && kexec_should_crash()" is true. But
> RISC-V not, it can only enter crash system via panic(). However panic()
> doesn't pass the regs of the real accident scene to crash_kexec(),
> it caused we can't get accurate backtrace via gdb,
> 	$ riscv64-linux-gnu-gdb vmlinux vmcore
> 	Reading symbols from vmlinux...
> 	[New LWP 95]
> 	#0  console_unlock () at kernel/printk/printk.c:2557
> 	2557                    if (do_cond_resched)
> 	(gdb) bt
> 	#0  console_unlock () at kernel/printk/printk.c:2557
> 	#1  0x0000000000000000 in ?? ()
>
> With the patch we can get the accurate backtrace,
> 	$ riscv64-linux-gnu-gdb vmlinux vmcore
> 	Reading symbols from vmlinux...
> 	[New LWP 95]
> 	#0  0xffffffe00063a4e0 in test_thread (data=<optimized out>) at drivers/test_crash.c:81
> 	81             *(int *)p = 0xdead;
> 	(gdb)
> 	(gdb) bt
> 	#0  0xffffffe00064d5c0 in test_thread (data=<optimized out>) at drivers/test_crash.c:81
> 	#1  0x0000000000000000 in ?? ()
>
> Test code to produce NULL address dereference in test_crash.c,
> 	void *p = NULL;
> 	*(int *)p = 0xdead;
>
> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
> Reviewed-by: Guo Ren <guoren@kernel.org>
> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
> ---
> Changes from v1:
> - simplify the commit message
> Changes from v2:
> - add fixup in title
> ---
>   arch/riscv/kernel/traps.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index b40426509244..39d0f8bba4b4 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -16,6 +16,7 @@
>   #include <linux/mm.h>
>   #include <linux/module.h>
>   #include <linux/irq.h>
> +#include <linux/kexec.h>
>   
>   #include <asm/asm-prototypes.h>
>   #include <asm/bug.h>
> @@ -44,6 +45,9 @@ void die(struct pt_regs *regs, const char *str)
>   
>   	ret = notify_die(DIE_OOPS, str, regs, 0, regs->cause, SIGSEGV);
>   
> +	if (regs && kexec_should_crash(current))
> +		crash_kexec(regs);
> +

It seems that the regs won't be null, right? except that,

Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>

>   	bust_spinlocks(0);
>   	add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
>   	spin_unlock_irq(&die_lock);
Xianting Tian June 7, 2022, 1:46 a.m. UTC | #2
在 2022/6/7 上午9:21, Kefeng Wang 写道:
>
> On 2022/6/6 20:37, Xianting Tian wrote:
>> Currently, almost all archs (x86, arm64, mips...) support fast call
>> of crash_kexec() when "regs && kexec_should_crash()" is true. But
>> RISC-V not, it can only enter crash system via panic(). However panic()
>> doesn't pass the regs of the real accident scene to crash_kexec(),
>> it caused we can't get accurate backtrace via gdb,
>>     $ riscv64-linux-gnu-gdb vmlinux vmcore
>>     Reading symbols from vmlinux...
>>     [New LWP 95]
>>     #0  console_unlock () at kernel/printk/printk.c:2557
>>     2557                    if (do_cond_resched)
>>     (gdb) bt
>>     #0  console_unlock () at kernel/printk/printk.c:2557
>>     #1  0x0000000000000000 in ?? ()
>>
>> With the patch we can get the accurate backtrace,
>>     $ riscv64-linux-gnu-gdb vmlinux vmcore
>>     Reading symbols from vmlinux...
>>     [New LWP 95]
>>     #0  0xffffffe00063a4e0 in test_thread (data=<optimized out>) at 
>> drivers/test_crash.c:81
>>     81             *(int *)p = 0xdead;
>>     (gdb)
>>     (gdb) bt
>>     #0  0xffffffe00064d5c0 in test_thread (data=<optimized out>) at 
>> drivers/test_crash.c:81
>>     #1  0x0000000000000000 in ?? ()
>>
>> Test code to produce NULL address dereference in test_crash.c,
>>     void *p = NULL;
>>     *(int *)p = 0xdead;
>>
>> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
>> Reviewed-by: Guo Ren <guoren@kernel.org>
>> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
>> ---
>> Changes from v1:
>> - simplify the commit message
>> Changes from v2:
>> - add fixup in title
>> ---
>>   arch/riscv/kernel/traps.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index b40426509244..39d0f8bba4b4 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -16,6 +16,7 @@
>>   #include <linux/mm.h>
>>   #include <linux/module.h>
>>   #include <linux/irq.h>
>> +#include <linux/kexec.h>
>>     #include <asm/asm-prototypes.h>
>>   #include <asm/bug.h>
>> @@ -44,6 +45,9 @@ void die(struct pt_regs *regs, const char *str)
>>         ret = notify_die(DIE_OOPS, str, regs, 0, regs->cause, SIGSEGV);
>>   +    if (regs && kexec_should_crash(current))
>> +        crash_kexec(regs);
>> +
>
> It seems that the regs won't be null, right? except that,

Autually both regs won't be null, But if it is triggered by panic() , 
the regs are got via riscv_crash_save_regs(), which are the regs of that 
moment, but not the real accident scene.

>
> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>
>>       bust_spinlocks(0);
>>       add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
>>       spin_unlock_irq(&die_lock);
Xianting Tian June 17, 2022, 6:40 a.m. UTC | #3
Hi Palmer

Will you apply this patch for 5.19?

thanks

在 2022/6/7 上午9:46, Xianting Tian 写道:
>
> 在 2022/6/7 上午9:21, Kefeng Wang 写道:
>>
>> On 2022/6/6 20:37, Xianting Tian wrote:
>>> Currently, almost all archs (x86, arm64, mips...) support fast call
>>> of crash_kexec() when "regs && kexec_should_crash()" is true. But
>>> RISC-V not, it can only enter crash system via panic(). However panic()
>>> doesn't pass the regs of the real accident scene to crash_kexec(),
>>> it caused we can't get accurate backtrace via gdb,
>>>     $ riscv64-linux-gnu-gdb vmlinux vmcore
>>>     Reading symbols from vmlinux...
>>>     [New LWP 95]
>>>     #0  console_unlock () at kernel/printk/printk.c:2557
>>>     2557                    if (do_cond_resched)
>>>     (gdb) bt
>>>     #0  console_unlock () at kernel/printk/printk.c:2557
>>>     #1  0x0000000000000000 in ?? ()
>>>
>>> With the patch we can get the accurate backtrace,
>>>     $ riscv64-linux-gnu-gdb vmlinux vmcore
>>>     Reading symbols from vmlinux...
>>>     [New LWP 95]
>>>     #0  0xffffffe00063a4e0 in test_thread (data=<optimized out>) at 
>>> drivers/test_crash.c:81
>>>     81             *(int *)p = 0xdead;
>>>     (gdb)
>>>     (gdb) bt
>>>     #0  0xffffffe00064d5c0 in test_thread (data=<optimized out>) at 
>>> drivers/test_crash.c:81
>>>     #1  0x0000000000000000 in ?? ()
>>>
>>> Test code to produce NULL address dereference in test_crash.c,
>>>     void *p = NULL;
>>>     *(int *)p = 0xdead;
>>>
>>> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
>>> Reviewed-by: Guo Ren <guoren@kernel.org>
>>> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
>>> ---
>>> Changes from v1:
>>> - simplify the commit message
>>> Changes from v2:
>>> - add fixup in title
>>> ---
>>>   arch/riscv/kernel/traps.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>>> index b40426509244..39d0f8bba4b4 100644
>>> --- a/arch/riscv/kernel/traps.c
>>> +++ b/arch/riscv/kernel/traps.c
>>> @@ -16,6 +16,7 @@
>>>   #include <linux/mm.h>
>>>   #include <linux/module.h>
>>>   #include <linux/irq.h>
>>> +#include <linux/kexec.h>
>>>     #include <asm/asm-prototypes.h>
>>>   #include <asm/bug.h>
>>> @@ -44,6 +45,9 @@ void die(struct pt_regs *regs, const char *str)
>>>         ret = notify_die(DIE_OOPS, str, regs, 0, regs->cause, SIGSEGV);
>>>   +    if (regs && kexec_should_crash(current))
>>> +        crash_kexec(regs);
>>> +
>>
>> It seems that the regs won't be null, right? except that,
>
> Autually both regs won't be null, But if it is triggered by panic() , 
> the regs are got via riscv_crash_save_regs(), which are the regs of 
> that moment, but not the real accident scene.
>
>>
>> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>
>>>       bust_spinlocks(0);
>>>       add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
>>>       spin_unlock_irq(&die_lock);
Guo Ren June 17, 2022, 7:13 a.m. UTC | #4
Hi Xianting,

On Fri, Jun 17, 2022 at 2:40 PM Xianting Tian
<xianting.tian@linux.alibaba.com> wrote:
>
> Hi Palmer
>
> Will you apply this patch for 5.19?
Maybe we could update tile with [PATCH v3] RISC-V: Fixup fast call of
crash_kexec() for a V4.

It's a fixup not feature, that would mislead maintainer. And add
"Reviewed-by: Kefeng Wang".

>
> thanks
>
> 在 2022/6/7 上午9:46, Xianting Tian 写道:
> >
> > 在 2022/6/7 上午9:21, Kefeng Wang 写道:
> >>
> >> On 2022/6/6 20:37, Xianting Tian wrote:
> >>> Currently, almost all archs (x86, arm64, mips...) support fast call
> >>> of crash_kexec() when "regs && kexec_should_crash()" is true. But
> >>> RISC-V not, it can only enter crash system via panic(). However panic()
> >>> doesn't pass the regs of the real accident scene to crash_kexec(),
> >>> it caused we can't get accurate backtrace via gdb,
> >>>     $ riscv64-linux-gnu-gdb vmlinux vmcore
> >>>     Reading symbols from vmlinux...
> >>>     [New LWP 95]
> >>>     #0  console_unlock () at kernel/printk/printk.c:2557
> >>>     2557                    if (do_cond_resched)
> >>>     (gdb) bt
> >>>     #0  console_unlock () at kernel/printk/printk.c:2557
> >>>     #1  0x0000000000000000 in ?? ()
> >>>
> >>> With the patch we can get the accurate backtrace,
> >>>     $ riscv64-linux-gnu-gdb vmlinux vmcore
> >>>     Reading symbols from vmlinux...
> >>>     [New LWP 95]
> >>>     #0  0xffffffe00063a4e0 in test_thread (data=<optimized out>) at
> >>> drivers/test_crash.c:81
> >>>     81             *(int *)p = 0xdead;
> >>>     (gdb)
> >>>     (gdb) bt
> >>>     #0  0xffffffe00064d5c0 in test_thread (data=<optimized out>) at
> >>> drivers/test_crash.c:81
> >>>     #1  0x0000000000000000 in ?? ()
> >>>
> >>> Test code to produce NULL address dereference in test_crash.c,
> >>>     void *p = NULL;
> >>>     *(int *)p = 0xdead;
> >>>
> >>> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
> >>> Reviewed-by: Guo Ren <guoren@kernel.org>
> >>> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
> >>> ---
> >>> Changes from v1:
> >>> - simplify the commit message
> >>> Changes from v2:
> >>> - add fixup in title
> >>> ---
> >>>   arch/riscv/kernel/traps.c | 4 ++++
> >>>   1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> >>> index b40426509244..39d0f8bba4b4 100644
> >>> --- a/arch/riscv/kernel/traps.c
> >>> +++ b/arch/riscv/kernel/traps.c
> >>> @@ -16,6 +16,7 @@
> >>>   #include <linux/mm.h>
> >>>   #include <linux/module.h>
> >>>   #include <linux/irq.h>
> >>> +#include <linux/kexec.h>
> >>>     #include <asm/asm-prototypes.h>
> >>>   #include <asm/bug.h>
> >>> @@ -44,6 +45,9 @@ void die(struct pt_regs *regs, const char *str)
> >>>         ret = notify_die(DIE_OOPS, str, regs, 0, regs->cause, SIGSEGV);
> >>>   +    if (regs && kexec_should_crash(current))
> >>> +        crash_kexec(regs);
> >>> +
> >>
> >> It seems that the regs won't be null, right? except that,
> >
> > Autually both regs won't be null, But if it is triggered by panic() ,
> > the regs are got via riscv_crash_save_regs(), which are the regs of
> > that moment, but not the real accident scene.
> >
> >>
> >> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> >>
> >>>       bust_spinlocks(0);
> >>>       add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
> >>>       spin_unlock_irq(&die_lock);
diff mbox series

Patch

diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index b40426509244..39d0f8bba4b4 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -16,6 +16,7 @@ 
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/irq.h>
+#include <linux/kexec.h>
 
 #include <asm/asm-prototypes.h>
 #include <asm/bug.h>
@@ -44,6 +45,9 @@  void die(struct pt_regs *regs, const char *str)
 
 	ret = notify_die(DIE_OOPS, str, regs, 0, regs->cause, SIGSEGV);
 
+	if (regs && kexec_should_crash(current))
+		crash_kexec(regs);
+
 	bust_spinlocks(0);
 	add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
 	spin_unlock_irq(&die_lock);