diff mbox series

riscv: stacktrace: fixed walk_stackframe()

Message ID 20240426072701.6463-1-dev.mbstr@gmail.com (mailing list archive)
State Superseded
Headers show
Series riscv: stacktrace: fixed walk_stackframe() | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-1-test-1 success .github/scripts/patches/tests/build_rv32_defconfig.sh
conchuod/patch-1-test-2 success .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh
conchuod/patch-1-test-3 success .github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh
conchuod/patch-1-test-4 success .github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh
conchuod/patch-1-test-5 success .github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh
conchuod/patch-1-test-6 success .github/scripts/patches/tests/checkpatch.sh
conchuod/patch-1-test-7 success .github/scripts/patches/tests/dtb_warn_rv64.sh
conchuod/patch-1-test-8 success .github/scripts/patches/tests/header_inline.sh
conchuod/patch-1-test-9 success .github/scripts/patches/tests/kdoc.sh
conchuod/patch-1-test-10 success .github/scripts/patches/tests/module_param.sh
conchuod/patch-1-test-11 success .github/scripts/patches/tests/verify_fixes.sh
conchuod/patch-1-test-12 success .github/scripts/patches/tests/verify_signedoff.sh

Commit Message

Matthew Bystrin April 26, 2024, 7:24 a.m. UTC
If the load access fault occures in a function without callee
(CONFIG_FRAME_POINTER=y), when wrong stack trace will be displayed:

[<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
---[ end trace 0000000000000000 ]---

Registers dump:
    ra     0xffffffff80485758 <regmap_mmio_read+36>
    sp     0xffffffc80200b9a0
    fp     0xffffffc80200b9b0
    pc     0xffffffff804853ba <regmap_mmio_read32le+6>

Stack dump:
    0xffffffc80200b9a0:  0xffffffc80200b9e0  0xffffffc80200b9e0
    0xffffffc80200b9b0:  0xffffffff8116d7e8  0x0000000000000100
    0xffffffc80200b9c0:  0xffffffd8055b9400  0xffffffd8055b9400
    0xffffffc80200b9d0:  0xffffffc80200b9f0  0xffffffff8047c526
    0xffffffc80200b9e0:  0xffffffc80200ba30  0xffffffff8047fe9a

The assembler dump of the function preambula:
    add     sp,sp,-16
    sd      s0,8(sp)
    add     s0,sp,16

In the fist stack frame, where ra is not stored on the stack we can
observe:

        0(sp)                  8(sp)
        .---------------------------------------------.
    sp->|       frame->fp      | frame->ra (saved fp) |
        |---------------------------------------------|
    fp->|         ....         |         ....         |
        |---------------------------------------------|
        |                      |                      |

and in the code check is performed:
	if (regs && (regs->epc == pc) && (frame->fp & 0x7))

I see no reason to check frame->fp value at all, because it is can be
uninitialized value on the stack, so removed it. After the stacktrace
shows as expect:

[<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
[<ffffffff80485758>] regmap_mmio_read+0x24/0x52
[<ffffffff8047c526>] _regmap_bus_reg_read+0x1a/0x22
[<ffffffff8047fe9a>] _regmap_read+0x5c/0xea
[<ffffffff80480376>] _regmap_update_bits+0x76/0xc0
...
---[ end trace 0000000000000000 ]---

Fixes: f766f77a74f5 ("riscv/stacktrace: Fix stack output without ra on the stack top")
Signed-off-by: Matthew Bystrin <dev.mbstr@gmail.com>
---
I've catched this bug on v6.1 with gcc 12.2.0 (Debian 12.2.0-13). Different
compiler versions generate the same assembler code. So I think this is not a
compiler issue.

 arch/riscv/kernel/stacktrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Samuel Holland May 8, 2024, 3:51 p.m. UTC | #1
Hi Matthew,

Thanks for the patch!

On 2024-04-26 2:24 AM, Matthew Bystrin wrote:
> If the load access fault occures in a function without callee
> (CONFIG_FRAME_POINTER=y), when wrong stack trace will be displayed:
> 
> [<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
> ---[ end trace 0000000000000000 ]---
> 
> Registers dump:
>     ra     0xffffffff80485758 <regmap_mmio_read+36>
>     sp     0xffffffc80200b9a0
>     fp     0xffffffc80200b9b0
>     pc     0xffffffff804853ba <regmap_mmio_read32le+6>
> 
> Stack dump:
>     0xffffffc80200b9a0:  0xffffffc80200b9e0  0xffffffc80200b9e0
>     0xffffffc80200b9b0:  0xffffffff8116d7e8  0x0000000000000100
>     0xffffffc80200b9c0:  0xffffffd8055b9400  0xffffffd8055b9400
>     0xffffffc80200b9d0:  0xffffffc80200b9f0  0xffffffff8047c526
>     0xffffffc80200b9e0:  0xffffffc80200ba30  0xffffffff8047fe9a
> 
> The assembler dump of the function preambula:
>     add     sp,sp,-16
>     sd      s0,8(sp)
>     add     s0,sp,16
> 
> In the fist stack frame, where ra is not stored on the stack we can
> observe:
> 
>         0(sp)                  8(sp)
>         .---------------------------------------------.
>     sp->|       frame->fp      | frame->ra (saved fp) |
>         |---------------------------------------------|
>     fp->|         ....         |         ....         |
>         |---------------------------------------------|
>         |                      |                      |
> 
> and in the code check is performed:
> 	if (regs && (regs->epc == pc) && (frame->fp & 0x7))
> 
> I see no reason to check frame->fp value at all, because it is can be
> uninitialized value on the stack, so removed it. After the stacktrace

The purpose of this check is to distinguish leaf functions from non-leaf
functions, which have a different frame record layout. regs->epc does not
necessarily point to somewhere in a leaf function, for example if a non-leaf
function was preempted by an interrupt. So it incorrect to remove the check
entirely and always consider the first frame in the stack trace to be a leaf
function.

You are correct that for a leaf function, frame->fp could be uninitialized or an
arbitrary saved register, so the existing check is wrong. A better check would
be based on frame->ra, which is always pushed. For non-leaf functions, frame->ra
is a kernel text address. For leaf functions, frame->ra is a stack address. So
checking if frame->ra is within the current stack (between `low` and `high`)
should be sufficient to detect this case.

> shows as expect:
> 
> [<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
> [<ffffffff80485758>] regmap_mmio_read+0x24/0x52
> [<ffffffff8047c526>] _regmap_bus_reg_read+0x1a/0x22
> [<ffffffff8047fe9a>] _regmap_read+0x5c/0xea
> [<ffffffff80480376>] _regmap_update_bits+0x76/0xc0
> ...
> ---[ end trace 0000000000000000 ]---
> 
> Fixes: f766f77a74f5 ("riscv/stacktrace: Fix stack output without ra on the stack top")
> Signed-off-by: Matthew Bystrin <dev.mbstr@gmail.com>
> ---
> I've catched this bug on v6.1 with gcc 12.2.0 (Debian 12.2.0-13). Different
> compiler versions generate the same assembler code. So I think this is not a
> compiler issue.

Unfortunately, this is indeed a mismatch between the compiler and the documented
ABI[1]. The ABI was created based on the existing GCC behavior[2]. However, the
GCC behavior is different for leaf functions. Specifically, the logic for
determining if ra needs to be saved on the stack does not check if frame
pointers are enabled. Since GCC was not trying to maintain a specific stack
frame layout, and just pushing registers sequentially as needed, this makes the
frame record layout different for leaf functions. So existing versions of GCC
are incompatible with the ABI as written.

This was fixed for GCC 14 as a side effect of adding the
-m(no)-omit-leaf-frame-pointer option[3]. After this change, ra is always saved
if a frame record is being saved, so the frame record layout is always the same,
matching the ABI.

For the kernel, since we want frame records to be available for stack walking,
we should pass -mno-omit-leaf-frame-pointer if that option is supported. If the
option is supported, we know the ABI is fixed, so in that case we can omit the
workaround described above.

We may still want a compiler version check, because there is a much simpler GCC
change that would fix the ABI incompatibility in older branches without adding
the new option:

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dddd72aa237..58722507dcb 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3894,7 +3894,8 @@ riscv_save_reg_p (unsigned int regno)
   if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed)
     return true;

-  if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
+  if (regno == RETURN_ADDR_REGNUM && (crtl->calls_eh_return
+				      || frame_pointer_needed))
     return true;

   /* If this is an interrupt handler, then must save extra registers.  */


[1]:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#frame-pointer-convention
[2]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/18
[3]:
https://github.com/gcc-mirror/gcc/commit/39663298b5934831a0125e12f113ebd83248c3be

Regards,
Samuel

> 
>  arch/riscv/kernel/stacktrace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> index 64a9c093aef9..8138c68270c9 100644
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -55,7 +55,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
>  		/* Unwind stack frame */
>  		frame = (struct stackframe *)fp - 1;
>  		sp = fp;
> -		if (regs && (regs->epc == pc) && (frame->fp & 0x7)) {
> +		if (regs && regs->epc == pc) {
>  			fp = frame->ra;
>  			pc = regs->ra;
>  		} else {
Palmer Dabbelt May 8, 2024, 5:05 p.m. UTC | #2
On Wed, 08 May 2024 08:51:04 PDT (-0700), samuel.holland@sifive.com wrote:
> Hi Matthew,
>
> Thanks for the patch!
>
> On 2024-04-26 2:24 AM, Matthew Bystrin wrote:
>> If the load access fault occures in a function without callee
>> (CONFIG_FRAME_POINTER=y), when wrong stack trace will be displayed:
>>
>> [<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
>> ---[ end trace 0000000000000000 ]---
>>
>> Registers dump:
>>     ra     0xffffffff80485758 <regmap_mmio_read+36>
>>     sp     0xffffffc80200b9a0
>>     fp     0xffffffc80200b9b0
>>     pc     0xffffffff804853ba <regmap_mmio_read32le+6>
>>
>> Stack dump:
>>     0xffffffc80200b9a0:  0xffffffc80200b9e0  0xffffffc80200b9e0
>>     0xffffffc80200b9b0:  0xffffffff8116d7e8  0x0000000000000100
>>     0xffffffc80200b9c0:  0xffffffd8055b9400  0xffffffd8055b9400
>>     0xffffffc80200b9d0:  0xffffffc80200b9f0  0xffffffff8047c526
>>     0xffffffc80200b9e0:  0xffffffc80200ba30  0xffffffff8047fe9a
>>
>> The assembler dump of the function preambula:
>>     add     sp,sp,-16
>>     sd      s0,8(sp)
>>     add     s0,sp,16
>>
>> In the fist stack frame, where ra is not stored on the stack we can
>> observe:
>>
>>         0(sp)                  8(sp)
>>         .---------------------------------------------.
>>     sp->|       frame->fp      | frame->ra (saved fp) |
>>         |---------------------------------------------|
>>     fp->|         ....         |         ....         |
>>         |---------------------------------------------|
>>         |                      |                      |
>>
>> and in the code check is performed:
>> 	if (regs && (regs->epc == pc) && (frame->fp & 0x7))
>>
>> I see no reason to check frame->fp value at all, because it is can be
>> uninitialized value on the stack, so removed it. After the stacktrace
>
> The purpose of this check is to distinguish leaf functions from non-leaf
> functions, which have a different frame record layout. regs->epc does not
> necessarily point to somewhere in a leaf function, for example if a non-leaf
> function was preempted by an interrupt. So it incorrect to remove the check
> entirely and always consider the first frame in the stack trace to be a leaf
> function.
>
> You are correct that for a leaf function, frame->fp could be uninitialized or an
> arbitrary saved register, so the existing check is wrong. A better check would
> be based on frame->ra, which is always pushed. For non-leaf functions, frame->ra
> is a kernel text address. For leaf functions, frame->ra is a stack address. So
> checking if frame->ra is within the current stack (between `low` and `high`)
> should be sufficient to detect this case.

I was just playing with this, looks like GCC is treating tail-call-only 
functions as leaf functions.  So I think whatever workaround we come up 
with loses the tail-call-only function in the backtrace, not sure if 
there's any way we can work around that as there's essentially no 
frame remaining for it.  From https://godbolt.org/z/7bMsEbqjn and 
GCC trunk

    int glob;
    
    void leaf(int x) {
        glob = x;
    }
    
    void tail(int x) {
        leaf(x);
    }
    
    void func(int x) {
        leaf(x);
        glob = x+1;
    }
    
    -O2 -fno-inline -fno-omit-frame-pointer -momit-leaf-frame-pointer
    
    leaf(int):
            lui     a5,%hi(glob)
            sw      a0,%lo(glob)(a5)
            ret
    tail(int):
            tail    leaf(int)
    func(int):
            addi    sp,sp,-32
            sd      s0,16(sp)
            sd      s1,8(sp)
    …        addi    sp,sp,32
            jr      ra
    glob:
            .zero   4

>
>> shows as expect:
>>
>> [<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
>> [<ffffffff80485758>] regmap_mmio_read+0x24/0x52
>> [<ffffffff8047c526>] _regmap_bus_reg_read+0x1a/0x22
>> [<ffffffff8047fe9a>] _regmap_read+0x5c/0xea
>> [<ffffffff80480376>] _regmap_update_bits+0x76/0xc0
>> ...
>> ---[ end trace 0000000000000000 ]---
>>
>> Fixes: f766f77a74f5 ("riscv/stacktrace: Fix stack output without ra on the stack top")
>> Signed-off-by: Matthew Bystrin <dev.mbstr@gmail.com>
>> ---
>> I've catched this bug on v6.1 with gcc 12.2.0 (Debian 12.2.0-13). Different
>> compiler versions generate the same assembler code. So I think this is not a
>> compiler issue.
>
> Unfortunately, this is indeed a mismatch between the compiler and the documented
> ABI[1]. The ABI was created based on the existing GCC behavior[2]. However, the
> GCC behavior is different for leaf functions. Specifically, the logic for
> determining if ra needs to be saved on the stack does not check if frame
> pointers are enabled. Since GCC was not trying to maintain a specific stack
> frame layout, and just pushing registers sequentially as needed, this makes the
> frame record layout different for leaf functions. So existing versions of GCC
> are incompatible with the ABI as written.
>
> This was fixed for GCC 14 as a side effect of adding the
> -m(no)-omit-leaf-frame-pointer option[3]. After this change, ra is always saved
> if a frame record is being saved, so the frame record layout is always the same,
> matching the ABI.

Ya, I think so -- IIUC we don't strictly need the option because the 
default seems to follow -fno-omit-frame-pointer, so we only strictly need 
to check for the presence of -mno-omit-leaf-frame-pointer.  Probably 
cleaner to just stick the option in there when it's supported, though, 
just to be on the safe side.

> For the kernel, since we want frame records to be available for stack walking,
> we should pass -mno-omit-leaf-frame-pointer if that option is supported. If the
> option is supported, we know the ABI is fixed, so in that case we can omit the
> workaround described above.
>
> We may still want a compiler version check, because there is a much simpler GCC
> change that would fix the ABI incompatibility in older branches without adding
> the new option:
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index dddd72aa237..58722507dcb 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3894,7 +3894,8 @@ riscv_save_reg_p (unsigned int regno)
>    if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed)
>      return true;
>
> -  if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
> +  if (regno == RETURN_ADDR_REGNUM && (crtl->calls_eh_return
> +				      || frame_pointer_needed))
>      return true;
>
>    /* If this is an interrupt handler, then must save extra registers.  */

IMO we should try backporting the option to GCC-13, that's a cleaner 
user interface.

> [1]:
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#frame-pointer-convention
> [2]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/18
> [3]:
> https://github.com/gcc-mirror/gcc/commit/39663298b5934831a0125e12f113ebd83248c3be
>
> Regards,
> Samuel
>
>>
>>  arch/riscv/kernel/stacktrace.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
>> index 64a9c093aef9..8138c68270c9 100644
>> --- a/arch/riscv/kernel/stacktrace.c
>> +++ b/arch/riscv/kernel/stacktrace.c
>> @@ -55,7 +55,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
>>  		/* Unwind stack frame */
>>  		frame = (struct stackframe *)fp - 1;
>>  		sp = fp;
>> -		if (regs && (regs->epc == pc) && (frame->fp & 0x7)) {
>> +		if (regs && regs->epc == pc) {
>>  			fp = frame->ra;
>>  			pc = regs->ra;
>>  		} else {
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Matthew Bystrin May 9, 2024, 10:13 p.m. UTC | #3
Thanks for taking your time to review the patch!

On 2024-05-08 06:51 PM, Samuel Holland wrote:
> A better check would be based on frame->ra, which is always pushed. For
> non-leaf functions, frame->ra is a kernel text address. For leaf functions,
> frame->ra is a stack address. So checking if frame->ra is within the current
> stack (between `low` and `high`) should be sufficient to detect this case.

I've came up with slightly different approach. Checking that frame->ra is not a
kernel text address seems like a more compact solution. Palmer, Samuel, what do
you think?

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 64a9c093aef9..e63bb926d3d9 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -55,7 +55,8 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
 		/* Unwind stack frame */
 		frame = (struct stackframe *)fp - 1;
 		sp = fp;
-		if (regs && (regs->epc == pc) && (frame->fp & 0x7)) {
+		if (regs && (regs->epc == pc) && !__kernel_text_address(frame->ra)) {
+			/* We hit function where ra is not saved on the stack */
 			fp = frame->ra;
 			pc = regs->ra;
 		} else {

On 2024-05-08 08:05 PM, Palmer Dabbelt wrote:
> I was just playing with this, looks like GCC is treating tail-call-only
> functions as leaf functions.  So I think whatever workaround we come up
> with loses the tail-call-only function in the backtrace, not sure if
> there's any way we can work around that as there's essentially no
> frame remaining for it.

Yes, we definitely will miss some functions. But at least we can observe
underlying call stack in older compiler versions.

On 2024-05-08 08:05 PM, Palmer Dabbelt wrote:
> > For the kernel, since we want frame records to be available for stack
> > walking, we should pass -mno-omit-leaf-frame-pointer if that option is
> > supported. If the option is supported, we know the ABI is fixed, so in that
> > case we can omit the workaround described above.
> >
> > We may still want a compiler version check, because there is a much simpler
> > GCC change that would fix the ABI incompatibility in older branches without
> > adding the new option:
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index dddd72aa237..58722507dcb 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -3894,7 +3894,8 @@ riscv_save_reg_p (unsigned int regno)
> >    if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed)
> >      return true;
> >
> > -  if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
> > +  if (regno == RETURN_ADDR_REGNUM && (crtl->calls_eh_return
> > +                                      || frame_pointer_needed))
> >      return true;
> >
> >    /* If this is an interrupt handler, then must save extra registers.  */
>
> IMO we should try backporting the option to GCC-13, that's a cleaner
> user interface.

Agreed. And what about older compiler versions? Is that a good idea to move
unwinding logic into the separate function which can have different
implementations (via conditional compilation) depending on the compiler version?

Now speaking about the implementation. I'm not sure where
`-mno-omit-leaf-frame-pointer` flag should be added. Do you have any suggestions
for a proper way to do it or maybe related examples? Also I can discuss it with
Masahiro.
diff mbox series

Patch

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 64a9c093aef9..8138c68270c9 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -55,7 +55,7 @@  void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
 		/* Unwind stack frame */
 		frame = (struct stackframe *)fp - 1;
 		sp = fp;
-		if (regs && (regs->epc == pc) && (frame->fp & 0x7)) {
+		if (regs && regs->epc == pc) {
 			fp = frame->ra;
 			pc = regs->ra;
 		} else {