diff mbox

parisc: unwind tables and backtraces broken?

Message ID 4A527397.7060306@gmx.de (mailing list archive)
State Not Applicable
Headers show

Commit Message

Helge Deller July 6, 2009, 9:58 p.m. UTC
I started looking into why CONFIG_BACKTRACE_SELF_TEST=y shows uncomplete/wrong/broken backtraces.

To me it seems, that the unwind tables are broken when using newer gcc/binutils versions.
I'm running hppa-linux-gcc (GCC) 4.3.3, and GNU ld (GNU Binutils) 2.19.51.20090704.

hppa-linux-objdump -d vmlinux gives:

vmlinux:     file format elf32-hppa-linux
Disassembly of section .text:
10100000 <stext-0x700>:
10100000:       20 2f e2 06     ldil L%103df000,r1
10100004:       e0 20 28 82     be,n 440(sr4,r1)
10100008:       20 38 42 02     ldil L%10170000,r1
1010000c:       e0 20 23 da     be,n 1ec(sr4,r1)
10100010:       20 32 62 02     ldil L%10165000,r1
10100014:       e0 20 2a 0a     be,n 504(sr4,r1)
10100018:       20 34 42 06     ldil L%10368000,r1
1010001c:       e0 20 27 12     be,n 388(sr4,r1)
10100020:       20 33 f2 04     ldil L%102e7800,r1
10100024:       e0 20 22 ea     be,n 174(sr4,r1)
10100028:       20 35 22 06     ldil L%1032b000,r1
1010002c:       e0 20 2e 2a     be,n 714(sr4,r1)
(and continuing)
I assume this is a jump-table generated by the linker to
resolve long-distance calls?

and later:
         ...
10100700 <stext>:
10100700:       00 00 38 20     mtsp r0,sr4
10100704:       00 00 78 20     mtsp r0,sr5
10100708:       00 00 b8 20     mtsp r0,sr6
1010070c:       00 00 f8 20     mtsp r0,sr7
10100710:       20 69 80 0c     ldil L%692000,r3
10100714:       34 63 00 00     ldo 0(r3),r3
10100718:       20 93 f0 0c     ldil L%6e7800,r4
1010071c:       34 84 0f 58     ldo 7ac(r4),r4
10100720 <$bss_loop>:
10100720:       80 83 9f f7     cmpb,<<,n r3,r4,10100720 <$bss_loop>
10100724:       0c 60 12 a8     stw,ma r0,4(r3)
....

but the unwind table when running the kernel with the attached patch (see below) shows:
...
unwind_init: start = 0x105fb3c0, end = 0x10634f30, entries = 14775
unwind 1: 100ff900 - 100ffa80, len=385
unwind 2: 100ffa84 - 100ffad4, len=81
unwind 3: 100ffad8 - 100ffb2c, len=85
unwind 4: 100ffb30 - 100ffbc8, len=153
unwind 5: 100ffbcc - 100ffc38, len=109
unwind 6: 100ffc3c - 100ffc9c, len=97
unwind 7: 100ffca0 - 100ffd00, len=97
unwind 8: 100ffd04 - 100ffd64, len=97
unwind 9: 100ffd68 - 100ffdc8, len=97
unwind 10: 100ffdcc - 100ffdec, len=33

 From this table I don't even understand the values of the very first
entry (unwind 1: 100ff900 - 100ffa80).
This does not resolve to any entry in the assembly.

My assumption:
When the linker creates the long-distance jump table, it does not adjusts
the values in the unwind table.
Second, when the linker discards attribute-weak functions,
it doesn't deletes/adjusts the unwind table entries of the deleted functions either.

Question: Might my analysis be correct?

Helge

Comments

Randolph Chung July 7, 2009, 2:57 a.m. UTC | #1
Helge,

> but the unwind table when running the kernel with the attached patch 
> (see below) shows:
> ...
> unwind_init: start = 0x105fb3c0, end = 0x10634f30, entries = 14775
> unwind 1: 100ff900 - 100ffa80, len=385
> unwind 2: 100ffa84 - 100ffad4, len=81
> unwind 3: 100ffad8 - 100ffb2c, len=85
> unwind 4: 100ffb30 - 100ffbc8, len=153
> unwind 5: 100ffbcc - 100ffc38, len=109
> unwind 6: 100ffc3c - 100ffc9c, len=97
> unwind 7: 100ffca0 - 100ffd00, len=97
> unwind 8: 100ffd04 - 100ffd64, len=97
> unwind 9: 100ffd68 - 100ffdc8, len=97
> unwind 10: 100ffdcc - 100ffdec, len=33
>
> From this table I don't even understand the values of the very first
> entry (unwind 1: 100ff900 - 100ffa80).
> This does not resolve to any entry in the assembly.
I am a little fuzzy on the details, but the numbers printed above are 
what is stored in the unwind table. This does not correspond with the 
actual address in memory, which is adjusted by an offset. In the case of 
kernel symbols, this offset is KERNEL_START (this is a parameter passed 
to unwind_table_init()

> My assumption:
> When the linker creates the long-distance jump table, it does not adjusts
> the values in the unwind table.
this used to work.....
> Second, when the linker discards attribute-weak functions,
> it doesn't deletes/adjusts the unwind table entries of the deleted 
> functions either.
can you try this with a userspace program? gdb uses this same unwind 
information to do backtraces. if the unwind info is wrong gdb will be 
very broken.

On the other hand, the kernel does use a more complex linker script so 
it is possible that some options in the linker script is triggering some 
bug.

randolph
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller July 7, 2009, 6:01 p.m. UTC | #2
On 07/07/2009 04:57 AM, Randolph Chung wrote:
> Helge,
>
>> but the unwind table when running the kernel with the attached patch
>> (see below) shows:
>> ...
>> unwind_init: start = 0x105fb3c0, end = 0x10634f30, entries = 14775
>> unwind 1: 100ff900 - 100ffa80, len=385
>> unwind 2: 100ffa84 - 100ffad4, len=81
>> unwind 3: 100ffad8 - 100ffb2c, len=85
>> unwind 4: 100ffb30 - 100ffbc8, len=153
>> unwind 5: 100ffbcc - 100ffc38, len=109
>> unwind 6: 100ffc3c - 100ffc9c, len=97
>> unwind 7: 100ffca0 - 100ffd00, len=97
>> unwind 8: 100ffd04 - 100ffd64, len=97
>> unwind 9: 100ffd68 - 100ffdc8, len=97
>> unwind 10: 100ffdcc - 100ffdec, len=33
>>
>> From this table I don't even understand the values of the very first
>> entry (unwind 1: 100ff900 - 100ffa80).
>> This does not resolve to any entry in the assembly.
> I am a little fuzzy on the details, but the numbers printed above are
> what is stored in the unwind table. This does not correspond with the
> actual address in memory, which is adjusted by an offset. In the case of
> kernel symbols, this offset is KERNEL_START (this is a parameter passed
> to unwind_table_init()

The addresses given above already got the offset added. They are wrong
nevertheless.

>> My assumption:
>> When the linker creates the long-distance jump table, it does not adjusts
>> the values in the unwind table.
> this used to work.....

Yes.
Interestingly, this problem showed up to me now since I updated my 32- and
64bit crosscompilers to 4.3.3 (and binutils of course).
I used (on 32bit) the gcc-3.3 before and this one doesn't exibited the
problem of buggy unwind tables (with the existing/same kernel source code).

>> Second, when the linker discards attribute-weak functions,
>> it doesn't deletes/adjusts the unwind table entries of the deleted
>> functions either.
> can you try this with a userspace program? gdb uses this same unwind
> information to do backtraces. if the unwind info is wrong gdb will be
> very broken.

I'll try, but I assume userspace is ok. If it wouldn't be, Dave probably
won't be able to debug the other userspace issues (the segv-thread on debian's
buildds).
  
> On the other hand, the kernel does use a more complex linker script so
> it is possible that some options in the linker script is triggering some
> bug.

Yes, maybe. But again, I think gcc-3.3 (and the old binutils) could handle this gracefully.

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Carlos O'Donell July 7, 2009, 8:36 p.m. UTC | #3
On Tue, Jul 7, 2009 at 2:33 PM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
>> Interestingly, this problem showed up to me now since I updated my 32- and
>> 64bit crosscompilers to 4.3.3 (and binutils of course).
>> I used (on 32bit) the gcc-3.3 before and this one doesn't exibited the
>> problem of buggy unwind tables (with the existing/same kernel source code).
>
> GCC doesn't know anything about PA-RISC unwind info.  It's generated by the
> assembler from assembler directives.  So, I think it's unlikely that the
> problem is in GCC.
>
> Your comments about dead-code elimination by the linker make me wonder
> if that isn't the problem.

Helge,

Are you compiling with --gc-sections? Try without?

I've seen at least one problem in the past on another target where
garbage collection would not correctly update debug information.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/parisc/kernel/unwind.c b/arch/parisc/kernel/unwind.c
index 69dad5a..d7c7241 100644
--- a/arch/parisc/kernel/unwind.c
+++ b/arch/parisc/kernel/unwind.c
@@ -94,6 +94,10 @@  unwind_table_init(struct unwind_table *table, const char *name,
 	struct unwind_table_entry *start = table_start;
 	struct unwind_table_entry *end = 
 		(struct unwind_table_entry *)table_end - 1;
+	int nr = 0;
+
+	extern void stext();
+	// base_addr += ((unsigned long)&stext) - KERNEL_START; // HELGE
 
 	table->name = name;
 	table->base_addr = base_addr;
@@ -112,6 +116,15 @@  unwind_table_init(struct unwind_table *table, const char *name,
 
 		start->region_start += base_addr;
 		start->region_end += base_addr;
+		if (nr<10) {
+			nr++;
+			printk("unwind %d: %x - %x, len=%d\n",
+				nr,
+				start->region_start,
+				start->region_end,
+				start->region_end - start->region_start + 1);
+				
+		}
 	}
 }