diff mbox

4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 000001be

Message ID 20171002212654.rsm4uj3hfioddldd@treble (mailing list archive)
State New, archived
Headers show

Commit Message

Josh Poimboeuf Oct. 2, 2017, 9:26 p.m. UTC
On Mon, Oct 02, 2017 at 01:09:31PM -0700, Linus Torvalds wrote:
> Bringing in Josh on this, because the merge commit gets fingered
> because it seems to be an interaction between the new code from the
> merge and the ORC unwinder changes. It's probably some almost trivial
> code difference that just causes some code generation to change.
> 
> And because Josh wasn't implicated in the merge, he didn't get cc'd by
> the kernel test robot.
> 
> Of course, the stack trace itself seems to be pretty useless, since
> that randconfig doesn't have KALLSYMS enabled, so it's just random hex
> numbers.
> 
> Josh, original email on lkml and a few other mailing lists, see for example:
> 
>     https://www.spinics.net/lists/devicetree/msg196692.html
> 
> Any ideas?

The bisect is pointing to a commit which is almost 5 months old, so this
is pre-ORC.  Kallsyms *is* enabled, but the unwinder dump isn't smart
enough to realize it's dumping misaligned stack addresses:

[  110.399425] WARNING: kernel stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 000001be
[  110.399428] unwind stack type:0 next_sp:  (null) mask:0x2 graph_idx:0
[  110.399431] bd92bc2e: 92bc7f00 (0x92bc7f00)
[  110.399433] bd92bc32: c27041bd (0xc27041bd)

So kallsyms doesn't help if I'm dumping bad data.  Oops.

I tried to recreate it with the reproducer script and a similar GCC
version, but no luck.

Fengguang, assuming it's reliably recreatable, any chance you could
recreate with the following patch?

Comments

Linus Torvalds Oct. 2, 2017, 9:58 p.m. UTC | #1
On Mon, Oct 2, 2017 at 2:26 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> The bisect is pointing to a commit which is almost 5 months old, so this
> is pre-ORC.  Kallsyms *is* enabled, but the unwinder dump isn't smart
> enough to realize it's dumping misaligned stack addresses:

Ahh, I didn't pick up on that "esp isn't aligned" part.

That said, if %esp gets unaligned at some point, it's not clear
exactly when we should align it. An unaligned stack pointer will
continue to _work_ just potentially perform fairly badly.

But more likely, we picked the wrong frame value to begin with.

For example, maybe that decode_frame_pointer() logic really should
check not that the low bit in bp is set, but instead check that it's a
valid "unsigned long *" that has the low bit set.

IOW, the difference would be that instead of checking

        if (!(regs & 0x1))
                return NULL;

if would check

        if ((regs & (sizeof(unsigned long)-1)) != 1)
                return NULL;

but also maybe add logic to simply not trust a next frame pointer that
isn't appropriately aligned.

So I think adding PTR_ALIGN() there in the unwind dumper might be a
bit late. By that time it has already accepted what looks like a
garbage frame. No?

                Linus
Josh Poimboeuf Oct. 3, 2017, 1:17 a.m. UTC | #2
On Mon, Oct 02, 2017 at 02:58:08PM -0700, Linus Torvalds wrote:
> On Mon, Oct 2, 2017 at 2:26 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > The bisect is pointing to a commit which is almost 5 months old, so this
> > is pre-ORC.  Kallsyms *is* enabled, but the unwinder dump isn't smart
> > enough to realize it's dumping misaligned stack addresses:
> 
> Ahh, I didn't pick up on that "esp isn't aligned" part.
> 
> That said, if %esp gets unaligned at some point, it's not clear
> exactly when we should align it. An unaligned stack pointer will
> continue to _work_ just potentially perform fairly badly.

True, but unaligned stacks seem to be very rare, and I don't remember
ever seeing one with frame pointers enabled.  I think the call return
addresses will always be aligned, so I think it makes sense to align the
starting address before dumping.

> But more likely, we picked the wrong frame value to begin with.

Agreed.

> For example, maybe that decode_frame_pointer() logic really should
> check not that the low bit in bp is set, but instead check that it's a
> valid "unsigned long *" that has the low bit set.
> 
> IOW, the difference would be that instead of checking
> 
>         if (!(regs & 0x1))
>                 return NULL;
> 
> if would check
> 
>         if ((regs & (sizeof(unsigned long)-1)) != 1)
>                 return NULL;

Yeah, that would be an improvement.

> but also maybe add logic to simply not trust a next frame pointer that
> isn't appropriately aligned.

Also a good idea.

> So I think adding PTR_ALIGN() there in the unwind dumper might be a
> bit late. By that time it has already accepted what looks like a
> garbage frame. No?

Yeah, maybe.  I guess it really depends on what exactly the root cause
is.  It got a bad frame pointer somehow, but maybe not *too* bad because
it still ended up in the task stack somehow.  It's not clear how that
happened, but it's possible the task stack had some clues about what led
up to it.

I'll implement your suggestions in addition to the patch to align sp in
the dump code.  Having all of them together would hopefully enable the
unwinder to detect a bad condition as early as possible and also do a
better job at dumping it.
diff mbox

Patch

diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
index 82c6d7f1fd73..64282ec73eb8 100644
--- a/arch/x86/kernel/unwind_frame.c
+++ b/arch/x86/kernel/unwind_frame.c
@@ -42,7 +42,7 @@  static void unwind_dump(struct unwind_state *state)
 			state->stack_info.type, state->stack_info.next_sp,
 			state->stack_mask, state->graph_idx);
 
-	for (sp = state->orig_sp; sp; sp = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
+	for (sp = PTR_ALIGN(state->orig_sp); sp; sp = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
 		if (get_stack_info(sp, state->task, &stack_info, &visit_mask))
 			break;